A comprehensive study on the Bayesian modelling of extreme rainfall: A case study from Pakistan

In this paper, the modelling of extreme rainfall is carried out in Pakistan by analysing annual daily maximum rainfall data via frequentist and Bayesian approaches. In frequentist settings, the parameters and return levels of the best fitted probabilistic model (i.e., generalized extreme value) are estimated using maximum likelihood and linear moments method. On the other side, under the Bayesian framework, the parameters and return levels are calculated both for noninformative and informative priors. This task is completed with the help of the Markov Chain Monte Carlo method using the Metropolis‐Hasting algorithm. This study also highlights a procedure to build an informative prior through historical records of the underlying processes from other nearby weather stations. The findings attained from the Bayesian paradigm demonstrate that the posterior inference could be affected by the choice of past knowledge used for the construction of informative priors. Additionally, the best method for the modelling of extreme rainfall over the country is decided with the support of assessment measures. In general, the Bayesian paradigm linked with the informative priors offers an adequate estimations scheme in terms of accuracy as compared to frequentist methods, accounting for ambiguity in parameters and return levels. Hence, these findings are very helpful in adopting accurate flood protection measures and designing infrastructures over the country.


| INTRODUCTION
Pakistan is a country with exclusive rainfall patterns and appearances than other countries in the world (Arif et al., 2019). In particular, Pakistan faces two different rainfall seasons namely summer and winter. In summer, rainfall happens mainly during the monsoon season (early July to September). Therefore, July and August are considered the peak months for monsoon rainfalls. In winter, mostly rainfall events occur from (mid-December to March) (Ahmad et al., 2013). Unquestionably, extreme rainfall events are frequently connected with climate fluctuations, which may cause a series of natural disasters such as flash floods, heavy winds, and landslides.
Consequently, rapid fluctuations in the climate have frequently increased the number of heavy floods in the country.
During the 2020 monsoon rainfall spell, numerous rain-related losses were reported in Pakistan. According to Aljazeera News (2020), 31 casualties were reported in the southern Sindh province, whereas 23 people expired in Khyber Pakhtunkhwa province. Furthermore, 15 deaths were reported in southwestern Baluchistan province and 8 in Punjab province. Likewise, 13 more people passed away elsewhere in Pakistan's northern areas, including three in Pakistan administered Kashmir.
According to the Federal Flood Commission (FFC) report, floods have now become a regular feature in the country. Due to downpours, the government has faced an overall financial loss of more than US$ 38 billion during the past 70 years. The massive defeat in the economy, particularly in the agriculture sector, has dramatically influenced the country's progress.
Rainfall patterns have continuously been examined including the estimation of the rainfall distribution and the identification of damp or dry events on a specific day. However, information about the amount and happenings of extreme rainfall is inevitable for different purposes such as sustainable water resource management, government planning for water-related disasters and preparation of different hydraulic structures (Ahmad et al., 2016;Hussain et al., 2017). On the other side, it is unquestionable that the infrastructures and the region's economy might be affected by extreme events. For these reasons, statistical modelling and prediction of extreme environmental events are required for future planning. Moreover, the challenging task in modelling extreme events is to realize the happening probabilities linked with events that are extrapolated beyond the observed data.
Procedures to analyse extreme values comprise the frequency of happening of extreme events with the practice of probabilistic models for both similar or dissimilar processes for one or more climate variables (Noto and La Loggia, 2009;Lenderink and Fowler, 2017). Several research studies exist in literature in favour of selection and evaluation of different extreme value probabilistic models for extreme data, but due to the accessibility of the small length of observed data as compared to return periods of interest, continuously this work has been challenging and provocative (Fadhilah et al., 2007;El Adlouni and Ouarda, 2010;Olofintoye et al., 2009;Suhaila et al., 2011;Rahman et al., 2013;Khudri and Sadia, 2013;Marani and Ignaccolo, 2015;Ahmad et al., 2016;Ahmad et al., 2019).
Extreme value analysis might be helpful to assess both the probability of happening and the magnitude of extreme events. So, extreme value theory authorizes researchers to measure an event's random behaviour that originates in the upper or lower tails. Standard extreme value analysis is frequently performed based on the most straightforward inferential procedure; however, the data structure might be complex. Consequently, the statistical modelling of extreme weather events using extreme value theory agrees to accomplish the complex system that is natural in the extreme data to enhance inferential procedures.
For the modelling of extreme weather variables, the two essential methods are engaged in extreme value theory, namely block maxima and peak-over-threshold (POT) (see, e.g., Coles, 2001;Rivas et al., 2008;Bücher et al., 2019). In the block maxima method, we model a maximum value of each year acquired from the large sample via generalized extreme value (GEV) distribution (Ferreira and De Haan, 2015). On the other side, the POT method deals with those values that exceed the high threshold level in the observed data with the application of generalized Pareto distribution (Davison and Smith, 1990;Ferreira and De Haan, 2015). However, according to Madsen et al. (1997) and Eastoe and Tawn (2012)), the block maxima is the favourite for modelling extremes, whereas in the POT procedure, sometimes selecting an appropriate threshold is not an easy task.
From the advances in the statistical modelling point of view, extreme value analyses customarily have been carried out using frequentist approaches. For instance, Hosking (1990) introduced the linear moments (LM) method for the study of extreme data. Many studies are available in the literature concerned with applying the LM method (for instance, Elamir and Seheult, 2003;Hosking and Wallis, 2005;Khan et al., 2021). Furthermore, Ahmad et al. (2013) and Ahmad et al. (2016) used LM method to model the monsoon rainfalls patterns in Pakistan. They establish the best fit distribution among five extreme value distributions by classical modelling. On the other side, Coles and Dixon (1999), Coles (2001), and Ahmad et al. (2019) used likelihood-based inference methods for modelling extreme value models. Researchers are more interested in Bayesian modelling than a classical setup to obtain more valuable results about uncertainty extreme environmental events.
Meanwhile, extreme data are scarce by their nature. The statistical inference on extremes could be enhanced by the Bayesian paradigm's support that allows supplementary evidence about the processes via prior knowledge. For interested readers, many studies exist in the literature (see, e.g., Coles andTawn, 1996, 2005;Coles and Powell, 1996;Beirlant et al., 2004;Chu and Zhao, 2011;Naghettini, 2017, chap. 11;Diriba et al., 2017;Ahmad et al., 2019;Diriba and Debusho, 2020). However, the Bayesian analysis of extreme events is not dependent on the critical assumptions that are obligatory for the frequentist framework by the asymptotic theory (Smith, 1985;Coles, 2001;Smith, 2005). Generally, the prior knowledge on extreme value model parameters is rare to discover that fits the probabilistic models via Bayesian procedure. Therefore, impressive results on this strengthen us to increase the precision of the estimates. For instance, Diriba et al. (2017) have been examined the properties of priors on the parameter estimates of GEV distribution and the prior effect on return levels for the wind speed data of Cape Town, South Africa.
Furthermore, Ahmad et al. (2019)) have studied the effects of priors on the parameters as well as return levels for the rainfall data of Lahore station, Punjab, Pakistan. Although, they did not generalize their results over the country because their findings were limited only to one province. They did not study any other frequentist procedure except maximum likelihood estimation (MLE). These studies inspire the authors to develop extreme rainfall modelling over the country and explore an entire narrative of uncertainty in parameters and return levels (RLs) of GEV distribution. For this reason, some sites over the country are considered for experimental work. Consequently, the essential purpose of this modelling is to predict extreme rainfall events in the future over the country. The occurrence of uncertainty in the future forecasts of extremes makes the study of extreme events even more vital and critical (Coles et al., 2003;Zhu et al., 2013;First, 2019). Thus, it is mandatory to characterize their behaviour statistically.
In application point of view, the key objective of the study for extreme environmental events is to recognize the properties of the larger RLs of the variable of interest. For instance, the estimates of RLs for an annual maximum of the extreme event could be predicted as these observations provide an expected value of return level that exceeds once, on average, in a given return period (Coles, 2001;Diriba and Debusho, 2020). Hence, statistical findings from the systematic study of extreme climatic events suggest high analytical power. Also, this research study aims to examine extreme climate fluctuations and varying patterns of events which may help to know the behaviour of extreme weather events.
The rest of the paper is organized as follows. In Section 2, the materials and methods are presented for data analysis. The data description with their exploratory analysis and generalized extreme value model with block maxima are explained. Also, both frequentist (MLE and LM methods) and Bayesian Markov Chain Monte Carlo (MCMC) paradigm with noninformative (NIPs) plus Informative Priors (IPs) for parameters estimation and RLs are established in the same section. Also, the assessment measures are described in Section 2. In Section 3, the results are discussed. For instance, how NIPs and IPs affect parameter estimates of GEV and RLs in Bayesian settings, the results of all three methods are compared based on assessment measures. Conclusion and some recommendations are given in the final Section 4.

| Data description and exploratory analysis
Throughout this paper, the data comprise a daily aggregate of rainfall (in millimetres) of 11 weather stations all over Pakistan recorded by automatic weather stations. The data had been taken from Pakistan Metrological Center Karachi corresponds to 32 years from 1985 to 2016. Data have been selected on the following standard criteria: the length of the data, variability, quality, urbanization, and climate changes. Later, the ADMRS was extracted from the daily rainfall data using the block maxima method. AMDRS is a single maximum value for any specific year and station among all recorded daily rainfall values. The extracted data of 11 weather stations, namely Lahore, Drosh, Chitral, Jacobabad, Khuzdar, Rohri, Nawabshah, Lasbela, Hyderabad, Chhor, and Pasni, were utilized for analysis and prediction. For deriving of IPs, the rainfall characteristics of two new weather stations at various distances were engaged, namely Mohenjo-daro and Dera Ismael Khan (D.I. Khan). The length of the data for these stations was 30 years from 1987 to 2016.
The D.I. Khan station is considered for the construction of IPs and it is located in the centre of the country. Besides, the D.I. Khan district is situated between district Bhakkar of south Punjab, Mianwali of North Punjab, Zhob of Baluchistan, Indus river, and South Waziristan of Pakistan tribal belt. The Mohenjo-daro station was chosen for prior elicitation due to very short distances from other various sites of the Sindh and Baluchistan provinces. The plots under the Google map spatial linkage encompass selected areas studied in the present research are shown in Figure 1. Blue pinpoints highlight the observed weather stations, and green pinpoints indicate the stations used for IPs construction.
The descriptive analysis for the amount of ADMRS of different stations selected all over the country is briefed in Table 1. The mean of ADMRS fluctuates from 34.44 to 87.61 mm. The ADMRS of the Jacobabad station has comparatively large variation, that is, the observations are more spread (sample CV) than other data sets. One observation (e.g., 323 mm) in the Jacobabad ADMRS is big enough compared to others and may be a source for this large CV. It can be observed that the Drosh station gained less relative variation against other stations in the study.
Besides, most Sindh province stations have larger sample CVs compared to Punjab and Khyber Pakhtunkhwa (KPK). The ADMRS of two Baluchistan sites (i.e., Lasbela and Khuzdar) and one station of KPK province as Chitral are pretty more skewed compared to other data sets.
Moreover, it is necessary to test the fundamental assumptions of any annual maximum series before conducting a final analysis in the field of statistical hydrology  because the final results could be doubtful without satisfying the basic assumptions. The hydrological series fundamental assumptions are independence, homogeneity, randomness, and stationarity (Naghettini, 2017, chap. 7). Thus, we ensured that the data fulfilled the basic assumptions and can be used to model extreme rainfall. The most suitable probability model for ADMRS is decided among various models (generalized extreme value, Pearson type three, generalized Pareto, generalized logistic) by using some nonparametric procedures. Hence, the GEV distribution remains most suitable for the observed ADMRS from various country sites. Moreover, the bestfitted probability model and estimation of its parameters via frequentist and Bayesian techniques are presented in the subsequent section.

| Block maxima and generalized extreme value distribution
To model the extreme observations using GEV, a data of N independent values w 1 , w 2 , …, w N is first blocked into k block of size n, with n essentially large, and hence N =kn: For rainfall data, the block size is usually a month, season or a year. For instance, 1 year stands for n ≈ 365 days. Then the maxima or extreme value Þis selected from each block. This produces a data of k annual maxima series named block maxima to which the GEV distribution family can be fitted. Suppose the yearly maxima w 1 , w 2 , …, w n are independent and identically distributed (i.i.d) with distribution function of G w ð Þ: Let M n =max w 1 , w 2 , …, w n ð Þ , n Ν and if there are sequences of normalizing constants c n >0 f g and d n ℜ such that as n ! ∞, where F is a nondegenerate distribution function, the distribution function G is called to be in the domain of attraction of extreme value distribution F, {i.e., G F w ð Þ}. Besides, the F follows the family of the probability distribution that has the form where w : 1 +κ w −μ ð Þ=σ f g , μ,σ>0, and κ are location, scale, and shape parameters of GEV distribution (Beirlant et al., 2004). The shape parameter affects the behaviour of the upper tail of the distribution. Moreover, GEV distribution is the mixture of three limiting extreme value distributions, that is, Gumbel distribution, Freshet distribution, and Weibull distribution. If κ ! 0 the GEV distribution in (2) relates to the Gumbel distribution. For κ>0 and κ<0, the expression given in (2) called Frechet and negative Weibull distributions, respectively (Coles, 2001).

| Parameter estimation of GEV distribution
Primarily, the maximum likelihood estimation method and linear moments method were applied to estimate GEV distribution parameters. In MLE, we differentiate the function given in (2) for w i , for instance, when κ ≠ 0 the density of GEV is given by The maximum likelihood estimates (MLEs) of the parameters μ, σ, and κ, sayμ,σ, andκ are calculated by maximizing the logarithm of the joint likelihood, that is, maximizing l μ,σ, κ; w 1 ,w 2 ,::, concerning unknown parameters, say μ,σ, and κ. Since the solution of log-likelihood is not an easy task, in particular, the maximization is solved by quasi-Newton procedure with numerical iteration (Diriba et al., 2017;Diriba and Debusho, 2020).
In LM method computations, we use the linear combinations of order statistics values. This method was introduced by Hosking (1990). The LM provides simple and more efficient estimators of extremal data characteristics and the parameters of the distribution. Let W 1 , W 2 , …,W r be the random sample of magnitude n, with cumulative distribution function F w ð Þ and quantile function w F ð Þ. Suppose W 1:r ≤W 2:r ≤…≤W r:r be the order statistics of the selected random samples. For the random variable W, the rth population LM as explained by Ahmad et al. (2016): Usually, we require the first four LM for r =1, 2, 3, 4: Additionally, LM can also be considered as the linear combinations of probability-weighted moments as given: The first four LM are The LM ratios τ =λ 2 =λ 1 , τ 3 , and τ 4 denote the linear coefficient of variation, linear skewness, and linear kurtosis, respectively. The GEV parameters were estimated by using some approximations discussed by Hosking and Wallis (2005, p. 196). Moreover, the theoretical estimates of the parameters of GEV are given as followŝ

| Return level estimation for GEV model
The attention to extreme environmental events analysis sometimes does not generally rely on the estimates of extreme value distribution parameters; however, it applies the fitted model to calculate the other quantities. Return level estimates play a dynamic role in rainfall modelling for calculating the future hazard associated with return periods conforming to a fitted model. For instance, the estimates of extreme quantiles for ADMRS of an event could be calculated because these observations assess the return level of the event predicted on averagely exceeds once in a specific number of years. The RLs for the GEV model corresponding to the return period T =1=p, denoted by w p where F w p = 1 −p À Á and 0<p<1, is attained by using quantile function by the inverse of (2) given by (Coles, 2001) and also discussed by Ahmad et al. (2019) &Debusho (2020).
The return level w p is determined by quantiles of GEV distribution associated with the upper tail probability p: For the GEV model, the MLEs of the return level w p , indicated byŵ p is gained by substituting the MLEŝ μ,σ, andκ (Rao, 1973).

| Bayesian analysis
As in the maximum likelihood procedure, suppose ADMRS w = w 1 , w 2 ,…, w n ð Þare i.i.d and their distribution falls within the parametric GEV family. Moreover, now in the Bayesian setting, the GEV distribution parameters (μ, σ, and κ) are dealt as random variables for which we identify the prior distributions. The prior information helps us enhance the knowledge provided by the observed data. Let θ = μ, σ, κ ð Þand suppose the prior for θ with no evidence to the actual data can be expressed by a probability density function g θ θ ð Þ. Then using Bayes theorem to combine the likelihood and prior knowledge and to get the posterior density for θ has the following form: Where L θ=w ð Þ indicates the likelihood function of GEV distribution given in (4) and f θ=w ð Þ is the posterior distribution for θ, and the integral is set over the parametric space Θ. In this research, both the NIPs and IPs were engaged. The NIPs were specified by considering there is least or no external information accessible about the parameters, separate from the data. To generate the NIPs for the GEV parameter designated θ = μ,σ, κ ð Þ, the parametrization φ =logσ is done in the place of σ due to more manageable tasks in the specification of prior and to secure the positivity of scale parameter σ. Since for priors specification, the joint density for θ was supposed in the following form The following marginal independent NIPs in different studies (Coles and Tawn, 2005;Fawcett and Walshaw, 2008;Eli et al., 2012;Diriba et al., 2017;Ahmad et al., 2019;Diriba and Debusho, 2020) were used.
These are known as independent Gaussian priors with mean 0 and large variances (e.g., N 0, 10,000 ð Þindicates Gaussian distribution with 0 mean and 10,000 variances). The higher variances of the distribution enforce enough to create flat marginal priors, which confirm the lack of external information. On the other hand, the IPs were built by using the procedure given by (Coles and Tawn, 1996), that is, prior knowledge was provoked in terms of extreme quantiles. The method engaged for GEV is briefly described in the following paragraph.
Remember that the return level w p i , i= 1, 2, 3 in expression (15) with p 1 >p 2 >p 3 , be the quantiles calculated corresponding to T return period from historical extreme rainfall data of two suitable sites over the country. For example, the quantiles w p i are estimated independently for both Mohenjo-daro and D.I. Khan stations by replacing MLEs of GEV parameters in Equation (2). Coles and Tawn (1996) discuss a joint prior distribution for GEV parameters generated from extreme quantile (w p 1 , w p 2 ,w p 3 ) by employing given probabilities p 1 >p 2 >p 3 . One minor complication with these techniques is that the quantiles w p i , i=1, 2, 3 must be in natural order (e.g., w p 1 <w p 2 <w p 3 ); hence, the fundamental assumption of independent priors w p i ,i =1, 2, 3 would not be fulfilled. Subsequently, they recommended to use the quantile differences:w where w p 0 denotes a physically lower endpoint of the process variable (e.g., rainfall) and supposed to be w p 0 =0. It can be noticed that the change in quantile endorses the ordering of quantiles. Since independent marginal priors based on the quantile differences are now supposed to be independent gamma distribution with parameters (ν i , γ i ), i= 1, 2, 3, and can be written in the following form: From Equations (19) and (20), we can develop the joint prior for the (ν i , γ i ), i =1, 2, 3, from the Gamma distribution in the following formw Then the joint prior for (w p 1 , w p 2 , w p 3 ), by considering w p 0 =0, is stated as g w p 1 , w p 2 , and has written in a short form with w p 1 <w p 2 <w p 3 discussed by Diriba et al. (2017), Ahmad et al. (2019), and Diriba and Debusho (2020). Then incorporating expression (15) in Equation (21) and multiply by the Jacobian of the transformation from w p 1 , w p 2 , w p 3 À Á ! θ = μ, σ, κ ð Þ, it provides an expression for the prior in terms of the GEV parameter vector θ. Moreover, it has the following form for w p 1 <w p 2 <w p 3 . According to Diriba and Debusho (2020), the Jacobean used in (22) where z i = −log(1−p i ), i = 1, 2, 3.

| Assessment measures
Assessment measures were used to compare the performance between the classical or frequentist (i.e., MLE and LM approaches) and the Bayesian MCMC paradigm with NIPs and IPs in estimating GEV parameters and RLs for ADMRS recorded from different weather stations over the country. Moreover, these measures could distinguish the accuracy among results obtained through classical and Bayesian procedures. So, the proposed assessment measures are relative root mean square error (RRMSE), relative absolute error (RAE). Both measures encompass evaluating the difference between the observed and the estimated values corresponding to the assumed distribution (Ahmad et al., 2019). The mathematical form of the tests are given in (24) and (25) where w j:n means the observed sample values of jth order statistics of a random sample, whileq F j À Á are estimated quantiles parallel to jth Weibull plotting position F =j= n +1 ð Þ, where j shows the ranks of the data. The method that gained the lowest RRMSE and RAS would be considered an efficient method for modelling Pakistan's extreme rainfall data.

| Generalized extreme value distribution using frequentist methods
In this section, the GEV model of Equation (2) with a block maxima was fitted to ADMRS using MLE and LM methods. Moreover, the parameter estimates (μ,σ, andκ) with their standard errors (SE) of the GEV stationary model are given in Table 2.It can be noted from Table 2 that the shape parameterκ obtained through the LM method is less than zero for the Hyderabad station. This specifies a bounded upper tail to the distribution of ADMRS. The negative shape parameter suggests a heavier tail for GEV distribution, which offers smaller quantiles, particularly when the quantiles for ADMRS are estimated for longer return periods (Hosking, 1990).
The estimated RLs for different return periods are presented in Table 3, for both MLE and LM procedures. As a result, the RL estimates are smaller and consistent for the LM method, while RLs are larger for the MLE method. Moreover, the variations in RLs could be due to the skewness in the ADMRS. Consequently, to distinguish the summary of a series for skewed distributions, the median is more energetic to deal with outliers than the mean. The variation between the median and the mean can signify the magnitude of unusual values in the RLs (Diriba and Debusho, 2020). Hence, this task is further examined with the Bayesian paradigm application.

| Bayesian modelling of ADMRS using noninformative and informative priors
This section deals with inferences about ADMRS obtained through Bayesian analysis by NIPs and IPs support. The IPs were constructed independently for Mohenjo-daro and D.I. Khan. Therefore, two sites were used to elicit prior distributions. Both sites were lying at different distances from the observed sites. The distance's influence on the parameters and the RLs were evaluated in general.
The NIPs were built for the GEV parameters θ = μ, σ, κ ð Þby assuming there is no reliable prior information about the process to express the prior distributions apart from the data. Thus, the priors joint density for θ given in expression (17) was assumed with the parametrization φ =logσ. Hence, the noninformative independent priors given in (18) were incorporated. The scale parameter of GEV (i.e., σ) was reparametrized as φ =logσ to hold the positive property of variance. The Gaussian distributions having zero mean and higher variances in expression (18) enforce the flat independent marginal priors (also known as diffuse or vague priors), which exhibit the lack of external information (Eli et al., 2012;Ahmad et al., 2019).
The MCMC procedure with the Metropolis-Hasting (M-H) recipe was used for generating the samples from posterior distributions. By applying H-M, 50,000 iterations were produced for all sites, of which the initial 10,000 were burn-in. For simulation, different starting points were considered to perceive the chain convergence. Hence, all the chains had mixed well with the original data. The posterior means (PMs), standard deviations (SDs), and 95% CIs via NIPs for the GEV parameters for different sites are given in Table 4. Hence, it can be observed that the PMs and SDs are close to MLEs and LM estimates of GEV parameters except for the shape parameter estimates for various sites obtained through L-moments (see Table 2). It is anticipated that for vague priors PMs would be close to L-moments and MLEs as they incorporate slight evidence to the likelihood. On the other hand, the historical data of rainfall of two weather stations, namely Mohenjo-daro and D.I. Khan were used to formulate the IPs. Using the procedure given in Section 2.5, the prior distributions for the GEV model were built with the support of quantiles andw p 3 G 0:294, 282:395 ð Þ . The PMs with their SDs and 95% CIs of GEV parameters from IPs are also given in Table 5. The findings show that PMs for the GEV model parameters from the settings of informative priors are very close to the results gained from the NIPs. Also, the IPs built from Mohenjo-daro and D.I. Khan stations abridged the posterior SDs of GEV parameters for various sites than the SDs obtained via NIPs and frequentist methods. The smaller SDs indicate a reduction in uncertainty. This is happened due to the use of historical information from two nearby weather stations over the country.
To understand how GEV parameters were affected by the IPs based on the historical data of two different weather stations. The posterior densities (PDs) of the parameters found through NIPs and IPs were compared. The estimated densities of GEV parameters (μ, σ, and κ) for three sites, namely Hyderabad, Khuzdar, and Chitral are plotted in Figure 2. The posterior densities of the parameters of the model for remaining are given in Figure S1 (see, e.g., supporting information file). Notice that the distributions of location parameters are symmetric for all three sites. On the other hand, the distributions of scale and shape parameters are positively skewed. Furthermore, the location, scale, and shape parameters for IPs are built using the information from Mohenjo-daro and D.I. Khan showing high peaks. The fluctuations in posterior densities indicate that the posterior distributions are sensitive to the IPs from which the prior knowledge was produced.
As we discussed earlier, IPs were based on the knowledge that considered the functions of two stations over the country with the combination of mean and quantiles of the data. Weather conditions of those regions from which the data acquired are assumed to be relatively homogeneous. For instance, the Lahore station was selected by pretending that the environmental conditions of other cities of the province Punjab similar to Lahore. According to our evaluation, the estimates of GEV parameters based on IPs are susceptible. Therefore, selecting appropriate weather stations for the formulations of IPs is also an important task. The validation and robustness of estimates are verified by observing the effect of other IPs on the model parameters. For this purpose, the IPs were elicited by exploiting the Jaisalmer station historical records of the neighbouring country India. Figure S3 (see, e.g., supporting information file) shows that the influence of IPs on parameters of the models for Rohri data was similar to IPs generated from Mohenjo-daro and D.I. Khan historical records. Based on this evidence, our current methodology could be extended easily to other ungagged sites of the country. Moreover, the same discussion can be followed to interpret the remaining stations namely Rohri, Nawabshah, Chhor, Lasbela, Pasni, Jacobabad, Drosh, and Lahore. Generally, the above-discussed sites are very important from a geographical point of view over the country. A spatial map of the country and three sites (namely Hyderabad, Khuzdar, and Chitral) with their neighbouring areas in concern division are presented in Figure 3. In addition, Figure 3b covered the area of the Hyderabad division of the Sindh province and the red area indicates the Hyderabad city which was quite affected during the latest monsoon seasons. The left bottom map described the Kalat division of the Balouchistan province. The red colour indicates the Khuzdar district of the Kalat division. This division also received a lot of damages during monsoon seasons. Figure 3d presented the Malakand division of KPK province, and the red colour indicates the Chitral district. During the 2020 monsoon, the Malakand division got extreme rain events and faced land sliding problems, and flashed floods. Furthermore, the daily life of the people in these areas of the country is mostly affected by heavy rainfalls during every monsoon. The modelling presented in this paper could be very helpful in policymaking and the country's development.

| Influence of priors on return levels
To inspect the influence of the NIPs and IPs on RLs, the posterior density plots for (0<p<1) were constructed by considering the vector of observations. Consequently, these are obtained from the marginal posterior distributions of GEV parameters. For instance, the posterior densities for 10, 25, 50, 100, and 500 years were obtained against the different posterior distributions. The RLs are also sensitive in the context of the choice of values p (e.g., p=0:1, 0:04, 0:02, 0:01, and 0:005).
The posterior densities plots of site Khuzdar for 10-, 25-, 50-, 100-, and 500 years RLs based on NIPs and IPs are F I G U R E 3 (a) The Pakistan spatial map with the elevation above the sea level (m), (b) the Hyderabad division of the province Sindh, the red colour indicates Hyderabad district, which is most affected due to heavy rains, (c) the Kalat division of the province Baluchistan, the red colour represents the affected site namely Khuzadar, and (d) described the Malakand division of the province Khyber Pakhtunkhwa, the red colour indicates the affected area Chitral with substation Darosh [Colour figure can be viewed at wileyonlinelibrary.com] presented in Figure 4. Also, the posterior densities plots for the RLs of the remaining sites namely Rohri, Nawabshah, Hyderabad, Chhor, Lasbela, Pasni, Jacobabad, Drosh, Chitral, and Lahore are presented in Figure S2 (see for example, supporting information file). From the plots given in Figure 4, it can be realized that the IPs also affected the RLs distribution. Furthermore, the PDs of IPs have appeared with high peaks as compared with PDs of NIPs. Also, to some extent, the distributions of all RLs are skewed to the right side. On the other hand, the posterior densities for higher RLs of stations (i.e., Rohri, Hyderabad, and Jacobabad) are not interpretable for the case of NIPs. So far, when comparing MLE with the LM method in a frequentist framework, we acquired different mean RLs, which could be due to the heavy tail of the GEV distribution or skewness detected in the data. The mean RLs obtained for Jacobaabad weather station through MLE are very high as compared to the LM method. Thus, the RLs estimates could be improved for all sites by choosing suitable summary measures for the inference. Generally, the posterior medians could be used in place of means in the Bayesian setting. Additionally, the skewed densities of return levels indicate the uncertainty inside the model for developing reasonable upper bounds of the return levels as compared to lower limits for higher return periods (Coles and Tawn, 2005) and (Ahmad et al., 2019). Therefore, posterior medians were obtained as the best choice than the posterior mean for 10, 25, 50, 100, and 500 years RLs of ADMRS using NIPs and IPs. The RLs via NIPs are provided in Table 6 while the RLs based on the IPs are given in Table 7.
The posterior medians of RLs obtained via IPs are close to RLs based on MLEs except for the Jacobabad weather station. From the results of Jacobabad station. On the other hand, RLs calculated through L-moments are smaller as compared to other methods. This might be happened due to the negativity of the shape parameter. The best estimation method for future modelling could be decided by assessment measures. For this purpose, we used two assessment measures. A useful discussion for those procedures was given in the next section.

| Model selection through assessment measures
The methods used for the analysis and/or for the modelling of extreme rainfall data were compared in this section. The comparison task was carried out by using the assessment measures namely RRMSE and RAE given in Equations (24) and (25), respectively. Also, Table 8 was created by using the "Fgmutils" R package. From the findings of reputation of Bayesian inference for the extreme rainfall data. This approach deals with the uncertainties linked with excesses of weather variables efficiently. Essentially, the IPs for the Bayesian method constructed from surrounding weather stations increase the accuracy of the parameter estimates than the frequentist approaches. This study partially supports the Bayesian paradigm results of (Ahmad et al., 2019). Even though it is arguable to say which of the methods offer accurate estimates, it can be contended that the supplement of uncertainties via IPs in the Bayesian framework significantly enhanced the findings of the estimates for ADMRS at different weather stations over the country. Conversely, the precision of the estimates could be more improved with a suitable choice of weather stations used for the elicitation of IPs.

| CONCLUSION
In this paper, our attention was to develop the modelling of extreme rainfall patterns all over the country by applying frequentist and Bayesian methods. Frankly speaking, we were unable to analyse the data of all stations across the country. Therefore, the ADMRS of suitable weather stations of the country were chosen by keeping in mind that the environmental conditions are homogenous across the stations in the provinces of the country. The data used for this study from different weather stations around the country were shown to follow the family of extreme value distributions (i.e., GEV distribution). In a frequentist setting, the parameters of GEV distribution were estimated through MLE and LM methods. Furthermore, RLs for (10, 25, 50, 100, and 500 years) were also calculated for MLE and LM methods. The RLs based on LM were showed consistency while examined by the birds-eye view. But, the results obtained from both methods provided evidence that there would be extreme rain events across the country in the future.
Modelling the behaviour of such extremes events within the Bayesian paradigm at different weather stations throughout the country offers more beauty to this paper. Bayesian MCMC is respected when climatic indications are unusual, and also the behaviour of extreme rainfall is similar over the region from which the data were acquired. Consequently, in these circumstances, the authors have preferred a Bayesian paradigm over the frequentist methods. This needs a genuine construction of IPs, thus it provides great estimation accuracy. Similar to the frequentist setting, in the Bayesian framework, the parameter of GEV distribution and RLs for (10, 25, 50, 100, and 500 years) were estimated via NIPs and IPs formulated from two suitable weather stations over the country. Moreover, the parameter estimates and RLs for GEV were sensitive to those sites used for the elicitation of IPs. Consequently, the present study also supports proper choices of the neighbouring stations, since the devising of IPs is significant as the estimates and the accuracies are profound to these priors.
Additionally, assessment measures were used to adopt a superior method for modelling ADMRS among frequentist and Bayesian approaches. The smaller values of assessment measures proved the precision of the Bayesian MCMC method associated with IPs. Thus, our current methodology could be implemented easily to other ungagged sites of the country. Also, the information from the neighbouring countries (for instance, India, Afghanistan, and Iran) could be utilized as prior knowledge. On the other hand, the findings of the proposed method could be very helpful for policymakers and hydrologists. Hence, engineers can take help from this study in designing dams, bridges, culverts, and flood control devices in Pakistan. The study could be improved more by considering nonstationary rainfall series and by inspecting a linear time trend in the location parameter of GEV, and also with the exercise of spatial modelling.