## 1. Introduction and General Overview

[2] Many hydrological phenomena are characterized by properties that are (or seem to be) stochastically correlated. These features are often treated as correlated random variables and modeled by multivariate joint distribution functions. In particular, the application of multivariate frequency analysis to hydrological variables has grown quickly since the introduction of copulas in hydrology and geosciences by *De Michele and Salvadori* [2003]. The possibility of splitting the statistical inference of marginal distributions and dependence structure (or copula) provided by Sklar's theorem [*Sklar*, 1959] along with the increasing availability of open source statistical software, allowed for simplifying the analysis and constructing new flexible joint distribution functions with heterogeneous dependence structures and marginal distributions. In the last 10 years, copula-based joint distributions have been defined for a number of hydrological phenomena that are deemed to be characterized by correlated features, such as the peak, volume and duration of flood events, the peak, mean intensity, duration and volume of hyetographs or peak, severity, duration and areal extension of droughts. A large part of the literature on multivariate frequency analysis of hydrological data has focused on the inference procedures, providing a copula-based joint distribution and their lower dimensional marginals and conditional distributions as a final result [e.g., *Yue et al*., 1999; *Yue*, 2000, 2001a, 2001b; *Grimaldi and Serinaldi*, 2006; *Zhang and Singh*, 2007; *Karmakar and Simonovic*, 2009; *Ghizzoni et al*., 2010]. The resultant joint and conditional probabilities are sometimes transformed in return periods by relationships such as , where *μ* is the average interarrival time between two events, *p* is a generic probability of nonexceedance (univariate, conditional, or multivariate), and *T* is the return period corresponding to *p*.

[3] However, moving from the univariate framework to the multivariate one, a number of practical problems arise about the inference procedures and their reliability, the interpretation of results and their operational use in practical design problems. In this respect, only a few studies have gone beyond the inference stage and have tackled the problems of the application to real-world hydrologic engineering. A parallel with univariate frequency analysis can help to introduce the key aspects that likely prevented a wider (and appropriate) application and diffusion of multivariate frequency analysis. In a univariate framework, a big effort has been ongoing for several decades to find the distribution that shows the most accurate fit of the data at hand. With this objective, a variety of multiparameter distributions which are more or less theoretically based and flexible were suggested [*Rao and Hamed*, 2000], along with a number of inference procedures such as moments, L-moments, maximum likelihood, maximum entropy, and Bayesian techniques. However, this effort to achieve high accuracy and a perfect fit is often frustrated by the limited sample size of hydrological data, which implies large uncertainty on the extreme quantiles. This uncertainty must be clearly defined and assessed and taken into account in design applications [*Montanari*, 2007, 2011; *Xu et al*., 2010]. Even in a univariate framework, the uncertainty is not always easy to be incorporated in the design practice and communicated [*Di Baldassarre et al*., 2010]. Therefore, practical problems are often solved by using simple point estimates. However, in univariate frequency analysis, rather simple analytical and Monte Carlo methods allow us to explore the uncertainty of extreme quantiles, such as extreme flood peaks, and to assess the impact on the final output of the risk analysis. In addition, the lack of information resulting from a small sample size is well known and several approaches, such as regionalization procedures [*Hosking and Wallis*, 1997; *Reed*, 2002] and information merging [*Reed*, 2002; *Merz and Blöschl*, 2008a, 2008b; *Viglione et al*., 2013], have been suggested to reduce it.

[4] In multivariate frequency analysis, the situation is similar. Most of the literature focuses on finding the best joint distribution that fits the multivariate empirical distribution of the data without accounting for the uncertainty and then the reliability of the estimates and their applicability to design problems, whereas regionalization procedures are in their early stage of development [*Chebana and Ouarda*, 2009]. As the small size of hydrological samples has a dramatic impact on the reliability of predictions of extreme quantiles [*Klemeš*, 2000a, 2000b; *Reed*, 2002; *Serinaldi*, 2009; *Xu et al*., 2010], a strong impact is also expected in a multivariate framework, in which the uncertainty of the marginal distributions combines with the uncertainty of the dependence structure, whereas the sample size is kept generally small (e.g., annual maxima). Moreover, in a multivariate context, an additional problem is the existence of infinite combinations of the studied variables (e.g., flood peak and duration) that share the same joint probability even though their impact on the design can be very different [e.g., *Salvadori et al*., 2011]. This makes the selection of appropriate design events more difficult than in the univariate analysis.

### 1.1. Overview on Multivariate Design Events

[5] To make the presentation clearer and with no loss of generality, it is worth visualizing the concepts mentioned above by referring to the bivariate case. Figure 1 shows the bivariate cumulative distribution function (CDF) and probability density function (PDF) of two generic random variables *X* and *Y* that follow a Gumbel logistic model [*Gumbel and Mustafi*, 1967] resulting from the combination of standard Gumbel marginal distributions and and a Gumbel-Hougaard copula with association parameter (Kendall correlation coefficient ). The figure also shows 5000 realizations simulated by this joint distribution and the level curves obtained by cutting the CDF and PDF surfaces through horizontal planes. Each level curve is the locus of points and . As specified by *Salvadori et al*. [2011], these curves become *p*-level surfaces and hyper-surfaces when the distribution is trivariate and multivariate. Keeping the discussion in two dimensions, Figure 1 illustrates that a virtually infinite number of combinations of *X* and *Y* share the same joint probability of nonexceedance . However, the simulations show that some points lying on a *p*-level curve are simulated more likely than others and the cloud of point obviously reflects the shape of the joint density. Therefore, even though all points on a *p*-level region have the same joint probability, under the hypothesis that the process is bivariate logistic, the different likelihood of each point must be taken into account to select appropriate design scenarios.

[6] The problem of selecting design events along a *p*-level curve of a bivariate CDF has been well known since at least the mid 1980s when it was tackled for instance by *Sackl and Bergman* [1987] in flood frequency analyses involving flood peak *Q* and flood volume *V* [see also *Bergman and Sackl*, 1989; *Bergmann et al*., 2001]. Namely, *Sackl and Bergman* [1987] used the peak runoff time to define an appropriate subset of a *p*-level curve through a physical boundary. After being overlooked for some years, the problem has been newly considered by *Chebana and Ouarda* [2011] and *Salvadori et al*. [2011] by exploiting the ease of mathematical manipulation offered by copulas and the advances in multivariate frequency analysis. In particular, *Chebana and Ouarda* [2011] used geometric properties of the joint CDF *p*-level curves to distinguish a so-called “proper part” of the curves (which corresponds to the central truly curved part), from a “naïve part” (the straight horizontal and vertical segments (Figure 1, bottom-left)). This approach is free from physical considerations and can be applied to every pair of random variables, thus generalizing to some extent the early idea of *Sackl and Bergman* [1987]. Almost simultaneously, *Salvadori et al*. [2011] suggested another general approach to restrict the range of design events with probability *p*. By using the copula-based formalism, and denoting a *p*-level region with the unique and general term of critical layer, they introduced the idea to use a suitable function that weights the realizations lying on the critical layer of interest. Following this approach, one can then freely choose the weight function that best fits the practical needs. One of these weight functions could be the joint PDF, which actually describes the likelihood associated with each point lying on a CDF *p*-level curve (or critical layer). Therefore, one of the solutions proposed by *Salvadori et al*. [2011] is to choose the point with the highest likelihood along the *p*-level curve. It should be noted that focusing on the density along a *p*-level curve corresponds to looking for the conditional distribution resulting from the integration of the joint density along the curve. Following this reasoning, *Volpi and Fiori* [2012] derived such a distribution and defined the proper part of the *p*-level curve as the subset of points within the two pairs of points and that define the area of the conditional distribution mentioned above. This method introduces a probabilistic selection of the proper part of a *p*-level curve, thus generalizing the approach of *Chebana and Ouarda* [2011]. Two examples of these conditional distributions are shown in Figure 2. The figure illustrates the level curves and corresponding to the logistic model described previously. The 5000-size simulated sample is also reported to better visualize how the conditional distributions reflect the spread of the points that follow the logistic model along the *p*-level curves.

### 1.2. Overview on Uncertainty of Joint Quantiles

[7] The problem of the uncertainty that affects the joint distribution estimated on a small sample was tackled in a very few studies. For instance, *Huard et al*. [2006] and *Silva and Lopes* [2008] considered a Bayesian approach to improve the accuracy of the model selection. Even though this method can be well devised in principle to explore the uncertainty of the joint quantiles, the authors have not investigated this aspect. *Durante and Salvadori* [2010] and *Salvadori and De Michele* [2011] recognized the need of accounting for uncertainty, suggesting caution about the reliability of the inference results and their practical application. In spite of its importance, the uncertainty assessment seems to have been widely overlooked in multivariate frequency analysis even though it is a prominent aspect of the univariate frequency analysis and often might make the applicability of results questionable.

[8] Following *Montanari* [2011], the uncertainty can be classified as uncertainty related to data, inherent (or structural) uncertainty, and epistemic uncertainty. In this study, inherent uncertainty refers to the intrinsic behavior of hydrological processes, is irreducible, and is described by probabilistic models (univariate and multivariate distribution functions), whereas epistemic (reducible) uncertainty refers to the model and parameter uncertainty of marginals and copula related to the limited size of hydrological records. Model and parameter uncertainty is epistemic as it can be reduced as new information (data and meta-data) becomes available. In particular, the uncertainty of the copula parameters leads commonly to stronger and weaker structures of dependence around the point estimates. In a bivariate case, this results in changes of the shape and curvature of the *p*-level curves (Figure 3a). On the other hand, the marginal uncertainty commonly entails a shift of the *p*-level curves along the two axes due to the fluctuations of position, scale, and shape parameters of the marginal distributions (Figure 3b). Both the sources of uncertainty combine randomly (Figure 3c) and must be carefully considered. Moreover, even though the model is perfectly specified, quantile estimates are characterized by the sampling uncertainty related to the size of the analyzed data set. Since large uncertainty usually affects the univariate analysis of extreme values, it is also likely prominent in a multivariate framework, and can be considered the main obstacle to a practical (effective and reliable) use of the quantiles resulting from multivariate (extreme value) frequency analyses. It should be noted that the multiplicity of scenarios lying on a *p*-level region must not be confused with the variability related to the epistemic and sampling uncertainty. Indeed, comparing Figures 2 and 3d, it is clear that exploring a level curve or, equivalently, the conditional distribution defined on it, does not allow the assessment of the uncertainty of its location over the plane.

[9] Based on the above discussion, in this study, we first discuss some key points related to the uncertainty assessment of extreme quantiles in univariate frequency analysis. Then, we show how these concepts can be extended to multivariate distributions, highlighting the difficulty of performing a reliable model specification, and obtaining accurate estimates of extreme multivariate quantiles for small data sets. Simple Monte Carlo procedures are therefore suggested to simulate sets of events that account for both the variability along the *p*-level regions and the epistemic and sampling uncertainty. A methodology to summarize the uncertainty by multidimensional confidence intervals (CIs) is also discussed. The proposed methodology is applied to two case studies already analyzed in the literature. Concluding remarks close the study.