## 1. Introduction

[2] Extreme hydrological events such as floods, droughts and rain storms may have significant economic and social consequences. Hydrological frequency analysis (FA) procedures are essential and commonly used for the analysis and prediction of such extreme events, which have a direct impact on reservoir management and dam design. Flood FA is based on the estimation of the probability of exceedance of the event *x _{T}* corresponding to a quantile of a given return period

*T*, e.g., T = 10, 50 or 100 years. The random variable

*X*is commonly taken to be the peak of the flood which is the maximum of the daily streamflow series during a hydrological year or season. Relating the magnitude of extreme events to their frequency of occurrence, through the use of probability distributions, is the principal aim of FA [

*Chow et al.*, 1988].

[3] The accurate estimation of the risk associated with the design and operation of water infrastructures requires a good knowledge of flood characteristics. Indeed, an overestimation of the design flood leads to an oversizing of hydraulic structures and, would therefore involve additional costs, while underestimation of design floods leads to material damages and loss of human lives. Flood FA is commonly employed to study this risk. It has been traditionally carried out for the analysis of flood peaks in a univariate context. The reader is referred to *Cunnane* [1987] and *Rao and Hamed* [2000].

[4] In general, a flood is described through a number of correlated characteristics, e.g., peak, volume and duration. The univariate treatment of each flood characteristic ignores their dependence structure. Consequently, the univariate framework is less representative of the phenomenon and reduces the risk estimation accuracy. Thereafter, several authors focused on the joint treatment of flood characteristics through the use of a number of multivariate techniques such as multivariate distributions and copulas [e.g., *Yue et al.*, 1999; *Shiau*, 2003; *Zhang and Singh*, 2006; *Chebana and Ouarda*, 2011a]. Multivariate studies contributed to the improvement of the estimation accuracy and provide information concerning the dependence structure between flood characteristics. The multivariate framework is applied in several hydrological events, such as floods, droughts and storms. For instance in floods, it is used for hydraulic structure design and extreme event prediction purposes (see *Chebana and Ouarda* [2011a] for recent references).

[5] Despite their usefulness, univariate and multivariate FA approaches have a number of limitations and drawbacks. The separate or joint use of hydrograph characteristics constitutes a major simplification of the real phenomenon. Furthermore, the way these characteristics can be determined is neither unique nor objective (in particular, flood starting and ending dates). In addition, each flood characteristic can be seen as a real-valued transformation of the hydrograph, e.g., the peak is the maximum. For hydrological applications, the bivariate setting is largely employed to treat two hydrological variables. A limited number of studies deals with the trivariate one, e.g., those of *Serinaldi and Grimaldi* [2007] and *Zhang and Singh* [2007]. The trivariate models generally suffer from less representativity and formulation complexity. Note that, in general, the number of associated parameters grows up rapidly with the dimension of the model and therefore the generated uncertainty increases. In addition, higher dimensions are not considered in hydrological practice. Finally, given the lack of data in hydrology, working with a limited number of extracted characteristics represents a loss of information in comparison to the overall available series.

[6] The main data source in FA is daily streamflow series, which during a year constitutes a hydrograph, from which the univariate and multivariate variables are extracted. The total information that is available in a hydrograph is necessary for the effective planning of water resources and for the design and management of hydraulic structures. The entire hydrograph, as a curve with respect to time, can be considered as a single observation within the functional context. In the univariate and the multivariate settings an observation is a real value and a vector, respectively. Therefore, the functional framework which treats the whole hydrograph as a functional observation (function or curve) is more representative of the real phenomena and makes better use of available data. Figure 1illustrates and summarizes the three frameworks.

[7] In the hydrological literature, there were some efforts toward a representation of the hydrograph as a function, such as in the study of the design flood hydrograph [e.g., *Yue et al.*, 2002] and in the flow duration curve study [e.g., *Castellarin et al.*, 2004] where the mean, median and variation are presented as curves. These studies underlined the importance to consider the shape of the hydrograph which is necessary, for instance, for water resources planning, design and management. The shape of flood hydrographs for a given river may change according to the observed storm or snowmelt events. More practical issues and examples related to the hydrograph are discussed by, e.g., *Yue et al.* [2002] and *Chow et al.* [1988]. Note that the main flood characteristics, i.e., peak, volume and duration, cannot completely capture the shape of the hydrograph. The study of the hydrographs by *Yue et al.* [2002], and similar studies, are simplistic and limited, as they approximated the flood hydrograph using a two-parameter beta density and considered only single-peak hydrographs. On the other hand, the flow duration curve approach [*Castellarin et al.*, 2004] is in the univariate setting and the presented functional elements (e.g., mean and median curves) are important but remain limited. The previous studies show the need to introduce a statistical framework to study the whole hydrograph and to perform further statistical analysis. The functional framework is more general and more flexible and can represent a large variety of hydrographs.

[8] Functional data are becoming increasingly common in a variety of fields. This has sparked a growing attention in the development of adapted statistical tools that allow us to analyze such kind of data. For instance, *Ramsay and Silverman* [2005], *Ferraty and Vieu* [2006], and *Dabo-Niang and Ferraty* [2008] provided detailed surveys of a number of parametric and nonparametric techniques for the analysis of functional data. In practice, the use of functional data analysis (FDA) has benefited from the availability of the appropriate statistical tools and high-performance computers. Furthermore, the use of FDA allows us to make the most of the information contained in the functional data. The aims of FDA are mainly the same as in the classical statistical analysis, e.g., representing and visualizing the data, studying variability and trends, comparing different data sets, as well as modeling and predicting. The majority of classical statistical techniques, such as principal component, linear models, confidence interval estimation and outlier detection, were extended to the functional context [e.g., *Ramsay and Silverman*, 2005]. The application of FDA has been successfully carried out, for instance, in the case of the El Niño climatic phenomenon [*Ferraty et al.*, 2005] and radar wave curve classification [*Dabo-Niang et al.*, 2007]. *Dabo-Niang et al.* [2010] proposed a spatial heterogeneity index to compare the effects of bioturbation on oxygen distribution. *Delicado et al.* [2008] and *Monestiez and Nerini* [2008] considered spatial functional kriging methods to model different temperature series. Sea ice data are treated in the FDA context by *Koulis et al.* [2009].

[9] The functional methodology constitutes a natural extension of univariate and multivariate hydrological FA approaches (see Figure 1). This new approach uses all available data by employing the whole hydrograph as a functional observation. In other words, FDA permits to exhaustively analyze hydrological data by conducting one analysis on the whole data instead of several univariate or multivariate analyses. In addition, the approach proposed by *Yue et al.* [2002] can be generalized in the FDA context where it becomes more flexible and includes hydrographs with different shapes such as multipeak ones.

[10] Given the above arguments, for hydrological applications, the functional context could be seen as an alternative framework to the univariate and multivariate ones, or it can also be employed as a parallel complement to bring additional insight to those obtained by the two other frameworks. The main objective of the present paper is to attract attention to the functional nature of data that can be used in all statistical techniques for hydrological applications through the FDA framework. A second objective is to introduce some of the FDA techniques, point out their advantages and illustrate their applicability in the hydrological framework. In the present paper, we focus on hydrological FA.

[11] 7Four main steps are required in order to carry out a comprehensive hydrological FA: (1) descriptive and exploratory analysis and outlier detection, (2) verification of FA assumptions, i.e., stationarity, homogeneity and independence, (3) modeling and estimation, and (4) evaluation and analysis of the risk. Step 1 is commonly carried out in univariate hydrological FA as pointed out, e.g., by *Rao and Hamed* [2000], *Kite* [1988], and *Stedinger et al.* [1993], whereas in the multivariate framework it was investigated recently by *Chebana and Ouarda* [2011b]. Contrary to the univariate setting, exploratory analysis in the multivariate and functional settings is not straightforward and requires more efforts. Table 1 summarizes the four FA steps and their status in each one of the three frameworks. It is indicated that the specific aim of the present paper is to treat step 1, which deals with data visualization, location and scale measures as well as outlier detection. A new nongraphical method to detect functional outliers is also proposed in the present paper. The presented techniques are applied to floods on the basis of daily streamflow series from a basin in the province of Quebec, Canada.

FA Steps | Framework | ||
---|---|---|---|

Univariate | Multivariate | Functional | |

- a
Note that in the univariate framework, step 1 is straightforward and is generally not treated separately. The references are given only as examples from the literature because of space limitations.
| |||

1. Exploratory analysis and outlier detection | Large body of literature: Cunnane [1987], Kite [1988], Stedinger et al. [1993], Rao and Hamed [2000] | Very sparse body of literature: Chebana and Ouarda [2011b] | The specific aim of the present paper |

2. Checking the FA assumptions: stationarity, homogeneity, independence | Large body of literature: Yue et al. [2002], Kundzewicz et al. [2005], Khaliq et al. [2009] | Very sparse body of literature: Chebana et al. [2010] | To be developed |

3. Modeling and estimation | Large body of literature: Cunnane [1987], Bobée and Ashkar [1991] | Large body of recent literature: Shiau [2003], Zhang and Singh [2006], Salvadori et al. [2007] | To be developed |

4. Risk evaluation and analysis | Large body of literature: Chow et al. [1988] | Sparse but growing body of literature: Shiau [2003], Chebana and Ouarda [2011a] | To be developed |

[12] Exploratory data analysis as a preliminary step of FA is useful for the comparison of hydrological samples and for the selection of the appropriate model for hydrological variables. It consists in a close inspection of the data to quantify and summarize the properties of the samples, for instance, through location and scale measures. Outliers can have negative impacts on the selection of the appropriate model as well as on the estimation of the associated parameters. In order to base the inference on the right data set, detection and treatment of outliers are also important elements of FA [*Barnett and Lewis*, 1998]. Therefore, it is essential to start with the basic analysis (step 1) in order to perform a complete functional FA.

[13] This paper is organized as follows. The theoretical background of functional statistical methods is presented in section 2 in its general form. In section 3, the functional framework is adapted to floods. The functional FA methods are applied, in section 4, to a real-world case study representing daily streamflows from the province of Quebec, Canada. A discussion as well as a comparison with multivariate FA are also reported in section 4. Conclusions and perspectives are presented in section 5.