## 1. Introduction

[2] Debris flows often carry a tremendous hazard and risk potential. Accurate assessment of this risk is key to its proper management. One of the main ingredients of such risk assessment is a debris flow frequency-magnitude analysis based on records of past debris flow events. A frequency-magnitude analysis relates the volume of debris flows with the likelihood of such event occurrence. However, the nature of the data and of the problem greatly challenges standard analytical techniques. Some of the standard methods are discussed in *Jakob* [2012] along with the issues surrounding their use. The present paper outlines a methodology that allows one to combine historical data with expert judgment and thus improve on the precision of statistical estimation for the frequency-magnitude relation.

[3] Historical data, comprising volume estimates of debris flows, are usually not directly observed but rather need to be reconstructed using absolute dating methods such as dendrochronology, radiocarbon dating, and varve chronologies [*Chiverell and Jakob*, 2012]. In this process, large events can be identified but even then their volumetric values tend to be prone to large errors due to erosion. Although such records of past events can span time periods of over a hundred years, they usually contain only a small number of event data points owing both to rareness of the phenomenon and the fact that smaller events remain undetected.

[4] Once a record of debris flows for a given location is obtained, it is used to estimate return levels, that is, levels of the debris flow volume to be exceeded with a specified annual frequency or probability, for a range of return periods. For instance, in British Columbia, Canada, the landslide safety criterion with respect to life-threatening or catastrophic landslides is set to the 10,000 year return level in the landslide safety guidelines provided by the *Ministry of Transportation and Infrastructure* [2009]. The return period of 10,000 years falls well beyond the span of most available data records, and hence estimation of the associated return level has to rely on a model-based extrapolation. A standard statistical approach to tackle this kind of problem is to make use of the asymptotic results of extreme value theory (EVT). The basic references on statistics of extremes and its applications include *Beirlant et al*. [2004], *Embrechts et al*. [1997], *Coles* [2001], and *Reiss and Thomas* [2007]. *Katz et al*. [2002] is a review article with focus on application of extreme value methods in hydrology. The models motivated by EVT serve as a theoretically justified basis for extrapolation to extreme levels of the process but, as for any model, the results of such extrapolations have to be treated with caution.

[5] Inference for the chosen model can be performed using a variety of methods. The moments-based methods have been shown to possess better small sample properties [see e.g., *Madsen et al*., 1997]; however, precision seems to be gained at the expense of underestimation in return levels when data come from a very heavy-tailed distribution (based on a simulation study described in Appendix Performance of Estimation Methods When Sampling From a Heavy-Tailed GP Distribution), a likely scenario in the case of debris flows we consider. This is a serious drawback from the risk management perspective. An alternative such as the maximum likelihood method is not usually recommended for small data samples. Due to a small sample size, confidence intervals for maximum likelihood return level estimates tend to be very wide, reflecting tremendous sampling variability especially in view of the extreme value problem at hand. Such wide confidence intervals are impractical in real-life decision making. Maximum likelihood return level estimates also suffer from bias, albeit on the positive side. One approach to address these issues of maximum likelihood is via restriction of the domain of possible values of the model shape parameter [see *Coles and Dixon*, 1999; *Martins and Stedinger*, 2001]. The shape parameter is key in determining the tail of the assumed distribution. However, such restrictions on the tail may not necessarily be justified in the present context.

[6] In our approach, in order to improve the overall input of the analysis as well as precision of the return level estimation, we propose to supplement the available small sample of historical debris flow events with additional information in the form of expert judgment. In particular, we have sought an expert opinion concerning the likely magnitudes of debris flows return levels. Kris Holm, a senior geoscientist at BGC Engineering in Vancouver, Canada, has kindly agreed to be our debris flow expert and to provide required information.

[7] The idea that incorporation of additional information into the analysis has the potential to make estimation more precise and reduce the bias has been exploited in various forms and contexts. The use of expert opinion similar to our approach has been suggested in *Coles and Tawn* [1996]. In flood frequency analyses, *Jin and Stedinger* [1989] combine the regional and historical information via maximum likelihood estimation, while *O'Connell et al*. [2002] employ a Bayesian methodology to include historical and paleohydrologic bound data. Examples of methods and case studies using regional information are *Coles and Powell* [1996], *Casson and Coles* [1999], and *Ribatet et al*. [2006]. *Coles and Dixon* [1999] and *Martins and Stedinger* [2001] incorporate additional information by imposing restrictions on the model shape parameter, as mentioned above.

[8] To combine the expert opinion with the available data, we make use of Bayesian techniques. For a recent review of Bayesian analysis of hydrologic extremes, refer to *Renard et al*. [2013]; an earlier review paper on Bayesian methods in extreme value modeling is *Coles and Powell* [1996]. The (likelihood) model we assume for the data is based on the point process representation of excesses over a high threshold. In the current application, as a threshold we use a volume estimate above which debris flows can be identified. Exceedance times or, equivalently, in our case, times of debris flow events are assumed to follow a Poisson process, with volume amounts and occurrence times being independent. This model is determined by three parameters. Without reference to the data, the sampling distribution of the parameter vector, known as the prior distribution, is specified using expert opinion. Applying Bayes' theorem, the prior distribution can be updated to incorporate the available record of data. The resulting posterior distribution can be interpreted as the sampling distribution of the parameter estimate that combines the given data and the prior information, derived with the expert's judgment. The posterior distribution should be less spread out than the sampling distribution of say the maximum likelihood estimate, as the latter uses less information. Hence, it can lead to shorter interval estimates of return levels. Details are provided in section 2. As a case study, presented in section 3, we consider the record of debris flows at Capricorn Creek on Mount Meager in British Columbia, Canada. Discussion of the results in comparison to other methods, and sensitivity analysis with respect to prior choice and data uncertainty are given in section 4. Section 5 summarizes our findings and conclusions. Appendix Performance of Estimation Methods When Sampling From a Heavy-Tailed GP Distribution supplements the analysis in section 4.1.