## 1. Introduction

[2] An important concern in the development of national flood frequency guidelines such as Bulletin 17B [*Interagency Committee on Water Data* (*IACWD*), 1982] or a new Bulletin 17C [*Stedinger and Griffis*, 2008; *England and Cohn*, 2008], is that the procedure be robust. That is, the recommended procedure should be reasonably efficient when the assumed characteristics of the flood distribution are true, while not doing poorly when those assumptions are violated. A critical issue is whether the low (or zero) flows in an annual flood series are relevant in estimating the probabilities of the largest events.

[3] Annual-peak-flow series, particularly those in the Western United States, often contain so-called “low outliers.” In the context of Bulletin 17B, low outliers are small values which depart from the trend of the rest of the data [*IACWD*, 1982] and often reflect a situation wherein the smaller flood flows are unusually small given what one would expect based on the larger flood flows. For example, Figure 1 depicts the logarithms of 1 day rainfall flood annual peak flows for the Sacramento River at Shasta Dam, as computed by the Army Corps of Engineers, 1932–2008 (77 years). At this site, the three smallest observations appear visually to be unusually small. The figure includes a lognormal distribution fit to the top 74 observations. The standard Grubbs-Beck test [*Grubbs and Beck*, 1972] (see equation (2), 10% significance level) generates a threshold of 7373 cubic feet per second (cfs), and thus correctly identifies the smallest observation as a low outlier, but not the second and third smallest observations.

[4] Low outliers and potentially influential low flows (PILFs) in annual-peak-flow series often reflect physical processes that are not relevant to the processes associated with large floods. Consequently, the magnitudes of small annual peaks typically do not reveal much about the upper right-hand tail of the frequency distribution, and thus should not have a highly influential role when estimating the risk of large floods. *Klemes* [1986] correctly observes:

[5] “It is by no means hydrologically obvious why the regime of the highest floods should be affected by the regime of flows in years when no floods occur, why the probability of a severe storm hitting this basin should depend on the accumulation of snow in the few driest winters, why the return period of a given heavy rain should be by an order of magnitude different depending, say, on slight temperature fluctuations during the melting seasons of a couple of years.”

[6] The distribution of the proposed test statistic is derived specifically for the purpose Klemes suggests: to identify small “nuisance” values. Paradoxically, moments-based statistical procedures, when applied to the logarithms of flood flows to estimate flood risk, can assign high leverage to the smallest peak flows. For this reason, procedures are needed to identify potentially influential small values so a procedure can limit their influence on flood-quantile and flood-risk estimates.

[7] This paper presents a generalization of the Grubbs-Beck statistic [*Grubbs and Beck*, 1972] that can provide a standard to identify multiple potentially influential small flows. The present work is motivated by ongoing efforts [*England and Cohn*, 2008; *Stedinger and Griffis*, 2008] to explore potential improvements to Bulletin 17B [*IACWD*, 1982] with moments-based, censored-data alternatives [*Cohn et al*., 1997; *Griffis et al*., 2004]. The proposed statistic is constructed following the reasoning in *Rosner* [1975], who developed a two-sided R-Statistic “many outlier” test (RST), that is based on the following argument:

[8] “[t]he idea is to compute a measure of location and spread (*a* and *b*) from the points that cannot be outliers under either the null or alternative hypotheses, i.e. the points that remain after deleting 100 *p*% of the sample from each end.”

[9] In Rosner's implementation, *p* is some fraction of the total number of observations, *n*. We consider a one-sided test statistic based on these concepts to detect PILFs in the left-hand tail.