## 1. Introduction

[2] The current theory of water resources engineering assumes that the frequency of hydrologic variables critical to planning and management follow an underlying, time-invariant stochastic process that can be estimated and utilized for risk assessment in engineering problems. Advances in hydroclimatic science over the past two decades have made it increasingly clear that this theory oversimplifies the probabilistic characterization of hydrology. Low-frequency climate oscillations and interannual shifts in antecedent watershed conditions (e.g., soil moisture, snowpack, etc.) have been shown to directly modify the likelihood of river flows critical to water management [*Georgakakos et al.*, 1998; *Grantz et al.*, 2005; *Sankarasubramanian et al.*, 2009; *Gong et al.*, 2010; *Pui et al.*, 2011]. The effects of climate and watershed precursors on streamflow are sometimes predictable, particularly at a seasonal time step, and are even extendable to the frequency, magnitude, and duration of extreme hydrologic responses [*Hirschboeck*, 1988; *Mosley*, 2000].

[3] While the vast majority of research has focused on the predictability of flood flow processes [*Cayan et al.*, 1999; *Olsen et al.*, 1999; *Jain and Lall*, 2000, 2001; *Kwon et al.*, 2008], recent work has found similar possibilities for extreme low flows as well [*Kiem and Franks*, 2004; *Verdon-Kidd and Kiem*, 2010; *Steinschneider and Brown*, 2011]. The predictability of low-flow events introduces the potential to produce year-to-year, forecast-informed estimates of critical low-flow statistics that could have significant implications for water quality management practices. In this paper we present a new approach to predict low-flow variability with hydrologic forecasts and apply the method to critical low flows in two major river basins in the northeast United States. The approach couches a semiparametric local likelihood method in a Bayesian framework to develop forecast-informed quantile estimates of annual minimum 7 day low flows (SDLFs), along with their associated error.

[4] A description of the frequency of critical low-flow statistics is vital to an accurate risk assessment of water quality degradation. Stochastic models of critical low flows are used to determine the T-year low-flow event for rivers and estuaries that dilute contaminants from point and nonpoint sources, allowing for the effective design of regulations that preserve water quality under sensitive flow conditions. For instance, total maximum daily loads (TMDLs) are often designed using the 10th percentile of 7 day low flows (i.e., the 7Q10) so that water quality standards will on average be violated only for the week of lowest flow once in every 10 years. Water quality trading markets are often based around contaminant concentration limits at the outlets of major rivers that depend on estimates of the 7Q10 at those locations. Despite their critical importance to the development of water quality regulations, there is a relative paucity of research regarding the frequency distributions of low flows, especially in contrast to similar efforts dedicated to flood flow analysis [*Vogel and Wilson*, 1996]. Those studies that have explored low-flow frequency analysis have met with limited success [*Smakhtin*, 2001]. In part this is due to the difficulty of identifying scalable models that can capture the true probabilistic behavior of extreme low flows through both space [*Vogel and Kroll*, 1990] and time [*Saunders and Lewis*, 2003].

[5] As water quality continues to become a concern in rivers across the United States, there is a growing need to better characterize the year-to-year risks of critical low flows. Currently, typical water quality regulations (e.g., regulated by the 7Q10) employ a static approach, basing requirements on historical statistics and enforcing the same level of treatment in each year, despite year-to-year variability in streamflows. Where there is strong interannual or decadal variability, this may lead to overly conservative regulations in some years and inadequately lax regulations in others. The potential effects of climate change, which may introduce trends in flow conditions, may further erode the utility of such static approaches to regulation [*Lins and Slack*, 1999].

[6] Recently, a number of studies have begun adapting frequency analysis procedures to nonstationary hydrologic time series, with a particular focus on flood statistics. The most popular approach has been to condition the parameters of a flood frequency distribution on different covariates through regression [*Coles*, 2001; *Cox et al.*, 2002; *Katz et al.*, 2002]. For example, *Coles* [2001] conditioned the location parameter of a Generalized Extreme Value (GEV) distribution for a flood series on time to model evolving flood risk. Frequency modeling with time-variant parameters has also been used to test for statistically significant trends and changing variability in flood series [*Delgado et al.*, 2010], as well as for regional flood frequency modeling [*Cunderlik and Ouarda*, 2006; *Leclerc and Ouarda*, 2007]. Other methods for frequency modeling of nonstationary floods include the incorporation of trends into distribution moments [*Strupczewski et al.*, 2001a, 2001b; *Strupczewski and Kaczmarek*, 2001], flood magnification and recurrence reduction factors [*Vogel et al.*, 2011], and quantile regression [*Sankarasubramanian and Lall*, 2003]. A thorough review of many of these studies and others can be found in *Khaliq et al.* [2006].

[7] To the authors' knowledge, there have only been a handful of studies that have applied a nonstationary frequency analysis to drought statistics. For example, *Burke et al.* [2010] fit the parameters of a peaks-over-threshold model to monthly drought indices projected for the next century using global and regional climate models. These parameters were allowed to vary through time via regression on global mean temperatures, enabling the estimation of nonstationary extreme monthly drought events at different return intervals. *Garcia Galiano et al.* [2011] and *Giraldo Osorio and Garcia Galiano* [2012] used Generalized Additive Models for Location, Scale, and Shape (GAMLSS) to model the nonstationary behavior of annual series of maximum lengths of dry spells. Besides these studies, no other research was found that attempted to extend nonstationary frequency analysis to dry-period extremes.

[8] Two recent approaches in the flood flow literature provide particularly interesting avenues for further research and application to low flows. These include a Bayesian frequency analysis with time-variant parameters that depend on parametric regressions [*El Adlouni et al.*, 2007; *Kwon et al.*, 2008; *Ouarda and El Adlouni*, 2011], as well as a semiparametric local likelihood procedure [*Sankarasubramanian and Lall*, 2003]. The use of Bayesian methods to conduct frequency analyses with time-variant parameters is relatively new. A primary advantage of the Bayesian approach is that it provides an efficient estimation framework that allows prior knowledge of model parameters to be integrated into the analysis. For instance, *El Adlouni et al.* [2007] and *Ouarda and El Adlouni* [2011] conditioned the parameters of a GEV distribution of annual maximum precipitation on the Southern Oscillation Index (SOI), using a Bayesian prior distribution in a generalized maximum likelihood analysis [*Martins and Stedinger*, 2000] to inform the estimation of the shape parameter and restrict its values to a reasonable range. Various linear and quadratic regression models were explored to condition the location, shape, and scale parameters on the SOI. *Kwon et al.* [2008] employed a similar technique in a hierarchical Bayesian framework in which the prior distribution of the location parameter of a Gumbel density function was specified through a linear regression on several different hydroclimate covariates. One potential downside of these studies is the reliance on parametric modeling to relate distribution parameters to covariates. The identification of appropriate analytic functions can become very complicated when the relationship between the covariates and the original hydrologic data exhibit significant nonlinearity or heteroskedasticity.

[9] Semiparametric and nonparametric methods have been a popular approach to circumvent such difficulties in strict parametric modeling. *Sankarasubramanian and Lall* [2003] used a semiparametric local likelihood method to condition the frequency distribution of annual maximum floods on seasonal forecasts based on El Nino–Southern Oscillation (ENSO) and the Pacific Decadal Oscillation (PDO). They found that the semiparametric local likelihood approach could address nonlinearity and heteroskedasticity in the relationship between flood flows and predictors more flexibly than parametric methods, but suggested that Bayesian approaches should be pursued to better characterize quantile and parameter uncertainty.

[10] This paper synthesizes the two approaches described above and explores an application to low-flow hydrology. A variation of the semiparametric local likelihood estimation method is developed in a Bayesian framework to condition the fit of frequency distributions to SDLF series on seasonal hydrologic forecasts for two major rivers in the northeast United States. Prior distributions for model parameters are developed from regional information and used in the Bayesian estimation procedure to ensure statistically and physically meaningful posterior distributions. The hydrologic forecast for both rivers is developed using measures of oceanic circulation in the North Atlantic Ocean. A static Bayesian inference with noninformative priors is also considered for comparison. Estimates of the 7Q10 are then developed for each year in the record using both estimators in a “leave-one-out” approach, and the results are compared to highlight the benefits of the conditional fitting procedure.

[11] The rest of the paper proceeds as follows. The semiparametric local likelihood estimator and Bayesian modeling framework are introduced in section 2. Section 3 provides a brief overview of the method's application to two rivers in the northeast United States. Results are presented in section 4, and the study concludes with a discussion in section 5.