## 1. Introduction

[2] Quantile regression has been extensively applied [*Koenker and Basset*, 1978; *Koenker*, 2005; *Ma and Koenker*, 2006; *Keyzer and Pande*, 2009] in parametric and nonparametric statistics. Quantile regression has also been applied in context of hydrologic forecasting [*Weerts et al*., 2011]. A particular case of quantile model selection in the form of model selection based on flow duration curves (FDCs) has also been extensively studied in hydrology [*Yu and Yang*, 2000; *Son and Sivapalan*, 2007; *Westerberg et al*., 2011; *Blazkova and Beven*, 2009].

[3] However, the extension of quantile regression to hydrological model selection is nontrivial. For example, in context of hydrological forecasting (such as *Weerts et al*. [2011]), linear quantile forecasts make one critical assumption that a forecasting model (or parts of it) can be linearized at a point in space or time. Such an assumption has several other implicit subassumptions on differentiability of the forecasting model and allowed perturbations that may not hold when the forecasting temporal range is large or when the temporal resolution of the forecasting model is coarse. This may lead to inaccurate quantile predictions with the quantile estimates of the parameters no more meaningful than being partial derivatives of the nonlinear forecasting model.

[4] *Pande* [2013] provides a theoretical foundation for its extension to hydrological model selection. The study reveals some interesting properties based on which structural deficiencies can be diagnosed and structure improvements can be assessed. A quantile (specific) model is obtained by minimizing a quantile specific loss function. The loss function is called the asymmetric loss function. The quantile model can then be used to predict the observed quantile. However, it is not limited to quantile model selection and prediction. Model structure deficiencies can lead to the crossing of two quantile model predictions because of the biases in predicting observed quantiles that structure deficiencies introduce. The optimal value of loss function at a quantile contains full information of the bias and thus structure deficiency at that quantile. The loss function values of a set of candidate model structure at a given quantile thus order the structures in terms of its deficiency at that quantile. The asymmetric loss function across quantiles can therefore be used to assess deficiencies of a set of candidate structures.

[5] Quantile hydrological model selection (or estimation) encompasses some standard model selection techniques based on summary statistics, for e.g., based on the minimization of the mean absolute error. Quantile model selection at a quantile value of 0.5 provides a model that minimizes the mean absolute error. In some cases, quantile model selection is also “equivalent” to maximum likelihood model selection. A model selected by minimizing mean absolute error is equivalent to assuming a Laplace likelihood function. Model selection based on minimizing the mean square error (Ordinary Least Squares, OLS) (a “mean” model) is a weighted sum of models selected at quantiles in the neighborhood of (and including) 0.5, such as at 0.4, 0.5, and 0.6 [*Koenker and Basset*, 1978], though the assignment of weights is less formal. Yet another way is to estimate a mean model is to take the mean of quantile models with quantile values ranging between 0 and 1.

[6] The model selection strategy that is applied in this paper is unique in following aspects: (1) the method is applicable to time series of any variable of interest (flux or storage), (2) model selection is based on one quantile at a time, unlike strategies based on FDC that identify a model such that its FDC closely matches the observed FDC, (3) a loss function of *Koenker and Basset* [1978] based on absolute deviations is employed as an objective function that removes the need to identify quantiles of an observed time series, and (4) linearization of a hydrological model is not undertaken unlike linear quantile hydrological forecasting methods. This paper presents applications that demonstrate the methodology and the utility of the theory presented in *Pande* (2013).

[7] First, the nature of bias in predicting an observed quantile due to model structure deficiency is exhaustively studied alongside the performance of variants of quantile hydrologic prediction. A complex model structure and French Broad river basin data are used for this purpose. Quantile model selection method is then implemented for a case of a parsimonious hydrologic model developed for western India where it is shown that quantile model selection enables the detection of model structure improvement. Finally, several variants of the model structure used in the first case study are applied to the Guadalupe river basin. Multiple model structures are compared using quantile model selection and inferred model structure deficiencies are discussed.

[8] All the properties proved for a generic hydrologic quantile model selection problem (QE2 problem) in *Pande* [2013] also hold for the model structures of the applications, provided that assumptions 1–7 hold. Appendix A provides relevant definitions for QE2 specification and the assumptions.

[9] These assumptions are valid for the cases studied here. The set **K** then represents the feasible space for model specific parameter set (the values that the parameter sets are allowed to take). Assumptions 1, 4, 5, and 7 hold for these cases (we allow for initial burn in which leads to positive initial storages in case of flexible model structures). Assumptions 2 and 6 on differentiability can be relaxed without affecting the propositions (not done in the text to maintain brevity). If the functions in these assumptions are not differentiable, their gradients can be replaced by subgradients at the optimum. Continuity is still required, but all hydrologic models are continuous in parameters. Monotonity in at least one parameter in assumption 2 holds in general, for example, flow is monotonic in recession parameters. The gradients of the predictive equation with respect to the parameters are independent of each other as required by Assumption 2. The only restrictive assumption may be that input forcings are nonzero for any time *t* (Assumption 3). It is shown to hold for the parsimonious dry land modeling case study, though not for the other two case studies. The sensitivity of propositions to this assumption is left for future work (at present it does not appear to contribute to major steps within the proofs). Different optimizers for each of the case studies are used, a gradient-based minimizer in the parsimonious model case study and a complex evolution-based minimizer in others. Both are initiated with multiple starting points to ensure that a solution is near a global minimum (Assumption 7). The parsimonious model case study is a constrained problem to tame the ill effects of model complexity while other case studies have reasonably large sample for calibration. Thus, the main result of *Pande* [2013] that the asymmetric loss function is a measure of model structure deficiency (and its utility to rank model structures based on asymmetric loss function at a particular quantile value) is not diluted by these more general case studies.

[10] The paper is organized as follows. Section 2 studies and validates the performance of quantile model selection on French Broad river basin data using a flexible model structure. Section 3 then assesses deficiency of a parsimonious dry land model developed for western India [*Pande et al*., 2010, 2012a]. Section 4 orders various model structures for the Guadalupe river basin in terms of structural deficiency. Finally, section 5 concludes the paper.