## 1. Introduction

[2] Synthetic simulation of streamflow sequences is used in a variety of applications including reservoir operation and for evaluating water supply reliability. Multiple reservoirs and stream sections are often considered in a system's operation plan. For this purpose, streamflows generated at different sites need to be consistent. This implies that the flow at a downstream gauge is the sum of tributary flows; the annual flow is the sum of monthly flows; the monthly fraction of flows in wet/dry years are representative; and the dependencies of flows between the sites have to be reproduced. To this end, the disaggregation problem can be thought of as simulation from the conditional probability density function (PDF) *f*(*X*∣*Z*), where *X* is a vector of disaggregated (e.g., monthly flows) flows and *Z* is the aggregate (e.g., annual) flows and other terms (e.g., the first month's correlation with the last month of the previous year), subject to the condition that the disaggregated flows add up to the aggregate flows, which is the additivity property. Often a simpler approach has been used consisting of fitting a model of the form

where *Z* is usually taken to be just the annual flow and *A* and *B* are matrices of the model parameters that are estimated to ensure the additivity property and ɛ is the stochastic term. Notice that the above form is that of a linear regression, which has a rich developmental history; consequently, the main assumption is that the stochastic term and hence the data (*X* and *Z*) are assumed to be normally distributed. To achieve this, the data are typically transformed to a normal distribution by appropriate transforms before the model is fit. The simulation proceeds as follows: (1) An aggregate streamflow is generated from an appropriate linear or nonlinear model or equivalent data set. (2) The simulated aggregate flow is then disaggregated using the above model. The simulated flows are back transformed to the original space. This linear stochastic framework for streamflow disaggregation was first developed by *Valencia and Schaake* [1973] and subsequently modified and improved by several others [*Mejia and Rousselle*, 1976; *Lane*, 1979; *Salas et al.*, 1980; *Stedinger and Vogel*, 1984; *Stedinger et al.*, 1985; *Salas*, 1985; *Santos and Salas*, 1992].

[3] Since these models are fit in the transformed space, the additivity of the disaggregated flows to the aggregate flows in the original space after back transformation is not guaranteed. Hence several adjustments have to be made [e.g., *Lane*, 1982; *Stedinger and Vogel*, 1984; *Grygier and Stedinger*, 1988]. Furthermore, the model is designed to reproduce the statistics in the transformed space but reproduction is not guaranteed in the original space.

[4] Alternate approaches to disaggregation [*Tao and Delleur*, 1976; *Todini*, 1980; *Koutsoyiannis*, 1992; *Koutsoyiannis and Manetas*, 1996; *Koutsoyiannis*, 2001] allow representation of non-Gaussian data directly in the disaggregation scheme to avoid the need for data transformation. These techniques can incorporate the skewness from the historic data into the stochastic term [*Tao and Delleur*, 1976; *Todini*, 1980; *Koutsoyiannis*, 1999]. *Koutsoyiannis* [2001] provides a stepwise disaggregation scheme that incorporates an adjustment procedure that preserves the additivity property and certain higher-order statistics. These methods are iterative in nature and thus computationally intensive besides requiring assumptions of linearity.

[5] Recent advances in nonparametric methods (see *Lall* [1995] for an overview of nonparametric methods and their applications to hydroclimatic data) provide an attractive alternative to linear parametric methods. Unlike the linear approach where a single linear model is fit to the entire data, the nonparametric methods involve “local” functional fitting. The function is fit to a small number of neighbors at each point. This approach has the ability to capture any arbitrary features (nonlinearities, non-normal, etc.) exhibited by the data. Nonparametric methods have been applied to a variety of hydroclimate modeling questions including stochastic daily weather generation [*Rajagopalan and Lall*, 1999; *Yates et al.*, 2003], streamflow simulation [*Lall and Sharma*, 1996; *Sharma et al.*, 1997; *Prairie et al.*, 2006], streamflow forecasting [*Grantz et al.*, 2005; *Singhrattna et al.*, 2005], and flood frequency estimation [*Moon and Lall*, 1994] to mention a few.

[6] Kernel estimator based nonparametric streamflow simulation at a single site was developed by *Sharma et al.* [1997] where they also demonstrate its advantage over traditional linear models. *Sharma and O'Neil* [2002] improved on this to capature the interannual dependence. However, kernel methods can be inefficient in higher dimensions (e.g., space-time disaggregation), as noted by *Sharma and O'Neil* [2002] and as such, difficult to implement in multivariate problems such as space-time disaggregation in a network. *Lall and Sharma* [1996] developed a K-nearest-neighbor (K-NN) bootstrap approach to time series modeling and applied it to streamflow simulation. Being a bootstrap method, values not observed in the historic data will not be generated in the simulations. To address this, a modified version of the K-NN bootstrap was developed by *Prairie et al.* [2005, 2006], and this was further used in streamflow forecasting [*Grantz et al.*, 2005; *Singhrattna et al.*, 2005]. Semiparametric approaches that combine the traditional linear modeling and bootstrap methods for streamflow simulation have also been developed [*Souza Filho and Lall*, 2003; *Srinivas and Srinivasan*, 2001].

[7] *Tarboton et al.* [1998] developed a kernel-based approach (an extension of their single site methodology by *Sharma et al.* [1997]) for temporal (i.e., annual to monthly) streamflow disaggregation. *Kumar et al.* [2000] adopted K-NN bootstrap techniques in conjunction with an optimization scheme for spatial and temporal disaggregation of monthly streamflows to daily flows. They indicate that disaggregating monthly flow to daily involves a higher-dimensional problem that cannot always be well represented by traditional parametric disaggregation techniques. Additionally, daily flows typically display nonlinear flow dynamics that are not adequately modeled with traditional techniques. The optimization framework allows for increased flexibility in specifying the functional relationships the disaggregation scheme needs to preserve but at a great computational cost. *Srinivas and Srinivasan* [2005] developed a semiparametric disaggregation method for a multisite model they termed as hybrid moving block bootstrap multisite model (HMM). In this approach a parametric model (such as a linear autoregressive model) is fit to the data and the residuals from this model are resampled by block bootstrapping (the nonparametric component). This method is able to incorporate the strengths of both parametric and nonparametric models but still requires multiple steps.

[8] In practical terms, there is a need for a robust, simple, and parsimonious approach for space-time streamflow disaggregation that can capture the features exhibited by the data. To this end, here we develop a K-NN based disaggregation framework. The proposed framework and the algorithm are first described, followed by its application to four streamflow sites on the Upper Colorado River basin, concluding with a summary and discussion of applications and the future direction for this research.