The Colorado River basin experienced the worst drought on record during 2000–2004. Paleoreconstructions of streamflow for the preobservational period show droughts of greater magnitude and duration, indicating that the recent drought is not unusual. The rich information provided by paleoreconstructions should be incorporated in stochastic streamflow models, enabling the generation of realistic flow scenarios required for robust water resources planning and management. However, the magnitudes of reconstructed streamflow have a high degree of uncertainty. This apparent weakness of the paleodata has made their use in water resources planning contentious, despite their availability for many decades. However, few contest the accuracy of hydrologic state (i.e., dry and wet periods). A key question is how to combine the long paleoreconstructed streamflow information of lower reliability with the shorter observational data to develop a framework for streamflow simulation. We propose a unique stochastic streamflow simulation framework combining these two data sets. This has two components: (1) a nonhomogeneous Markov chain model, developed using the paleodata, which is used to simulate the hydrologic state, and (2) a nonparametric K-nearest neighbor (K-NN) time series bootstrap of observational flow magnitudes conditioned on the hydrologic state, thus combining the respective strengths of the two data sets. The framework is demonstrated for the Lees Ferry, Arizona, stream gauge on the Colorado River. The simulations show the ability to reproduce relevant statistics of the observational period and generate a rich variety of wet and dry sequences for use in sustainable management of water resources.
 Clearly, the rich information provided by paleoreconstructed streamflows should be incorporated in stochastic streamflow models to enable the generation of a realistic variety of plausible flow scenarios. However, the magnitudes of reconstructed streamflow have a high degree of uncertainty. Typically, a regression model is fit to the observed streamflow with a suite of tree ring observations as the predictors. This fitted model is then used to estimate streamflows in the preobservational period using the tree ring observations [Meko et al., 1995]. The reconstructed streamflows can be sensitive to the choice of model as demonstrated by Hidalgo et al. . This apparent weakness of the paleoreconstructed flow data has made their use in a water resources planning context contentious, despite the availability of paleoreconstructed data for many decades. In spite of these apparent weaknesses, few argue about the duration and frequency of dry and wet (i.e., the hydrologic state) periods from the reconstructions [Woodhouse et al., 2006]. The key question is how to combine the long paleoreconstructed streamflow information of lower reliability with the shorter but reliable observational data to develop a framework for simulation of streamflow scenarios.
 To address this question, we propose a new two-step process in which the hydrologic state (i.e., wet or dry) is modeled using the paleoreconstruction data and the flow magnitudes derived from the observational data. Specifically, a nonhomogeneous Markov chain model [Rajagopalan et al., 1996, 1997] is built on the paleodata that is then used to simulate the hydrologic state. The flow magnitudes are then generated conditioned on the simulated hydrologic state using a K-nearest neighbor (K-NN) conditional time series bootstrap [Lall and Sharma, 1996], thereby using the strengths of both of these data sets. The data sets used, the proposed framework, and the application to the Lees Ferry, Arizona, stream gauge on the Colorado River are described in the following sections.
2. Data Sets
 As mentioned earlier, two data sets, paleoreconstructed streamflow and observed flows, are used in this study. These are described below.
2.1. Natural Streamflow
 The natural streamflow data for the Colorado River basin are developed by the Bureau of Reclamation (Reclamation) and updated regularly. Annual updates addressing data changes and additions are typical. Naturalized streamflows are computed by removing anthropogenic impacts (i.e., reservoir regulation, consumptive water use, etc.) from the recorded historic flows. Prairie and Callejo  present a detailed description of methods and data used for the computation of natural flows in the Colorado River basin. This study uses the annual water year (September–October) natural streamflow at Lees Ferry, Arizona, for the period 1906–2005.
2.2. Paleoreconstructed Streamflow
 This study also uses the annual water year streamflow reconstructions from tree ring information at the Lees Ferry, Arizona, gauge, completed by Woodhouse et al.  for the period 1490–1997. Tree ring widths are influenced by climate and available soil moisture and thus are good integrators of the weather fluctuations, just as streamflow is a watershed integration of hydrologic and climatologic processes. Consequently, the tree ring widths are well correlated with annual runoff. To gather ring width data, a series of trees are cored at multiple locations, chosen such that the tree species have annual rings sensitive to moisture availability. Selecting the species and the location is very important for this effort [Meko et al., 1995]. Two core samples are taken from each tree for cross dating, and the ring widths are measured, obtaining the chronology of tree ring widths. The attractive aspect of tree-ring-based reconstructions, unlike other paleoproxy data, is that trees that put on annual rings have natural dating, with the outer ring corresponding to the current year and the subsequent inner rings corresponding to past years. A standard series of techniques [Stokes and Smiley, 1968; Swetnam et al., 1985] are employed to process the ring width series. Typically, the series is first detrended to remove the effects of reduced ring width with aging. Next, the ring width series from various cores at a single location are combined to develop a “site chronology” [Cook et al., 1990]. The site chronology is related to observed streamflow during the overlap period; typically, a multiple linear regression model is fit [Weisberg, 1985]. For the Colorado River at the Lees Ferry, Arizona, gauge, the regression model developed by Woodhouse et al. , using all the available pool of chronologies (30 in total), explains approximately 84% of the annual variance of the observed streamflow. The fitted regression model is then used to estimate the streamflow during the preobservation period when tree ring information is available, thus obtaining the reconstructed streamflow series.
 Especially during high streamflow periods it is known that the tree ring widths are influenced by variables other than moisture availability, thus degrading their ability in accurately representing high flow years. Further, different data sets and techniques to process tree ring information can result in substantial differences in the reconstructed flows [Hidalgo et al., 2000]. This can be seen in Figure 2, where four different streamflow reconstructions at the Lees Ferry, Arizona, gauge are shown, including the earliest reconstruction of Stockton and Jacoby , later reconstructions by Hidalgo et al. , that of Hirschboeck and Meko  as part of the Salt River Project, and the most recent reconstruction by Woodhouse et al. . Each reconstruction used a different set of tree ring chronologies and different processing methods. Of particular interest is the increased severity of drought and reduced overall mean displayed by the Hidalgo reconstruction. Unfortunately, the variability across reconstructions has not helped instill confidence in use of these data by policy makers and water managers in the Colorado River basin, even with growing interest in wanting to use them. Despite their differences, reconstructions tend to agree quite well on “wet” and “dry” years [Woodhouse et al., 2006], as seen in Figure 3. We found that three or more reconstructions agree on the hydrologic state 88% of the time, while all four methods agree 65% of the time on an annual basis. This offers the potential to use the paleoreconstructed streamflows to model the hydrologic state (i.e., wet or dry) of the system and use the observational data for the flow magnitude. This forms the basis of our proposed framework.
3. Proposed Framework
 As mentioned above, the proposed framework combines the paleoreconstructed streamflows with the observational data in a framework for simulating robust streamflow scenarios for use in water resources management. The paleoreconstructed data are used to model the hydrologic state of the system. The median of the observed flows is used to define periods as wet if flow is greater than this threshold and dry if flow is less than this threshold. Epochs of wet and dry periods identified using this criterion are illustrated in Figures 2 and 3. They illustrate the persistence in wet/dry regimes that suggests a Markov chain based model. Because the state transition appears to be varying through time, a nonhomogeneous Markov chain modeling approach is appropriate. The streamflow magnitudes are then simulated from the conditional probability density function, given the wet or dry state using a nonparametric K-nearest neighbor bootstrap approach. The framework is shown in Figure 4. The description of these two components of the framework along with background information are provided below. Hereinafter we refer to this framework as nonparametric paleoconditioning (NPC).
3.1. Modeling the Hydrologic State
 Markov chains have been extensively used to model daily precipitation occurrence [Gabriel and Neumann, 1962; Todorovic and Woolhiser, 1975; Smith and Schreiber, 1974; Salas, 1993, and references within]. Typically, for a two-state (wet, dry) first-order model (i.e., state transition at the current time step depends on the previous state), the transition probabilities are directly estimated from the data by counting the proportion of transitions to a wet year from a dry year, Pdw, and the probability of a wet year followed by a dry year, Pwd. The probability of a dry year followed by a dry year can be obtained as Pdd = 1 − Pdw; likewise, the probability of a wet year followed by a wet year can be obtained as Pww = 1 − Pwd. The transition probabilities can be readily used to simulate the hydrologic states and consequently, their frequencies. If these transition probabilities are assumed to be stationary and calculated from the entire data, then it is a “stationary” Markov chain. Here, though (Figures 2 and 3), the frequencies of wet and dry periods are varying (i.e., nonstationary) over time.
 Nonparametric alternatives [e.g., Rajagopalan et al., 1996, 1997; Mehrotra et al., 2004; Mehrotra and Sharma, 2005] offer a more general and flexible approach. In particular, here we use the nonhomogeneous Markov model (NHM) developed by Rajagopalan et al. , in which the transition probability at any time t is estimated as a weighted average of the transitions within a window of size H centered on t. The window size H is obtained from objective criteria. This was developed to model a daily precipitation process and subsequently applied for modeling the occurrence of El Niño–Southern Oscillation [Rajagopalan et al., 1997]. We adapt the NHM framework for modeling the streamflow states described below.
 The transition probabilities, Pdw (t) and Pwd (t), for a given year are estimated by a discrete nonparametric kernel estimator given as
where K() = the kernel function, St = system hydrologic state (1 = wet, 0 = dry) at time t, St−1 = system hydrologic state at time t − 1, h() = the kernel bandwidth, t = year of interest, and n = the number of values in the window t − h() to t + h(). The discrete quadratic kernel function developed by Rajagopalan and Lall  is used, which is given as
where x = (t − t())/h() measures the distance for event t() from the year of interest t within the bandwidth h(), where h() is an integer. The weights from the kernel function are positive and sum to unity. It can be seen that the estimates of transition probabilities at any year t are based only on the transitions within a window t − h() to t + h().
 The transition probability estimators (1) and (2) are fully defined once the bandwidth h() is determined for each. An objective method based on a least squares cross-validation (LSCV) procedure [Scott, 1992] is used to select the optimal bandwidth that was developed by Rajagopalan et al.  for the NHM case,
where n = the number of observations (ndw or nwd), and (ti) = the estimate of the transition probability (WD or DW) at year t, based on data ranging from t − h to t + h, with the exclusion of t (the transition at t should not be included when attempting to approximate that value). The 1 in equation (4) results from an assumption that the prior probability of transition is 1 for the years on which a transition has occurred. The value of h that minimizes the LSCV function is selected as the optimal bandwidth. The bandwidths hdw and hwd are objectively determined and subsequently used in the estimators (1) and (2) to estimate the transition probabilities for each year. The LSCV function does not always yield a clear minimum; therefore it is preferable to find an estimate for all available transitions to obtain a range of bandwidths. When a clear minimum is not found, it is recommended that a minimum delta h value (i.e., 0.0001) be determined to objectively find a minimum LSCV based on reaching the minimum delta h. We chose the clear minimum found within each complementary transition, but found little sensitivity over the range of possible bandwidths.
 Best Markov chain model orders are generally selected as the minimizers of the Akaike information criterion [Gates and Tong, 1976]. For the Lees Ferry paleoreconstructed data we found the two-state, first-order to be optimal.
3.2. Modeling the Flow Magnitudes
 The streamflow magnitudes, as mentioned earlier, are modeled based on the observed data and conditioned upon the hydrologic state simulated using the paleodata. This model can be described as the conditional probability density function (PDF),
where the flow at the current time t = xt conditioned on the current system state = St, previous system state = St−1, and previous flow = xt−1.
 Simulation from this conditional PDF is achieved by a K-NN bootstrap method [Lall and Sharma, 1996; Rajagopalan and Lall, 1999]. Typically, K-NN are identified in the observational data of the current feature vector [St, St−1, xt−1]. One of the neighbors is selected, based on a metric that gives the highest probability to the nearest neighbor and the lowest to the farthest. The corresponding streamflow of the year that sequentially follows the selected neighbor is the simulated value for the current time.
 This case is unique in that the feature vector includes discrete and continuous variables. Further, the discrete variables indicate system state as 0 or 1, i.e., dry or wet, while the continuous variable is a considerably larger value. If this disparity in magnitude is not considered in the neighbor choice, the state information will not influence the neighbor choice. The neighbor would be chosen based solely on xt−1. Therefore determination from the feature vector [St, St−1, xt−1] is split into two steps. First the discrete variables are identified as members in one of the four categories (ww, wd, dw, dd) identified from the state vector [St, St−1]. In the second step, the K-nearest neighbors of xt−1 that lie within the appropriate category are identified. The flow for the following year, xt, corresponding to the neighbor selected for xt−1, is then sampled.
 In this work, Kj = nj, where j = 1,..,4 represent the four state categories and n is the number of values in each category. With a larger observational data set the number of nearest neighbors can also be based on the heuristic scheme K = [Lall and Sharma, 1996], following the asymptotic arguments of Fukunaga . Objective criteria such as generalized cross validation (GCV) can also be used [Lall and Sharma, 1996, Prairie et al., 2005] The Kj neighbors were weighted with the function
3.3. Implementation Algorithm
 The complete framework combines the two models. The simulation proceeds as follows. First, a simulation horizon is identified, which is application dependent. Suppose a T-year horizon is chosen.
 1. Randomly resample a block of T years from the paleoreconstructed streamflows, say 1651–1680.
 2. Generate flow states S(t) where t = 1, 2,…,T using the transition probabilities of the resampled years from step 1 above.
 3. Generate flow magnitudes x(t) for each t = 1,2,…,T from the conditional PDF f(xt∣St, St−1, xt−1) using the K-NN bootstrap approach described in the previous section.
 4. Repeat steps 2 and 3 to obtain as many simulations as required.
4. Model Evaluation
 The proposed framework (NPC) is applied to the paleoreconstructed streamflows (1490–1997) and observed natural flows (1906–2005) at Lees Ferry, Arizona, on the Colorado River. For this work, 500 simulations, each 100 years in length (same as the length of the observed flows), were generated.
 A suite of basic distributional statistics are computed including the annual (1) mean, (2) standard deviation, (3) coefficient of skew, (4) maximum, (5) minimum, and (6) lag-1 autocorrelation. Surplus and drought statistics include the average length surplus (avgLS), average length drought (avgLD), average surplus (avgS), and deficit (avgD) volume. Surplus (drought) is defined as values above (below) a threshold, here the median of the observed record. Figure 5 describes the computation of these surplus and drought statistics based on the threshold.
 The results are displayed as box plots where the box represents the interquartile range (IQR) and whiskers extend to the 5th and 95th percentiles of the simulations and outliers are shown as points beyond the whiskers. The statistics of the observed record are represented as a triangle, and the statistics of the paleoreconstructed record are represented as a circle. Performance on a given statistic is judged as good when the observed or paleostatistic, depending on the statistic of interest, falls within the interquartile range of the box plots, while increased variability is indicated by a wider box plot.
 First the four sets of time-varying transition probabilities estimated from the NHM estimator (equations, (2), (3), and (4)) over the paleoperiod are shown in Figure 6. The optimal bandwidth minimizing the LSCV was found to be 37 years for the wet-wet transition and 19 for the dry-dry transition. The other two transition probabilities are complements of these. The epochal behavior in the transition probabilities is quite apparent. We draw attention to two epochs, (1) the early 1900s when the probability of transition to a wet state is higher than 0.5 and the transition to dry state is much lower than 0.5, which is also the epoch when the water sharing compact agreements on the Colorado River basin were developed, the wettest epoch in the past 500 years. In contrast, (2) the early 1600s is when the probability of transition to a dry state is much higher than 0.5, which is one of the driest periods in the paleorecord. There is also a steady decline in the probability of transition to a wet state in recent decades and a corresponding increase to dry states. Thus using these varied transition probabilities will provide a richer variety of wet and dry sequences, as seen in the results that follow.
 The simulations capture the basic distributional statistics of the observed streamflow within the IQR (Figure 7). This is consistent with the methodology in that the K-NN bootstrap approach resamples the observed data. Since the generated sequences are of the same length as the observed, the basic statistics of the observed streamflows are well captured, as to be expected. These distributional statistics of the paleorecord are not expected to be captured.
 Box plots of surplus and drought statistics are shown in Figure 8, along with the corresponding values from the observed record represented as a triangle and those from the paleorecord represented as a circle. The simulations from NPC generate longer drought and surplus sequences relative to observed, which can be seen by the observed statistics falling low within or below the IQR in Figure 8. The avgLS and avgLD of the paleodata are well reproduced in the NPC simulations. The avgS and avgD are influenced by both the magnitudes of flow, which are resampled from the observed record, and the state sequences from the paleorecord; therefore these statistics represent a blend of both these records. For comparison, a simple K-NN lag-1 model as described by Lall and Sharma  was used to resample the observed natural flow record with no influence from the paleorecord. Figure 9 shows the drought and surplus statistics from this simple model. The avgLS and avgLD as well as the avgS and avgD of the observed record are captured well within the IQR from this simulation, but the corresponding statistics from the paleorecord are not, as to be expected. Also, these simulations display reduced variability, as a tightened IQR, compared with Figure 8. The NPC framework is able to produce more varied drought and surplus sequences than what can be obtained from resampling only the observed data.
 The distribution of surplus and drought lengths is displayed in Figures 10 and 11 as histograms, respectively, for the observed, paleo, and NPC simulations. The histogram from the NPC simulations appear to be a smoothed version of that from the paleorecord, and also, the observed record has limited longest wet and dry spell lengths. Visually, the tail behavior of the histograms from the paleorecord and NPC simulations can be seen to be different from the observed record. The risk of a 6-year or longer dry spell (i.e., probability of exceedance) is 0% from the observed, 10.1% from the paleo, and 8.6% from the NPC simulations. The NPC provides a better sense of this risk, while the observed data show no risk of this. Also, the tails of the NPC drought and surplus plots extend to include event durations not seen in either the paleo or the observed. This is a new contribution to the field and is valuable in quantifying risk and planning for extreme events. The impacts on results from a decision support system that incorporated alternate hydrologic simulations including NPC simulations were published by Bureau of Reclamation  for the Colorado River basin operations. This study found that use of NPC simulations indicated greater risk of lower reservoir conditions than when only using simulation based on the observed or paleorecord alone. These findings indicated the importance of developing sequences of flows not seen in the observed period but probable based on state information from the paleorecord.
 In the Colorado River basin the critical sequence of concern is a series of droughts connected over 12 years with surplus years interspersed. Such sequences are not represented in the drought statistics described above; Timilsena et al.  address this through the use of a 5-year moving average to determine the hydrologic variable and thus periods of drought. Furthermore, the drought and surplus statistics estimated above are based on a preselected threshold (here it is the median streamflow of the observed period). Thus the results are sensitive to this selected threshold. To avoid this, a better approach is to determine the required storage for a given streamflow sequence to meet various demand levels. This incorporates the effect of multiple linked droughts and thus is more realistic in representing critical droughts. The algorithm, termed the sequent peak algorithm [Loucks et al., 1981], used for this purpose is given as
where Si′ is the storage at time step i, d is the demand or yield, yi is the streamflow from a sequence of N values at time i, and Sc is the storage capacity. This is also widely used for designing reservoir capacities.
 The algorithm is run for various demand (yield) levels with the historic flow (triangle), and each trace of the 500 simulations (box plots) shown in Figure 12. The sequences generated from the NPC framework introduce significant and realistic flow variability and as a result, reduced system reliability. For example, consider a demand of 16.5 million acre feet (MAF; 1.233 × 109 m3). To reliably meet this demand, based on the historic inflow sequence (triangle), a storage capacity of 325 MAF is required. The box plot shows considerable variability in the required storage capacity based on the 500 traces simulated from the combined framework. Furthermore, the box plot, shown as a PDF (Figure 13a) or a cumulative distribution function (CDF) (Figure 13b), can easily be used to find the reliability. It is clear that a demand of 16.5 MAF cannot reliably be met 98.9% of the time for a storage capacity of 60 MAF (the approximate current storage capacity of the Colorado River basin). The reliability is the area under the PDF curve below 60 MAF which is 1 minus the area of the hatched region in Figure 13a, or (1–0.989 = 0.011) as read from the CDF. The reliability of alternate storage capacities can be found from Figure 13a or 13b in a similar manner.
 The sequent peak method assumes that the demand level is constant through time and must be met in all years. In real operations, however, this is not the case. As a result, the reliability estimates obtained above tend to be too simplistic and conservative and provide only a coarse representation of the actual system reliability. Therefore we urge caution in using these results to read policy implications. To fully appreciate the actual operations of the water resources in a river basin, a decision support system that incorporates variable demand schedules, proper topographic layout for river system reservoir, diversion points, and operating policies must be used. This will help provide realistic estimates of reliability for the various decision components of the system, as demonstrated by Prairie  and Bureau of Reclamation [2007, Appendix N].
 As seen from the results, the utility of the proposed framework of combining information from paleoreconstructions and observations is to produce a rich variety of wet and dry spells, which are crucial for robust water resources planning. Investigation of the spell distribution shows that the combination approach generates a higher risk of extended wet and dry spells. This risk will have a significant impact on the water resources management in the basin, especially when the current modeling framework does not model events greater than 5 years in length. The spell variability generated here is much richer than what can be obtained from traditional time series modeling of the observed data [Prairie, 2006].
6. Summary and Discussion
 A novel framework for combining information from multiple sources in generating scenarios was developed. The methodology is data driven, flexible, and easy to implement. Other variations of the framework are possible, especially for generating state information, such as (1) fitting a stationary Markov chain separately on different epochs specified by the user, or (2) bootstrapping blocks of paleodata and using the state information.
 The presented framework combines the long paleoreconstructed streamflow information of lesser reliability with the shorter but reliable observational data. The framework has two components: (1) a nonhomogeneous Markov chain model developed on the paleodata that is then used to simulate the hydrologic state, and (2) a K-nearest neighbor (K-NN) time series bootstrap to simulate the streamflow magnitude from the observational data conditioned on the hydrologic state and the previous flow magnitude. This framework combines the respective strengths of the two data sets. Furthermore, it is robust and parsimonious. The framework was applied to paleoreconstructed streamflow and observational data for the Lees Ferry, Arizona, streamflow gauge on the Colorado River. The simulations showed the ability to capture all the distributional statistics of the observational period and also generate a rich variety of wet and dry sequences that will benefit the sustainable management of water resources in the basin.
 It is difficult to quantify the significance of combining the paleodata with the observational data in comparison with solely using the observational period data when generating simulations. A Kolmogorov-Smirnov test was used to compare the distributions from both methods, and a significant difference in the overall distributions was not found. However, the distributions do present different tail probabilities that cannot be demonstrated with the Kolmogorov-Smirnov test. These differing tail probabilities present a revised picture of risk that is typically determined with a decision support system. Prairie [2006, chapter 5] presents results from a decision support system that demonstrate using the combine data set versus the observational data alone influences decision variables sensitive to tail probabilities such as those affected by extreme events, for instance, protracted drought or surplus.
 In the presented results, currently the threshold used to determine system state as well as drought and surplus statistics is based on the median of the observed flow. This threshold can be modified or more states could be included as required on a case by case basis.
 A slightly modified version of this technique can be used to generate streamflow sequences based on climate change projections. In this modification the state sequence would be generated using the paleo and observed data and the streamflow magnitudes would be resampled from the PDF of flows from climate change projections.
 The annual streamflow generated at Lees Ferry, Arizona, from this approach can be spatially and temporally disaggregated [Prairie et al., 2007] obtaining monthly flow scenarios at all the gauges in the basin. These scenarios are used in a basin-wide decision model [Prairie, 2006] and help determine realistic estimations for risk and reliability of various decision components in the water resources system, facilitating effective long-term planning.
 Funding for this research by the Bureau of Reclamation's Lower Colorado regional office via grant 04PG303326 is gratefully acknowledged. Continued support by Kib Jacobson of the Bureau of Reclamation's Upper Colorado regional office is appreciated. Thanks are also due the Center for Advanced Decision Support in Water and Environmental Systems (CADSWES) at the University of Colorado, Boulder, for use of its facilities and computational support.