## 1. Introduction

[2] Statistical downscaling provides a way to utilize output of climate models for local-scale applications. Typical grid size for global-scale simulations are of the order of 100–200 km, and the raw global-scale model output is of limited use when information is required at local scales. The objective of downscaling is to overcome this scale mismatch and to use the skill in atmospheric forecasts at local scales.

[3] In short, statistical downscaling develops relationships between large-scale atmospheric circulation variables and local climate information (e.g., precipitation and temperature observations at individual stations). Using these observed relationships, forecasts of atmospheric variables can be translated into forecasts of local climate variables. Several methods of varying complexity have been used in performing statistical downscaling. *Zorita and von Storch* [1998] have classified existing statistical methods into three categories: (1) linear methods (e.g., canonical correlation analysis), (2) classification methods (e.g., weather generators and regression tree), and (3) deterministic nonlinear methods (e.g., neural networks). They also propose an analog method and compare the results with a method chosen from each of the above three categories to reconstruct average December–February (DJF) precipitation over the Iberian Peninsula for the period 1901–1989.

[4] *Widmann et al.* [2003] applied three different statistical downscaling methods that used simulated precipitation fields from the National Centers for Environmental Prediction-National Center for Atmospheric Research (NCEP-NCAR) reanalysis [*Kalnay et al.*, 1996] as the predictor. These methods are (1) local rescaling of the simulated precipitation, (2) downscaling using singular value decomposition (SVD), and (3) local rescaling with a dynamical correction. The three methods were applied to reconstruct historical (1958–1994) wintertime precipitation over Oregon and Washington and concluded that local rescaling with dynamical correction and SVD-based downscaling yielded comparable skills over the Pacific Northwest region. *Salathé* [2003] forced a hydrologic model of the Yakima River in central Washington with three downscaled precipitation fields to compare the effectiveness of the downscaling methods. One of these methods was an analog method that used a 1000-hPa geopotential height field from the NCEP-NCAR reanalysis as a predictor. *Salathé* [2003] showed that downscaling by local scaling of simulated large-scale precipitation from the NCEP-NCAR model was quite successful in streamflow simulations in the Yakima basin.

[5] In this paper we present a downscaling methodology based on the *K*-nearest neighbor (*K*-nn) algorithm. The *K*-nn algorithm is described for use in a stochastic weather generator by *Lall and Sharma* [1996], *Rajagopalan and Lall* [1999], *Buishand and Brandsma* [2001], and *Yates et al.* [2003]. The fundamental idea of the *K*-nn algorithm is to search for analogs of a feature vector (vector of variables for which analogs are sought) based on similarity criteria in the observed time series. In the weather generator model, the day immediately following the analog day is taken as the next day in the generated sequence, and the process is repeated. In the method presented here, local-scale station information is used for analog days selected on the basis of global-scale climate model output.

[6] Though transfer-function-based models (e.g., multiple linear regression, or MLR) are widely in use [*Antolik*, 2000], the *K*-nn based approach developed here has several advantages. First, this method is data-driven and makes no assumptions of the underlying marginal and joint probability distributions of variables. For example, to downscale precipitation using MLR, we need a two-step process [e.g., *Clark et al.*, 2004]. We need to account for the intermittent property of precipitation (typically modeled using a logistic regression), and then transform to normal space to satisfy the inherent normality criteria needed in least squares regression to model precipitation amounts. Second, *K*-nn based downscaling will be shown to intrinsically preserve the spatial covariability and consistency of the downscaled climate fields. Third, ensemble medium-range forecast (MRF) runs can be readily utilized in the downscaling process, and there is no need to use the ensemble mean of MRF predictors, as is normally used in regression models. Finally, the ensemble spread information from MRF runs can be utilized to develop spread-skill relationships, which is not possible in a MLR model [e.g., *Clark et al.*, 2004].

[7] The *K*-nn downscaling methodology was tested on four example river basins distributed over the continental United States, covering both snowmelt- and rainfall-dominated hydrologic regimes. These four basins are (1) the Animas River in southwestern Colorado, (2) the east fork of the Carson River on the California/Nevada border, (3) the Cle Elum River in central Washington, and (4) the Alapaha River in southern Georgia (Figure 1).

[8] Section 2 provides a description of the data used in the analysis. Section 3 describes the *K*-nn methodology developed for statistical downscaling. Section 4 present a discussion of the results from the four example river basins. Section 5 is a summary of the techniques and results.