Corresponding author: S. K. Jha, Water Research Centre, School of Civil and Environmental Engineering, University of New South Wales, Sydney, NSW 2052, Australia. (firstname.lastname@example.org)
 A downscaling approach based on multiple-point geostatistics (MPS) is presented. The key concept underlying MPS is to sample spatial patterns from within training images, which can then be used in characterizing the relationship between different variables across multiple scales. The approach is used here to downscale climate variables including skin surface temperature (TSK), soil moisture (SMOIS), and latent heat flux (LH). The performance of the approach is assessed by applying it to data derived from a regional climate model of the Murray-Darling basin in southeast Australia, using model outputs at two spatial resolutions of 50 and 10 km. The data used in this study cover the period from 1985 to 2006, with 1985 to 2005 used for generating the training images that define the relationships of the variables across the different spatial scales. Subsequently, the spatial distributions for the variables in the year 2006 are determined at 10 km resolution using the 50 km resolution data as input. The MPS geostatistical downscaling approach reproduces the spatial distribution of TSK, SMOIS, and LH at 10 km resolution with the correct spatial patterns over different seasons, while providing uncertainty estimates through the use of multiple realizations. The technique has the potential to not only bridge issues of spatial resolution in regional and global climate model simulations but also in feature sharpening in remote sensing applications through image fusion, filling gaps in spatial data, evaluating downscaled variables with available remote sensing images, and aggregating/disaggregating hydrological and groundwater variables for catchment studies.
 Characterizing changes in local climate variables is required to better understand the movement of water through the atmosphere, land surface, and subsurface interfaces at basin scales and to quantify the influence of potential changes in the hydrological system across a range of resolutions. Among other variables, the surface skin temperature (TSK), soil moisture (SMOIS), latent heat flux (LH), and vegetation fraction (VEGFRA) are key variables in coupling atmospheric forecasting models with land surface models [Evans et al., 2011; McCumber and Pielke, 1981; Walker and Rowntree, 1977]. These variables are inherently interrelated and combine to affect water balances across catchment scales [McCabe et al., 2008a]. For instance, while the amount of vegetation fraction directly influence the SMOIS, LH, and TSK, it is likewise affected by variations in each of these other related variables through complex nonlinear relationships [Rodriguez-Iturbe, 2000].
 General circulation models (GCMs) provide a mechanism through which to predict the climate under scenarios of future change. However, the output of these GCMs is often too coarse (grid cells of hundreds of kilometers) for studying the potential effects of hydrological variability and change at the regional and local scale. For example, hydrological studies related to the effect of climate change on areas such as flood prediction, groundwater recharge, agricultural water use, and urban water supply all require information at the catchment scale, which is beyond the capacity of current generation GCMs to provide. To bridge the mismatch of spatial scale between the GCM and the scale of operational interest, some form of data downscaling is necessary.
 There are two broad categories of downscaling approaches existing in the current literature: statistical downscaling and dynamical downscaling. Comprehensive reviews on these approaches have been provided by various researchers [Evans et al., 2012; Maraun et al., 2010; Wilby and Wigley, 1997; Wilby, 2002], so only a summary is provided here. Statistical downscaling approaches are based mainly on regression relationships between the coarse scale information obtained from GCM (predictors) and observed variables at the local or fine scale (predictands) [Hewitson and Crane, 1996]. Statistical downscaling approaches have their own limitations, such as (i) the selection of informative predictors is an onerous task and (ii) the predictors depend on their availability from GCM outputs and also on the region and the season under consideration. The latter limitation is crucial as it determines the characteristics of the downscaled climate scenario. In addition, the derived relationship between the predictors and predictands is assumed to be valid in a future perturbed condition, which cannot be verified [Chu et al., 2010].
 There is no universally accepted statistical downscaling approach applicable in climate change impact studies. Therefore, a recent trend in the scientific community has been to first apply several different downscaling methods and compare the results using verification data [Raje and Mujumdar, 2011]. The output of the “best” downscaled model is then forced into the hydrological model to assess the future scenario of climate change [Liu et al., 2011]. Furthermore, all relevant uncertainties stemming from the selection of the downscaling methods and hydrological methods have to be taken into account. Many researchers identify the method of statistical downscaling as a major source of uncertainty in impact studies [Segui et al., 2010; Stoll et al., 2011; Teutschbein et al., 2011]. Further investigation of statistical downscaling approaches is therefore warranted.
 Statistical downscaling approaches are often preferred as they are computationally less burdensome than the other major approach: dynamic downscaling [Kidson and Thompson, 1998]. In dynamical downscaling, a regional climate model (RCM) is applied using GCM outputs as boundary conditions. This approach takes into account detailed terrain and land cover information and provides physically consistent results [Kunstamann et al., 2004]. Dynamic downscaling also inherently accounts for the multivariate spatial dependencies and the cross-dependencies between variables, which statistical approaches often do not [Sain et al., 2011]. Although RCMs are useful for investigating the large-scale circulation, there are limitations to their application in impact studies at the regional and local levels [Giorgi, 2006], mostly related to the need to explicitly account for model biases and the difficulty to characterize downscaling uncertainty. Furthermore, the physical simulation of the climatic system in RCMs is computationally very expensive. Hence, there is growing interest in the research community to develop statistical techniques to further downscale the RCM output to the point locations required for impact studies [Evans, 2012; Hoffmann et al., 2011]. These methods vary considerably from simple spatial interpolation, weather generators and machine-learning techniques [Evans, 2012].
 Geostatistical simulation methods provide useful tools for analyzing spatially correlated variables. Whether in predicting point values from areal data for univariate problems [Kyriakidis, 2004; Kyriakidis and Yoo, 2005] or in the use of kriging and cokriging techniques to obtain fine resolution imagery from coarse scale images [Nishii et al., 1996], there are many applications of such approaches in geology, earth science, and remote sensing [Atkinson et al., 2008; Liu et al., 2007; Mariethoz et al., 2012; Pardo-Iguzquiza et al., 2006, 2010; Zhang et al., 2012]. However, most of the existing approaches are based on linear geostatistics [Goovaerts, 1997]. Such methods rely on assumptions of linear correlation with covariates or assume a multi-Gaussian spatial dependence, involving specific types of patterns [Journel and Zhang, 2006]. From the perspective of land-atmosphere interactions, it is known that the structure of land-atmosphere variables and their interrelationships can be highly nonlinear [Wallace and Hobbs, 2006]. As a result, new statistical downscaling approaches are needed that are capable of reproducing such nonlinear characteristics.
 Some geostatistical approaches to downscaling have been proposed that use multiple-point geostatistics (MPS) [Strebelle, 2002], a nonparametric approach that is free from simplifying assumptions such as linearity, maximum entropy, and multi-Gaussianity [Gómez-Hernández and Wen, 1998]. However, applications to downscaling are so far limited to cases with a single variable and, in addition, are restricted to either categorical variables such as land cover category [Boucher, 2009] or to self-similar images [Mariethoz et al., 2011]. In this paper, the geostatistical approach of direct sampling (DS), based on MPS, is applied for downscaling variables typically of interest in land-atmospheric studies of the hydrological sciences. Traditional geostatistical approaches use variograms, whereas the multiple-point statistics-based approaches use training images to describe spatial continuity. The DS algorithm, as described in Mariethoz et al. , extends the idea of MPS to continuous and multivariate problems.
 The MPS approach involves deriving spatial arrangements of values from a training image and storing them in a database [Strebelle, 2002]. The database is used for retrieving the conditional probabilities for the simulation. In contrast, the DS approach does not require a database and can generate stochastic fields representing complex statistical and spatial properties directly from the training image. The method can use multivariate training images, generating random fields where the (possibly nonlinear) relationships between variables are reproduced as in the training image. Such an approach is used in this study, with a training image generated from 20 years of seasonal data.
 The rationale of our approach is to identify the dependence between coarse- and fine-scale patterns using past observations expressed as training images. Patterns contained in these training images are resampled conditionally to local coarse values, resulting in locally accurate downscaled estimates mimicking the interscale relationships as observed in training images, i.e., past observations, where the variables are informed at all scales are considered. We demonstrate the use of the DS approach with RCM simulation data at different resolutions and by comparing the downscaled results with high-resolution RCM outputs. Unlike typical statistical downscaling approaches, the proposed method provides physically consistent spatial patterns of TSK, SMOIS, and LH over different seasons, which are then used as downscaling predictive models.
2.1. Direct Sampling
 The methodology adopted for downscaling the climate variables is based on the DS approach. The principle of DS is to sequentially determine the value of each pixel, conditional to the values of other neighboring pixels, which can be at the same scale or at a different scale. A value is obtained by conditional sampling of the training image, which is deemed representative of the patterns at all scales considered. This is accomplished with the following algorithm: let x be a pixel in the image where the variable of interest Z(x) needs to be downscaled. Equivalently, pixels in the training image are denoted y. We denote Nx as the ensemble of the n closest pixels of x. Note that x is a two-dimensional (2-D) vector, each of its components representing a coordinate in a 2-D Cartesian space. In the case of one variable, a neighborhood is defined around the node to be simulated as , where h1, …, hn represent the vectors between the pixel x and the neighboring pixels. The basic idea is to find another location y in the training image that has a neighborhood Ny similar to Nx. The training image is sampled by randomly selecting pixels contained within it and defining a neighborhood that has the same lag vectors as in Nx. The distance represents the mismatch between Nx and Ny. At the first occurrence of a mismatch below a given threshold t, the data at the node y of the training image Z(y) is specified at location x of the simulation grid. Since the distance is under the threshold, the value at location y is a sample of Z conditional to the neighborhood Nx. The important feature of the DS method is that using the first sample with a distance lower than t is equivalent to sampling from the high-dimensional distribution:
 This sampling strategy was first used by Shannon  and applied to geostatistical simulations by Mariethoz et al. .
 Let us now consider a multivariate case consisting of m variables Z1(x), …, Zm(x). The joint distance between multiple variables is defined, with the objective to find pixel values matching the neighborhoods of all variables taken together. The consequence is that the values sampled from (1) will have the same cross-dependencies as observed in the multiple variables of the training image. For each variable k, k = 1, …, m, the number nk of neighboring nodes can be the same or different. Therefore, we can define an individual neighborhood for each variable k as . Then, the multivariate neighborhood consists of the ensemble of all these individual neighbors joined across all m variables: .
 The joint distance between multivariate neighborhoods is then a linear combination of individual distances:
where the weights wk sum to 1 and represent the relative importance given to each variable. The distance function d(.) can be defined as any valid distance function between the vectors of values in Nx and Ny. A detailed description of the calculation of different types of distances is provided in Mariethoz et al. . Some examples of distances are Euclidean, Manhattan, or transform-invariant [Mariethoz and Kelly, 2011]. The Manhattan distance was used for all cases presented in this paper. Apart from the computation of the distance, the rest of the algorithm is identical to the univariate case. It should be noted that since our approach is nonparametric, no normalization or variable transformation is necessary. The downscaling methodology is represented in Figure 1 and further described in section 3.1 to explain the process of generation of training images and conditioning data.
2.2. Dimensionality of Multivariate Training Images
 In much of the literature devoted to MPS, training images are “images” in the commonly accepted sense and can be either 2-D or three-dimensional (3-D), depending on the application [Strebelle, 2002]. For example, applications to remote sensing would call for 2-D images, while applications to aquifer modeling might call for 3-D models. Space-time applications such as rainfall modeling are also 3-D because they include two dimensions in space and an additional temporal dimension. The DS approach, by introducing the possibility to use multivariate training images, extends the complexity and the dimensionality of the training image objects. The addition of several variables can be seen as a four-dimensional (4-D) construct, with coordinate k defining which variable is considered.
 The downscaling application implemented in this paper considers 2-D land-atmosphere variables and their temporal variations, therefore making the problem effectively 3-D. The addition of several variables considered together adds one dimension, calling for 4-D training images. Since the training images need to contain the relationship between coarse and fine scale, each variable considered needs to be present in the training image at both resolutions, therefore doubling the total number of variables.
3. Data Set and Description of Variables
 The focus area for the study is the Murray-Darling basin (MDB) in southeast Australia (Figure 2). With a size of approximately 106 km2 and supporting a population over 3 × 106 people, the MDB is the largest and most economically productive catchment in Australia. Because of its importance in the Australian economy, the effect of climate change on the long-term productivity and sustainability of the basin is of high importance. The data set used in this study was derived from the work of Evans and McCabe , who evaluated the Weather Research and Forecasting (WRF) model RCM for the period of 1985 to 2009 over the MDB, finding good agreement across multiple time scales including subdaily [Evans and Westra, 2012]. WRF has also been extensively tested over this region at the event scale using multiple physical parameterizations [Evans et al., 2012].
 In this study, WRF model outputs at spatial resolutions of 50 km (coarse resolution) and 10 km (fine resolution) were used to demonstrate our approach, with grids of 41 × 53 and 155 ×185 nodes, respectively. The number of cells in the simulation grid (x, y) is (161,171) with a grid cell size of 0.09°. The sample period includes daily output for the period 1985 to 2006. Twenty years of data from 1985 to 2005 at both resolutions were used as training images (calibration dataset). Using the MPS approach, downscaled 10 km predictions for each season for the year 2006 were calculated, assuming that the 50 km resolution is known for this year. The higher resolution WRF outputs for the year 2006 were used to evaluate the capability of the downscaling approach. Daily values of both the 50 and 10 km resolution of TSK, SMOIS, LH, and VEGFRA were aggregated seasonally, considering months of December, January, and February (DJF) for the summer; March, April, and May (MAM) for autumn; June, July, and August (JJA) for winter; and September, October, and November (SON) for the spring season. Note that, throughout the paper, the seasons mentioned correspond to southern hemisphere (austral) seasons.
 Land surface variables, such as the TSK, SMOIS, and LH, are not linearly related, with multiple thresholds in the system affecting the strength, and even the sign, of their relationships. TSK is an estimate of the temperature of a very thin surface layer of the land or water and responds rapidly to changes in direct sunshine or shade. SMOIS plays a key role in partitioning components of the water balance such as infiltration, runoff, and evaporation and can vary significantly due to large heterogeneity in land cover types, soil type, leaf area index, and topography [Brocca et al., 2012; Liu et al., 2010]. In this study, the SMOIS data correspond to the moisture content in the top 10 cm of the soil and will respond to precipitation and evaporation much more quickly than in deeper layers [Manfreda et al., 2007; McCabe et al., 2005b]. LH describes the energy used for transporting the water from the land surface to the atmosphere as evapotranspiration, with the SMOIS condition directly influencing the evaporative flux [Kalma, 2008]. It is a critical variable in defining water and energy exchange over the Earth's terrestrial surface, with the net radiation available at the Earth surface divided principally between latent heat and sensible heat flux [Jiménez et al., 2011].
3.1. Application of DS for Downscaling
 The training images contain seasonal WRF outputs for 20 years, including the variables TSK, SMOIS, and LH at both fine and coarse resolutions. The statistics and patterns of land-atmosphere values for each variable are known to differ with season. As such, to represent seasonal changes in the land-atmospheric variables, a separate training image was built for each season.
 Each training image is a 4-D object, containing data averages for a given year and season, for 20 different years, and for the three variables to be downscaled (TSK, SMOIS, and LH) at resolutions of 50 and 10 km. As such, there are 20 training images for each season. The training images also contain the 10 km VEGFRA, latitude, and longitude as covariates in the simulation. Using equation (2), many variables can be added as covariates in the simulation. Here, latitude and longitude are considered sufficient to address the nonstationarity, but one could consider alternatives such as elevation, distance to the coast, etc. VEGFRA is included as climatological monthly values in the climate model simulations, so could also be applied to a future unknown climate state. In all, the total number of variables is nine. The reason for using all the variables simultaneously is that there exist nonlinear relationships among them, which should ideally be preserved. Due to large variations in atmospheric variables over the land and other static features such as topography, water bodies, coastal areas, etc., a constraint needs to be applied to prevent using patterns that emerge from very different locations. Including latitude and longitude in the computation of the distance has the consequence of imposing such local consistency in the downscaled spatial features [Honarkhah and Caers, 2012], while the VEGFRA constraint limits the result to areas with similar levels of vegetation cover. The result is that patterns from similar spatial areas as their coarse counterpart are more likely to be identified and used for downscaled values. The resulting training images are assumed to contain the various higher-order statistical behavior of the system and are thereafter used as a nonparametric spatiotemporal multiscale model.
 The setup of the DS model is described in Figure 1. The layers on the right-hand side represent one variable of the training image at resolutions of 50 and 10 km (i.e., the available data in the period 1985–2005). Each of the variables has the same two sets of information for each of the 20 years of data. The left-hand side indicates new data (i.e., year 2006) that needs to be downscaled. Here, the information is available only at the coarse resolution and the values of the variable at the finer grid (shown in gray color) are those to be generated. Since multivariate patterns are considered (each scale being considered as a different variable), Nx is made of both coarse ( ) and fine ( ) values in the simulation grid, and similarly Ny considers both scales together in the training image ( and ). Equation (2) compares these together, therefore preserving the spatial relationships across scales. In our case, we jointly simulate three variables (LH, SMOIS, and TSK), each of which has to be considered at both fine and coarse scales.
 For the DS, we used neighborhoods consisting of 20 pixels for high-resolution and 20 pixels for low-resolution variables: except for the latitude and longitude for which a single neighbor (the central pixel of the pattern) is enough to define the location of a point. The distance function in (2) is used with a distance threshold of t = 0.01. All variables are given a weight wk = 0.1333 in the distance calculation, except for latitude and longitude, which have a weight of 0.0333 to allow sampling patterns in across a reasonably broad area of the domain to being considered. Regarding parameterization of the method, we refer the reader to a comprehensive discussion and sensitivity analysis on the DS parameters provided in Meerschman et al. . Note that the DS also allows the imposition of a temporal dependence between 1 year and the next by defining neighbors as adjacent pixels in the third dimension (corresponding to time). However, we do not impose such dependence here because, in the case of downscaling seasonal values, we assume that the temporal dependence is entirely driven by the coarse data. Our training image is a stack of 20 maps, each of which is equally likely to be considered in the conditional sampling. As the DS is a stochastic simulation method, we run the downscaling method 50 times to produce Monte-Carlo realizations, obtaining 50 different values at each point of the domain. Therefore, the results take the form of local probability distributions, which allows for an assessment of the uncertainty inherent to the downscaling procedure.
4. Results and Discussion
 For validation, the downscaled results are compared against the actual 10 km WRF outputs for all the variables in the year 2006. In this section, findings based on comparative analysis of the downscaled results with this reference year are presented. The analysis is based on evaluating how well the downscaled approach reproduces the spatial features present at the fine scale. Quantitative validation of the results is examined in the subsequent section.
 Figure 3 presents the coarse input data and the results obtained from DS simulation runs for TSK, SMOIS, and LH for the summer of 2006, along with the reference image. The coarse data, as shown in the first row, has 50 km grids, and therefore the land features are homogenized over large areas. The color patterns in all three variables illustrate the overall distribution of the magnitude of variables, but there is no distinguishable feature available. The efficacy of DS can be seen by comparing the figures in the first row with the downscaled spatial distribution of all the variables obtained from a single realization of DS in the second row. The figures in the second row not only capture the overall variation across geographical locations but also preserve detailed land features. For example, the conditioning data for LH (last figure in the first row) only show that the LH increases from west to east. There is a small area on the east coast where LH has its highest magnitude. However, the corresponding DS output (last figure in the second row), while maintaining those regions of high and low values of LH also provides insight into regions with high LH in the south of the basin, which were not visible in the conditioning data. There are also specific features on all images at the edge of the domain, related to the boundary conditions of the RCM used. Since these features are present in the training image at both scales, it is also reproduced in the downscaled realizations, as expected.
 The spatial distribution obtained from the mean of 50 realizations of DS is presented in the third row. The number of realization is chosen to demonstrate the performance of our approach, while balancing the inherent computational cost. By comparing figures in the second and third row, it is clear that, although the zones of high/low values of the variables remain at the same locations, by taking the mean of 50 realizations, the features have been smoothed. A single realization will preserve the textural properties of the training image but is nonunique, as it is just a single possible scenario. The DS results can be compared directly to the reference figure presented in the fourth row, which shows an excellent match for the TSK and LH and a reasonable match for the SMOIS. A quantitative assessment of this comparison is presented in section 4.2.
 Apart from small local scale variations, downscaled results across all realizations present similar features (refer to Figure S1 in the Supporting Information). The variability of the downscaled results can be seen in the standard deviation map shown in the fifth row of Figure 3. It is evident that the highest values of the standard deviation occur around the mountainous and coastal areas, where land features change abruptly to water bodies. Outside of these areas, the downscaled reproductions express similar standard deviation values with each of the three variables and exhibit little variation.
 Depending on the application, it may be preferable to retain either the mean or the individual realizations. If one is interested in the most probable value at each pixel (also known as conditional estimation), the mean value is appropriate. However, to correctly propagate the uncertainty in downscaling, one should keep in mind that all realizations have to be considered together. For example, if the downscaling results are used as input for a hydrological model, the hydrological model should run with each of the downscaled realizations, and the result is a set of 50 equiprobable hydrological predictions. Such predictions are generally expressed as probability density functions that could be used as input into management or policy models. Although all realizations have an equivalent chance of representing the model “reality,” for the rest of the paper, only a single realization of the DS simulation runs are presented for analysis, due to the challenge of graphically presenting all reproductions.
4.1. Reproduction of Hydrological Variables
4.1.1. Surface Temperature (TSK)
 Figure 4 represents the seasonal variation of TSK obtained from DS simulation runs for summer, autumn, winter, and spring. Figures in the first column present the map of TSK at the coarse resolution of 50 km. The result of DS for a single realization is shown in the second column and the reference values in the third column. The difference plot of mean values of 50 realizations and the reference values is shown in the fourth column. Comparison of the second and third columns shows that the DS is able to capture the spatial distribution of TSK very well. In all seasons, the northern and central parts of the basin have high TSK values, while the southern region and some of the eastern coast show relatively low TSK. During the summer season, minimum and maximum values of TSK vary between 292 K and 317 K, while during the winter season they vary between 276 K and 298 K, respectively: matching the corresponding minimum and maximum values of TSK in the reference maps. Note that the color scale is different for each season in Figure 4. The corresponding standard deviation (SD) map of DS simulation runs is shown in Figure S2 (figures in the first column, Supporting Information). These figures indicate that the maximum SD, corresponding to the highest downscaling uncertainty, occurs near the southern and eastern mountainous and coastal areas, where low values of TSK occur. The difference plot shows that, during the summer season, the mean value of downscaled TSK is lower than the reference values in the northwestern region. Similar results are observed on the southern coast during the spring season. During autumn and winter, there is an improved match between the downscaled results and reference, showing zero difference. The presence of water bodies such as canals, lakes, and reservoirs produce higher values than the reference in all seasons.
4.1.2. Soil Moisture
 Figure 5 compares the downscaled spatial distribution of SMOIS with the reference map for different seasons. Figures in different rows are arranged in the same sequence as described previously. For all seasons, the western part of the basin has lower SMOIS relative to the reference data set. The peak SMOIS values routinely occur in the southeast. It can also be observed that SMOIS have high values during the winter season, while, in spring, SMOIS is low across the entire basin. A closer inspection of the second and third columns demonstrates that the downscaled SMOIS from the DS simulations agree reasonably well with the WRF outputs. It is clear that the SMOIS has specific features like rivers, lakes, and other water bodies that are present in the reference map but which DS has not been able to reproduce. These discrepancies are clearly visible in the difference map (figures in the fourth column), which can be attributed to the fact that there are hardly any of the land features visible in the conditioning data (figures in the first column) were provided as input to the DS. The SD maps (figures in the second column of Figure S2, Supporting Information) show that the maximum SD values occur in many areas across the entire basin but predominantly in the southern part. We observed that these values are often collocated in the water bodies that are spread across the entire basin and where SMOIS is specified with a zero value.
4.1.3. Latent Heat Flux
 The DS simulation results for the spatial variation of the LH are shown in Figure 6 and retaining the same layout as Figures 4 and 5. By comparing the simulation response in the second and third columns, it can be observed that the DS results are able to capture the spatial variation in LH for all of the seasons. For instance, the higher values of LH on the eastern coast are very well reproduced by DS in the summer, autumn, and spring seasons. It can also be seen that LH is high in the southeast part of the basin in the summer and spring seasons. During the winter season, very low values of LH are observed everywhere in the basin, which is again captured by the DS runs (second and third columns of row 3). Note that the color bar is different in the winter. The difference map (fourth column) shows that, throughout the basin, there is a wide range of differences between the mean values of downscaled LH from 50 realizations and the reference values. However, during the winter season, the difference is relatively uniform over the entire basin. The SD maps (figures in the third column of Figure S2, Supporting Information) show SD patterns that are similar in all seasons except winter, when the entire basin has small and almost constant SD.
 The areas in the difference map where the highest errors are observed correspond to locations where the coarse variable patterns are not informative enough to characterize the smaller scale processes. For example, in Figure 4, the lower part of the domain presents high LH values for spring and summer on the fine-scale model, whereas the corresponding coarse scale is mostly featureless. The coarse scale is therefore poorly informative, and the information necessary for downscaling is simply not sufficient to identify the high LH values. This results in the observed high estimation bias (high difference map) and is confirmed by the high values also present for the same locations in the standard deviation maps.
4.2. Quantitative Measures of Error
 As mentioned earlier, although the simulation was run for 50 realizations to demonstrate the efficacy of DS, only results from a single realization were presented. In addition, all the downscaled values from 50 realizations were plotted against the single reference values for summer and winter in the scatter plots presented in Figures 7 and 8, respectively. The average values of the root-mean-square error (RMSE), bias, correlation coefficient (R2), and SD of downscaled values from all 50 realizations are presented in the figures. From the first row of Figure 7, it is clear that in the case of TSK and LH, all points lay around the reference line of slope 1 and intercept 0. However, there are a significant number of points spread away from the reference line for the SMOIS plot. The scatter plots show that the downscaled values are well correlated with the reference WRF datasets, with a small number of outliers. From Figures 7 and 8, it can also be observed that for TSK and SMOIS, the scatter remains quite consistent across the different seasons. However, the scatter plot for LH changes over the seasons. In summer, the plot shows wide scatter, indicating too much spatial variation in the values of LH, which is also visible in Figure 6. The results for spring and autumn show similarly shaped distributions but different values (see Figures S3 and S4, Supporting Information). The scatter in LH is very low during the autumn and winter seasons. A wide scatter was observed in LH during spring, which can also be verified from Figure 6. The RMSE of TSK and SMOIS are almost the same in all seasons. For the LH, we observe variations in the RMSE and also in the biases over different seasons. These can be attributed to the fact that latitude and longitude may not be entirely sufficient to represent the nonstationarity, resulting in inadequate application of the method for certain locations. Additional information such as distance to the sea might capture the nonstationarity better. Another consideration is that the parameters of the DS have been adjusted for all seasons together. To address this, one could undertake a sensitivity analysis to establish the best weights and the distance threshold parameters to use for each season: an issue that requires further investigation and is the topic of current work.
 The histograms of errors are presented in the lower part of Figure 7, which are estimated by subtracting the reference values from the mean of 50 realizations. The errors are centered around 0 and mostly unbiased. For TSK, the errors can reach up to 5 K, whereas the errors for SMOIS vary between 0% and 2%. The errors in LH are of higher magnitude, reaching up to 50 W/m2. These errors fall within the expected range previously reported in terms of RMSE as 1–8 K for TSK [Ferguson and Wood, 2010; McCabe et al., 2008b; Wan et al., 2002], 3–7% for SMOIS [Drusch et al., 2004; Liu et al., 2011; McCabe et al., 2005a] and 20–100 W/m2 for LH [Kalma et al., 2008; Kalma, 2008; Kustas and Norman, 2000].
 In Figure 9, the correlations coefficients (R2) between reference and downscaled variables are presented for all the realizations and for all seasons. The values of R2 are generally high, although there are notable differences across variables and seasons. The value of R2 is more than 0.98 for TSK in all examined periods. For SMOIS and LH, the correlation is not as high but remains consistently above 0.8. The lowest R2 values are observed in summer SMOIS (0.791) and winter LH (0.855), which is consistent with previous figures.
 Correct reproduction of the spatial continuity is the most important validation criterion for downscaling. To evaluate this, we compare the experimental variogram of the reference downscaled variables with the experimental variograms from all 50 realizations. These are shown in Figure S5 (Supporting Information). The first feature to note is that the variograms of the downscaled models exhibit a good similarity with those of the reference. Another aspect of interest is that these variograms are unbounded (i.e., they do not stabilize at the variance value), meaning that the variables considered are nonstationary. This further reinforces and motivates the necessity of using specific techniques to address nonstationarity, as detailed in section 3.1.
4.3. Relationships Between Downscaled Variables
 The accuracy of the multivariate dependencies between variables is examined using scatter plots for both WRF reference variables and downscaled variables, with results presented in Figures 10 and 11 for the summer and winter seasons, respectively. The results for spring and autumn are shown in Figures S6 and S7, respectively (Supporting Information). The first row of each figure shows the downscaled variables, while the second row plots the WRF reference variables. In Figure 10, comparison of the scatter plots between downscaled and reference values demonstrate that the three variables (TSK, SMOIS, and LH) present nonlinear relationships with each other, due to the complex interactions between the land and atmosphere. The reproduction of the various nonlinear dependencies is excellent: a consequence of the multivariate capability of the DS method considering patterns between all of the variables. Note that there are known analytical relationships between the different variables LH, TSK, and SMOIS in the literature; however, our approach reproduces them not by physical modeling but by data driven, nonparametric sampling. The scatter plots vary over different seasons (see Figure 11 and Figures S6 and S7 in the Supporting Information) due to variation in the magnitudes of TSK, SMOIS, and LH. However, the reproduction of nonlinear dependencies remains excellent in all the cases.
 Downscaling of climate model data with the DS method was demonstrated through downscaling of WRF regional climate model outputs to reproduce seasonal values of temperature, SMOIS, and LH from 50 km resolution to a grid of 10 km over the MDB in southeast Australia. The DS algorithm samples patterns in a training image conditionally to different covariates that are available at a coarser resolution. The training image is based on 20 years of data (1985–2005) at both coarse and fine scales. For each season, all of the variables are considered simultaneously in the training image. Given this rich inventory of spatial patterns, the DS approach is then used for downscaling for the following year 2006 but using only the coarse data. The nonstationarity in spatial features is considered in the simulation by providing the locational information in the training image. The spatial relationships among variables are not assumed to be the same over the different seasons, and the results show that these varying dependencies are correctly reproduced.
 The downscaled results from DS show excellent agreement with the spatial distribution of WRF reference variables at a fine scale for the TSK and LH across all seasons. The downscaled results for SMOIS show that DS is not able to reproduce some of the small-scale land surface features, especially over water bodies such as canals, lakes, and reservoirs. It is expected that this could be improved by using a land/water mask. The error analysis of the downscaled variables for all seasons and correlations among the variables show the potential of our approach in simultaneously producing statistical downscaling of multiple variables. The results are presented for one single realization of 50 possible realizations over which DS runs were performed. Acknowledging the fact that, all realizations are equiprobable, the multiple realizations are a representation of the uncertainty associated with the downscaling: a feature that other methods are generally unable to provide. Understanding the uncertainty in these reproductions is critical information for using the downscaled forcing in applications such as rainfall-runoff models, groundwater models, or other operational systems designed to characterize hydrological and environmental response.
 Apart from providing an explicit capacity to represent uncertainty in the downscaled product, the DS approach also presents an advantage of reduced computational cost compared to RCMs. One full simulation of an RCM takes approximately 6000 CPU hours, while DS takes about 48 hours of CPU time for 50 realizations. The DS approach developed here also has a wide scope of application in the area of remote sensing and catchment studies, especially in feature sharpening or interpretation through image fusion [Agam et al., 2007; Renzullo et al., 2008], and gap filling in spatial data collected from satellites [Mariethoz et al., 2012].
 One limitation of the approach is that training images must be available that contain both coarse and fine-scale variables. Building such training images requires that each coarse variable to be downscaled needs to be informed at the fine scale for some previous time steps. In our case, this information derives from model runs corresponding to previous time steps, but in practical applications, it could be beneficial to use remote sensing measurements. Such application with real-world remote sensing data is a focus of ongoing research. Another area of further research will examine the utility of this method for downscaling GCM outputs to local scales, based on observations at a daily time resolution. The methodology will be applied at the catchment scale, with the aim to downscale GCM outputs to fine-enough resolutions to meet the forcing data requirements of surface water-groundwater models. Such a study will enable an evaluation of the effects of climate change on groundwater resources. In evaluating future climate scenarios, it will be important to consider seasonal variability associated with extreme conditions responding to El Niño and La Niña occurrences. Examining the capacity of the DS approach to incorporate and reproduce such internal dynamics will allow a more robust representation of the observed system.
 This work was supported through a postdoctoral research fellowship as part of the National Centre for Groundwater Research and Training (NCGRT), a joint initiative between the Australian Research Council and the National Water Commission.