In common climate model bias-correction procedures, temperature and precipitation are corrected separately, thereby degrading the dynamical link represented within the model. We propose a methodology that advances the state-of-the-art by correcting not just the 1D intensity distributions separately but the full two-dimensional statistical distribution. To assess the effectiveness of the proposed method, it is applied to the REMO regional climate model output using point measurements of hourly temperature and precipitation from 6 weather stations over Germany as observations. A standard cross-validation is performed by dividing the data into two nonoverlapping 15 year periods. Results show that the methodology effectively improves the temperature-precipitation copula in the validation period, unlike separate 1D temperature and precipitation corrections which, by construction, leave the copula unchanged. An unexpected result is that a relatively small number (<5) of temperature bins are required to achieve significant improvements in the copula. Results are similar for all stations.
 Although bias correction of climate forcing fields has become a necessary step in climate impact simulations, many recent studies have identified limitations and pitfalls associated with this process. In particular, separate bias correction of dynamically linked fields, such as T&P, may lead to significant responses in some impact models [Hagemann et al., 2011; Chen et al., 2011]. However, the WBC has been applied successfully within large inter-comparison projects to examine the effects on both simulated climate and extreme hydrological events [Dosio and Paruolo, 2011; Rojas et al., 2011].
 In this paper we develop a bias correction method for the full 2D probability distribution function (PDF) of T&P and cross-validate it using a regional climate model together with observations from 6 weather stations over Germany. The cross-validation focuses on the ability of the corrected model output to reproduce the observed T&P copulas. We first discuss the data and methodology. Then the results for the actual model and observed data are presented. We finally conclude with a discussion of the improvements.
2. Methodology and Data
 The observational T&P data was provided by the German Weather Service (DWD) in the form of hourly-mean measurements from precipitation Hellmann-gauge stations at several locations in Germany. T&P measurements were originally recorded by five-minute aggregation and then further aggregated into hourly time steps. Station locations and names are shown inFigure 1. The station datasets span a 30 year period from 1950 to 1980. The first 15 years will be used for the calibration of the bias correction parameters while the last15 years will be used for the cross-validation. We stress that a 30 year period may not be enough to represent climate time scale changes in the bias.
 Model data was provided by hourly average T&P intensity from a 20th century time slice simulation, hence model dates are purely nominal. The adopted model is the Max Planck Institute for Meteorology Regional Climate Model (REMO) with 10 km resolution [Jacob et al., 2012], driven by ECHAM5-MPI-OM C20_1 run for the IPCC 4th assessment report [IPCC, 2007]. For each gauge station the corresponding model data are taken from a 4 × 4 grid of the nearest neighbors to the locations of the hourly rain gauge measurements.
3. Bias Correction Methodology
 The methodology presented here is the natural extension into 2 dimensions of the WBC. Here we will give a brief description of the 1D methodology and then describe the 2-dimensional extension.
 Consider two time series Po and Ps, where superscripts ‘o’ and ‘s’ stand for ‘observed’ and ‘simulated’ respectively. We will assume that ‘P’ stands for precipitation. We will also assume that the time series are of equal length: Poi and Psi, with i = 1, 2..... N, where Nis the total number of data points. If this is not the case, it is straightforward to obtain two time series of equal length by random sub-sampling. This is an acceptable procedure since it is only the intensity spectrum and not the temporal structure of the time series that matters. The goal of the bias correction effort is to identify a transfer function (TF) such that the new time series Pci, where ‘c’ stands for ‘corrected’, obtained by Pci = TF(Psi), has an intensity histogram as close as possible to the intensity histogram of the observed time series, Po.
 The TF can be readily obtained by sorting the Pio and Pis time series according to intensity from the lowest to the highest values. If the observed time series is plotted against the simulated one, the points on the emerging graph can be joined to obtain a piecewise continuous TF, with the aforementioned characteristics (Figure 2).
 The thick red line in Figure 2 is generally referred to as the perfect transfer function (PTF) because, if applied to modeled values, the resulting Pic time series would, by construction, have exactly the same statistical characteristics of the observed time series Pio. This is a trivial result and does not loan any value to the PTF or any close fitting interpolation obtained at the expense of a high number of parameters. Transfer functions with a high number of parameters perform well, by construction, when applied to model output in the calibration period. However they are likely to do poorly in the cross-validation period because of the variation of the climate bias over long time scales. To minimize the number of parameters in our TF we chose a linear fit (dashed lineFigure 2). Whether or not one chooses higher order TF's, as in Piani et al. , depends on the quantity of data available to constrain the parameters.
 When correcting both T&P, a common approach is to correct the two fields separately. This potentially downgrades the representation of the dynamical link between the two. Ideally, a bias correction method for T&P should correct the two-dimensional intensity histogram as a whole. Also, the merit of a two-dimensional bias correction method should be measured in terms of the added value relative to the separate correction approach. Finally, the proper tool for distilling the added value of the fully 2D-approach from the 2D-histogram is to examine the copula of the corrected, simulated and observed fields.
4. Brief on Copulas and Two-Dimensional Bias Correction
 A copula is a joint cumulative distribution function with uniform marginals. The derived PDF is referred to as copula density (CD). Given a 2D-PDF of T&P, the 1D-PDF of temperature, obtained from the same dataset by integrating over all precipitation values, is the temperature marginal. If the 2D histogram is the best estimate of the 2D PDF, the best estimate of the CD is extracted simply by substituting every value of T&P with their intensity rank value and dividing by the size of the sample. For instance, if we have 5 temperature values, [296 K, 293 K, 292 K, 294 K, 295 K] becomes [5, 2, 1, 3, 4]/5. A rigorous derivation of copulas is given in byLaux et al. .
 The first step in the 2D bias correction is to apply the standard 1D WBC separately to model temperatures. Next the T&P pairs are grouped into temperature intensity quantiles. Finally a standard 1D WBC for precipitation is carried out within each temperature quantile. This methodology is more readily illustrated than described.
 To illustrate the methodology it was applied to a synthetic T&P dataset. Figure 3ashows two histograms derived from synthetic 2D T&P datasets. The dashed color-filled contours represent the simulated T&P histogram while the solid contours represent the observed T&P histogram. The distribution shown inFigure 3e is the CD extracted, as explained above, from the synthetic observed T&P data set used to derive the 2D histogram shown in Figure 3a (solid non colored contours). Figure 3e reveals properties of the dynamical link between T&P far more clearly than Figure 3a. For example, there is a clear discontinuity between the median temperature in the top 2 versus bottom 3 precipitation quintiles (Figure 3e). One may perceive that there is a difference between the temperature medians in high versus low intensity precipitation from Figure 3a but the abrupt nature of the discontinuity is evident only in Figure 3e.
 The CD extracted from the simulated T&P data set (not shown) is flat, that is, there is no link in the simulated T&P. Figure 3b differs from Figure 3a in that the simulated, colored, 2D histogram has been bias corrected using linear 1D bias corrections separately for T&P. Close inspection of Figure 3b will reveal that the mean and variance of the colored histogram is closer to the solid contour histogram for both T and P axes. Of course skewness and other higher order moments have not changed, since we have applied a linear correction only. The CD of the corrected data set in Figure 3b is still flat although the colored histogram in Figure 3b is a much better representation of the observed data set. Figure 3c is, again, obtained by correcting the 1D marginals of T&P separately but, unlike for Figure 3b, using the PTF instead of a linear interpolation. Simply put, this means that the corrected data set, color filled contours, has marginals that exactly match the marginals of the observed data set, solid black contours.
 As in Figure 3b, the CD extracted from the corrected data set is flat (not shown). Two bold horizontal lines are traced across Figure 3c. Careful visual inspection will show that along both lines, representing different precipitation values, the median temperature is the same. Finally, in Figure 3d, the full 2D bias correction is applied. In this case 10 quantiles are used. Visual inspection along the two red lines, representing the same precipitation values as in Figure 3c, shows that now the median temperature changes with precipitation and, hence, that the corrected data set finally presents a link between T&P. Figure 3f shows the CD of the corrected data from Figure 3d. Here the limits of the quantiles, which are by construction evenly spaced, are also shown as vertical lines. This is done purely to give an intuitive understanding of the concept of CD and uniform marginal. Finally, now that we have applied a fully 2D bias correction, we obtain some structure in the derived CD. The similarities between the two CDs, corrected and observed, are encouraging since we applied a simple linear correction.
Figure 4shows the results of applying the 2D bias correction, in a cross-validation setup, to the Aachen station T&P time series and the corresponding REMO dataset, sub-sampled as explained in the data section.Figure 4a is the 2D histogram obtained from REMO T&P output and Figure 4b is the corresponding CD.
 Unsurprisingly the CD of simulated data does have some structure, in particular a noticeable ridge along the X = Y diagonal explains the realistically high T-P correlation in the model data.Figure 4c is the 2D histogram calculated from simulated and separately corrected T and P. Clearly the linear and separate 1D bias correction of T&P does a good job since the 2D histogram in Figure 4c is much more similar to the observed 2D histogram in Figure 4g than the original in Figure 4a. However, because the corrections are done separately, the corresponding CD, shown in Figure 4d, has exactly the same structure as the original shown in Figure 4b.
 A full 2D bias correction is now applied to the REMO data and the results are shown in Figure 4e. Five temperature quantiles were used. The 2D histogram shown in Figure 4e is not much of an improvement relative to that obtained through separate 1D corrections shown in Figure 4c. The significant improvement is evident only in the CD shown in Figure 4f. In particular the prominent narrow maximum at the top right corner, depicting high temperatures associated with high precipitation events, and the much broader maximum at the opposite extreme, depicting low to medium temperatures associated with low to moderate precipitation, are now present. The Aachen station was chosen simply because it is the first in alphabetical order, all other stations produce similar results (not shown). It is worth noticing that both the simulated and the observed CDs show a dynamical link between T&P and that such links are in the same direction, that is, higher temperatures are associated with higher precipitation intensities (r > 0, where r is correlation). However the nature of the CDs is far more complex than can be described by correlation alone.
 Precipitation and temperature are dynamically coupled as described by the Clausius-Clapeyron relation. It is the subject of recent [e.g.,Allen and Ingram, 2002; Allan and Soden, 2007; Lenderink and van Meijgaard, 2008] and current research, whether such increases in moisture holding capacity directly relate to increases in precipitation and at which observational time-scales [Haerter et al., 2010; Berg and Haerter, 2012]. However, significant correlations between the variables surface T&P intensity are found on various scales. When using T&P data - as derived from climate model output - as input for hydrological models, such correlations and higher order statistical links are likely important. However, when a combined correction in P&T is made, it becomes a clear point of concern that this could generate statistics that are not physically consistent with other model variables. Precipitation at high temperatures is bound to yield higher values of subsequent evaporation (i.e. leading to moisture recycling and reduced discharge) than when the associated temperatures are lower. Conversely, low values of precipitation at high temperatures may lead to severe drought - thereby threatening water resources and society - while at low temperatures they may remain inconsequential.
 In this study we developed and tested a full 2D bias correction. The fact that the 2D bias correction successfully corrects the simulated T&P copula in a cross-validation set up, may be unsurprising. It is the natural extension of a 1D methodology that has been tried and tested in the relevant scientific literature. What is surprising is that the benefits of a fully 2D bias correction cannot be appreciated by simply looking at the 2D histograms or calculating the T&P correlation alone. Even when T&P are corrected separately, the two-dimensional histogram may appear very similar to the observed. However, the dynamical link between the two is unchanged. Whenever possible, full 2D bias corrections should be used or, at the very minimum, the differences between the observed and corrected T&P CDs should be analysed.
 We are grateful to the German Weather Service (DWD) for providing the observational data for precipitation and temperature. We further acknowledge the use of model output data from the Max Planck Institute for Meteorology Regional Climate Model (REMO).