4.1. Time Stability of Model Parameters
[23] To analyze the general temporal behavior, the spatial averages of the model parameters for all catchments under study were analyzed. Figure 4 shows the spatial average of the snow correction factor SCF, the degreeday factor DDF, the maximum soil moisture storage FC, and the nonlinearity parameter B plotted against the calibration period. There are significant time trends. The snow correction factor SCF decreases with time. In the recent warmer years, less precipitation falls as snow, so one would expect the catch deficit of the precipitation gauges during snowfall to be smaller, which explains the decreasing trend of SCF. The degreeday factor DDF also tends to decrease with time. The average DDF in all catchments, calibrated to runoff data from the late 1970s, is about 1.8, while it is about 1.65 if calibrated to the most recent period. A possible interpretation of this is that the snowpacks, accumulated in the winter, tend to be larger in the colder years and snow melt starts later in spring. Rainonsnow events are more likely, and more radiation is available for melting snow and ice. The melt rates (relative to air temperature) then tend to be larger, which explains the larger DDF in the late 1970s. No trend in the DDF values is found for the drier catchments, although the mean values vary strongly between the different periods. The runoff regimes in the drier catchments, which are mainly located in the flatter eastern part of Austria, are rainfall dominated, and rainonsnow events are less frequent. The occurrence of single rainonsnow events in the different calibration periods therefore influences the calibrated DDF values, which increases the observed temporal variability of DDF. The average values of the maximum soil moisture storage capacity FC strongly increase with time. Evapotranspiration would be expected to increase because of the significant increase in air temperatures. The interpretation of this is that the soils tend to be drier, and on average, more water can be stored in the soils. This is reflected by larger calibration values of FC in the more recent warmer years. Similarly, the nonlinearity parameter B increases with time. B doubles from about 3 to almost 6 in the 3 decades. This increase is apparently related to more linear runoff generation and a lower fraction of rain that becomes runoff in the more recent years (Figure 2). This is plausible as the drier catchments have larger values of B (Figure 4), so as the catchments get slowly drier, B increases. The spatial coefficients of variations of the four model parameters are around 0.15, 0.3, 0.6, and 1.1 (not shown here) and do not change much in time. Also, the trends are consistent between all three catchment groups in Figure 4. This suggests that the trends of the spatial averages shown in Figure 4 are representing changes in the hydrological conditions, taking place in most catchments, and are not an artifact of parameter uncertainty due to calibration. The other parameters (TM and LS_{UZ}) do not show any significant temporal trend (not shown here), while the parameter LP describing the limit for potential evaporation slightly decreases with time. The latter trend is related to increases in the evapotranspiration over the years.
[24] It is now of interest to understand whether these trends can be explained by climatic variability. For each catchment, the temporal correlation between the calibrated model parameters and one of a number of hydroclimatic indicators was estimated, each of them representative of one 5 year period. This means that six data points were used in each regression. As the model parameters and the climate indicators of the six calibration periods are not necessarily normally distributed, the Spearman rank correlation coefficient r_{s} was used here to measure the dependence of the model parameters on the climate indicators:
where rk(x_{i}) is the rank of x_{i}, where the highest value has rank 1 and the lowest value has rank n. Spearman's r_{s} varies between −1 and 1, where −1 represents a completely negative correlation and 1 represents a completely positive correlation. Completely uncorrelated pairs of data have a Spearman's r_{s} of 0. The spatial variability of these temporal correlation coefficients was then plotted as BoxWhisker plots in Figure 5. The hydroclimatic indicators are mean annual precipitation (P), mean annual air temperature (Temp), mean annual potential evapotranspiration (PET), mean annual runoff (Q), and the mean annual ratio of runoff and precipitation (Q/P). On average, the snow correction factor SCF is negatively correlated to the mean annual values of precipitation, temperature, and PET, with correlation coefficients of −0.25 to −0.5, and positively correlated to Q and Q/P, with correlation coefficients of up to 0.6. A similar trend of a negative correlation to precipitation, air temperature, and PET and a positive correlation to Q and Q/P occurs for the degreeday factor, while the trend is opposite for the maximum soil moisture storage capacity FC and the nonlinearity parameter B. The spatial median of the correlation coefficients of FC and B to precipitation is around 0.3, while it is around 0.5 for air temperature and potential evapotranspiration. It is interesting that FC and B are positively correlated to precipitation. This is because of the increasing trend in precipitation (Figure 2), which corresponds to the increasing trends in FC and B.
[25] There is a large variability in the correlations between the catchments for the same parameter–climate indicator combinations. This large variability may be partly related to parameter identifiability issues [Beven and Binley, 1992; Montanari, 2005]. The large variability may also be related to differences in the hydrological processes in the catchments. For example, for most catchments the B parameter is positively correlated with air temperatures because warmer years have less runoff because of increased evapotranspiration. However, there are some catchments where the correlations are negative. These are mostly catchments in the alpine parts of Austria where snow or glaciers play an important role. In these catchments, years with aboveaverage air temperatures are associated with aboveaverage runoff, which then translates into negative correlations between air temperature and B. Overall, the grey range (25%–75% quantiles) of most of the BoxWhisker plots in Figure 5 is either completely positive or negative, suggesting that most model parameters are indeed meaningfully correlated to the climate indicators.
[26] To provide more insight into the plausibility of the temporal trends of the model parameters, independent data sets were analyzed. In Figure 6 (top) the average calibrated DDF and the average percentage of rainonsnow days are plotted against the calibration period. A day was considered a rainonsnow day when more than 30% of the catchment area was covered by snow (as estimated by the snow depth data), air temperature was above 1°C, and precipitation was more than 1 mm/d. The percentage of rainonsnow days is therefore information independent of the model results. The percentage of rainonsnow days tends to decrease in the warmer, more recent, periods, which is consistent with the decrease of the DDF, as one would expect larger DDF on rainonsnow days than on sunny days because of the latent heat and long wave radiation. For the drier catchments (Figure 6, top right) the trends are not consistent, but here the DDF is not well defined as there is little snow in these catchments.
[27] The change in runoff generation over the years, indicated by a trend in the B values of the model, is compared with an independent analysis of event runoff coefficients. Merz et al. [2006] back calculated event runoff coefficients from hourly runoff data, hourly precipitation data, and estimates of snowmelt. This means their analysis is different from the one in this paper in terms of the model used (event model versus continuous model) and in terms of the time scale of the data (hourly versus daily). In this paper we adopted the methodology of Merz et al. [2006] to estimate event runoff coefficients for a total of 39,700 events in the period 1976 to 2006 and averaged them for the same 5 year periods used here. The mean event runoff coefficient (Figure 6) decreases from 0.42 for the period 1976–1981 to 0.38 for the period 2001–2006 for all catchments under study, while the mean B value increased from 3.2 to 5.2. The interpretation of this is that, due to increasing mean air temperatures, evapotranspiration has increased and catchments have become drier. More rainfall can be stored in the soils, so a smaller portion of rainfall contributes to direct runoff. The model accounts for this change in runoff generation by a change in the calibrated values of the B parameter.
[28] The spatial mean of four routing parameters, the three storage coefficients K_{0}, K_{1}, and K_{2} and the parameter C_{P} controlling the percolation to the lower zone, are plotted against the period of calibration in Figure 7. K_{1} and C_{P} slightly decrease, while for K_{0} and K_{2} no trend is apparent. The decrease in K_{1} implies that the hillslope runoff response has slightly accelerated in recent years (9.5 as opposed to 11 days). The decrease in C_{P} in the most recent period implies that the groundwater recharge has decreased (1.5 as opposed to 2 mm/d), which is consistent with the lower catchment soil moisture to be expected in a warmer climate. The spatial coefficient of variation of the routing parameters is around 0.26, 0.3, 0.45, and 0.4 (not shown here) and does not change much in time, which indicates that the trends of the spatial averages in Figure 7 are consistent for most catchments.
[29] The spatial medians of the correlation coefficients between the routing parameters and the climate indicators vary from −0.25 to 0.1 (Figure 8). These are much smaller values than those for the four soil moisture parameters, which have values of up to 0.6 (Figure 5). While the parameters of the soil moisture routine are obviously linked to changes in soil moisture over the last 30 years, driven by climate, the relationships for the routing parameters are less clear. This would be expected, as runoff routing is mainly controlled by the topography, the river network, geology, and soil type and, to a lesser degree, by the soil moisture state.
4.2. Trading Space for Time
[30] It is interesting to link the temporal relationship between model parameters and climatic forcing to the spatial variability of the model parameters for a given time period. If the increase in, e.g., B values for the last 30 years is caused by higher air temperatures, a similar change of the B values should be found in space if one moves from cold to warm catchments for the same time period. This is the idea of trading space for time. To analyze this, the spatial Spearman rank correlation coefficients of the model parameters and climate indicators were calculated for each of the six calibration periods separately, and the temporal variability over the six periods is given as BoxWhisker plots in Figures 9 and 10. Furthermore, to provide a general idea of the controls on the model parameters, the spatial correlation coefficients between model parameters and catchment attributes that do not change with time are also given. Merz and Blöschl [2009b] term this type of catchment attributes “static” to reflect their temporal stability within the time scale of the analysis. There are, of course, longerterm interactions of climate and these catchment characteristics related to landform and soil evolution [Merz and Blöschl, 2008a, 2008b], but they are considered small for the 30 years considered here. The static catchment attributes are catchment area, mean elevation, river network density (RND), average topographic slope, and, as the forested area has only slightly increased in the last decades [Jonas et al., 1998], percent forest cover.
[31] For most model parameters and climate indicators the spatial correlation coefficients are, on average, in a range similar to or slightly higher than the temporal correlation coefficients. For example, the mean temporal correlation coefficient between B and air temperature is about 0.5 (Figure 5), while the mean spatial correlation coefficient is about 0.7 (Figure 7). This may be attributed to the larger range of climate indicators in space than in time. The mean annual air temperatures have increased by almost 2°C during the time period of this analysis, while the mean annual air temperatures range from about 0°C in the high Alps to more than 10°C in the eastern flatlands, i.e., a difference of more than 10°C. Similarly, mean annual precipitation has increased, on average, by about 100 mm/yr, while the differences in space are more than 2500 mm/yr (400 mm/yr in the east and 3000 mm/yr in the west). The temporal variability of the spatial correlation coefficients is rather small, i.e., the spatial correlations do not change much between the six calibration periods. The main regional patterns of the catchment characteristics are therefore represented rather well by the parameters. The distinct regional patterns of the parameters in Austria are consistent with the results of Merz and Blöschl [2004], who compared the calibrated parameters of two 11 year calibration periods for the same catchments as here (their Figures 4–7). For example, for both calibration periods, B values were high in the eastern flatlands and low in the alpine environment of the west, which is also borne out in the regional calibration of Parajka et al. [2007b].
[32] The spatial correlations of SCF, DDF, and B with most climaterelated variables (air temperature, PET, runoff, and runoffrainfall ratio) are consistent with the corresponding temporal correlations. SCF and DDF are negatively correlated with temperature and PET and positively correlated with runoff and the runoffrainfall ratio, both in the temporal (Figure 5) and spatial (Figure 9) correlations. Similarly, B is positively correlated with temperature and PET and negatively correlated with runoff and the runoffrainfall ratio, both in the temporal (Figure 5) and spatial (Figure 9) correlations. This lends additional credence to the interpretations above. However, for some parameters and climate indicators the temporal and spatial correlations are inconsistent, but at least part of this inconsistency can be explained on hydrological grounds. For example, the temporal correlation of FC and air temperature is positive; that is, with increasing mean air temperature in the more recent years, the maximum storage capacity tends to increase. In warmer years, more soil moisture evaporates, the soils are drier, and more water can be stored, resulting in a higher maximum storage capacity. In the spatial correlation analysis, FC and air temperature are negatively correlated, meaning that FC tends to increase as one moves to colder catchments. This negative correlation can be explained by the regional pattern of FC [see, e.g., Merz and Blöschl, 2004, Figure 5]. In Austria, FC tends to be high in the warm and dry flatlands of the east and in the dry and cold inneralpine catchments in the west. The lowest FC values are found in the wet catchments at the northern rim of the high Alps, where orographic effects enhance precipitation. The mean annual temperature of this region is between those of the high Alps and the flatlands of the east. As more inneralpine catchments than warm flatland catchments are included in the spatial analysis, the correlation of FC and temperature is negative. A similar reasoning may explain the differences in the correlations of other parameters and other climate indicators.
[33] For the snow routine (SCF and DDF) and runoff generation (FC and B) parameters, the correlation coefficients with climate indicators are higher than those with the static catchment attributes such as RND and catchment area. This suggests that the variability in snow and runoff generation processes at the scale of Austria is more strongly controlled by climatic variability than by the differences in static catchment attributes. This is in line with the analyses of event runoff coefficients in Austria by Merz and Blöschl [2009a]. They concluded that event runoff coefficients were most strongly correlated to indicators representing climate, such as mean annual precipitation, through controlling the seasonal soil moisture variability. Land use, soil types, and geology did not seem to exert a major control on the runoff coefficients with the data they had available and at the regional scale of the analysis. Clearly, if more detailed soils and geology data were available for smaller catchment scales, one would expect very strong controls.
[34] For the four parameters (K_{0}, K_{1}, K_{2}, and C_{P}) of the routing component, the correlation coefficients to climate indicators are always small and on the same order or slightly lower than those of the static catchment attributes (Figure 10). This is consistent with the results of Merz and Blöschl [2004], who found the highest correlations of K_{0}, K_{1}, K_{2,} and C_{P} with the mean slope, the river network density, the percentage of tertiary and quaternary geological units, and catchment area, respectively. For a similar model type and 11 catchments in Sweden, Seibert [1999] found the best correlation of the flow routing parameters with catchment area, lake percentage, and forest percentage. Interestingly, in Figure 10 there is a consistent correlation between the percolation rate C_{P} and percent forest cover, which would be expected because of more permeable soils in the forest than for other land uses.
4.3. Implications for Climate Impact Analysis
[35] The previous analyses have shown that some of the calibrated parameters consistently vary with time and can be related to climate fluctuations. Given these parameter changes, an obvious question now is whether they matter for hydrologic prediction. The effects of the parameter changes on three flow indicators are analyzed here: the Q_{95} lowflow quantile (the discharge exceeded 95% of the time), the Q_{50} median flow, and the Q_{5} highflow quantile that is exceeded 5% of the time. Q_{95} is widely used in Europe and is relevant for numerous problems in water resources management [e.g., Smakhtin, 2001; Laaha and Blöschl, 2007]. The median is a measure of the water balance. Q_{5} was used instead of peak discharges as it can be more robustly estimated from the 5 year periods used here.
[36] The error of runoff predictions consists of two parts. The first stems from the imperfect fit of the model to the runoff data during the calibration period and is mainly related to data and model structure errors [Di Baldassarre and Montanari, 2009]. The second arises when moving from the calibration to the verification (or any other) period and is mainly related to less than optimum parameters. Verification performance therefore tends to be lower than the calibration performance [see, e.g., Merz et al. 2009]. Assuming that the model parameters remain constant with time is expected to increase the second part of the error if nonstationarities are present. Both errors are shown in Figure 11. The dotted black lines in Figure 11 show the average of the errors in the calibration period (i.e., the first part of the error due to an imperfect calibration):
where is the simulated flow characteristic in period i using the parameters calibrated for the same period, and is the observed flow characteristic in period i. For example, the last segment of the dotted black line in Figure 11 (top right) shows 0.08, which is the relative difference of the simulated and observed Q_{5} for the period 2001–2006 using the parameters calibrated for 2001–2006. The thick solid black lines in Figure 11 show the average of the errors in verification period i (i.e., both parts):
where is the simulated flow characteristic in period i using the parameters calibrated for a different period j. The horizontal axes of the panels relate to the periods i, while the different rows relate to different periods j. For example, the last segment of the thick solid black line in Figure 11 (top right) shows 0.30, which is the relative difference of the simulated and observed Q_{5} for the period 2001–2006 using the parameters calibrated for 1976–1981. The spatial averages of the wetter and drier catchments are shown as solid and dashed grey lines, respectively, as in Figure 2.
[37] Figure 11 indicates that there are significant errors in the simulated flow quantiles if one assumes that the model parameters are time stable. When using the parameters calibrated to the period 1976–1981 for predicting the flows in 2001–2006, the Q_{95} low flows are overestimated by about 12%, Q_{50} is overestimated by about 15%, and the Q_{5} high flows are overestimated by about 35% (Figure 11, top, thick solid black lines). The parameters from the period 1976–1981 represent colder periods with less evapotranspiration and relatively higher runoff generation rates (lower B values; see Figure 4) and smaller soil moisture storages (lower FC values; see Figure 4). The model hence produces too much runoff when applying the parameters to the drier period 2001–2006. Similarly, using parameters calibrated to recent periods for simulating flow in the earlier periods tends to underestimate the flow characteristics (Figure 11, bottom). As expected, the differences between simulated and observed flows increase with increasing time lag between the period used to calibrate the parameters and the period for which flows are simulated.
[38] Interestingly, the differences between simulated and observed flows tend to be smaller for the Q_{95} low flows and the Q_{50} median flows than for the Q_{5} high flows. Highflow situations may differ substantially between the different calibration periods. Apparently, the model only represents high flows well if it is calibrated to a period in which similar highflow conditions were observed, thus reducing the model performance as one moves away from the calibration period. Mean flows and low flows are more stable in time, so mean and lowflow conditions of one period may be more representative of other periods [Laaha and Blöschl, 2005]. Low flows in the Austrian lowlands are caused by long periods of no or little rainfall in summer, while low flows in the Alps are caused by snow and freezing processes in winter. These processes are associated with longer time scales than flood processes, and aquifer storage additionally increases time scales [Skøien et al., 2003]. This explains the lower time dependence. Also, errors in the input rainfall data will be less important in periods of no or little rainfall.
[39] To analyze the predictive errors due to the time lag between the calibration and prediction periods in more detail, the cumulative distribution function (CDF) of the absolute differences between simulated and observed flows have been plotted in Figure 12. The absolute errors of the Q_{50} median flows for time lags of 0, 5, 15, and 25 years are at least 1%, 5%, 9%, and 16%, respectively, for half the catchments (CDF = 0.5). The errors of low and high flows are larger as it is more difficult to represent the extremes well. As expected, the absolute errors are smallest for a time lag of zero, as this reflects the calibration case where the model performance is better than in the validation case. For all three flow indices, Q_{95}, Q_{50}, and Q_{5}, the errors increase with increasing time lag. This means that the longer one extrapolates to the future (or the past), the more the model errors will increase. Only for the Q_{5} high flows are the errors of 20 and 25 years similar. In the case of the low and high flows, the contribution of the calibration error (6% and 9%, respectively) to the total error is much larger than in the case of the median flows (1%) (CDF = 0.5). This is partly related to the choice of the objective function that gives more weight to median flow than the high flows and partly to the time scales. However, in all instances the error due to the time trends is very important. The differences in the large errors are even larger, i.e., if one examines the errors exceeded in one quarter of the catchments (CDF = 0.75), as indicated in Table 2.
Table 2. Absolute Values of the Relative Validation Errors (%) of Simulated Low (Q_{95}), Median (Q_{50}), and High Flows (Q_{5}) for Different 5 Year Periods, When Time Stability of the Parameters Is Assumed^{a}T (years)  CDF = 0.50  CDF = 0.75 

Q_{95}  Q_{50}  Q_{5}  Q_{95}  Q_{50}  Q_{5} 


0  6  1  9  10  2  18 
5  9  5  13  15  9  24 
10  10  7  15  16  12  28 
15  10  9  20  18  16  35 
20  11  12  27  19  19  44 
25  14  16  25  25  24  47 