4.1 Model-data comparison: estimated and observed station normals
The methods discussed above were evaluated individually in terms of their ability to estimate observed temperature normals at station level. Specifically, the 1961–1990 normals of monthly temperatures were estimated by each method for the 1361 stations shown in Figure 1, and then compared with the observed values. In all cases we used a leave-one-out approach that implies the removal of the station whose monthly normals were being reconstructed, to avoid ‘self-influence’ of the station data under consideration. Only a minor exception to this approach was made for RK due to computational time reasons. Here, the kriging weight of the station to be reconstructed was simply set to 0 and weights of the remaining stations properly re-normalized, while the MLR coefficients and the covariance matrix were obtained from the full station data set.
The results of the comparison between estimated and observed station normals are given month-by-month in Table 1, where the accuracy of each method is measured by the mean error (bias), the mean absolute error (MAE), and the RMSE.
Table 1. Accuracy of the monthly climatologies obtained with the three methods
The three methods show a very small bias, with values within ±0.05 °C in any month. This indicates that none of these methods is seriously affected by systematic errors, at least globally, i.e. when the mean bias over all the stations is taken. On the other hand, the MAE and RMSE turned out to be smaller for LWLR (with monthly averages of 0.65 and 0.84 °C, respectively) than for RK (0.69 and 0.88 °C, respectively) and MLRLI (0.80 and 1.01 °C, respectively).
For the latter method we also studied the error reduction when the residuals are subjected to an additional interpolation method (see Spinoni (2010) for more details on this approach). We used both inverse distance weighting (IDW, with weights given by distance and elevation differences) and kriging, again with the leave-one-out approach; we obtained MLRLI errors slightly smaller than RK errors, although still larger than those of LWLR (results not shown).
We also analysed monthly LWLR residuals in order to check whether we could further reduce errors by applying a spatial interpolation method such as IDW or kriging. We considered the same distance intervals used for the kriging variograms and plotted the semivariance of the temperature residual differences versus distance: these plots (not shown) highlight that the LWLR station residuals do not depend on distance; therefore, it is useless to subject them to spatial interpolation methods such as IDW or kriging.
We additionally tested the importance of station weighting in LWLR by performing an ordinary regression with the closest stations and all weights set equal to 1. In this case the errors turned out to be equal to those of RK, with monthly average MAE and RMSE of 0.69 and 0.88 °C, respectively.
We then tested whether the three methods produce systematic errors at a local level when selected station clusters are considered. To this end, we evaluated the station bias separately for different 1° latitude belts, including (1) all the stations, (2) only those stations with elevation above 800 m a.s.l. Figures 6 and 7 report the bias distribution in January and July, for case (1) and (2) respectively. Figure 6 clearly shows that at some latitude belts MLRLI is affected by a large bias during winter, especially in Southern Italy (positive bias), and to a lesser extent in the Northern part of Central Italy (negative bias). Errors are generally smaller in summer, even though MLRLI again shows the largest errors. Figure 7 (case 2) proves that the large winter bias of MLRLI in the southernmost part of Italy mainly concerns high-elevation stations. For these stations also RK shows a marked bias, whereas the LWLR errors remain small at any latitude belt. Also in this case the differences between competing methods are smaller in summer than winter, and MLRLI again exhibits the lowest performance.
Figure 6. January and July box-plots of the errors of the three methods, clustering the stations within 1° latitude belts. All stations are considered here. The boxes range from the lowest quartile to the highest one and are centred on the median; whiskers represent the minimum and the maximum errors.
Download figure to PowerPoint
The difficulty of MLRLI (and partially RK) in correctly estimating temperature normals of the high-elevation stations in Southern Italy is even more apparent from Figure 8 which shows for any month a bias distribution restricted only to stations with elevation ≥800 m a.s.l. and latitude ≤40°N.
Figure 8. Monthly box-plot of the errors of the three methods from stations with latitude <40°N and elevation >800 m a.s.l. The boxes range from the lower to the higher quartiles and are centred on the median; whiskers represent the minimum and the maximum errors.
Download figure to PowerPoint
Further insights into limitations of a global MLR involving all stations—at the base of both MLRLI and RK—can be gained by comparing the full MLR results with those of an additional MLR restricted to those stations located at a latitude below 44°N. The regression coefficients relative to each independent variable are listed month-by-month in Table 2, for (a) the full MLR and (b) the restricted MLR as above. Here, elevation coefficients (lapse rates) are markedly different, especially during winter. This discrepancy can be explained by the strong influence that Alpine stations have on the full MLR: they are the bulk of mountain stations and cover a wider altitude range than the Apennine stations. The contribution of the Alpine stations, in particular, leads to weaker than observed lapse rates during winter in large parts of Italy, thereby producing the MLRLI positive bias at high-elevation stations in Southern Italy (Figures 6-8). Hence, the lapse rates estimated by MLR over all stations do not reflect the real temperature-elevation relationship across the entire region. Actually, both local improvements and RK after the initial MLR yield a more realistic spatial distribution of the lapse rates (especially for kriging), albeit these subsequent re-adjustments cannot completely remove the above bias.
Table 2. Coefficients obtained applying the temperature versus elevation-latitude-longitude MLR to (a) all the 1484 stations of the area and (b) only to the 572 stations with latitude <44°N
| ||(a)|| || || ||(b)|| || || |
|1|| ||−0.440||−1.15||−0.17|| ||−0.628||−0.82||−0.14|
|2|| ||−0.514||−0.91||−0.15|| ||−0.636||−0.66||−0.13|
|3|| ||−0.573||−0.64||−0.11|| ||−0.599||−0.49||−0.06|
|4|| ||−0.611||−0.43||−0.03|| ||−0.567||−0.33||0.02|
|5|| ||−0.606||−0.37||0.04|| ||−0.521||−0.30||0.09|
|6|| ||−0.607||−0.46||0.01|| ||−0.516||−0.37||0.07|
|7|| ||−0.600||−0.59||−0.05|| ||−0.497||−0.39||0.02|
|8|| ||−0.592||−0.66||−0.03|| ||−0.515||−0.46||0.01|
|9|| ||−0.566||−0.67||−0.03|| ||−0.561||−0.53||−0.02|
|10|| ||−0.514||−0.79||−0.07|| ||−0.597||−0.64||−0.09|
|11|| ||−0.477||−1.02||−0.07|| ||−0.608||−0.79||−0.07|
|12|| ||−0.435||−1.20||−0.15|| ||−0.618||−0.86||−0.11|
Similar discrepancies between global and restricted MLR appear in the latitude coefficients as seen in Table 2(a) and (b), and underline the fact that temperature dependence from latitude is also poorly captured by a global, linear approach. The increase of temperature along the north-to-south direction is, in fact, strongly emphasized by the Alps and the northern parts of the Apennines. These west-to-east-oriented orographic barriers cause important climatic discontinuities with strong north-to-south temperature gradients within rather small distances; the Alpine temperature discontinuity is more important in summer, whilst the Apennine one is mainly evident in winter.
The above discussion then suggests that the reason for larger errors in MLRLI (and partially RK) than in LWLR mainly lies in the assumption that the same relations of temperature with elevation, latitude and longitude hold over the entire study area. In fact, in both MLRLI and RK the effect of these variables is estimated globally by means of a unique MLR computed over all the station temperature normals. In previous studies (Hiebl et al., 2009; 2011), the limitations of a unique MLR over complex regions were overcome by splitting the area of interest into smaller, more homogeneous sub-domains and performing a separate MLR for each of them. A drawback of this approach, however, is the frequent artificial discontinuities arising at the boundaries of nearby sub-domains. On the other hand, LWLR here proposed estimates temperature with a local approach using at most 35 stations and avoids these boundary discontinuities.
A final remark concerns the role of major local improvements to MLR in special station clusters, e.g. sea and Po-Plain stations. In the case of the Po-plain effect, restricting the error analysis only to those stations to which local improvement was tailored, both MLRLI and LWLR present a very low bias, on the contrary, RK is affected by an important positive bias in winter (Figure 9(b)). In the case of sea effect, no bias is found for MLRLI, whereas a negative winter bias affects both RK and LWLR (Figure 9(a)), while in summer RK is positively biased and LWLR is almost unbiased. A deeper analysis, however, shows that even though MLRLI is unbiased, it exhibits larger MAEs and RMSEs than the other two methods, and LWLR again performs best in any month (Figure 9(a) and (b)). This is because MLRLI well captures the modelled effect only on average and overlooks its variability from point-to-point. For instance, coastal effects of the Adriatic and Ligurian Seas are modelled by MLRLI in the same way without taking into account the different geographic features of the two regions.
Figure 9. Monthly average errors of the three methods from (a) coastal (stations within the first 1.3 km from the sea) and (b) Po-Plain stations. Filled symbols indicate RMSE, empty symbols indicate BIAS.
Download figure to PowerPoint
In view of the above results LWLR exhibits the best performances at the station level, if we consider both the entire study area and smaller clusters of stations, especially during winter months when the complex orography of the Italian territory strongly influences mean temperatures and gradients. Therefore, since LWLR convincingly reproduces local variations of temperature with elevation, it stands out as the most suitable approach for producing Italian climatologies on a high-resolution grid, as discussed in detail below.
4.2 High-resolution climatologies
A detailed comparison between the climatologies produced with the three methods is illustrated in Figure 10(a)–(d) for January and July. Here, the differences between monthly normals are reported (MLRLI minus LWLR and RK minus LWLR), rather than their absolute values, to emphasize even small discrepancies.
The most pronounced differences appear in the MLRLI-LWLR comparison, where the MLRLI difficulties previously detected in the estimates of station normals are distinctly reflected at the grid-point level. Noteworthy is the large positive difference observed in January (Figure 10(a)) between MLRLI and LWLR climatologies of high elevation grid points in southern Italy, with values that exceed 3 °C and peak at the highest mountains (e.g. Mount Etna).
As seen in Figure 10(b), similar discrepancies, though with smaller values, also appear in the January comparison between RK and LWLR. The RK and MLRLI overestimations at these grid points have their common root in the large error affecting MLR elevation coefficients during winter in Southern Italy, which causes a relevant positive bias at high-elevation sites, as discussed in the previous section (Table 2).
Remarkable winter differences also concern the Po-Plain basin (Figure 10(a) where, as previously discussed (Figure 9), it turned out that LWLR estimates the station normals in a more accurate way than MLRLI does (LWLR has lower RMSE than MLRLI). Indeed, in the latter case a single monthly correction to MLR coefficients for the whole Po-Plain basin hardly captures the complexity of the continentality effect of this area. This can be seen, for instance, in the north-eastern part of the Po-Plain basin (to the north of the Venice lagoon) where the MLRLI grid-point estimates of temperature normals appear to be overly corrected (i.e. negatively biased) by the Po-Plain local improvement. Instead, MLRLI estimations suffer from a positive bias in the area South to the Po river, where a larger correction would be required to account for thermal inversion effects. Likewise, MLRLI normals appear to be under-corrected in the western part of the basin, and especially in some areas not classified as Po-Plain points on the basis of the elevation criteria (Figure 3) but nonetheless affected by the cold air pool which characterizes the Po Plain in winter.
It can be seen again in Figure 10(a) that the MLRLI normals in the coastal areas are either positively or negatively biased with respect to LWLR, where the detailed effect of the sea, especially during winter, is not well described by a global regression performed over the Italian coastal stations altogether.
The unsatisfactory representation of the latitude effects provided by MLR and, in particular, the overestimation of the north-to-south temperature gradient (significantly affected by the temperature discontinuity across the northern part of the Apennines), as discussed in the previous section (Table 2), is clearly reflected in winter climatologies (Figure 10(a)) in the cold biased MLRLI normals of the northern part of Central Italy and the warm biased MLRLI normals of the southernmost part of the country. These biases persist to a lesser degree during summer (Figure 10(c)), when MLRLI-LWLR discrepancies are generally less marked. During summer the MLRLI normals show a warm bias in the northern part of the Alps (Switzerland and Austria), induced by the poor MLR representation of the Alpine temperature gradient (also in this case there is an important discontinuity due to a geographic barrier).
As with the RK-LWLR comparison (Figure 10(b) and (d)), the most pronounced discrepancies can be found again in winter climatologies. Besides the above-discussed high-elevation warm bias in Southern Italy, RK normals are up to 3 °C colder than LWLR normals in the North-Eastern Alps. In this area the valleys are characterized by a very strong thermal inversion during winter, which is not well accounted for by the MLR performed over the full station data set. Hence, the corresponding stations normals have strong negative residuals which are extended to the surrounding points (regardless of their elevation) by kriging. Note that excluding the highest-elevation stations from LWLR, resulting normals of high-elevation grid points in the above region get colder and the RK-LWRL differences gradually reduce. That is, a potentially negative bias in LWRL vanishes with increasing availability of high-elevation data, whereas RK is less sensitive and its bias persists.
Finally, absolute values of gridded monthly normals for LWLR are represented in Figures 11 and 12 for January and July respectively, which represent the coldest and warmest period of the year. Figure 13 shows the LWLR climatology for the yearly average temperature.
4.3. LWLR prediction interval
Besides a better performance in terms of station errors, LWLR has the additional advantage of estimating a prediction interval for any grid point of the considered domain. This estimation was performed as in Daly et al. (2008). The procedure consists in estimating the variance of the temperature (T) of a grid point at elevation h new as
where MSE is the mean square error of the observed station temperatures compared to those obtained with the regression model.
This estimation takes into account both the variation in the possible location of the expected temperature for a given elevation () and the variation of the individual station temperatures around the regression line (MSE). The former term depends on the regression coefficient errors, the second depends on the fact that the temperature versus elevation linear regression on which LWLR is based describes only a part of the variability of the station temperature normals.
Expressing in terms of MSE, station weights [wi , as defined in Equation (12)] and station elevations (hi ), we get:
where i ranges over the stations involved in the grid-point reconstruction.
The problem of defining a prediction interval (with confidence α) for the grid point with elevation h new can easily be solved considering the interval:
where t is the value of a Student distribution with df degrees of freedom corresponding to cumulative probability (1 − α)/2. As in Daly et al. (2008) df was simply identified with the number of stations considered in the regression even though this assumption may have some problems which are discussed in detail by Daly et al. (2008).
In order to get prediction intervals easily comparable with the station leave-one-out RMSE, we selected α = 0.68; so, we called these intervals PI68. Their half widths, calculated for the grid points closest to the stations, turn out to be in excellent agreement with the leave-one-out station RMSE (Table 1), with monthly differences generally within 1% and with the average of the monthly differences within 0.1%.
Figure 14 shows the LWLR PI68 half widths for January and July. As expected from the results in Table 1 (LWLR RMSEs), winter months are those with the largest PI68. In winter the area with the largest confidence intervals is located in the Ligurian Alps: this area has very strong temperature gradients within rather short distances, with extremely mild conditions on the southern slopes facing the Ligurian Sea, and rather continental climate on the northern slopes. Therefore, in this area, the exposition plays a very important role, and stations at the same elevation but with different expositions can have very strong temperature differences. Other areas with rather large errors are located in the central and eastern part of the Alpine region: they probably reflect the difficulty in performing the fit in presence of stations located in very cold valleys. In summer, the area with highest errors is in Calabria: these rather large errors are due to the short distance of the Apennine chain from the sea. Here the results are affected by rather cool stations at the lowest elevations (the coastal stations are mitigated by the sea) which cause significant deviation from linear behaviour in the station temperature-elevation regression.
Naturally, all the previous problems would greatly benefit from an increase of the density of the station network. This density, in fact, turns out to be the most important issue for LWLR. So, although MLRLI (and partially also RK) can also be used with a rather poor station network, LWLR, instead, requires a dense station network in all the parts of the study domain.