3.1. Data description
In this numerical application, the accuracy and feasibility of the proposed multi-site multivariate SD approach is assessed using the 30-year daily extreme temperature records available from a network of 10 weather stations located in the southwest region of Quebec and southeast region of Ontario (Table I and Figure 1). The station data are partitioned into the calibration period 1961–1975, and the validation period 1976–1990.
Figure 1. Study area with the location of weather stations from Environment Canada, and NCEP/NCAR and CRCM grid points. The NCEP/NCAR grid points correspond to the interpolated values onto the CGCM3 global Gaussian grid (around 3.75° longitude × 3.75° latitude, see DAI CGCM3 predictors, 2008), and the CRCM grid points are on the polar stereographic grid using a horizontal resolution of 45 km. This figure is available in colour online at wileyonlinelibrary.com/journal/joc
Download figure to PowerPoint
Table I. Coordinates (latitude, longitude, and altitude) of the weather stations (i.e. daily Tmin and Tmax from Environment Canada National Archive) used in this study. The location of these stations in Quebec and Ontario provinces (Canada) is shown in Figure 1
|Station name||Province||Latitude (°N)||Longitude (°W)||Altitude (m)|
|Ottawa CDA||Ontario||45.38||− 75.72||79.2|
|St Alban||Quebec||46.72||− 72.08||76.2|
The atmospheric predictor variables used in this study originate from the NCEP/NCAR reanalysis data (Kalnay et al., 1996; Kistler et al., 2001). These predictors have been linearly interpolated to match the Gaussian grids of the third version of the Canadian Centre for Climate Modelling and Analysis Coupled Global Climate Model (CGCM3) (DAI CGCM3 Predictors, 2008). The value in each grid cell corresponds to the value over the centre of the cell defined over an area of 3.75° longitude and approximately 3.75° latitude. As shown in Figure 1, there is an NCEP/NCAR grid point in the middle of the target region and two other nearby. A set containing 25 daily predictors covering the period from 1961 to 2003 (Table II) is available at each of the 3 grid points.
Table II. The significant predictors selected for each month for the computation of Tmax predictand at St Alban station
|25 predictor variables||Monthly selection|
|Mean sea level pressure||√||√||√||√||√||√||√||√||√||√||√||√|
|500 hPa airflow strength|| || || || || || || || || || || || |
|500 hPa zonal velocity|| || || || || || || || || || || || |
|500 hPa meridional velocity|| || || || || || || ||√|| || || || |
|500 hPa vorticity|| || || ||√|| || || || || ||√||√|| |
|500 hPa wind direction|| || || || || || || || || || || || |
|500 hPa divergence||√|| || || || || || || || || ||√||√|
|850 hPa airflow strength||√|| ||√|| ||√||√||√||√||√|| ||√|| |
|850 hPa zonal velocity|| ||√|| || || || || || || || || ||√|
|850 hPa meridional velocity|| || || || || || || || || ||√|| || |
|850 hPa vorticity|| || || || || || || || || || || || |
|850 hPa wind direction|| || || ||√|| || || || || || || || |
|850 hPa divergence||√|| || || || || || || || ||√|| ||√|
|500 hPa geopotential height|| || || || || || || || || || || || |
|850 hPa geopotential height||√||√||√||√||√||√||√||√||√||√||√||√|
|Surface airflow strength|| || ||√|| ||√||√||√|| ||√|| || || |
|Surface zonal velocity|| || || || || || || || ||√|| || || |
|Surface meridional velocity|| || || || || || || || || || || || |
|Surface vorticity|| ||√||√|| || || || || || || || || |
|Surface wind direction|| || || || || || || || || || || || |
|Surface divergence|| || || || || || || || || || || || |
|Specific humidity at 500 hPa|| ||√|| || || ||√||√||√|| || || || |
|Specific humidity at 850 hPa|| || || ||√||√|| || || || || || || |
|Near-surface specific humidity|| || || || || || || || || || || || |
|Mean temperature at 2m|| || || || || || || || || || || || |
In addition, data from the limited-area nested CRCM (Music and Caya, 2007; Caya and Laprise, 1999) were also evaluated for their accuracy as compared to the observations. The CRCM data used in this study come from the CRCM 4.1.1 version using the Canadian Land Surface Scheme (CLASS) (Verseghy, 1991). The computational domain of this model covers North America (AMNO) with 45-km horizontal resolution (on a polar stereographic grid), and 29 vertical levels. This model is driven by 6-hourly NCEP/NCAR reanalysis with the sea surface temperature and sea ice coming from the Atmospheric Model Intercomparison Project II (Fiorino, 1997 for the AMIP II version) database. This CRCM allows a suitable comparison with the proposed SD approach, because the two downscaling models use the NCEP/NCAR data as inputs.
Despite the scale mismatch between the local observed or SD point-scale information and the grid-cell values from the CRCM (smooth and uniform values over a grid-cell area), the idea behind this comparison is to analyse roughly the added values from the SD approach versus the use of direct RCM grid-cell values. Using an up-scaling approach to pass from the point-scale to the grid-cell area information, by spatial interpolation methods, will smooth the variability of the local climatic variables and introduce some errors or uncertainties related to the subjectively selected interpolation scheme. Hence, this will not allow evaluating the added values from the multi-site SD approach at the local scale and clearly assessing the performance of this SD approach in preserving the spatial and temporal climate information at the local scale. On the other hand, the interpolation of the CRCM grid-values to the station locations can entail different errors and dramatically affects the CRCM outputs. As shown in Figure 1, the CRCM grid points used for the present intercomparison study belong to the nearest grid boxes located in the vicinity of the 10 weather stations (10 green stars in Figure 1).
3.2. Predictors selection
In Equation (2), the selection of the significant atmospheric predictors is the critical factor that could affect the accuracy of the predictand estimation. Different combinations of atmospheric predictors may lead to different results (Huth, 1999; Wilby et al., 2004). In this study, the backward stepwise regression (McCuen, 2003; Hessami et al., 2008) was used to select the significant predictors for both Tmax and Tmin for each month and for each of the 10 selected stations. Backward stepwise regression starts with all candidate variables, tests every variable for statistical significance, and deletes those that are statistically insignificant. Sequences of partial F-test, with degrees of freedom 1 and n − p − 1, are used to decide the addition or elimination of these variables
in which n is the number of observations; p is the number of predictors in Equation (2); Rp and Rp−1 are correlation coefficients between the criterion variable and a prediction equation using p and p − 1 variables, respectively. For a given predictor, when F is greater than a critical value defined for a given level of significance, this predictor is retained. The procedure stops adding or dropping when no predictor can be retained or removed.
Furthermore, in this study, the selection of the best combination of predictor variables was carried out on a monthly basis to take into account the monthly variations in the predictor-predictand relationships rather than on an annual basis, for which the combination of predictors is constant through the year, as suggested by some previous studies (Wilby et al., 2002; Hessami et al., 2008). Table II shows an example of the selection of the five most significant predictors identified by the backward stepwise regression for the computation of Tmax predictand at St Alban station. The use of only five most significant predictors derives from need to describe adequately the local climate while achieving the parsimony of the downscaling model as well as to prevent the possible excessive colinearity between predictors when too many variables are selected (Gachon et al., 2005; Dibike et al., 2008; Hessami et al., 2008). Notice that the two predictors, near-surface specific humidity, and mean temperature at 2 m, have been omitted in this study to avoid the issue that the observed temperature data were assimilated within these two NCEP/NCAR predictors. As shown in Table II, the mean sea level pressure and the 850 hPa geopotential height are always identified as significant predictors. However, the optimal combination of the five most significant predictors for each month was found different from one month to another, except for June and July when selected predictors are similar. This difference could indicate the need for considering the different combination of predictors for each month rather than a constant combination for the whole year as mentioned previously.
To assess the accuracy of the proposed multi-site multivariate SD approach, both numerical and graphical comparisons between observed and simulated results were considered. More specifically, for the numerical comparison, the coefficient of determination (R2), the Mean Absolute Error (MAE), and the Root Mean Square Error (RMSE) were used
in which t is the length of the time series; XObs, i, s and XSim, i, s are, respectively, the observed and simulated values for day i and station s. X̄Obs is the mean of the observed data.
Table III shows the average values of R2, MAE, and RMSE based on 100 simulations of daily Tmax and Tmin time series for all stations and for both calibration and validation periods. In general, the high values of R2 (larger than 0.9976) and the low values of MAE (less than 0.54 °C) and RMSE (less than 0.66 °C) indicate the very good accuracy of the proposed multi-site multivariate SD procedure.
Table III. Coefficients of determination (R2), MAE, and RMSE between the monthly means of the observed and multi-site simulated daily Tmax and Tmin for both the calibration and validation periods at all stations
| ||R2||MAE ( °C)||RMSE ( °C)||R2||MAE ( °C)||RMSE ( °C)||R2||MAE ( °C)||RMSE ( °C)||R2||MAE ( °C)||RMSE ( °C)|
In terms of accuracy in reproducing the observed temporal dependence of daily Tmax and Tmin series at each station, Table IV shows, as typical example, the comparison between the observed and computed serial autocorrelations of lags 1–3 for Tmax and Tmin at Oka station. The proposed multi-site multivariate SD method was able to capture very well the temporal dependence of the daily Tmax and Tmin time series at this location as indicated by the comparable results between the observed and computed serial autocorrelation values for both calibration and validation periods. Similar results were obtained for the other stations.
Table IV. Observed and multi-site simulated daily Tmax and Tmin serial autocorrelation of lags 1–3 over both the calibration and validation periods at station Oka
|Serial Autocorrelations||Station Oka|
Regarding the spatial dependence, Figures 2 and 3 show, respectively, the comparisons between the observed and simulated daily Tmax and Tmin interstation correlations in each month for the calibration and validation periods. Each graph represents the correlations of the 45 pairs of stations in each month. In general, a very good agreement between the observed and simulated interstation correlations was found as indicated by the high R2 values for the calibration (0.8707 for Tmax and 0.8546 for Tmin) and validation (0.7736 for Tmax and 0.7765 for Tmin). The monthly intervariable correlations between Tmin and Tmax at each station and at each pair of stations are presented in Figure 4. A very good agreement was found between the observed and simulated intervariable correlation results as indicated by the high R2 values for both calibration (R2 = 0.8629) and validation (R2 = 0.8367) steps.
Moreover, a comparison has been made between the observed and simulated daily spatial autocorrelations of Tmax and Tmin computed by Moran's I (Equation (1)). For purposes of illustration, Figure 5 presents the results obtained for Tmin with 365 points are plotted in each graph. The observed daily spatial autocorrelations are averaged over the 15 years of each of the calibration and validation period, and the simulated ones are averaged over the 100 simulations. It can be seen that a very good agreement was obtained between the observed and simulated spatial autocorrelations as indicated by the high R2 values (0.8521 for the calibration and 0.8037 for the validation). Comparable results were found for Tmax (not shown) as well with similar high R2 values (0.877 for the calibration and 0.8525 for the validation).
The performance of the proposed multi-site multivariate SD procedure was also assessed based on common temperature indices for each month such as the means and standard deviations (STD) of Tmax and Tmin, the 90th percentile of daily Tmax (Tmax90p), and the 10th percentile of daily Tmin (Tmin10p). Figures 6–11 display the boxplots of these indices computed from the observed data, from the proposed multi-site multivariate SD model, and from the CRCM for both calibration and validation periods. In general, the multi-site multivariate SD model outperformed the CRCM in terms of the accuracy in reproducing these observed statistical properties. More specifically, for the mean values of Tmax and Tmin (Figures 6 and 8), the suggested model gave a very good agreement with the observations for both calibration and validation periods. For the case of the standard deviations of Tmax and Tmin (Figures 7 and 9), a slight overestimation of the median values of the Tmax STD was noted for summer (Jun, Jul, Aug) and September, while for the median values of the Tmin STD an underestimation was observed for winter (Dec, Jan, Feb) and March, and an overestimation for April, May, summer and autumn (Sep, Oct, Nov). Larger discrepancies of the CRCM with respect to the observations were found for the means and STDs of both Tmax and Tmin for almost every month. The bias appears either as an overestimation or an underestimation of the median value and its variability. Furthermore, regarding the results of the Tmax90p and Tmin10p indices (Figures 10 and 11), a slight overestimation of Tmax90p was observed with the multi-site multivariate SD model for summer, an overestimation of Tmin10p for winter and March, and an underestimation of Tmin10p for May, April, and summer for both calibration and validation periods. However, the multi-site multivariate SD model shows better skill as compared to the CRCM. In other words, the CRCM tends to underestimate or overestimate the median and variability of cold and warm extremes of Tmin and Tmax.
Figure 6. Box plots of the mean of Tmax for the observed data, the multi-site model, and the CRCM for (a) the calibration period (1961–1975), and (b) the validation period (1976–1990). The boxes correspond to the interquartile range (IQR), the band in the middle of each box to the median value, and the whiskers to the 1.5 × IQR. Outliers are represented by the crosses. Each box plot represents the daily time series obtained at all stations (observation and multi-site model values) or at all used grid points (CRCM values) aggregated over each period of 15 years. This figure is available in colour online at wileyonlinelibrary.com/journal/joc
Download figure to PowerPoint
Finally, it was noted that outliers appeared more frequently in the results of the proposed multi-site multivariate SD method. This behaviour could be attributed to the only use of the coarse-scale NCEP/NCAR predictors in this procedure, without taking explicitly into account the regional-scale variables such as surface conditions or diabatic fluxes from the surface needed to capture all range of variability related to the occurrence of temperature extremes. These can come from nonlinear processes and feedbacks linked with frost and thaw conditions of the soil, and/or presence or absence of snow on the ground as indicated by Gachon and Dibike (2007) for northern Canada. As also noted in Hessami et al. (2008), for the simulation of Tmax90p over eastern Quebec in Canada, outliers appear more often for the case of NCEP/NCAR-driven conditions than for the GCM ones.
Tables V–VII show the RMSEs of the standardized seasonal mean and STD of Tmin, and the standardized seasonal Tmin10p, respectively. The RMSE values were computed from the observations and the multi-site multivariate simulations, and from the observations and the CRCM data for both calibration and validation periods for all stations. The seasonal standardising process was carried out by subtracting the climatological mean and dividing by the climatological standard deviation of a given variable (the mean or STD of Tmin, or the Tmin10p) in a given season at a given station, over the 15-year calibration or 15-year validation period. It can be clearly seen that the RMSEs for many stations in the study region are generally smaller for the multi-site multivariate SD than for the CRCM. In particular, over the entire region, the values of the mean RMSE for all seasons were consistently smaller for the multi-site multivariate SD than for the CRCM. Hence, it can be concluded that the proposed multi-site multivariate SD procedure can reproduce more accurately the statistical properties of Tmin than the CRCM. Similar results were found for the RMSEs of the standardized seasonal mean and STD of Tmax, and the standardized seasonal Tmax90p index (not shown). The multi-site multivariate SD model has outperformed the CRCM as well in its accurate description of the statistical properties of Tmin.
Table V. RMSEs of the standardized seasonal Tmin mean computed from the observed data and the multi-site simulations (Obs_Sim), and from the observed data and the CRCM (Obs_CRCM) for both calibration and validation periods
|Stations||Standardized Tmin Mean RMSE_Calibration period|
| ||Standardized Tmin Mean RMSE_Validation period|
Table VI. Same as Table V but for the standardized seasonal Tmin STD
|Stations||Standardized Tmin STD RMSE_Calibration period|
| ||Standardized Tmin STD RMSE_Validation period|
Table VII. Same as Table V but for the standardized seasonal Tmin10p
|Stations||Standardized Tmin10p RMSE_Calibration period|
| ||Standardized Tmin10p RMSE_Validation period|
For purposes of illustration, Figures 12, 13, and 14 show the interannual anomalies of the standardized seasonal mean and STD of Tmax, and the standardized seasonal Tmax90p index, respectively, for both calibration and validation periods. The standardising process is the same as that for the spatial variation, but in this case, these figures show, for each of the 15 years presented in the x-axis, the mean of the standardized variables computed over all 10 stations. It can be clearly seen that the multi-site multivariate SD method can accurately reproduce the temporal variability of these Tmax statistical properties over the entire 15-year period of both calibration and validation. No significant differences are noted, except for the summer of the 13th year of the calibration period (Figure 14(a)), which shows an overestimation of the standardized Tmax90p index, and for the autumn of the 8th year of the validation period, which shows an underestimation of the standardized Tmax STD (Figure 13(b)). The CRCM-simulated values were able to describe the general trend of these interannual anomalies, but displayed a higher discrepancy with respect to the observed values than the multi-site multivariate SD model. Similar results were found for the interannual anomalies of the mean and STD of Tmin, and the Tmin10p index as well (not shown).
Finally, in addition to the evaluation of the performance of the proposed multi-site multivariate SD model based on a number of common temperature indices that are important for various climate-related impact assessment studies, the present paper includes also an assessment of the summer heat spell index. Many studies have been carried out on this extreme phenomenon especially after the disastrous European summer heat wave of 2003 (WHO, 2003; Beniston and Diaz, 2004; Beniston and Stephenson, 2004; Schär et al., 2004; Gachon et al., 2005; Khaliq et al., 2005, 2006, 2007). In particular, as mentioned by Khaliq et al., 2007, the assessment of the heat spells depends upon the type and length of the available data, time of the year, and the sector impacted by the heat spells. The present analysis focuses on the extreme summer temperatures, particularly affecting public health in the study area. Heat wave duration index (HWDI3days) has therefore been calculated for the summer of both the calibration and validation periods. The HWDI3days index computes the total of days in sequences superior to three days, where Tmax is superior to the calendar day mean, calculated on a five-day window centred on each calendar day during the calibration or the validation period, supplemented by 3 °C (Gachon et al., 2005; Drouin et al., 2005). Figure 15 shows the interannual anomalies of the standardized HWDI3days for both the calibration and validation periods. The mean of the standardized HWDI3days is computed over all 10 stations and presented for each of the respective 15-year period. These results confirm that the proposed SD model is able to reproduce adequately the occurrence and temporal variability of the observed heat waves over both calibration and validation periods, except for the 13th year of the calibration period in which an overestimation of the standardized HWDI3days was found due to the overestimation of Tmax90p index for the summer of this year, as noted previously (Figure 14(a)). The CRCM-simulated values were able to reproduce the interannual anomalies, but with less skill than the SD results, especially for the calibration period.