Interpolation of monthly mean temperatures using cokriging in spherical coordinates

Authors


Abstract

Paleoclimate reconstructions are generally validated on recent periods. To obtain a set of instrumental records at the regional scale, a time series of monthly mean temperatures in Northeastern Canada were interpolated for the 1961–2000 period. Records were provided by 202 meteorological stations. Temperatures derived from a Canadian regional climate model (Climate Model CRCM 4.2.3 from the AMNO domain produced at ∼50 km resolution) were added as secondary information to take into account local heterogeneity and temporal dependencies. Geostatistical interpolation of the measured temperature was calculated using CRCM modelled data as a covariable and then compared to an ordinary kriging performed on a time series of mean temperature anomalies. Spherical distances between locations were calculated taking into account the curvature of the Earth with monthly semivariances being modelled using Cauchy variograms. Mean absolute error values (1.5 ± 1.2 °C) were calculated for the whole period using cross-validation procedures. Errors were found to have the same order of magnitude in the central part of the study area where few recorded temperatures were available. Monthly mean temperature grids are publicly available through the Institut National de la Recherche Scientifique (http://url.in.rs/3P). Copyright © 2012 Royal Meteorological Society

1. Introduction

Global temperatures are predicted to rise and large changes are expected in precipitation patterns and temperatures (IPCC, 2007). Many studies have been conducted to anticipate potential climate change effects on landscapes, living organisms, and humans. Temperature is an important climatological variable for various scientific disciplines (Vogt et al., 1997). For example, air temperature is one of the main input data for water balance monitoring and crop yield prediction models (Monestiez et al., 2001).

When instrumental temperature data are sparse, climate simulations or indirect reconstructions using proxies recording time-related climatic oscillations can be used, however, instrumental data are still required to validate these proxies (Jacoby and Darrigo, 1989; Moberg et al., 2005).

When measured instrumental data are available, the first approach is to interpolate directly from this data. Several open access gridded datasets for total annual precipitation and mean temperatures are produced at the world scale (e.g., the CRU TS 2 dataset Mitchell and Jones, 2005). These data are suitable for studies focusing on large spatial patterns, but are not suitable for investigating the subtle geographic differences expressed at intermediate or local spatial levels. This is particularly true when meteorological stations are scarce and/or clustered throughout large territories such as the Northeastern Canada. At this scale, local heterogeneities introduced by various factors are not taken into consideration and contrasting climate conditions may thus be masked.

To perform interpolations at a regional scale, sparse meteorological station data can be complemented by the addition of high-resolution indirect information in order to take into account local heterogeneities (Hevesi et al., 1992; Phillips et al., 1992; Goovaerts, 2000; Carrera-Hernandez and Gaskin, 2007). In this approach, instrumental data support the local means whereas additional information takes into account the spatial variability of the target climatic parameter. Different approaches (cokriging (COK), collocated cokriging, Bayesian kriging, etc.) can be chosen depending on the spatial distribution of the indirect variables.

For example, covariables derived from digital elevation models are widely used to adjust for topographic conditions (Lloyd, 2005). However, many other predictors could also be included, such as distance to the sea (continentality), snow and vegetation cover, or leaf area index (Wilson and Gallant, 2000; Tapsoba et al., 2005; Ustrnul and Czekierda, 2005). Moreover, air temperature is not solely determined by local factors (e.g., elevation and landcover), but also by atmospheric circulation patterns in the northern hemisphere (Trenberth and Hurrell, 1994; Hurrell and VanLoon, 1997; Arguez et al., 2009). Climate dynamics may also obscure the relationships of specific variables. For example, it has been reported that in some areas, precipitation was not related to elevation (Daly et al., 1994). At the local level, the relationship between temperature and topography may be nonlinear and dependant on landcover type, which could change through time with a fluctuating climate (Dousset and Gourmelon, 2003).

A reliable estimate of the spatiotemporal variability of climatic variables can be obtained using climate models. Climate model simulations can be run for past periods. They reflect the regional atmospheric circulation patterns that for a large part determine local variabilities. Simulated climatic parameters are widely available at the global scale. Moreover, data provided by climate models are not influenced by biased records. In contrast, when various geospatial data sources are used in the manner of station data, or when several satellite-images are compiled, several records must deviate significantly from the real values. In addition, for recent periods, the spatial patterns of climatic variables calculated using models could be compared to those provided by remote sensing imagery to test their validity.

When interpolating data, distances among points are generally calculated using planar projections of the Earth surface and linear Euclidean distances. This assumption is suitable for small areas or areas with a high data number. In these cases, only the data closest to the location being estimated are retained. The low number of meteorological stations distributed throughout the Earth's large northern areas results in a dramatic increase in inter-pair distances and consequently the difference between the linear Euclidean distance and the true spherical distance is not negligible. Thus, the Earth's sphericity must be considered and Euclidean distance replaced by the length of the shorter arc between points (i.e., the geodesic). Surprisingly, this alternative method of distance calculation is rarely implemented in geostatistical estimation methods.

The aim of this study is to spatially interpolate a mean monthly temperature time series using spherical distance measurements and CRCM predictions as secondary information. We tested this approach in northern Québec; a geographic area where meteorological stations are scarce and irregularly distributed.

2. Methods

2.1. Data

The instrumental data used in this study is a compilation of all available temperature measurements recorded in the Canadian eastern taiga shield ecoregion that encompass the boreal forest of Quebec (Wiken et al., 1993). Meteorological stations are operated by different organisations including government agencies (Environment Canada, Québec government), Universities (Centre d'Études Nordiques, Université Laval), and Hydro-Quebec, a state agency. The majority of the data used were compiled and cleaned at the Centre d'Études Nordiques in order to insure their quality and to remove suspicious measurements (Cournoyer et al., 2007). Both global and regional models were used to obtain the regional climate variability pattern. The final database comprised monthly temperatures recorded at 202 stations. The length of the records ranged from 1 to 480 months (Figure 1). Only a few long records are available for the area. If all available data are considered, including short records of only several months, the spatial density of measurements increases dramatically.

Figure 1.

Locations of the instrumental data used. Circle radii indicate the length of the record period expressed in number of months

While General Circulation Models (GCM) have a coarse resolution, they give an accurate representation of global circulation. In contrast, Regional Climate Models (RCM) focus on particular areas and allow a much better representation of spatial variations because of their higher resolution. Thus, nested regional climate models are the standard tool used to obtain fine resolution views of the atmospheric circulation pattern (de Elia et al., 2008).

The RCM data (time series of monthly mean temperatures, ST variable) used in this study was generated by the Canadian Regional Climate Model CRCM 4.2.3 run over the AMNO domain using a grid spacing of approximately 50 km (Caya et al., 1995; Caya and Laprise, 1999). The CRCM was driven by five members of the third generation of the Canadian Coupled Global Climate Model (CGCM 3.1 at a T47 truncation from the Canadian Centre for Climate Modelling and Analysis). The T47 truncation is projected on a Gaussian grid with grid points spaced about 300 km apart (Kharin et al., 2007). The five members corresponded to small differences in initial conditions applied in 1850. The greenhouse gas concentrations following the SRES-A2 emission scenario corresponding to ‘business as usual’ with increasing CO2, CH4, and NO2 emissions (Allen and Ingram, 2002).

The land surface temperature products from the Moderate Resolution Imaging Spectroradiometer (MODIS) onboard Aqua and Terra satellite (http://modis.gsfc.nasa.gov) were used to describe the climate monthly mean temperature. A total of 7476 instantaneous images were compiled for the March 2000–December 2006 period to obtain mean skin surface temperature values (MTs) for each month (m) and for each 1 km pixel (i) located in our study area (Hachem et al., 2009):

equation image

where Ts(m, i) is the skin surface temperature measured on a single image for a particular pixel, and N is the number of images used (N⩽7476). Averaging the temperature over the period allowed us to remove various artefacts such as clouds. In counterpart, this procedure did not produce a time series of monthly mean temperatures, but a single temperature value for each month and for each pixel. We calculated the spatial linear Pearson correlation between the MODIS skin temperature measurements and the climate mean monthly temperatures provided by each member. We retained the member temperatures that exhibited the highest correlation value as a covariable. This choice was not based on the relationship with the measured data, as meteorological stations were highly clustered along particular climatic domains and were consequently unable to reconstruct the temperature pattern's entire spatial variability. While the mean temperatures reported by MODIS might differ from the temperatures registered at the meteorological stations, the variability patterns are closely related (Dousset and Gourmelon, 2003; Coll et al., 2005; Trigo et al., 2008). Researchers have reported errors between the temperatures measured at meteorological stations and those of skin surface temperatures that depend on sensor characteristics such as view zenith angle, but which are independent of surface air temperature, humidity, wind speed, and soil moisture (Wang et al., 2008). Thus, the MODIS imagery provides robust measurements that can be used to validate the spatial variability of temperatures reported by the climate models, although the data are limited to the time span of satellite activity (Ault et al., 2006; Crosman and Horel, 2009). Additionally, MODIS data were used to validate the cokriged temperature spatial patterns.

2.2. Coregionalization models

We used the usual geostatistical variogram estimator expressed as:

equation image

where h is the lag distance between two points, Z(x) and Z(x + h) are the Z values of the pair, and Nh the pair number at lag h. Cross variograms were calculated on pairs formed by closest locations (k-nearest neighbour algorithm with k = 1). The maximum distance between measured and CRCM data was 25 km.

Direct and cross variograms must be modelled subject to the constraint that the resulting COK matrix is positive definite (Isaaks and Srivastava, 1989). Direct and cross variograms must be expressed as a linear combination of the same basic structures (Journel and Huijbregts, 1978; Isaaks and Srivastava, 1989). In the linear coregionalization models, variograms could be modelled simultaneously, but because primary variables (Z1) were known at only a few locations, we first chose to model the secondary variables (Z2). The direct variograms of Z1 and the cross variograms were obtained by rescaling the variogram of Z1 by applying a simple proportionality coefficient to the nugget and sill parameters. Variogram structures, anisotropy directions, and ranges of Z1 and Z2 were the same. This approach was suitable in our case because preliminary analysis showed that the measured and model temperatures were highly linearly correlated. We verified that the determinants of the coregionalization matrices that contained the parameters of the variograms structures were non-negative (Goovaerts, 1997). If both direct and cross variograms (γ11, γ22, and γ12) are modelled using two structures (g0 and g1), the linear model of coregionalization is expressed as:

equation image

The positive semi definite condition is expressed as:

equation image

where c0 and c1 are the parameters of the two basic structures.

Modelling the cross variograms was not needed because we assumed that it was proportional to the semivariograms of Z2. This corresponded to the Markov model II hypothesis (Chilès and Delfiner, 1999; Journel and Shmaryan, 1999):

equation image

where C12 is the covariance between Z1 and Z2, C22 is the covariance of Z2:

equation image

Then, only C22 has to be known. C11(0) is the variance of the primary variable. C12(0) is obtained with the simple correlation coefficient expression:

equation image

where R12 is the correlation between the two variables calculated on pairs formed by closest locations (k-nearest neighbour algorithm with k = 1). We did not take into account the temporal monthly mean temperature autocorrelation. In this time-slice approach, each monthly interpolation was done independently (Yuval et al., 2005).

Experimental semivariograms were calculated on 30 lags for each month*year layer using the grid cell dimension (50 km) as lag distance unit (h). We expected a strong anisotropy oriented E-W and N-S and calculated experimental variograms in four directions: 0, 45, 90, and 135°. As Z2 was distributed on a regular grid, angle tolerance was fixed to stretch 10° and the bandwidth parameter to half of the grid unit (25 km). Exponential and Cauchy (Chilès and Delfiner, 1999; Tscheschel et al., 2005) models were adjusted on monthly semivariance values (for the whole period):

equation image

The Cauchy model was chosen as it best fits the experimental semivariogram. At larger lag distance, an asymptotic behaviour is expected. Automatic fitting using weighted least-squared nonlinear regression approximate the values of nugget, slope, and range parameters. The models were then manually adjusted to obtain the final parameter estimations. The first 10 lags (inter-point distance of 500 km) were considered solely to obtain a better approximation of the part of the variogram that primarily influences the kriging weights and consequently the final estimations. Additionally, automatic fitting was conducted for each month*year for the specific period. Scatter plots of the semivariogram parameters reported along time indicated that the parameters did not show any temporal trends over the whole period. In addition to visually inspecting the parameter time series, we performed unit root tests (Phillips-Perron). Monthly variogram models for the primary variable were fitted manually by modifying the constant of proportionality between the variograms of the two variables.

2.3. Interpolation procedures

The interpolation method used in this study was a full COK with a limited neighbour number and without threshold distances. COK allows the incorporation of densely correlated secondary information to estimate the principal variable (Journel and Huijbregts, 1978). As pointed out by Goovaerts (1997), COK relates the primary and secondary variables on a correlation basis. This corresponds to our hypothesis that measured climate mean temperatures are related to those produced by climate models on the whole period studied. Full COK takes into account the whole information and was preferred to other approaches that only consider the collocated secondary information (Chilès and Delfiner, 1999). Both simple and ordinary COK differ by the methods used to model the means of primary and secondary variables. Simple COK assumes that primary and secondary means are constant over the study area (Goovaerts, 1998). This condition was not respected with our data where large latitudinal variations were expected. Estimated primary and secondary local means could differ from the means calculated on the whole dataset. Consequently, ordinary COK was preferred because it did not require a knowledge or the strict stationarity of the primary and secondary local means (Goovaerts, 1998).

The ordinary COK estimator Z* at location x can be expressed as follow:

equation image

where Z1 and Z2 are the primary and secondary variables respectively, λ1 and λ2 are the corresponding weights of the N1 and N2 points considered for each of the two variables, respectively. The means of Z1 and Z2 do not appear in the previous equation because their respective weights were fixed to 1 and 0, respectively. This introduces two supplementary constraints:

equation image

Solving the corresponding kriging system requires the use of Lagrange parameters. In matrix notation the ordinary COK system is written for two variables as:

equation image

where λ is the kriging weights assigned to each points, µ is the Lagrange parameters, and γij is the semivariance values calculated for the xi and xj points. γ11, γ22, and γ12 refer to the Z1, Z2 variables, and the cross variograms, respectively. γ11ix, γ22ix are the semivariance values calculated between each Z1 and Z2 points and the estimated location Z*(x).

Estimations were conducted using the same locations as Z2 but limited to our study area and moved forward from an epsilon distance (a few metres) to avoid the zero distance effect. A COK was then performed for each position using the 6 and 12 nearest neighbours of the variables Z2 and Z1, respectively. Because the secondary information was available around the estimated location, it was not necessary to retain a higher neighbour number (Goovaerts, 1998). For the primary information, the neighbour selection was not limited to a threshold distance due to the scarcity of meteorological stations and consequently we fixed a large radius value (5000 km). Cross-validation procedures were used to estimate the model's accuracy. Measured values were discarded one at a time with the COK being applied at the missing location using the remaining locations along with the Z2 data. Errors between measured and estimated data were calculated using the mean absolute difference (MAE).

We compared our results with a common approach where temperatures were detrended (to remove the spatial gradients) and the residual component interpolated using ordinary kriging. The trend surface was modelled for each month on the entire study area and for the whole period using the CRMC values. These values were subtracted from the corresponding time series of measured mean monthly temperatures. Monthly omnidirectional variograms were adjusted using semivariance values calculated from residuals.

Ordinary kriging, COK, and cross validation were done using a modified version of the MATLAB ‘COKRI’ code (Marcotte, 1991). Other analyses were carried out using SAS V9 software (SAS Institute Inc., Cary, NC).

2.4. Distance measurements

Classical approaches measure the distance between pairs of points using linear Euclidean distance. At the local level, inter-point distances are not biased. However, at the continental scale, non-negligible departures from the true distances can occur. Therefore, regional kriging modelling may be improved by explicitly taking into account the earth's curvature for distance calculations. In this study, distances between locations (dij) were determined using spherical distances computed using the Vincenty algorithm (Vincenty, 1975). The equatorial radius of the earth was fixed to 6378.137e3 m following the WGS84 reference system. Point coordinates were consequently expressed in decimal degrees.

3. Results

3.1. Relationships between the temperature datasets

Instrumental temperatures were higher than those calculated by the CRCM (4°, Figure 2). However, the strong spatial correlation coefficient (R = 0.94) indicated that the CRCM was able to closely describe the monthly anisotropic pattern of temperatures.

Figure 2.

Relationship between modelled (Tcrcm) and measured (Tmeasured) monthly mean temperatures

Data provided by the five CRCM scenarios (the five members) were highly correlated (R > 0.98 calculated for all records). The A2 scenario model member was chosen as the covariable for the kriging processes because it showed the highest spatiotemporal correlation value (R = 0.94) with the instrumental data throughout the whole period. This relationship was not temporally biased and high correlation values were encountered for the whole period (R = 0.89 ± 0.12 for all month*years). The lowest correlation coefficients were encountered for the winter temperatures (Figure 3(b)) and for only some years (Figure 3(a)). Spatially, the relationship between CRCM and measured temperatures also did not appear to be biased. Absolute error maps (modelled - measured temperatures) calculated for the whole period or generated for each month*year did not reveal any spatial trends (Figure 4) under the 60th parallel. Errors increased for the upper latitudes. The quasi linear relationship between measured and modelled monthly mean temperatures was verified except for the lowest values.

Figure 3.

Correlation between measured and modelled temperatures. Annual (a) and monthly (b) evolution. Mean (bold line) and 95% confidence intervals

Figure 4.

Map of the absolute errors (MAE) calculated for the whole period between measured and modelled temperatures (CRCM)

3.2. Variograms

Omnidirectional experimental variograms calculated on the CRCM data were best fitted by the Cauchy model as shown by RMSE values. The exponential model performed better for only 1 month, February 1977. For other year*month periods and for the whole period, the RMSE were always lower with the Cauchy model (mean RMSE difference = 7 ± 5; Figure 5). As expected, the directions of major and minor spatial continuity corresponded to the 0°(N-S) and 90°(E-W) directions, respectively (Figure 6). Higher semivariance values in the N-S directions indicated that temperatures varied rapidly on the latitudinal gradient. These two directions were thus kept for kriging. Variogram models were constructed for the 12 months using three structures (one nugget and two Cauchy):

equation image

where θ represent the anisotropy angle, a1(Θ) and a2(Θ) the ranges associated to the two Cauchy structures. Automatic modelling performed for each month*year did not reveal any temporal trend for the parameters' values along the period considered (Phillips-Perron root test: P values > 0.2 for all parameters). The temporal stationarity of parameters showed that the anisotropic pattern was constant along the period studied and allowed us to calculate monthly mean semivariograms only (Figure 7; Table I). For Z1, the minimum pair number for each lag was 40, whereas it ranged from 2 to 26 when years were considered separately. Positive definiteness was respected for each month. Models adjusted closely the experimental variograms for all months. The anisotropy pattern was more pronounced during winter. Variograms of the primary and secondary variables followed each other closely. Coefficients of proportionality K were close to 1 (Table I). Highest disparities were encountered during the August–October period with K values ranging from 0.4 to 0.7. The monthly spatial correlation between Z1 and Z2 (r-values comprised between 0.73 and 0.9) allowed the parameters of the cross variograms to be calculated. These calculated cross variograms adjusted the experimental cross variograms fitted on the data adequately. Correlation coefficient values ranged between 0.61 and 0.43.

Figure 5.

Root mean square errors associated with the exponential and Cauchy variograms. Upper and lower lines show the 95% confidence intervals

Figure 6.

Semivariances computed in four directions (0, 45, 90, and 135°) for a particular month*year (April 2000)

Figure 7.

Models (lines) and experimental (points) variograms for the secondary variable. Upper and lower curves are relative to the two directions of anisotropy (0 and 90°, respectively). γ is the semivariance and the month is indicated with a number in the upper left corner

Table I. Variogram parameters
MonthZ2Z1Z12 MM2Angle = 0Angle = 90
 c0c1c2c0c1c2Kc0c1c2a1a2a1a2
  1. Z1 and Z2 are the primary and secondary variables, Z12 MM2 are the cross variogram parameters calculated under the Markov type II hypothesis. C0 is the nugget. C1 and C2 are the sill of the Cauchy structures. Ranges (a1 and a2) are given in the two anisotropy directions (0 and 90°).

10.1104500.10.099445.50.990.0730.72732712013001202000
20.173500.10.099346.50.990.0770.63726712013001202000
30.164000.10.0993960.990.0810.62732416016002902900
40.144000.30.0993960.990.1520.55134920020003903900
50.133100.40.072170.70.1760.40322820020003503500
60.14.5750.70.181351.80.2120.72180.612012001302600
70.155.5900.650.181081.20.2570.8281.212012001302300
80.121100.40.06660.60.1710.2977310011501202000
90.121200.450.04480.40.1760.23563.110011501202000
100.191300.450.07910.70.1680.62986.112012001202000
110.1152000.50.120010.1760.96415712012001101900
120.01152000.60.0153001.50.060.36718910011001101450

3.4. Cokriging

COK estimations were carried out for all locations allowing us to obtain monthly temperature estimations (Figure 8). The difference between the predicted temperatures and those measured by meteorological stations were lower than 2 degrees. MAE values calculated for the whole period with the cross-validation procedures were 1.5 ± 1.2 °C. Errors were higher in winter than for the summer months (Figure 9(a), solid line) and for the mid part of the period studied (Figure 9(b)). The higher winter MAE values may be related to the higher spatial variability of temperatures during these months. Variograms sills were higher (Figure 7) and the MAE divided by the total sill (normalized MAE) plotted against months did not exhibit the same seasonal trend (Figure 9(a), dashed line). Spatially, MAE did not show any particular like trends and were not related to the meteorological station density (Figure 10). Some stations located close together exhibited different errors suggesting that measurements could be biased or influenced by particular climatic conditions. Kriging predictions respected the spatial pattern of variability reported by the MODIS skin surface temperatures for the year 2000 (Figure 8). Visual inspection of January and July 2000 temperature mapping showed that spatial patterns were close together across the entire study area. Spatial correlation coefficients calculated between MODIS and cokriged temperatures were 0.8 and 0.6 for January and July, respectively. Cokriged temperatures did not respect the fine level (1 km resolution) temperature variability revealed by the MODIS imagery. This could explain observed MAE values for geostatistical estimations.

Figure 8.

Cokriged mean temperatures (a) and MODIS skin surfaces temperatures (b) for 2 months. Monthly mean temperatures are calculated for the 2000–2005 period using the MODIS imagery

Figure 9.

Mean absolute errors (MAE) between (a) months and (b) years. Dashed line is the MAE normalized by the total sill variogram values for each month

Figure 10.

Map of the absolute errors (MAE) calculated for the whole period between measured and cokriged temperatures using cross-validation procedures

Ordinary kriging performed on the residual anomalies after detrending led to global higher MAE values (1.7 ± 1.7). For the central part of the area where meteorological stations are scarce, the COK provided better estimations. In this area errors obtained with the ordinary kriging were 1 to 2 °C higher.

4. Discussion

For regions with sparse meteorological stations, stochastic interpolation methods can be used to estimate the spatial distribution of climatic variables (Li et al., 2005). The COK interpolation used in this study has previously yielded good results for temperature, rainfall, solar radiation, relative humidity, and wind speed (Ishida and Kawashima, 1993; Phillips et al., 1997; Apaydin et al., 2004; Severino and Alpuim, 2005). As demonstrated in this study, climate model predictions can provide very useful secondary information for temperature COK. In contrast to elevation-based variables and other temporally constant variables, climate models reflect a given area's regional atmospheric circulation patterns that may vary across the time. Thus, the use of secondary information that takes into account the temporal variability of the primary variable improves geostatistical interpolations. As the interpolation is greatly influenced by the CRCM, accuracy and reliability of the CRCM is crucial. The spatial variability of CRCM monthly mean temperatures must be related to the observed spatial patterns. This hypothesis has been verified. Strong spatial correlation with the measured mean temperatures indicated that the CRCM was able to closely describe the monthly spatial pattern of temperatures. Moreover, comparisons with the skin surface temperatures registered by the MODIS satellite show a good large scale relationship. However, MODIS skin surface temperatures highlight the fact that fine level spatial variability is not modelled by the COK interpolation. Climate models that provide global meteorology at mesoscale resolutions only (∼50 km in our study) do not take into account the sub-grid spatial variability (Spadavecchia and Williams, 2009). This could explain the remaining 1.5° error associated with these predictions. A large part the results may also be explained by the secondary information being available at a 50 km scale whereas the MODIS temperatures were measured at a 1 km resolution. Furthermore, this highlights the usefulness to include variables that reflect the variabilities at different spatial (or temporal) levels. Remote sensing, particularly within the infra-red spectrum, is used successfully to quantify the near surface temperatures with a high resolution (Goward et al., 1994). However, these data are time limited. Elevation and derived variables such as slope or aspect also have a high explanatory power (Goovaerts, 2000). An alternative approach could include elevation and skin surface temperatures as secondary information with climate model temperatures. Previous studies have shown an interest in including satellite-retrieved surface skin temperatures to interpolate the maximum air temperatures (Vogt et al., 1997). These interpolations would be conducted using kriging with covariables sampled at different scales. When numerous covariables are used, a single synthetic variable that encompasses the whole variability contained within the covariables could be calculated. For example, principal component analysis could be used to define a component that accounts for most of the variance in the data and is a linear combination of the original covariables (Davis and Kent, 1983).

A common approach to account for a variable with a trend is to decompose it into a trend and a residual component. The residuals are then stochastically modelled independently of the trend (Leuangthong and Deutsch, 2004). Several authors have used this approach successfully to interpolate climate variables (Hevesi et al., 1992; Phillips et al., 1992; Sevruk, 1997). Removing large trends ensures the first-order stationarity of the variables. Interpolating anomalies rather than actual temperatures can be viewed as a form of detrending where the surface removed is the mean temperature value calculated over a specific period and where the residual part is kriged (Jones et al., 1986; Gunst et al., 1993). However, some results have shown that detrending does not always improve interpolations (Weibel et al., 2002). In our COK approach, variograms exhibited a global exponential behaviour and did not reach a sill. Without detrending, semivariance increased continuously throughout the entire study area. COK without detrending allowed us to model the variogram of the primary variable and the cross variogram as a linear function of the secondary information. Adding the secondary information compensated for the lack of first-order stationarity.

Temperature records were available for the last 100 years in our study area and interpolations could be extended. However, potential pitfalls should be investigated. In this study, mean monthly variograms were calculated because we had previously verified that the variograms parameters were constant along the 40 year study period. However, trends could be expected if interpolations were conducted on larger periods. If this hypothesis is verified, semivariance and variograms could be calculated and modelled for each month and for each year. It would then be possible to calculate each monthly variogram using the predictions of several years centred on the year*month modelled. Predictions could be extended backward if climate model predictions were produced for past periods. A lower number of meteorological stations would be available, but we assumed that the COK predictions would still also be accurate. COK allowed us to produce an accurate 40 year time series of monthly mean temperatures in a region that possessed a low density of instrumental data.

Monthly mean temperature grids for the Northeastern Canada produced in this study can be used in the future for climate simulation, spatial variation studies or validation of paleoclimate reconstructions. Gridded datasets are publicly available through the Institut National de la Recherche Scientifique (http://url.in.rs/3P). Interpolation of monthly mean temperatures using COK in spherical coordinates provided good estimations and could be transposed for all part of the world because climate models data are largely available for all locations.

Acknowledgements

Professors D. Cluis and D. Marcotte provided helpful comments on earlier drafts of this manuscript. We thank M. Renaud and P. Jasinski for technical assistance and helpful collaboration for editing revisions. Funding for this research was provided by the Hydro-Québec Governmental Agency and the OURANOS Consortium. This work is a part of the Archives project.

The MATLAB codes are available upon request from the authors.

Ancillary