A gridded land-only data set representing near-surface observations of daily maximum and minimum temperatures (HadGHCND) has been created to allow analysis of recent changes in climate extremes and for the evaluation of climate model simulations. Using a global data set of quality-controlled station observations compiled by the U.S. National Climatic Data Center (NCDC), daily anomalies were created relative to the 1961–1990 reference period for each contributing station. An angular distance weighting technique was used to interpolate these observed anomalies onto a 2.5° latitude by 3.75° longitude grid over the period from January 1946 to December 2000. We have used the data set to examine regional trends in time-varying percentiles. Data over consecutive 5 year periods were used to calculate percentiles which allow us to see how the distributions of daily maximum and minimum temperature have changed over time. Changes during the winter and spring periods are larger than in the other seasons, particularly with respect to increasing temperatures at the lower end of the maximum and minimum temperature distributions. Regional differences suggest that it is not possible to infer distributional changes from changes in the mean alone.
 Long-term, global-scale, gridded monthly temperature data sets [e.g., New et al., 2000; Jones and Moberg, 2003] have been available to the research community for well over a decade. In contrast, no comparable products exist for the daily timescale. Gridded daily temperature observations are required for empirical analyses of global extremes, to validate the performance of climate models used to make future predictions of extreme events, as well as for other environmental modeling applications that require evenly spaced temperature data as input.
 A number of regional gridded daily temperature data sets are in existence including China [Feng et al., 2004] and the USA [Janowiak et al., 1999]. Piper and Stewart  created a global gridded data set consisting of daily maximum and minimum temperatures at a grid resolution of 0.5°, but it was based upon a limited period beginning in 1977. The data set presented in this paper, HadGHCND, offers an improvement on previous data sets as it contains daily maximum and minimum temperature fields for the entire period from 1946 to 2000 allowing analysis of changes over five decades. It also enables us to assess these changes on a near-global scale.
 First we discuss the observational data in section 2 and then describe the process of gridding these data in section 3, along with an evaluation of the data set in terms of interpolation errors and comparison with an existing global monthly mean temperature data set. Section 4 presents an assessment of changes in observed maximum and minimum temperatures between 1946 and 2000 with a particular focus on the changing distributional characteristics of the data. We discuss the results and conclusions in section 5.
2. Observational Data
 The primary source of station data is the U.S. National Climatic Data Center (NCDC) Global Historical Climatology Network-Daily (GHCND). This data set contains daily maximum and minimum temperatures for nearly 15,000 stations around the globe [Gleason et al., 2002] and is the most comprehensive data set of daily station observations available. Despite recent efforts to collate daily climate data on a regional basis [e.g., Klein Tank et al., 2002], a number of regions still display relatively sparse coverage of freely available station data, in particular Africa and South America. To supplement the coverage provided by GHCND we incorporate a total of 10 additional stations over Greenland and North Africa obtained from regional sources to provide additional, or more complete data in regions with poor coverage. At this stage, data for every 29 February were missing from the supplied GHCND data set. This is not a problem in relation to model comparison since climate models tend to have months of equal length (30 days), though future updates to GHCND, and the data set described in this paper, will rectify this omission.
 The GHCND data have undergone quality control checks [Gleason, 2002]. This procedure consisted of two parts: (1) simple datum checks, e.g., exceedance of known world extreme values, minimum temperature greater than maximum temperature on a given day, 10 or more consecutive days at the same value, and (2) statistical analysis of sets of observations to locate and identify outliers representing potentially erroneous data. These checks were also applied to the non-GHCND data. We were reluctant to exclude outliers, on the basis that these may represent genuine extremes, but an initial analysis suggested that many of the flagged values were indeed erroneous. We ran an additional test to identify values exceeding four, five and six biweight standard deviations [Lanzante, 1996] and excluded data exceeding six times the biweight standard deviation.
 Stations containing at least 20 years of data between 1961 and 1990 were selected for gridding. The station network over the USA is much denser than any other region so we thinned the network to those stations corresponding to the daily United States Historical Climatology Network (USHCN [Williams et al., 2004]). The gridding method we use is limited to using the closest 10 stations to a grid point, so a highly dense network would not aide the interpolation process. USHCN stations are selected on the basis of having a low potential for heat island bias, a relatively constant observation time, and reasonably homogeneous spatial distribution over the United States. The final station network is fairly dense over the Northern Hemisphere, particularly the United States, Europe, Japan and China. The Southern Hemisphere and tropics are poorly sampled in comparison. A total of 2936 stations were subsequently selected for use in the gridded data set (Figure 1).
 Data used for long-term climate research may be affected by inhomogeneities which can be related to urbanization and land use biases, or changing observing practices and instrumentation [Peterson et al., 1998]. We undertook an initial assessment of station data homogeneity based upon the methods described by Wijngaard et al. , which use four tests of absolute homogeneity, i.e., testing for breakpoints at individual stations instead of with reference to neighboring stations. The tests indicated that approximately 40% of stations indicated potential breakpoints. We looked more closely at a number of stations for which we had adequate metadata and this suggested that breakpoints may be detected in the absence of any documented explanation. In some cases these breakpoints were coincident at neighboring stations suggesting a possible genuine shift in the climate. Because of the large proportion of stations with detected breakpoints, many of which were located in data-sparse regions, we decided to include all stations to gain the greatest possible gridded coverage. An increase in the available daily station data would allow us to be more selective. A related issue is that methods of detecting and adjusting for inhomogeneities in monthly series are more advanced relative to those available for daily data [Wijngaard et al., 2003]. Robeson  compared daily temperature data from Canada that had been homogenized [Vincent et al., 2002] with United States data that had not. Visual inspection along the U.S.-Canadian border showed no obvious difference in patterns of trends. This does not necessarily apply worldwide, and while the interpolation technique will help to reduce the impact of single inhomogeneities at individual stations, more so in data-rich regions, countrywide changes in observing practice or instrumentation may have a more significant impact upon observed trends.
3. Gridding the Observations
 Since we also wish to use the data set for model evaluation, we grid the data onto a 2.5° by 3.75° grid identical to the land mask of HadCM3 [Pope et al., 1999]. The interpolation method uses a modified version of Shepard's angular distance weighting algorithm [Shepard, 1968] as employed by New et al.  who used it in favor of other methods because of its flexibility when gridding irregularly spaced station data. It has also been used by Kiktev et al.  and Piper and Stewart  who found it to be computationally efficient compared with other methods, while producing interpolation errors of a similar magnitude to alternative approaches.
 In order to avoid biases in the gridding, particularly over regions of sharply varying elevation, we grid the daily anomalies as opposed to the absolute values. First, climatological normals for 1961 to 1990 were calculated for each station's minimum and maximum temperature records using a five-day window centered on each day, assuming that the station had at least 20 years of data available within the reference period. Stations were required to have at least 350 daily normal values out of 366 calendar days, otherwise they were excluded from further consideration. The daily anomalies are simply calculated as the difference of each daily temperature from its daily normal value.
3.2. Correlation Length Scales
 Our interpolation method, angular distance weighting (see section 3.3), requires an understanding of the spatial correlation structure of the station data. We investigate interstation correlations to determine the distances over which observed temperature anomalies are related. This allows us to define a distance weighting function and a maximum radius of influence for calculating grid point values. The spatial relationships between stations can vary with season and also differs between high and low latitudes. There is a weaker relationship between temperatures in the meridional direction than the zonal [Jones et al., 1997]. We therefore split the globe into nonoverlapping zonal bands of 30° latitude and calculate interstation correlations for these bands independently. For the southernmost band we take the band from 30 to 90°S because of the sparse station coverage at these latitudes.
 For each pair of stations within the latitudinal bands and for each month, their correlation, r, was calculated and then binned according to their separation over intervals of 100 km. Since there are a large number of stations we cut down the processing time by preselecting pairs of stations which fall within 2000 km of each other. The mean correlation was estimated over each 100 km interval and a two-degree polynomial function was fitted to these values, since the decay curves were not particularly smooth in the data-sparse southern bands. Figure 2 shows an example correlation decay curve for maximum temperature in the most northerly band (band 1) during July. Stations in Mexico were excluded at this point since we discovered that interstation correlations were particularly poor, and the GHCND documentation also notes potential unresolved quality issues with these data. We also excluded data from Hawaii and Puerto Rico which displayed similarly poor interstation correlations. Other island stations, particularly those in the Pacific, were not incorporated into the final gridded data set in many cases because of their distance from HadCM3 grid points classified as land.
 The correlation length scale (CLS) is defined as the distance at which the mean correlation, represented by the fitted function, fell below 1/e [Belousov et al., 1971], where e = EXP(1). We estimated the distance at which this occurred to determine the CLS representing the radius of influence. The results are shown in Figure 3, indicating that CLSs are generally smaller in the summer and at lower latitudes.
 We also compared the interpolation errors associated with using a variable monthly CLS, against a fixed annual mean value. Using a variable correlation length scale tended to give lower root mean square (RMS) interpolation errors (described in section 3.4) during the summer, and higher errors during the winter, relative to using a fixed annual CLS. Annual mean interpolation error was slightly lower using a fixed CLS, and coupled with the convenience of a fixed grid mask obtained using the annual mean CLS meant that we decided not to adopt variable CLSs. The annual mean correlation length scales for maximum and minimum temperatures are shown in Table 1. Despite having correlated anomalies rather than absolute temperatures, seasonal differences are apparent between the zonal bands (Figure 3), although these are less apparent when viewing annual mean figures (Table 1).
Table 1. Annual Mean Correlation Length Scales for Each Latitude Band for Tmax and Tmin Anomaliesa
Length scales are in kilometers.
Band 1 (60°–90°N)
Band 2 (30°–60°N)
Band 3 (0°–30°N)
Band 4 (0°–30°S)
Band 5 (30°–90°S)
3.3. Interpolation Method
 Angular distance weighting uses two components to calculate the weighting of each station. The first component weights the station according to its distance from a grid point, with the CLS controlling the rate at which the weight decreases away from the grid point. We selected the exponential function as a reasonable representation of the observed correlation decay curves produced.
 Based upon the CLS, a correlation function can be defined [Jones et al., 1997] shown in equation (1), where x is the distance of the station from the required grid point and xo is the CLS appropriate to that grid point depending on its latitude.
Following New et al. , we define a distance weight for a station, i, in equation (2). Weights decay more steeply for smaller CLSs, but the term m allows us to adjust the weighting function further, so that higher values of m also increase the rate at which the weight decays with distance.
As in the work of New et al., we tested different values of m (ranging from 1 to 10), and evaluated results based upon cross validation against withheld station data. We found that cross-validation RMS errors tended to decrease with increasing m, but that an m value of 4 offered a reasonable compromise between reducing the error and helping to reduce spatial smoothing, while still allowing more distant stations to influence the grid point value.
 Following New et al. , the combined angular distance weight for the ith station (of a total of k stations contributing to a grid point value), Wi is defined as:
where the position of the ith station is defined in terms of its distance, xi (equation (1)) and its angle to North, θi, relative to the specified grid point. The first term in the combined angular distance weight (equation (3)) weights the gridded value in favor of stations close to the grid point. The second term, in large brackets, weights the stations contributing to a grid point according to their directional (angular) isolation from each other and acts to increase the weight if the station is isolated in an angular sense.
 One requirement is to define how we will select the stations that will contribute to each grid point value and avoid the use of unrepresentative stations. Instead of using an arbitrary globally constant search radius we instead base the search radius upon the CLS, which varies with latitude. For the purposes of gridding, the CLS values for each zonal band were linearly interpolated to each grid point between the center of each band so that there were no discontinuities along the band boundaries. Where the distance from a station to a grid point is greater than the CLS, the station is unlikely to provide any useful information for gridding [New et al., 2000]. Piper and Stewart  and New et al.  both use a variable search radius to include, respectively, the closest 4 to 10 stations and the closest eight stations to a grid point, with Dodson and Marks  also suggesting eight as a good compromise. We use the weighted sum of the closest 3 to 10 stations to each grid point, assuming that they fall within the CLS distance, to estimate our grid point temperature values. We use a minimum of three stations to allow for greater gridded coverage over data-sparse regions. If fewer than three stations with data are present within the search radius, the grid point value for that day is set to missing. If there are more than 10 stations within the CLS distance, then only the 10 closest to the grid point are used so as to increase computational efficiency, and therefore the actual radius of influence depends on the station density.
 Daily data present greater problems than monthly data as there are more likely to be gaps in the record at the higher temporal resolution. Hence the group of stations that contribute to a grid point value on any particular day has to be reassessed for each grid point on each day. To help cut down on the processing time we follow Piper and Stewart  by creating lists of “nearest neighbor” stations for each grid point which can be used to focus the search for nonmissing values.
 In addition to creating the anomaly grids, certain applications require the creation of an absolute temperature grid. We have gridded the daily normals using the same technique, which can then be added back onto the gridded anomalies to create absolute temperature grids. This does not completely address the issue of elevation dependence in the gridded normals [e.g., Willmott and Robeson, 1995], and we have begun investigating simple methods of accounting for elevation, which do lead to a reduction in interpolation error for the normals. This is an aspect we will investigate further for future versions of the data set.
3.4. Data Set Evaluation
 We evaluated the data set using cross validation [Cressie, 1993] to estimate errors associated with our chosen interpolation technique. This involved removing each station from the data set, and then using the interpolation technique to estimate the temperature anomaly time series for that station using data from the surrounding stations. We compute RMS errors on the basis of the differences between the actual station time series and the interpolated station time series. The results (Figure 4) show that, on average, the RMS errors are around 2°C. The annual average RMS error for all maximum temperature stations is 1.9°C, with highest values of 2.3°C in January, and lowest of 1.6°C during August. Errors are typically larger for minimum (2.0°C annual average) than for maximum temperatures, larger in the winter hemisphere, larger in coastal areas than inland locations, and larger in regions where the station density is lowest, hence resulting in greater spatial smoothing.
Figure 5 shows how the gridded coverage, represented by percentage of land cover, changes through time. In 1946 there is less than 40% coverage, rising to over 50% during the 1960s to the 1990s when coverage drops slightly before reducing rapidly as 2000 is approached. This reflects changes in the availability of stations, and spatial changes in coverage can be seen in Figure 6. For example, we do not have observations for China available in GHCND prior to 1950 and after 1998. The decadal averages require that less than 20% of data be missing for any particular grid point during the selected averaging period, hence grid points over China are missing in the first and last periods of Figure 6.
 As noted in the introduction, a number of global gridded data sets on monthly timescales are in existence. We have compared the variability of our daily data set on monthly timescales with that of CRUTEM2 [Jones and Moberg, 2003] for a number of regions around the globe. This gives us an initial appreciation of how our data set compares with established data sets, albeit at a lower temporal resolution. Daily Tmax and Tmin anomalies are averaged over each month, and a mean monthly temperature is calculated. Mean monthly temperatures for the two data sets are in good agreement over all regions (Figure 7). It is of note that maximum and minimum monthly temperatures over Australasia (Figures 7d and 7h) do not coincide as closely to the mean as they do over other regions, though the mean still correlates well with CRUTEM2. The correlation coefficients between the two series vary from 0.926 for central Asia in July, to 0.996 for Europe during January.
4. Observed Changes in Maximum and Minimum Daily Temperatures
 This data set allows us to study the observed global patterns of change based upon daily maximum and minimum temperatures. As an indication of global changes, Figure 6 shows maximum and minimum temperature anomalies relative to 1961–1990, averaged over each decade, or 1946–1960 in the case of the early part of the data set. Both maximum and minimum temperature anomalies have increased relative to the 1961–1990 period, particularly during the most recent two decades of the 1980s and 1990s. A key question concerns the potential change in not only the mean but also the variance and the shape of the daily temperature distributions [Meehl et al., 2000], and whether it is valid to use changes in mean temperature to infer changes in the extremes at a regional or local scale. Our data set enables us to investigate the full distribution of maximum and minimum temperatures and therefore investigate this assumption in more depth.
 We estimate percentiles on a monthly basis following Robeson . If percentiles are calculated on a seasonal or annual basis the lower percentiles are typically drawn from the colder months, and the higher percentiles from warmer months, and are therefore not representative of the entire season. For each month, we pool data from 5 consecutive years which gives us a larger sample from which to calculate the percentiles. The percentiles for each month and nonoverlapping 5 year period were estimated by selecting the data value closest to the required percentile. The percentiles were then area averaged for a number of subregions of the globe and regional trends were estimated using least squares regression. The results discussed below are based on percentiles calculated from our absolute temperature data set, though we obtain similar results using our gridded anomaly data set.
Robeson  investigated time varying percentiles for daily air temperature over North America and found that changes in the late winter and spring were particularly important. Our results for the United States (Figures 8a and 8b) indicate a broad agreement in the sign and magnitude of trends for all months and percentiles compared to those found by Robeson. Most of the daily minimum temperature distribution has experienced warming throughout the year, with maximum rates occurring during the winter months and at the mid to lower end of the distribution. Only the period around October indicates small negative trends in Tmin across all percentiles. One difference from the findings of Robeson is the location of the zero trend demarcation for maximum temperatures during the summer months, but the location of the maximum trends, in terms of when they occur and at what percentile, are in very close agreement. Otherwise Tmax shows a similar pattern to Tmin, in that warming is concentrated during the winter and early spring months, centered on March. Again, slight cooling trends are centered on October with small negative trends throughout the rest of the year. We evaluated trend significance for each month and percentile interval using a nonparametric Mann-Kendall test. Over the USA trends were only coherently significant at the 5% level during March.
 Next we look at Europe (Figures 8c and 8d), another region where observations are relatively dense. Similar to the USA, we see winter minimum temperatures significantly increasing, particularly at the lower end of the distribution. During November and December there are decreasing trends at the lower end of the distribution. Changes are small during the summer. Maximum temperatures suggest a similar pattern with greater trends at the lower percentiles during the winter. While median Tmax and Tmin during the European autumn show close to a zero trend, there is a negative trend at lower percentiles and a slight positive trend at higher percentiles, suggesting changes in the variance and/or skewness that would not be detected by considering only the median change.
 China (Figures 8e and 8f) also displays warming of both minimum and maximum temperatures during the winter, particularly at the lower percentiles, indicating a reduction in the range. Maximum temperatures, like the USA (Figure 8b), have a negative trend during the spring and summer. Only the warming trends in minimum temperatures during winter are significant over China. Minimum and maximum temperatures increase throughout the year in Russia (Figures 8g and 8h). Large significant trends of around 6°C are observed during winter. Median Tmax and Tmin increase, but particularly during the winter months. Finally, temperatures increase during all months and at all percentiles across Australasia (Figures 8i and 8j). Unlike the other regions, the greatest increase in temperatures is not during the austral winter, but is split between May and September. Largest changes are observed during September, but the warming trends throughout the year are significant at most percentiles.
 In addition to the intradistributional changes in maximum and minimum temperatures, it is clear that in most regions the warming trends in minimum temperatures are greater than for maximum temperatures. This results in a reduction in the diurnal temperature range (DTR) that has been highlighted in previous studies [e.g., Easterling et al., 1997]. Over the USA and China, the reduction in DTR due to warming minimum temperatures is exacerbated by decreasing trends in maximum temperatures which occur during the summer months. China during winter shows that trends in minimum temperatures are larger than for maximum temperatures, which contributes to a reduction of the DTR.
Frich et al.  examined a number of extremes indices relating to temperature. They found that much of the Northern Hemisphere and Australia have warmed, exceptions being the south-central United States, eastern Canada and Iceland, as well as parts of central and eastern Asia. Most notably the proportion of warm nights (defined as the frequency of days where the 90th percentile of minimum temperatures is exceeded) has increased in most regions except over parts of Canada, Iceland, China, and around the Black Sea. Our findings support this and give a more detailed view of how the temperature distributions are changing across the entire distribution and on a seasonal basis.
5. Summary and Conclusions
 A gridded daily temperature data set (HadGHCND) has been created based upon station observations of maximum and minimum temperature which covers the period from 1946 to 2000. We have compared the data set with an existing gridded data set at a monthly resolution, which exhibits a good comparison on monthly timescales when considered on a regional basis. Despite this, analysis of extremes requires a high level of quality control and homogeneity, for example when extracting single annual maxima values. Future considerations are likely to include station homogeneity, including neighbor checks and adjustments for break points, biases caused by different observation times around the globe, as well as sampling errors. Assessing the impact of the interpolation method on the underlying data will also be an important consideration. It is apparent that to gain a truly global picture of changing extremes we need to fill the remaining data gaps over regions such as Africa and South America, southern Asia and the Middle East where the availability of daily climate observations is currently limited. A recent initiative by Alexander et al.  has made considerable progress toward improving the coverage of available climate extremes indices data.
 Investigation of time-varying percentile trends shows that many regions indicate a coherent warming trend in both maximum and minimum temperatures during winter. It is clear that changes are not uniform across the seasons. While the dominant patterns of change are generally seen during winter, there are regional variations. The varying patterns of seasonal and regional percentile trends suggest that to infer changes in extreme temperatures from mean changes in temperature would not be appropriate.
 The data set was gridded onto a 2.5° by 3.75° grid to facilitate comparison with the Hadley Centre GCM. Future work will involve intercomparison with GCM simulations of the latter half of the twentieth century to evaluate the quality of GCMs with respect to their ability to simulate daily temperature distributions and extremes. A newly implemented automated gridded system will enable us to more easily produce versions of this data set on different grid resolutions. Improvements to the GHCND data set will allow us to extend HadGHCND from 2000 to the present day. The gridded data set can be obtained from the U.S. National Climatic Data Center (http://www.ncdc.noaa.gov) and from the Hadley Centre at the UK Met Office (http://www.hadobs.org).
 Thanks go to Byron Gleason for supplying the GHCN-Daily data set and to the two anonymous reviewers for helping to improve this paper. This work was largely funded by the U.K. Department for Environment, Food and Rural Affairs under contract PECD/7/12/37. This paper is British Crown copyright.