By continuing to browse this site you agree to us using cookies as described in About Cookies
Wiley Online Library is migrating to a new platform powered by Atypon, the leading provider of scholarly publishing platforms. The new Wiley Online Library will be migrated over the weekend of February 24 & 25 and will be live on February 26, 2018. For more information, please visit our migration page: http://www.wileyactual.com/WOLMigration/
 This study first homogenizes time series of daily maximum and minimum temperatures recorded at 825 stations in China over the period from 1951 to 2010, using both metadata and the penalized maximum t test with the first-order autocorrelation being accounted for to detect change points and using the quantile-matching algorithm to adjust the data time series to diminish discontinuities. Station relocation was found to be the main cause for discontinuities, followed by station automation. The effects of discontinuities on estimation of long-term trends in the annual mean and extreme indices of temperature are illustrated. The data homogenization is shown to have improved the spatial consistency of estimated trends. Using the homogenized daily minimum and daily maximum temperature data, this study also analyzes trends in extreme temperature indices. The results show that the vast majority (85%–90%) of the 825 sites have experienced significantly more warm nights and less cold nights since 1951. There have also been more warm days and less cold days since 1951, although these trends are less extensive. About 62% of the 825 sites were found to have experienced significantly more warm days and about 50% significantly less cold days. None of the 825 sites were found to have significantly more cold nights/days or less warm nights/days. These indicate that the warming is stronger in nighttime than in daytime and stronger in winter than in summer. Thus, the diurnal temperature range was found to have significantly decreased at 49% of the 825 sites, with significant increases being identified only at 3% of these sites.
 Extreme events, such as heat waves, have great impacts on society. Daily meteorological observations have increasingly been used to assess changes in extremes or unusually anomalous weather fluctuations [Zwiers and Kharin, 1998; Jones et al., 1999; Karl et al., 1999; Frich et al., 2002; Klein-Tank and Konnen, 2003]. However, long-term daily climate data time series are often affected by a number of nonclimatic factors that make these data unrepresentative of the actual climate variation over time. These factors include changes in instruments, observing practices, station location, and environment [Jones et al., 1985; Karl and Williams, 1987; Gullett et al., 1990; Heino, 1994]. Therefore, to be confident in the analysis of long-term changes in weather/climate extremes, it is necessary for the data to be homogenized to diminish nonclimatic influences.
 Instrument-type changes are usually accompanied by comparison measurements. In this case, overlapping records can be used to adjust for the discontinuities due to instrument-type changes. For example, Demaree et al.  and Cocheo and Camuffo  used overlapping records to construct a long-term central Belgium and Padova, Italy, temperature series, respectively. Unfortunately, instrument comparisons are often conducted at a limited number of locations [Peterson et al., 1998], and the results had been often published as institute reports or other “gray literature” not easily available to the international research community [Nordli et al., 1997]. Usually, a statistical test is necessary to identify nonclimatic shifts [e.g., Wang2003, 2008a, 2008b; Reeves et al., 2007]. Most of the recent homogenized data sets are developed by using statistical approaches and using available metadata to verify detected discontinuities [DeGaetano, 2006; Li et al., 2009; Trewin, 2010, Vincent et al., 2012; Venema et al., 2012; Trewin, 2013; Wang et al., 2013]. In particular, Trewin  reviewed daily data homogenization methods, and Venema et al.  and DeGaetano  discussed homogenization of monthly data.
 In the literature, techniques suitable for homogenization of daily temperature data are limited. This is because daily temperatures vary on relatively small spatial scales and are influenced by local processes that are complex and nonlinear, which can be difficult to capture using our conventional climatological networks. The typical decorrelation scales of daily temperatures are ~200 km [Jones and Trewin, 2002], and these scales are likely to be smaller in areas of complex topography. Therefore, while the methods used to homogenize annual and monthly data are well established [e.g., Peterson et al., 1998; Ducre-Robitaille et al., 2003], relatively few methods exist for homogenization of daily data.
Vincent et al.  used a simple approach to homogenize Canadian daily maximum and minimum temperature data. They applied standard techniques to monthly data to identify change points and estimate the shift size (monthly adjustments). Adjustments to each daily temperature are then derived from fitting a piecewise linear function to the monthly mean adjustments for the 12 calendar months such that the integrated magnitude of the daily adjustments preserves the monthly adjustments. They show that their method results in improved daily error estimates and greater spatial representation of extreme temperature trends. However, this method cannot handle regime-dependent shifts properly, because the adjustment for each shift is fixed for a calendar day, no matter how cold or warm the temperature is on that day. Similar techniques have been applied widely to homogenize daily climate data [Maugeri et al., 2002, 2004; Li et al., 2009; Jones and Lister, 2002; Moberg et al., 2002]. Unfortunately, none of these methods adjust the higher-order moments explicitly.
 In recent years, efforts have been made to develop specific methods for homogenization of daily data series with a focus on biases in the probability distribution of daily variables and in daily weather fluctuations. For example, Trewin and Trevitt  developed a percentile-matching adjustment method for homogenization of climate data, which matches the probability distribution of the data before and after a nonclimatic shift. By percentile-matching adjustment, a new daily homogenized temperature data set for Australia has been developed and released [Trewin, 2013]. Della-Marta and Wanner  developed a similar adjustment method. They used a nonlinear model to estimate the relationship between a target station and a highly correlated reference station data series. They showed that, given a suitably reliable reference station, this method produces reliable adjustments to the mean, variance, and skewness. Brandsma and Konnen  present a technique called “nearest neighbor resampling” for homogenization of daily mean temperatures, to diminish nonclimatic shifts caused by a change in the time and frequency of subdaily measurements. Yan and Jones  applied a wavelet method to adjust discontinuities in daily meteorological series. Wang and Feng  proposed the quantile-matching (QM) adjustment method, which is in principle similar to the methods of Trewin and Trevitt  and Della-Marta and Wanner , except that different methods are proposed and used to estimate the adjustments that are needed to match the probability distributions. This QM adjustment method has recently been used by Vincent et al.  and Wang et al.  to homogenize monthly and daily series of Canadian surface air temperatures. In this study, we use this method to homogenize daily series of Chinese surface air temperatures.
 Like many long-term climate data series, the Chinese temperature data time series also contain discontinuities caused by meteorological station locations or other changes to observation practice [Liu and Li, 2003; Li et al., 2004a, 2004b; Li et al., 2009; Li and Yan, 2009; Li and Dong, 2009]. For the period 1951–2001, Li et al. [2004a, 2004b] have homogenized the surface air temperatures for 731 stations in China, using the two-phase regression [Easterling and Peterson, 1995] and the technique of Vincent et al. , producing the first version of Chinese homogenized historical temperature data set (CHHT, version 1.0) [Li et al., 2009]. For the period 1960–2008, Li and Yan  applied the multiple analysis of series for homogenization method to homogenize the daily mean, maximum, minimum temperatures for 549 stations in China, producing another representative homogenized temperature data set for China. Despite the different methods used, the resulting data sets show similar temperature trend patterns [Li and Li, 2007; Li and Dong, 2009; Li and Yan, 2009]. These studies have undoubtedly improved the reliability of the temperature data set. However, most surface observational stations have been automated, with the automation starting to take place in year 2000 but occurring mostly in the period 2003–2005. In addition, we have recently identified and corrected some errors that were introduced during digitalization, which affect a considerable number of values during the period 2004–2007. It would be interesting to assess the effect of the station automation and the digitization errors on estimation of climatic trends and to update the homogenized temperature data set to cover the last decade.
 Thus, the first objective of this study is to use recently developed data homogenization methods (e.g., using QM adjustments) to homogenize daily maximum (Tmax) and daily minimum (Tmin) temperature time series for the period 1951–2010 for 825 National Reference and Basic Stations (NRBS) in China, to produce a larger data set of homogenized Tmax and Tmin for a longer period. Another objective of this study is to access trends in the Chinese temperature extreme indices derived from the data set that has just been homogenized in this study.
 The rest of this paper is organized as follows. Section 2 describes the observed surface air temperature data and related metadata, as well as the methods used in this study. Section 3 reports on the statistics of detected change points and the related causes. Section 4 shows the effects of discontinuities on estimation of long-term trends. Section 5 presents the observed trends in Chinese extreme temperature indices as derived from the homogenized data set. Finally, we give a summary in section 6 to complete this article.
2 Data and Methods
2.1 Observed Surface Air Temperature Data and Metadata
 The temperature data set is collected and processed by the National Meteorological Information Center (NMIC) of the China Meteorological Administration (CMA). According to CMA's classification in 2012, the 825 NRBS stations, shown in Figure 1a, include 143 National Reference Stations (NRS) and 682 National Basic Stations (NBS). The NRSs are established according to the regional climate characteristics and the requirements of the Global Climate Observing System, to attain long-term, continuous, and representative climate data time series. These stations are the backbone of the national climate observing network. The NBSs are set up according to the needs of the national climate analysis and weather forecasting, mainly to serve for the regional or national weather information exchange.
 The data set covers the period from 1951 to 2010. Daily maximum and minimum temperatures were recorded according to the standard observation rules for China [CMA, 1979]. The observed values are quality controlled using the NMIC conventional procedures, including the climatological limit check, the station or regional extremes check, the internal consistency check, the temporal and spatial consistency checks, etc. Recently, NMIC has identified and corrected some random errors that were introduced during digitization of data in paper forms, which affect a considerable number of data in the period 2004–2007. But the corrections of these errors are approximately normally distributed (not shown), with the mean value of 0.06°C for both Tmax and Tmin. These digitization errors were not corrected in any of the existing homogenized China temperature data sets. This study will produce for the first time a homogenized data set that includes correction of the digitization errors.
 Related metadata are also used in this study to verify the veracity of statistically detected change points. According to “the Rules of Surface Observations in China” [CMA, 1979], which is stated in the China Meteorological Industry Standards, any change at a national meteorological station shall be documented in a standard form, the “Meteorological Station History data File Format.” These station history data files (i.e., metadata) are checked and stored by the provisional meteorological information centers and also by NMIC.
 Table 1 reports the statistics of station relocations. Among the 825 stations analyzed in this study, there are a total of 669 stations that have been moved once or more times. As mentioned earlier, most stations have been automated in the past decade or so. Figure 1b shows the number of stations that have been automated in each year during the period from 2000 to 2010. Clearly, the majority of automations took place during 2003–2005, with 274 stations being automated in 2004.
Table 1. The Statistics of Station Relocations (up to 31 December 2009)
> = 3 Times
Number of stations relocated
2.2 Change Point Detection Methods
2.2.1 Methods Used to Detect Change Points
 In this study, we use the RHtestsV3 software package [Wang and Feng, 2010] to homogenize the daily temperature series. This package includes two algorithms: the PMTred algorithm is based on the penalized maximal t test (PMT) [Wang et al., 2007], which needs to use a reference series, and the PMFred algorithm is based on the penalized maximal F (PMF) test [Wang, 2008b], which can be used without a reference series. Both algorithms account for the effect of autocorrelated noise term empirically and deal with multiple change points using a recursive testing algorithm [Wang, 2008a]. The effects of unequal sample sizes on the detection power were also diminished using an empirical function in each algorithm [Wang et al., 2007; Wang, 2008b]. The RHtestsV3 package and its previous versions have been widely used to homogenize climate data [e.g., Wang et al., 2013; Kuglitsch et al., 2012; Vincent et al., 2012; Dai et al., 2011; Wan et al., 2010; Zhang et al., 2005; Alexander et al., 2006]. Using an early version of the RHtests package, You et al.  checked homogeneity of daily temperature and precipitation data before they calculated extreme indices for China.
 In particular, one can use the PMFred algorithm to test the homogeneity of a reference series and adjust it for significant change points when necessary. The homogenized reference series can then be used to test other data series using the PMTred algorithm. In this study, as described in the next subsection, we first use both the PMFred and PMTred algorithms to test the homogeneity of potential reference series, to find homogeneous series for use as reference series. Then, we use the PMTred algorithm along with a reference series to test the daily maximum and daily minimum temperature series and to test the corresponding annual and monthly mean series, separately. In this study, all the tests are conducted at the 5% significance level.
 The resulting change points are then synthesized and verified with available metadata. In general, we retain two types of change points for adjustment. (1) Change points that are supported by metadata and are also identified to be a significant documented (Type 0) change point in the monthly or annual mean series. In general, we consider a detected change point to be a documented one when metadata indicate a documented change within 1 year before or after the detected change point. In this case, the documented time of change is used to replace the estimated time of change when they are not identical (due to estimation error). (2) Change points that are identified to be a significant undocumented (Type 1) change point in each and every data series of three time scales, i.e., in the annual and monthly and daily series but have no metadata support. Here the maximum time distance to consider a change point the same one in the monthly, annual, and daily time series is 1 year. While daily data series is, in general, much noisier than the corresponding annual and monthly data series, but its sample size is 365 times the sample size of the annual series and 30 times of the consecutive monthly series. As long as the autocorrelation, which is also higher in daily series than in monthly or annual series, is accounted for in the homogeneity test (as is done in the RHtests package), results of testing daily series are not necessarily less reliable than testing monthly or annual series. Requiring a Type 1 change point to be detected in data series of all three time scales will reduce the small chance of adjusting false positive change points, which is what we desire for.
2.2.2 Methods Used to Choose/Compose Reference Series
 In general, it is very important to find or compose a homogeneous reference series (which could be a homogeneous subperiod of a data record) that well represents the same climatic variations as in the base series (the series to be tested and homogenized). (Although pairwise comparison algorithms do not need to find a homogeneous period a priori, they implicitly require that, within the network of stations, some data series are homogeneous for some subperiods. It would not work well for detecting network wide changes (i.e., all data time series in the network share the same change)). However, it is often difficult to find a homogenous representative reference data time series. At the monthly or yearly scale, Peterson and Easterling  proposed to compose a reference series by spatially interpolating the data series from 3 to 4 nearby stations that are highly positively correlated with the base series. At the daily time scale, however, it would be more feasible to use a single-station data series as a reference series, because the decorrelation distance is much smaller for daily data due to the effect of local disturbances. Thus, in this study, we took the following two approaches to choose reference series.
 The first approach is for high station density areas, which consists of the following steps:
 We divide continental China into 2.5°-by-2.5° lat.-long. grid boxes (28 × 16 grid boxes, of about 250 km resolution). For each grid box that contains five or more stations, we obtain a grid box average annual data series by averaging individual station annual data series over all stations in the grid box.
 We calculate the correlation of the first difference series between the grid box average annual data series and each individual station annual data series in the grid box.
 When the grid box contains five or more stations for which the above correlation is 0.8 or higher, we average the daily Tmin (or Tmax) data series over the five stations of highest correlations, obtaining a five station average daily data Tmin (or Tmax) data series.
 We use the PMTred algorithm, with the five station average daily Tmin (or Tmax) data series as the reference series to test homogeneity of each individual station daily Tmin (or Tmax) data series in the grid box, to find the longest homogeneous single-station Tmin (or Tmax) data series as the final reference series for all Tmin (or Tmax) series in the grid box.
 When a 2.5°-by-2.5° grid box contains less than five stations for which the above correlation is 0.8 or higher (including the case that there are less than five stations in the grid box), we choose reference series without using any grid boxes. This second approach consists of the following steps:
 We use the PMFred algorithm to test homogeneity of each daily temperature data series at each of the 2421 stations in China (without using a reference series), just to find homogeneous series to form a pool of potential reference series.
 For each base series for which a reference series has not been identified using the first approach, we choose, from within 500 km radius, up to 20 nearest homogeneous stations that also satisfy the following limits in altitude difference: the difference shall be within 200 m if the base station altitude Hb ≤ 2500 m, and within 500 m if Hb > 2500 m.
 We calculate the correlation of the first difference series between the base daily series and each of these (up to 20) homogenous series and choose the homogenous station series that has the highest correlation as the reference series for the base series in question.
 In general, the station density is lower in western China than in eastern China. However, since we are choosing reference series from 2421 stations, rather than the 825 stations, the correlation of first difference series between a base series and its reference series is 0.7 or higher, for all base series analyzed in this study, regardless of its station location (in western or eastern China).
 In addition, we also avoid using stations of obvious urban heat island effect, as defined in Li et al. [2004b], as a reference station, to avoid the effect of urban heat island on homogeneity test [Li and Dong, 2009]. Following Li et al. [2004b], here, an urban station is defined as a station that is located either in a city with population of more than 50,000 or in a municipal district with population of 500,000. There are about 120 urban stations of good long-term data record [Li et al., 2004b]. There have been several studies on assessing urban heat island effects on temperature trend estimates for China [e.g., Li and Dong, 2009; Jones et al., 2008; Li et al., 2010; Li and Huang, 2013; Li et al., 2013]. Thus, this will not be a focus of the present study. After all, urban heat island effects are human-induced changes.
 As a result of the above reference selection procedures, each and every chosen reference series is a single-station raw data series that is found to be homogeneous. None of the chosen reference series was subjected any adjustments, no matter which of the two approaches above was used.
2.3 Change Point Adjustment Methods
 Adjusting the central tendency or mean state of a climate variable is sufficient to homogenize annual/monthly data time series to provide reliable estimates of trends and variability in the mean state of the climate variable [e.g., Houghton et al., 2001; Jones and Moberg, 2003]. However, a nonclimatic change could have different effects on the recorded values under different climate conditions (e.g., cold versus warm temperatures), as shown in Wang et al.  (see their Figure 4). It could affect several aspects of the data distribution (e.g., scale and shape), with or without shifting the mean state. In particular, adjusting artificial shifts in the mean state is often not sufficient to homogenize daily data time series for analysis of extremes. Thus, a few methods have been developed to match the distributions of the data before and after a change point [e.g., Trewin and Trevitt, 1996; Della-Marta and Wanner, 2006; Wang et al., 2010], which diminishes the effect of nonclimatic change on all aspects of the data distribution (variance, shape…). In this study, we use the QM adjustment method [Wang et al., 2010] to homogenize the Tmax and Tmin time series. As in Vincent et al.  and Wang et al. , up to 10 years of data before and after a change point are used to estimate the QM adjustments from the base-minus-reference series, with the cumulative distribution being evaluated for 12 quantile categories whenever there is enough data to do so [Wang et al., 2010; Wang and Feng, 2010]. The reference series were selected as described in section 2.2.2.
3 Statistics of Detected Change Points and the Related Causes
 Table 2 reports the number of stations without any shift or with shifts that have been identified and adjusted for the daily maximum (Tmax) and minimum (Tmin) temperature series and the number of shifts due to different causes (relocation, automation…). At the 5% significance level, the Tmax and Tmin series are found to be homogeneous at 466 and 363 stations (56% and 44% of the 825 stations), respectively. This suggests that Tmin is more sensitive to the effect of discontinuities than is Tmax, which is consistent with the conclusion of Li and Dong . Sometimes, a shift is identified on the same date for both Tmax and Tmin series of the same station. For 239 out of the 825 stations, both the Tmax and Tmin series were found to be homogeneous. Note that the number of stations that have a statistically significant shift due to station relocation is much smaller (272 stations for Tmax and 316 stations for Tmax; see Table 2) than the total number of stations that have been moved once or more times (669 stations; see Table 1). This indicates that station relocation did not always cause a statistically significant shift; some of the relocations had negligible effects on the temperature observations.
Table 2. The Number of Stations With None, One, Two, Three, Four, and Five Shifts That Have Been Identified and Adjusted for the Daily Maximum (Tmax) and Minimum (Tmin) Temperature Series, as Well as the Number of Shifts due to Different Causes as Indicated
a. Number of Stations With
Total number of stations
b. Number of Shifts due to
Total number of shifts
 A total of 463 change points were detected in 359 Tmax time series and 635 change points in 462 Tmin time series (Table 2). Figure 2 shows the spatial and temporal distributions of the number of shifts in the Tmax and Tmin series recorded at the 825 stations for the period from 1951 to 2010. Clearly, artificial shifts are quite widespread across China and over the period since 1951. Stations with more than two shifts are mainly seen in northern China (Figures 2a–2b). The highest number of change points in the observing network is seen in years 2004 and 2005 for both Tmax and Tmin (Figures 2c–2d), which were caused by station automation according to metadata.
 About 84% and 74% of the change points detected in the Tmax and Tmin series, respectively, are supported by metadata (Table 2). These are also referred to as documented change points. The smaller portions of change points that do not have metadata support (Table 2) are significant Type 1 change points that are associated with shifts of unnegligibly large magnitudes in the annual and monthly series and were also identified in the corresponding daily series. Thus, we consider them to be true artificial change points for adjustments, noticing that metadata are often incomplete.
 As reported in Table 2, station relocation is the main cause of discontinuities in the daily temperature data series. It is responsible for about 50%–60% of the detected change points or about 70% of the documented change points. Station automation and changes in the observing environment are found to be the other main causes for discontinuities in daily temperature time series (Table 2).
 In general, the effect of relocation varies from case to case. Relocation with a big station elevation change usually has larger impacts than does a long-distance relocation with the old and new sites sharing similar environment and elevation.
 Whereas station relocations and other changes at observing sites can cause either a rise or a drop in the observed temperatures, the magnitude of all artificial changes is not necessary symmetrical about zero. Figures 3a–3b show the probability density functions of all QM adjustments applied respectively to the daily maximum and minimum temperatures that have been identified to have inhomogeneities. Most of the adjustments range from −1°C to 1°C for Tmax (50% within −0.39°C to 0.25°C; Figure 3a) and from −1.5°C to 1.5°C for Tmin (50% within −0.61°C to 0.38°C; Figure 3b). The median of the adjustments is −0.10°C for both Tmax and Tmin (Figures 3a–3b).
 The negative median values of the QM adjustments (Figures 3a–3b) arise from the fact that station relocations collectively imposed an overall positive mean bias in Tmin and Tmax, so that the median of the adjustments needed to diminish the effects of relocations is negative (−0.21°C for Tmax and −0.35°C for Tmin; Figures 3c–3d). On the contrary, station automations collectively imposed an overall negative mean bias in Tmin and Tmax, so that the adjustments needed to diminish the effects of automations have a positive median value (0.10°C for Tmax, 0.12°C for Tmin; Figures 3e–3f).
 The magnitude of the relocation effects on Tmin is more variable (the distribution is wider) than those on Tmax (Figures 3c–3d). This is also true for the automation effects but to a lesser degree (Figures 3e–3f).
4 Inhomogeneity Impacts on Estimation of Long-Term Trends
4.1 Inhomogeneity Impacts on Estimation of Annual Mean Temperature Trend
 In this subsection, we first use Guiyang and Dachen stations as examples to show the effects of discontinuities on estimation of annual mean temperature trends. We then further illustrate the impacts by intercomparing the patterns of trends estimated from the raw and homogenized data series.
 According to metadata, Guiyang station (ID 57816) has been relocated twice since 1951. The first relocation was on 31 August 1953, moving 0.4 km away from its old location, and the new station elevation is 13.8 m higher than the old one. The second relocation was on 31 December 1999, moving 2.5 km away from the old site, and the new station elevation is 149.0 m higher than the old one.
 Dachen station (ID 58666) has been relocated once since 1951. It was moved 1.7 km away from its old location on 1 October 1982. The new station elevation is 18.7 m lower than the old one.
 The Tmin series from station 57707 and the Tmax series from station 57902 were chosen as the best reference series for the Guiyang Tmin and Tmax series, respectively, and the Tmin series from station 58752 and the Tmax series from station 58667, for the Dachen Tmin and Tmax series. With these reference series, the homogeneity test results suggest that the Guiyang Tmax series is significantly affected by both relocations, but its Tmin series is significantly affected by the second relocation only. For Dachen, both its Tmin and Tmax series are significantly affected by the relocation to a lower elevation site.
 The Guiyang annual mean series of the raw and homogenized (QM adjusted) Tmin and Tmax series are shown in Figures 4a–4b, respectively. For both the Tmin and Tmax series, a negative trend is estimated from the raw data series, while the homogenized series shows a positive trend. Here the negative trend is an artifact introduced by relocation of the station to a site of notably higher elevation. The elevation increase is suspected to be the main cause of the artificial decrease in the temperature records. The 1999 relocation introduced a decrease of about 1.26°C and 1.38°C in the mean of Tmax and Tmin, respectively, and the 1953 relocation also introduced a decrease of about 1.73°C in the mean of Tmax (these are the sizes of the mean shifts).
 The Dachen annual mean series of the raw and homogenized Tmin and Tmax series are shown in Figures 4c–4d, respectively. For both Tmin and Tmax, the raw data series show stronger warming trends than do the corresponding homogenized data series. The artificially enhanced warming trends are due to relocation of the station with an elevation decrease of 18.7 m.
 The examples in Figure 4 clearly show that inhomogeneities can artificially enhance or weaken the long-term trends. More realistic estimates of long-term trends can be obtained from homogenized data series.
 The data inhomogeneities also decrease the spatial consistency of estimated trends in annual mean temperatures. As shown in Figure 5, the patterns of trends derived from the homogenized data series have better spatial consistency than those derived from the raw data series. The improvement is particularly noticeable for Tmax in southwestern China (Figures 5a–5b) and for Tmin in central China (Figures 5c–5d).
4.2 Inhomogeneity Impacts on Estimation of Trends in Temperature Extreme Indices
 Inhomogeneities in daily temperature series could also have great impacts on the estimation of trends in temperature extreme indices. To illustrate such impacts, we calculate the following extreme indices: warm days (TX90P), cold days (TX10P), warm nights (TN90P), and cold nights (TN10P). Similar to Zhang and Feng , TX90P (TX10P) is defined as the number of days with temperature above the 90th (below the 10th) percentile of Tmax, and TN90P (TN10P), the number of nights with temperature above the 90th (below the 10th) percentile of Tmin, respectively. The period 1961–1990 is used as the base period for calculating the 90th and 10th percentiles.
 Considering that these indices are not Gaussian data, in this study, we estimate trends in each time series of these indices by the trend estimation method of Wang and Swail . This is a Kendall trend estimator [Kendall, 1955; Sen, 1968] that accounts for the first-order autocorrelation in the time series in question [Wang and Swail, 2001, Appendix A].
 Figure 6 shows the time series of annual counts of warm days, warm nights, cold days, and cold nights, as derived from the raw and homogenized (QM adjusted) Tmax and Tmin series for Guiyang station. For three of the four extreme indices (except for TN10P), the trend estimated from the homogenized data is of the opposite sign to that estimated from the corresponding raw data (Figures 6a–6c). For cold nights (TN10P), a significant decreasing trend is almost completely suppressed by the artificial changes in the raw data (Figure 6d). The artificial decreases in temperature records, if not diminished, would lead us to mistakenly report a decreasing trend for an actual increasing trend in the occurrence frequency of warm days and warm nights (Figures 6a–6b) and mistakenly report an increasing trend for an actual decreasing trend or fail to detect a significant decreasing trend, in the occurrence frequency of cold days and cold nights (Figures 6c–6d). The artificial effects on the frequency of warm nights and cold nights are larger than those for warm days and cold days (Figure 6).
 Inhomogeneities could also decrease the spatial consistency of estimated trends. For example, Figures 7-10 show the maps of trends in the annual counts of cold nights, warm nights, warm days, and cold days, as derived from the raw and homogenized Tmin data, respectively. Clearly, the spatial consistency of trends in the homogenized data (Figures 7b–10b) is better than that in the raw data (Figures 7a–10a), indicating that data homogenization could improve the spatial consistency of estimated trends.
5 Trends in Extreme Temperature Indices Derived From the Homogenized Data
 Figures 7b and 8b show the maps of trends in the occurrence frequency of cold nights (TN10P) and warm nights (TN90P), as derived from the homogenized Tmin data. Clearly, the frequency of cold nights has decreased almost everywhere across the country, with a significant decreasing trend being identified at 745 (90.3%) of the 825 sites (Figure 7b). A small increasing trend in the frequency of cold nights is estimated for five sites in central China, but none of them is significant at 5% level (Figure 7b). The decreasing trend is smaller in central south China (from the middle Yangtze River valley to Guangxi province) than in other regions (Figure 7b). On the contrary, the occurrence of warm nights has become more frequent almost everywhere across the country, with a significant increasing frequency being identified at 697 (84.5%) of the 825 sites (Figure 8b). Although a negative trend in the frequency of warm nights is found at 12 out of the 825 sites, but none of them are found to be significant at 5% level (Figure 8b). Overall, continental China has experienced significantly more warm nights and less cold nights since 1951.
 Continental China has also experienced more warm days (TX90P) and less cold days (TX10P) during the last 60 years or so. However, trends in the occurrence frequencies of warm days and cold days are much less extensively significant than those of warm nights and cold nights (compare Figures 7b, 8b, 9b, and 10b). Among the 825 sites, 508 sites (61.6%) were found to have experienced significantly more warm days and 408 sites (49.5%) significantly less cold days (Figures 9b and 10b). None of the 825 sites was found to have a significant decrease in the frequency of warm days or a significant increase in the frequency of cold days (Figures 9b and 10b). Trends in the frequency of warm days are largest along the southeast coasts and along the lower Yangtze River valley, as well as over the Loess Plateau, while they are mostly insignificant in inland south China (Figure 9b). In the Huaihe River to lower Huanghe River valley, the frequency of warm days shows an insignificant small decreasing trend (Figure 9b), which could be associated with gradual increasing in irrigation in this major agriculture area (personal communication with Dr. Xuchao Yang). Trends in the frequency of cold days are largest along the east coast and in western China, with most sites in inland south China showing insignificant trends (Figure 10b).
 As a result of the climatic changes in the Tmin and Tmax, the diurnal temperature range (DTR) was also found to have decreased significantly at 401 (48.6%) of the 825 sites but increased significantly at 30 (3.6%) of the 825 sites (Figure 11). The DTR increases are largest in northeastern and central-eastern China, as well as in northwestern China (Figure 11).
6 Summary and Discussion
 In this study, we have first homogenized time series of daily maximum and minimum temperatures recorded at 825 stations in China over the period from 1951 to 2010, using both metadata and the PMTred algorithm to detect change points, and using the QM adjustment method conducted with reference series to adjust the data time series to diminish inhomogeneities. This produces the largest Chinese data set of homogenized daily minimum and daily maximum temperatures, including 825 stations and covering a 60 year period (1951–2010), which is particularly useful for in-depth characterization of extreme surface air temperatures across China and for assessing changes therein. The raw and homogenized daily data for 198 international exchange stations will soon be made publicly available via the Global Land Surface Meteorological Databank of the International Surface Temperature Initiative (http://www.surfacetemperatures.org/home).
 We have noticed that station relocation is the main cause for the identified inhomogeneities, followed by station automation. The latter introduced obvious sudden increases in temperature between years 2004 and 2005. Our results show that, within the network of 825 stations, station relations collectively imposed an overall positive mean bias in Tmin and Tmax, but station automations collectively imposed an overall negative mean bias in Tmin and Tmax (Figure 3). The magnitude of the effects of station relocations and automations on Tmin is more variable than those on Tmax (Figures 3c–3f). The adjustments for all identified inhomogeneities are not distributed symmetrically about zero, showing a small negative bias for both Tmax and Tmin (Figures 3a–3b).
 We have also assessed the effects of inhomogeneities on estimation of trends in the annual mean temperature and in extreme temperature indices. We have shown that the data homogenization has improved the spatial consistency of estimated trends (Figures 5 and 7-10). We have also shown that data homogenization is rather important at the station level, though the impact of inhomogeneity appears to have a mostly random impact on the station network as a whole. Data homogenization has improved spatial consistency of estimated trends.
 Although the QM adjustment method used to homogenize daily temperature series in this study has been used in previous studies [e.g., Dai et al., 2011; Vincent et al., 2012; Wang et al., 2013], benchmarking of daily data homogenization methods is still lacking at the present. Such benchmarking work would be valuable to characterize the uncertainty associated with homogenization of daily climate data time series. Thus, this has become a planned activity of the International Surface Temperature Initiative (http://www.surfacetemperatures.org/home).
 Using the newly homogenized daily minimum and daily maximum temperature data, we have also analyzed trends in the extreme temperature indices for the 825 sites. Our results show that the vast majority (85%–90%) of the 825 sites have experienced significantly more warm nights and less cold nights since 1951. Also, there have been more warm days and less cold days since 1951, although these trends are less extensive. About 62% of the 825 sites were found to have experienced significantly more warm days and about 50%, significantly less cold days. None of the 825 sites were found to have significantly more cold nights/days or less warm nights/days. These indicate that the warming is stronger in nighttime (Tmin) than in daytime (Tmax) and stronger in winter than in summer, which are consistent with previous reports for other regions [e.g., Vincent et al., 2012; Wang et al., 2013]. Therefore, the diurnal temperature range was also found to have significantly decreased at about 49% of the 825 sites, with significant increases being identified only at 3% of these sites (Figure 11).
 This work was supported by State Key Development Program of Basic Research of China, 2010CB951600, Ministry of Science and Technology of China, GYHY201206012, National Science and Technology Supporting Program of the 12th Five-Year Plan Period, 2012BAC22B00, and China Meteorological Administration Special Foundation for Climate Change, CCSF201224. This work is conducted under the umbrella of the Canada-China Joint Working Group Project titled “R&D and Cooperation on Quality Control and Homogenization of Climate Observational Data”.