Probabilistic quality control of daily temperature data

Authors


Abstract

A procedure for quality control of daily temperature data, designed to automatically identify potential anomalies, is presented herein. The procedure verifies whether the observed data lie within two different confidence intervals of fixed probability: the first one derived on the basis of the probability distribution fitted to the historic dataset of the considered climatic station, and the second one derived by means of multiple linear regressions, making use of contemporaneous data observed at selected reference stations. Examples of applications of the proposed procedure are reported with reference to daily maximum and minimum temperature data observed from 1950 to 2004 at four thermometric stations in Sicily (Italy), selected within the meteorological monitoring network operated by the Water Observatory of Sicily region (Italy). Results of the applications show that more than 80% of daily temperature data are automatically validated. In addition, the performance of the procedure is assessed by introducing known errors into temperature datasets, supposed as correct, and by estimating the probabilities of correctly classifying data as validated or not validated. From the results, it can be concluded that, as the percentage of errors increases, the probability that not validated data are not correct increases, while the probability that validated data are correct decreases. In particular, if the percentage of errors is larger than about 35%, the procedure reveals a lower capability to recognize correct data rather than not correct ones. Copyright © 2012 Royal Meteorological Society

1. Introduction

The quality of meteorological data is of primary importance for the reliability of any study in the field of the environmental sciences. As a matter of fact, meteorological data are affected by errors deriving from multiple causes, often unidentified, which can have severe and adverse effects on the analysis and decisions that result from them (Madsen, 1989; Daly et al., 2004).

With the widespread use of electronic interfaces in data collection, many meteorological networks have increased the sampling rate and have added more sensors. In principle, the increase in data volume implies a greater manual workload of the operators whose task is to assign quality flags to data. To this end, automatic procedures for quality control of meteorological data can significantly reduce the necessary time for a manual inspection by preliminary screening data for potential anomalies in the observations (Abbott, 1986; Reek et al., 1992; You et al., 2008). Then, only a limited number of suspected incorrect data require an in-depth inspection by an operator who, based on his experience, will be eventually able to decide on the quality of the information (Sciuto et al., 2009).

Although the automatic procedures for data quality control have been originally designed to identify typing errors that can occur when data on paper format are stored in electronic format (Daly et al., 2004), nowadays they are mainly applied to detect errors which can occur in data acquisition, storage and transmission from the automatic stations, either when data are transmitted to a collection centre, or read by an integrated data logger (Gandin, 1988). In the latter case, one of the biggest problems is corruption or loss of data in the logging unit. Other problems include loss of seal, internal condensation or water intrusion, corrosions of contacts, loss of power (battery) and inability to read data from the logging unit memory (Metcalfe et al., 1997).

In literature, several procedures for the automatic control of hydrometeorological data have been proposed, mostly oriented to detect outliers in historical datasets. In particular, some procedures are based on a comparison between current data and the data observed in the past in the same station. Vose et al. (1992) have proposed a procedure for quality control of monthly series of climatic data (temperature, precipitation, sea level, etc.) oriented to identify anomalous extreme values. In particular, temperature data have been analysed by means of thresholds (e.g. mean monthly temperature less than − 73 °C or more than 58 °C are labeled as outliers). Similarly, Feng et al. (2004) have applied a quality control to daily climatic data observed from a large number of stations in China, based on a comparison with appropriate thresholds for high and low extreme values.

Other methods are based on data observed in neighbouring stations or on a combination of both procedures (Eischeid et al., 1995; Gandin, 1988; Campisano et al., 2002; Brunet et al., 2006). Peterson et al. (1998) have presented practical guidelines for quality control of digital data, mainly focused on avoiding problems (e.g. file truncations, formatting errors, unreadable records) that make large portions of data unusable or untrustworthy. Moreover the outliers identification based on the data recorded in the target station is supported with a spatial analysis, based on the consideration that if the climate of the region was exceptionally cold (or warm) that month, nearby stations should confirm that assessment. You and Hubbard (2006) show that spatial regressions can be successfully applied in quality assurance also for extreme values and that these methods yield a robust quality control for automatic data network.

Recent researches interrelate the quality control of climatic data with the types of error that can occur when data are flagged either as correct or incorrect. Usually, a Type I error is the rejection of good data and a Type II error is the acceptance of bad data. The frequency of occurrence of these two types of error is a good indicator for evaluating the performance of the methods (You and Hubbard, 2007).

Besides detecting outliers, quality control can be also applied within data assimilation schemes, e.g. for weather forecasting models, to make sure that the underlying assumptions for assimilated data are not violated (Ingleby and Lorenc, 1993; Qin et al., 2010).

In general, quality control methods can be oriented to detect anomalies in historical dataset or implemented for an online use within the acquisition procedures. Clearly in the latter case, the procedure should enable a fast decision based on computer flagging.

In this article, an automatic quality control procedure of daily temperature data is presented, which extends a method previously developed for monthly temperature and precipitation dataset (Campisano et al., 2002). The procedure is specifically developed for an automatic monitoring network, and aims at flagging daily temperature data as they are acquired from the observation network. In particular, the procedure is based on checking new data through the comparison with probabilistic confidence intervals derived from the historical dataset observed both in the target station and in the selected reference stations. More specifically, a first control is carried out by verifying whether the observed data lie within a confidence interval of fixed probability, obtained from the historical series of the station under investigation. A second control is based on comparing the observed data with confidence intervals derived by means of multiple linear regressions, developed by using contemporaneous data observed in selected reference stations.

The proposed procedure aims to be a further step of quality control after common internal consistency and temporal coherency tests are carried out, such as checking whether daily maximum temperature data are smaller than daily minimum temperature data, or if several consecutive values are identical or if there are large differences between consecutive observations.

It is worth underlying that quality control is usually the first step to carry out in order to compile high-quality database. Once quality control is completed, the second step is performing homogeneity tests to make sure that variations in the time series only respond to the forcing induced by climate variability and not to other artificial causes related, for instance, to stations relocation, installations of new sensor types, changes in observation time, and so on. However, as quality control procedures (such as the one presented here), generally rely on historical observations in the same stations or in nearby ones, it is extremely important that historical datasets of the considered stations are preliminarily submitted to an in-depth analysis aiming at the identification of possible non-homogeneities, and perhaps adjusted before undertaking the quality control. Indeed, non-homogeneous datasets can compromise the performance of the quality control procedure, in terms of outliers detection, by leading to an unreliable estimation of confidence intervals. Several papers document the appropriate order in which to proceed to compile an homogeneous temperature datasets (Karl and Williams, 1987; Alexandersson and Moberg, 1997; Vincent, 1998; DeGaetano, 2006; Toreti and Desiato, 2008). In what follows, we will assume that historical datasets have already been tested for homogeneity.

Examples of applications of the proposed procedure are reported with reference to daily maximum and minimum temperature observed from 1950 to 2004 at some automatic stations in Sicily (Italy), operated by the former Sicilian Hydrographic Service, hereafter referred as to Water Observatory.

Finally, following a similar approach adopted in a previous study on quality control of daily rainfall data (Sciuto et al., 2009), the performance of the proposed procedure is verified by introducing known errors in the available series of daily temperature, supposed as correct, and by estimating the probabilities to correctly and incorrectly classify data in terms of the corresponding frequencies.

2. Quality control procedure of daily maximum and minimum temperature data

2.1. Generalities

The proposed procedure for quality control of daily maximum and minimum temperature consists of two different phases, namely: Calibration and Data control. The aim of the calibration phase is the definition of two confidence intervals of fixed probability for daily temperature: one based on the historical dataset of the target station only, the other one based on the data observed, at the same time, at the target station and the selected reference stations.

In the first case, probabilistic confidence intervals of daily temperature from the historic dataset of the target station are derived from the probability distribution preliminarily fitted to the data. Clearly, the periodic variability along the year of daily temperature data has to be taken into account properly in the choice of the probabilistic model to be adopted.

On the other hand, when reference stations are also considered, probabilistic confidence intervals are derived from multiple linear regressions linking daily maximum and minimum temperature data of the target station to the contemporaneous data of reference stations.

As already stated, homogeneity tests should be applied to historic temperature datasets before confidence intervals are computed, in order to remove any biases due to artificial causes related, for instance, to changes in observational practices, that can compromise the results of the quality control. In particular, non-homogeneities in observed time series are usually abrupt changes or breakpoints, which can be sometimes interpreted as significant linear trends and thus incorrectly removed from the data. Therefore, appropriate methods for detecting non-homogeneities that enable, first, to discern among breakpoints and trends, and then, to properly adjust the data, must be applied to avoid unreliable estimation of outliers.

Once the procedure has been calibrated, the data control phase enables to assess the quality of the current observed temperature data, by verifying whether the data lie within the confidence intervals previously determined.

In Table I, the possible outcomes of the quality control procedure are reported, as well as the consequent actions to be undertaken. In particular, if the observed temperature value lies within the two confidence intervals, one based on the historical data of the target station and the other one based on the data of reference stations, it is classified as ‘validated’ (V) and thus accepted. If the value lies within just one of these two confidence intervals is classified as ‘suspect’; whereas, if it lies outside the two confidence intervals, is flagged as ‘highly suspect’, indicating an higher chance for the data value to be incorrect. Clearly, in the last two cases the value is ‘not validated’ (NV) and a manual control of the data is necessary, according to which the data can either be re-classified as validated or finally rejected. The main steps of the procedure are described in Figure 1.

Figure 1.

Flow chart of the automatic quality control procedure of daily temperature data

Table I. Outcomes of the automatic quality control procedure of daily temperature data
Verification 1 Data within confidence intervals based on target station dataVerification 2 Data within confidence intervals based on the data of reference stationsData quality classificationAction
YESYESValidated (V)Accepted
YESNOSuspect (NV)Manual control
NOYESSuspect (NV)Manual control
NONOHighly suspect (NV)Manual control

2.2. Confidence intervals based on the data of the target station

Confidence intervals can be derived by fitting a probability distribution to the temperature series, one for each day of the year, observed at the considered station. For instance, if temperature data at year i and day t, Ti, t, with t = 1, 2, …, 365 and i = 1, 2, …n, are normally distributed, the daily lower and upper bounds, equation image and equation image, of 100 · (1 − α)% confidence interval, with α the fixed significance level, can be determined as:

equation image(1)

where µt and σt are respectively the mean and the standard deviation of temperature values at day t and uα/2 is the standard normal quantile corresponding to a non-exceedance probability α/2. Similar expression can be derived for the case of other distributions.

Owing to the periodicity of daily temperature, the mean µt and standard deviation σt will exhibit a strong dependence with the day t. Here, three methods have been compared to compute means and standard deviation of daily temperature, namely:

  • Method 1: the means and standard deviations of daily values are computed, for each day t of the year, from their sample counterpart, namely equation image and equation image respectively;

  • Method 2: the variability along the year of the mean and of the standard deviation of daily values is modelled by means of Fourier analysis;

  • Method 3: the variability along the year of the mean and of the standard deviation of the mean of daily values, computed on a moving window of 11 consecutive days, is modelled by means of Fourier analysis.

The advantage of modelling the mean and standard deviation of daily maximum and minimum temperature dataset through Fourier analysis stems from reducing the number of parameters (i.e. two parameters for each harmonic plus a constant). Methods 2 and 3 are based on two different applications of Fourier's analysis; the difference lies in the choice of the underlying variable, namely: the daily temperature value in method 2, the mean of daily temperature values observed within a moving window of 11 days in method 3. In particular, method 3 is based on the fact that temperature observed during a generic day is not a feature of that specific day of the year, but it can be similar for a group of days which precede and follow the one considered. To this end, a moving window of 11 days has been chosen as it may represent an appropriate trade-off between a shorter time interval, which would not encompass all the days with a similar temperature, and a longer time interval, which would include different seasonal features.

In the following, details about Fourier analysis of the means of daily temperature are provided. According to Fourier analysis, the mean of daily temperature can be approximated by the following function equation image:

equation image(2)

where equation image and θk are respectively amplitude and phase of kth harmonic, Kmax is the total number of harmonics and t = 1, 2, …, 365 is the day of the year. The amplitude Ak and the phase θk can be computed as functions of the Fourier coefficients ak and bk, by using the following expressions:

equation image(3)

where the Fourier coefficients can be, in turn, computed as:

equation image(4)
equation image(5)

The experience in using Fourier analysis for estimating periodic parameters of hydrologic time series shows that for small time interval series, such as daily and weekly series, few harmonics (i.e. from 4 to 6) are generally enough for a good modelling of periodicity (Salas et al., 1985).

Nonetheless, in order to choose the minimum number of harmonics, one can use the graphical cumulative periodogram test, namely the plot of the explained variance of the variable under consideration versus the number of harmonics (Yevjevich, 1972; Rossi, 1974). In particular, let us define the mean squared deviation (MSD) of Tt around its mean (equivalent to the definition of variance in statistical terms) as:

equation image(6)

Let us now consider the Fourier series estimate equation image of Equation (2) with harmonics k = 1, 2, …, Kmax and the corresponding Fourier coefficients ak and bk. The mean square deviation of equation image around its mean is composed of the MSD(k) of each k, which are determined by:

equation image(7)

with ak and bk obtained from Equations (4) and (5).

Furthermore, let us indicate by Pk the ratio of the sum of the first k mean square deviations MSD(k), to the mean square deviation MSD(T) of Equation (6), namely:

equation image(8)

The plot of Pk versus Kmax is called the cumulative periodogram.

A graphical criterion using the cumulative periodogram for obtaining the significant harmonics is given below. The criterion is based on the concept that the plot of Pk versus Kmax is composed of two distinct parts: a first part where it is possible to observe a sensitive increase of Pk versus k, and a second part where Pk does not change significantly with respect to k, approaching an asymptotic behaviour. Both parts are approximated by smooth curves, intersecting at a point which corresponds to the number of significant harmonics (Salas et al., 1985).

A similar procedure can be applied to model the periodicity of standard deviation. Once that a sufficient number of harmonics to describe both mean and standard deviation have been chosen, it is possible to estimate 100·(1-α)% confidence intervals. The quality control procedure is then carried out by verifying that current temperature data fall within the considered confidence interval. Data falling outside such an interval will be flagged as “suspect” and submitted to a manual control.

2.3. Confidence intervals based on the data of reference stations

The second step of the calibration phase of the proposed methodology consists of a comparison of the datum currently observed at the target station with the contemporary temperature data observed at the reference stations. A preliminary choice of the reference stations can be carried out on the basis of their characteristics (e.g. distance from the target station, difference in height, hillslope orientation) and the length of the record overlapping the time series of the target station. Among the candidate stations, those with the highest median daily correlation coefficients are selected. In principle, one reference station is enough to compute confidence intervals. Nonetheless, for practical applications, a larger number (e.g. at least three or four stations) should be selected, in order to ensure that, in case of temporary malfunctions or maintenance services in one or two stations, still a minimum of at least two reference stations can be used. However, the derivation of reliable confidence intervals mainly depends on the statistical dependence among the data of the target station and the data of the reference stations, here expressed in terms of correlation coefficients, rather than on the number of selected reference stations.

Let Ti, t be the sample series of daily temperature values observed at the target station at year i, with i = 1, …, n and day t, and equation image, the contemporary samples observed at the m reference stations. Assuming a linear multiple regression between the daily temperature values Ti, t observed at the target station and the contemporary data equation image observed in the m reference stations, the daily value Ti, t can be expressed as:

equation image(9)

where εi, t is independent normally distributed with mean 0 and variance equation image.

By defining a Yt(n × 1) vector of the target station and a Xt(n × (m + 1)) matrix related to the reference stations as:

equation image(10)
equation image(11)

the parameters vector equation image can be estimated through the matrix relation (Kottegoda and Rosso,1997):

equation image(12)

and confidence interval that contains the observation Ti, t of the target station with fixed probability 100·(1 − α)% can be computed. In particular, assuming equation image as the vector of the data observed at the reference stations at day t of year i and i, t as the corresponding estimated value through multiple linear regression i, t = Ti, t·bt, the limits of such interval can be expressed in terms of the regression parameters and of the observed data through the following equation (Kottegoda and Rosso, 1997):

equation image(13)

where equation image is the standard deviation of the residual of regression, and tnm−1, α/2 is the quantile of the Student's variable with n–m–1 degree of freedom, corresponding to a non exceedance probability α/2. Again, observed values falling outside such an interval can be considered potentially affected by errors, thus requiring further controls.

2.4. Assessment of the performance of the quality control procedure for maximum and minimum daily temperature series

The performance of the proposed procedure can be evaluated by introducing known errors into a dataset, supposed as correct, and by verifying the model's ability to validate correct data, as well as to detect errors in the data (Sciuto et al., 2009).

Once that the quality control procedure has been applied, it may be of interest to assess the probabilities that the data is actually correct, once it is validated, or conversely the probability that is not correct once it is not validated. Such probabilities can be expressed as:

equation image(14)
equation image(15)

Equations (14) and (15) represent a measure of the performance of the procedure. In particular, Equation (14), gives a measure of the degree of trust that validated data can actually be considered as correct, while Equation (15), provides indications on the percentage of data for which a manual verification (i.e. data not validated) is properly activated, since the data is actually not correct.

Analytical derivation of the above conditional probabilities is not feasible, therefore previous equations can be solved in terms of the corresponding frequency values evaluated on available data sample.

In order to evaluate the performance of the procedure for quality control of temperature datasets, in the present study errors corresponding to a fixed positive variation of temperature have been added randomly to the sample series, so that a scale shift of the thermometer is simulated. In particular, the error is added, each time, to the value observed at a particular day t of a given year i randomly chosen.

Different error amounts have been considered, and a sensitivity analysis of the quality control procedure with respect to the value of the error has been made. In particular, for each day t an error ΔTt expressed in terms of the corresponding standard deviation is introduced, namely:

equation image(16)

where δ represents a multiplicative coefficient of the standard deviation σt, ranging from 0 to 5.

More specifically, first δ is properly fixed, then a particular day t of a given year i is randomly chosen (a Matlab function has been applied to randomly sampling without replacement data points from the array of daily temperature series), and an error ΔTt, as in Equation (16), is added to the corresponding maximum and minimum daily temperature data. Finally, the quality control procedure is running and based on the results, the probabilities expressed by Equations (14) and (15) are computed. Then, the procedure is repeated by randomly sampling other data to add errors, and so on.

3. Applications

3.1. Data description

The procedure has been applied to temperature data observed at selected thermometric stations of the monitoring network operated by the Water Observatory of the Regional Department for Water and Waste in Sicily (Italy). Such data have been previously subject to an accurate manual inspection by qualified operators of the Water Observatory, before being published in the annual reports. In particular, quality control traditionally adopted are mainly internal consistency and temporal coherency tests, such as those oriented to detect improbable values (e.g. too high or low) among the observed data, or to check whether daily maximum temperature data are lower than contemporary minimum values or whether the sign and the amount of change in the observed values are consistent with those that might be expected for the specific time of observation, and so on.

The proposed procedure has been applied to several temperature stations both in Sicily and in Germany (Rossi et al., 2006; Sciuto, 2008). In this study, only the results related to four Sicilian stations, selected within four climatic sub-regions of the island, are presented hereafter.

Following previous studies oriented to verify the presence of non-homogeneities in data, which showed only the presence of climatic trend in temperature series of Sicily region (Noto et al., 2007; Liuzzo et al., 2008), and according to an inspection of the corresponding metadata, which do not document significant changes in the stations characteristics, a trend analysis has been carried out on the available series to ensure the removal of biases in the data.

It should be stressed that, although this simple approach may be suitable for the particular case studies considered in the study, application of the procedure to other cases could require a preliminary homogeneity analysis of the available time series, consisting of identification of abrupt changes, data adjustment, trend detection and removal.

In order to test the procedure, each series has been divided in two samples: the first, from the beginning of the observations to the last 6 available years, is used for the calibration; and the second including the last 6 years is used for quality control. This latter sample has been also used to asses the performance of the procedure by introducing known errors in a dataset, supposed as correct, and by computing the probabilities of correctly classifying data as validated or not validated. Note that assuming the last 6 years as correct is a reasonable assumption in light of the thorough manual control carried out by the Water Observatory.

3.2. Estimation of confidence intervals based on the data of the target station

First, a trend analysis has been carried out on the available datasets. Trend analysis has been originally performed both on annual time scale and shorter time scales (e.g. months or quarters). However, the removal at short time scales can be sometimes rather difficult. In fact, there may be some physical process that causes a rapid switch or change from one mode of behaviour (e.g. increasing trend) to another (e.g. decreasing trend), which becomes evident at shorter time scales. In such a case, the overall behaviour might be described by a linear-step trend model, namely as a linear trend to the change-point, a step change at this point, followed by a second linear trend portion. Nonetheless, step trend analysis can produce contradictory results since, for instance, jumps can be identified in different years on the series of maximum and minimum temperature, even if they are collected on the same station (Wigley, 2006).

Therefore, keeping in mind that the focus of this study is on quality control, rather than on complex homogenization exercises, we preferred to look for annual linear trends only which used appropriately, provide the simplest and most convenient way to describe the overall change over time in a dataset.

In particular, the Student's t test for linear trend detection has been applied for a significance level α equal to 5%. The trend test results referred to the mean annual maximum and minimum temperature are reported in Table II, as well as the slope β1 of the linear trend detected on each dataset. As all the observed test statistic values t are largely greater than the critical values tN−2, 1−α/2, it can be concluded that all the considered series are affected by linear trends at α equal to 5%.

Table II. Student's t test values and trend slopes β1 ( °C/year) of the mean annual maximum and minimum temperature datasets of the considered stations (significance level α = 5%)
    TmaxTmin
StationPeriod of observationNo of yearstN−2, 1−α/2tβ1tβ1
Trapani1951–2004492.013.510.03514.580.0485
Mazara del Vallo1951–2004412.023.680.04276.160.0601
Acireale1951–2004482.015.790.05174.980.0589
Caltanissetta1951–2003292.054.440.04274.040.0601

Then, daily temperature residuals Zi, t has been derived as:

equation image(17)

where β0 is the intercept of the trend line.

Once that statistically significant trends have been removed from the original datasets, the hypothesis H0 of normality of temperature residuals has been verified by means of four tests, namely: Chi Square, Kolmogorov, Filliben and Anderson-Darling. For a detailed description of such tests, the readers may refer, for example, to D'Agostino and Stephens (1986).

In Figure 2, the percentages of the 365 daily series of temperature residuals for which the hypothesis H0 of normality is not rejected at 5% significance level, are presented. It can be observed that for all the considered stations, the majority of the series can be considered normal distributed.

Figure 2.

Results of the Chi square, Kolmogorov, Filliben and Anderson Darling normality tests applied to temperature residuals.

Then, the three methods previously described to model periodicity in daily temperature (see Section 2.2.) have been applied and compared, also in relation with parameters parsimony. For the sake of brevity, the various steps of the procedure along with the results are hereafter illustrated in detail for Trapani station only.

3.2.1. Method 1

Confidence intervals are estimated by using the mean and standard deviation of daily temperature residuals, corresponding to each day of the year, as model parameters. In Figure 3, a comparison between maximum and minimum temperature daily residuals for the last 6 years of observations and the corresponding confidence intervals are reported. As it can be observed, most of the recorded data fall inside 99% confidence intervals. Maximum and minimum temperature data falling outside confidence intervals can be mainly observed in summer 1999 and 2000 for maximum temperature, and in spring 1999 and summer 2000 for minimum temperature.

Figure 3.

Residuals of daily maximum (a) and minimum (b) temperature observed in the last 6 years at Trapani station versus 99% confidence intervals by using method 1.

It is worth pointing out that such confidence intervals are strictly related to the extreme values observed in the past. Therefore, if a temperature value observed at day t is larger than most of the past values observed for that day during the calibration period, it is likely to be classified as an outlier. This is the case of some of the outliers detected for Trapani station. In fact, the mean of daily maximum temperature recorded during summer 1999 and summer 2000 are respectively about 31 °C and 31.6 °C, that is greater than the corresponding mean for the calibration period, which is about 29 °C.

3.2.2. Method 2

The second method models the means and standard deviations of daily residuals by means of Fourier analysis. Once that the harmonic coefficients have been computed, the significant number of harmonics has been determined by the cumulative periodogram. Concerning the mean of temperature residuals, periodogram in Figure 4(a) shows that the first two harmonics are able to explain about 99% of the variance. On the other hand, Figure 4(b) shows that, in the case of standard deviations, at least seven harmonics are needed, since the explained variance increases very slow for a larger number of harmonics.

Figure 4.

Cumulative periodogram for the mean (a) and the standard deviation (b) of daily maximum temperature residuals at Trapani station.

In Table III, Fourier coefficients and their explained variance are reported respectively for daily maximum and minimum temperature mean (first two harmonics) and standard deviation (first seven harmonics).

Table III. Results of Fourier analysis for the mean and standard deviation of daily maximum and minimum temperature residuals at Trapani station (method 2)
  TmaxTmin
 MeanAkθkPkMeanAkθkPk
Mean1st Harmonic0.94829.553− 0.5230.9861.20823.791− 0.6350.982
 2nd Harmonic 0.209− 1.4410.993 0.254− 1.4030.993
Standard deviation1st Harmonic2.6720.0170.2660.1742.5390.036− 0.1820.337
 2nd Harmonic 0.0181.5380.455 0.0181.5310.509
 3rd Harmonic 0.004− 0.0460.486 0.001− 0.0860.520
 4th Harmonic 0.001− 0.7750.500 0.004− 0.2470.559
 5th Harmonic 0.001− 1.2110.504 0.001− 1.1340.570
 6th Harmonic 0.0030.4510.528 0.0030.4300.602
 7th Harmonic 0.0050.2200.570 0.0040.2570.639

Figure 5 shows a good agreement between daily mean and standard deviation of maximum and minimum temperature residuals observed at Trapani station and the corresponding values derived from Fourier analysis, with the number of harmonics selected according to the above-mentioned periodogram.

Figure 5.

Mean and standard deviation of residuals of daily maximum (a, c) and minimum (b, d) temperature at Trapani station and corresponding Fourier series based on method 2.

Again, 99% confidence intervals have been computed and compared with data observed during the last 6 years (Figure 6). Data falling outside confidence intervals are always located in the same periods identified with the first method and coincide with periods characterized by a general temperature increase over the region.

Figure 6.

Residuals of daily maximum (a) and minimum (b) temperature observed in the last 6 years at Trapani station versus 99% confidence intervals by using method 2.

3.2.3. Method 3

According to this method, Fourier analysis is applied to mean values computed on a moving window of 11 d, centred on each considered day. In order to deal with data points on the boundary, confidence intervals for the first and the last 10 days of the series are calculated excluding the first and last year of the calibration datasets respectively. In Table IV, Fourier coefficients for the first two and seven harmonics of respectively the mean and standard deviation of daily maximum and minimum temperature residuals are reported for method 3.

Table IV. Results of Fourier analysis for the mean and standard deviation of daily maximum and minimum temperature residuals at Trapani station (method 3)
 TmaxTmin
 MeanAkθkPkMeanAkθkPk
Mean1st Harmonic0.94829.376− 0.5230.9891.20823.791− 0.6350.982
 2nd Harmonic 0.204− 1.4410.996 0.254− 1.4030.993
Standard deviation1st Harmonic1.9940.0200.0330.0111.9590.0011.1520.297
 2nd Harmonic 0.0241.4190.510 0.0251.5670.668
 3rd Harmonic 0.0010.0210.604 0.0050.1610.678
 4th Harmonic 0.006− 0.1960.642 0.0020.5780.765
 5th Harmonic 0.003− 1.0410.652 0.001− 0.8830.812
 6th Harmonic 0.0030.6050.747 0.0050.2060.858
 7th Harmonic 0.0050.6050.894 0.007− 0.0840.938

In this case, the mean and standard deviation variability shown in Figure 5, pass through a smoothing, as it can be observed in Figure 7.

Figure 7.

Mean and standard deviation of residuals of daily maximum (a, c) and minimum (b, d) temperature at Trapani station and corresponding Fourier series based on method 3.

From the comparison between daily data in the last 6 years and confidence intervals corresponding to the latter method (Figure 8), it can be observed that the number of data points falling outside confidence intervals is greater than the one obtained by applying previous methods.

Figure 8.

Residuals of daily maximum (a) and minimum (b) temperature observed in the last 6 years at Trapani station versus 99% confidence intervals by using method 3.

In Table V, the number and percentage of values related to the last 6 years, falling outside the confidence intervals, computed on temperature data observed in the previous periods, are reported for each method applied to all the considered stations. From the comparison of the three methods, method 2 appears preferable with respect to the others, as it leads to a lower number of outliers with respect to those detected by method 3 and a little higher than those detected by method 1, In addition, this method requires a substantially lower number of parameters, namely 20 parameters (two harmonics for the mean, seven harmonics for the standard deviation and a constant for maximum and minimum daily temperature series), as opposite to 730 parameters (365 × 2) required for method 1.

Table V. Number and percentage of values lying outside 99% confidence intervals based on data observed in the target station according to the three methods adopted to model periodicity
 TmaxTmin
MethodNo of parameters for meanNo of parameters for standard deviationStationvalue out-interval% value out-intervalvalue out-interval% value out-interval
1365365Trapani2991.672281.27
   Mazara del Vallo2191.462481.66
   Acireale2141.221420.81
   Caltanissetta730.69850.80
2515Trapani3221.802501.40
   Mazara del Vallo2531.692741.83
   Acireale2631.501901.08
   Caltanissetta1241.171171.11
3515Trapani5523.092401.34
   Mazara del Vallo2471.655103.41
   Acireale4222.411821.04
   Caltanissetta2272.141171.11

3.3. Estimation of confidence interval based on the data of the reference stations

This step of the quality control procedure requires the preliminary identification of reference stations. For this purpose, a maximum distance Δl = 60–70 km and a maximum difference in altitude ΔH = ± 300 m, between the target station and the candidate reference stations, have been considered. Among the stations matching these criteria, only those presenting the greater median values of daily correlation coefficients, estimated for each day, with respect to the target stations have been selected. As an example, results for Trapani station are reported in Table VI. From the table, it can be inferred that Partinico and Mazara del Vallo should been chosen as reference stations for Trapani, both for the case of daily maximum and minimum temperature.

Table VI. Minimum, maximum and median values of daily correlation coefficients between temperature series at Trapani and nearly stations
 Maximum temperatureMinimum temperature
StationMedianMaxMinMedianMaxMin
Partinico0.580.840.190.510.76− 0.03
Marsala0.450.760.020.430.770.01
Mazara del Vallo0.510.83− 0.020.610.850.31
Castelvetrano0.460.77− 0.090.380.74− 0.15

After trend removal from the observed series, confidence intervals have been estimated by means of multiple regressions calculated on the basis of reference stations data. In Figure 9, daily maximum (a) and minimum (b) temperature residuals series observed at Trapani station in the last 6 years, are compared with confidence intervals estimated from Partinico and Mazara del Vallo stations series. In both cases, 99% confidence intervals are computed. Figure 9 shows that the number of values that fall outside confidence intervals are less than those reported in Figure 8, where confidence intervals are estimated only with the data of the target station.

Figure 9.

Residuals of daily maximum (a) and minimum (b) temperature observed in the last 6 years at Trapani station versus 99% confidence intervals by using multiple linear regressions based on the data of Partinico and Mazara del Vallo reference stations.

Finally, results of the quality control carried out by means of the confidence interval based on the data of the target station (by using method 2) and on the confidence interval based on the data of reference station, applied to daily maximum and minimum temperature data recorded in the last 6 years of the period of observation of each station, are summarized in Table VII.

Table VII. Results of the quality control of daily maximum and minimum temperature observed at the stations of Trapani, Mazara del Vallo, Acireale and Caltanisetta in the last six years
 TrapaniMazara del ValloAcirealeCaltanissetta
 no. data% datano. data% datano. data% datano. data% data
TmaxValidated194088.6194288.7202492.4205293.7
 Suspect (NV)24211.122110.11657.51386.3
 Highly suspect (NV)80.4271.210.0510.05
TminValidated196189.5180882.6208095.0206394.2
 Suspect (NV)22210.133215.21095.01265.8
 Highly suspect (NV)70.3502.310.0510.05

Application of the proposed procedure to four stations operated by the Water Observatory in Sicily (Italy) has led to validate from 88.6 to 93.7% of maximum daily temperature data and from 82.6 to 95.0% of minimum daily temperature.

3.4. Analysis of the performance of the quality control procedure

The performance of the procedure is verified by simulating its application to data affected by errors. In particular, the last 6 years of the original dataset were modified by introducing known errors as functions of daily standard deviation, according to Equation (16), so that they are different day to day.

In Figure 10, the percentage of errors detected by the quality control procedure versus the error parameter δ is reported for Trapani station. Note that in this example the errors have been added to 100% of the data, with the only exception of the case δ≅0, when no error is introduced. From the figure it can be inferred that the percentage of validated data when δ≅0 is about 90% both for minimum and maximum daily temperature residuals. In the upper axes of the figure the range of variability of the introduced error is shown. As the amount of error increases, the procedure detects an increasing percentage of errors. In particular, for δ≅2, the model detects about 50% of the errors.

Figure 10.

Performance of the quality control procedure of daily maximum (a) and minimum (b) temperature series at Trapani station versus. the parameter error δ.

Besides, in order to compare the performance of the quality control procedure based on the three different methods to estimate confidence intervals by using the target station data, probabilities of data being correct once it is validated (Equation (14)), or not correct once it is not validated (Equation (15)), are shown as a function of the percentage of erroneous data, by fixing δ≅2. Results obtained for daily maximum temperature are reported in Figure 11, showing almost the same behaviour for the three different methods to estimate confidence intervals. In particular, as the percentage of errors in the series increases, the probability that not validated data are actually not correct (Equation (15)) increases, while the probability that validated data are correct (Equation (14)) decreases. Furthermore, from the figure it can be observed that, when the percentage of errors in the dataset is about 35%, the two probabilities coincide, being approximately equal to 80%. Similar results are obtained also for daily minimum temperature.

Figure 11.

Performance of the quality control procedure for the three different methods to estimate confidence intervals versus the percentage of the errors introduced in daily maximum temperature series.

4. Conclusions

A procedure for the automatic quality control of daily maximum and minimum temperature data from a hydrometeorological monitoring network has been presented. The procedure is able to automatically flag observed data as ‘validated’ or ‘not validated’, thus reducing significantly the amount of data requiring a manual control.

The proposed procedure is based on a comparison between the observed data and confidence intervals of fixed probability. A first comparison is carried out between the observed value and a confidence interval estimated from historic dataset of the analysed station, whose daily mean and standard deviation are modeled trough a Fourier series expansion (in order to take into account annual periodicity), limited to a given number of harmonics able to explain a high percentage of variance. A further control is then carried out by means of confidence intervals based on multiple linear regressions between the temperature data of the target station and the contemporaneous data observed at properly selected reference stations.

As a consequence of the application of the proposed procedure, to four stations operated by the Water Observatory in Sicily (Italy), more than 88% of maximum temperature data and 82% of minimum temperature data have been validated automatically. Thus, the number of observations requiring further manual controls is considerably reduced.

Finally, the performance of the proposed procedure has been evaluated by introducing known errors in the temperature datasets, supposed as correct, and by computing the probabilities of correctly flagging data. In particular, the probability of data being correct, once validated, measures the reliability of validated data, while the probability of data being not correct, once not validated, measures the ability of the procedure to detect erroneous data. From the results, it can be concluded that, when the percentage of errors in the series increases, the probability that not validated data are not correct will increase too, while the probability that validated data are correct will decrease.

The proposed quality control procedure of daily temperature data appears rather sensitive to the percentage of errors in the available dataset. For instance, if the percentage of errors is larger than about 35%, more than 80% of not validated data are always not correct, whereas the probability that a datum is correct given that it has been validated, is always less than 80%, revealing a lower capability of the procedure to recognize correct data.

Ongoing research is oriented to improve the procedure for the estimation of confidence intervals, by making use of probability distributions conditioned on the observed values.

Acknowledgements

The financial support of the European projects SEDEMED and SEDEMED II, program INTERREG III B MEDOCC, and of the national project MIUR-PRIN 2008, ‘Water resources assessment and management under climate change scenarios’ are gratefully acknowledged.

Ancillary