Probabilistic quality control of daily temperature data



A procedure for quality control of daily temperature data, designed to automatically identify potential anomalies, is presented herein. The procedure verifies whether the observed data lie within two different confidence intervals of fixed probability: the first one derived on the basis of the probability distribution fitted to the historic dataset of the considered climatic station, and the second one derived by means of multiple linear regressions, making use of contemporaneous data observed at selected reference stations. Examples of applications of the proposed procedure are reported with reference to daily maximum and minimum temperature data observed from 1950 to 2004 at four thermometric stations in Sicily (Italy), selected within the meteorological monitoring network operated by the Water Observatory of Sicily region (Italy). Results of the applications show that more than 80% of daily temperature data are automatically validated. In addition, the performance of the procedure is assessed by introducing known errors into temperature datasets, supposed as correct, and by estimating the probabilities of correctly classifying data as validated or not validated. From the results, it can be concluded that, as the percentage of errors increases, the probability that not validated data are not correct increases, while the probability that validated data are correct decreases. In particular, if the percentage of errors is larger than about 35%, the procedure reveals a lower capability to recognize correct data rather than not correct ones. Copyright © 2012 Royal Meteorological Society