Nonparametric algorithm for identification of outliers in environmental data
Abstract
Outliers that can significantly affect data analysis are frequently present in environmental data sets. Most methods suggested for the detection of outliers impose restrictions on the distribution of analysed variables. However, in many environmental areas, the observed variable is influenced by a lot of different factors and its distribution is often difficult to find or cannot be estimated. Therefore, an approach for the identification of outliers in environmental time series based on nonparametric statistical techniques is presented. The core principle of the algorithm is to smoothen the data using nonparametric regression with variable bandwidth and subsequently analyse the residuals by nonparametric statistical methods. In the case that the distribution of the analysed variable is normal an efficient statistical method based on normality assumptions is presented as well. The proposed procedure is applied for the identification of outliers in hourly concentrations of particulate matter and verified by simulations.
Citing Literature
Number of times cited according to CrossRef: 2
- Martina Čampulová, Jaroslav Michálek, Jiří Moučka, Generalised linear model-based algorithm for detection of outliers in environmental data and comparison with semi-parametric outlier detection methods, Atmospheric Pollution Research, 10.1016/j.apr.2019.01.010, (2019).
- Martina Čampulová, Ladislava Issever Grochová, Jaroslav Michálek, undefined, , 10.1063/1.5114418, (400004), (2019).




