Volume 32, Issue 5
RESEARCH ARTICLE

Nonparametric algorithm for identification of outliers in environmental data

Martina Čampulová

Corresponding Author

E-mail address: martina.campulova@mendelu.cz

E-mail address: martina0126@seznam.cz

Faculty of Military Leadership, Department of Econometrics, University of Defence, Kounicova 65, 662 10 Brno, Czech Republic

Faculty of Business and Economics, Department of Statistics and Operation Analysis, Mendel University in Brno, Zemědělská 1, 61300 Brno, Czech Republic

Correspondence

Martina Čampulová, Faculty of Business and Economics, Department of Statistics and Operation Analysis, Mendel University in Brno, Zemědělská 1, 61300 Brno, Czech Republic.

Email: martina.campulova@mendelu.cz

martina0126@seznam.cz

Search for more papers by this author
Jaroslav Michálek

Faculty of Military Leadership, Department of Econometrics, University of Defence, Kounicova 65, 662 10 Brno, Czech Republic

Search for more papers by this author
Pavel Mikuška

Czech Academy of Sciences, Institute of Analytical Chemistry, v. v. i., Veveří 97, 602 00 Brno, Czech Republic

Search for more papers by this author
Drago Bokal

Faculty of Natural Sciences and Mathematics, Department of Mathematics and Computer Science, University of Maribor, Koroška cesta 160, 2000 Maribor, Slovenia

Search for more papers by this author
First published: 25 January 2018
Citations: 2

Abstract

Outliers that can significantly affect data analysis are frequently present in environmental data sets. Most methods suggested for the detection of outliers impose restrictions on the distribution of analysed variables. However, in many environmental areas, the observed variable is influenced by a lot of different factors and its distribution is often difficult to find or cannot be estimated. Therefore, an approach for the identification of outliers in environmental time series based on nonparametric statistical techniques is presented. The core principle of the algorithm is to smoothen the data using nonparametric regression with variable bandwidth and subsequently analyse the residuals by nonparametric statistical methods. In the case that the distribution of the analysed variable is normal an efficient statistical method based on normality assumptions is presented as well. The proposed procedure is applied for the identification of outliers in hourly concentrations of particulate matter and verified by simulations.

Number of times cited according to CrossRef: 2

  • Generalised linear model-based algorithm for detection of outliers in environmental data and comparison with semi-parametric outlier detection methods, Atmospheric Pollution Research, 10.1016/j.apr.2019.01.010, (2019).
  • undefined, , 10.1063/1.5114418, (400004), (2019).

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.