Image processing and machine learning-based classiﬁcation method for hyperspectral images

Hyperspectral images are used to produce effective solutions in many areas due to the number of information they have. However, the number of information they provide may not bring an advantage always, in contrast, it may cause confusion sometimes. In this study, we propose a machine learning-based classiﬁcation method on the reduced bands using median and mean ﬁlters. The proposed method is tested on Indian Pines, Pavia University and Salinas datasets. After applying normalization, median ﬁlter and mathematical morphology for each band in an image, a feature matrix with only one band is achieved. Then, the most signiﬁcant features are selected by using relief feature selection algorithm. The selected features are used for classiﬁcation with support vector machine and K nearest neighbourhood. For all methods and datasets, a success rate above 99% in terms of accuracy is achieved. The proposed method has two signiﬁcant contributions. First, when the proposed method is compared with the similar studies in the literature, it is clearly seen that the features selected by relief algorithm signiﬁcantly increase the success rate of clas-siﬁcation algorithms. Second signiﬁcant contribution of using relief algorithm is obtaining faster methods by reducing the number of hyperspectral bands.


Background
Hyperspectral imaging is a new generation of remote sensing technology that has been developing rapidly in recent years. Hyperspectral technology contains hundreds of bands, unlike multispectral optical detection systems. This imaging method is based on the optical passive remote sensing method. The method of utilizing the properties of the reflection of the rays coming from the sun to the objects on the surface is called passive detection method. It contains wavelengths between 0.4 and 14 µm, which are called the optical region of the electromagnetic spectrum in the optical remote sensing method. The visible light region is between 0.4 and 0.7 µm, infrared (NIR) is between 0.7 and 1.5 µm, short wavelength infrared is between 1.5 and 3 µm, and medium wavelength infrared is between 3 and 5 µm, and long wavelength infrared is between 5 and 14 µm. In this way, the optical region is divided into five sub-regions in total. Generally, multispectral images contain This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. © 2021 The Authors. The Journal of Engineering published by John Wiley & Sons Ltd on behalf of The Institution of Engineering and Technology between 4 and 7 discrete bands and each bandwidth is between 300 and 400 nm. Hyperspectral images consist of hundreds of bands ranging from 10 to 20 nm. While obtaining hyperspectral images, satellites, aircraft equipped with special sensors or portable surface sensors are generally used. In hyperspectral images, a continuous spectrum graph is drawn for each image cell.

Motivation
Today, hyperspectral imaging technology is used in a wide range of fields such as geology, medicine, chemistry, agriculture, forestry, space exploration, biomedical, and defence. Recently, the processing of hyperspectral images has been increasing in remote sensing areas. Hyperspectral images in agriculture are used in many fields such as classification of product types, analysis of vegetation, perception of crop and soil conditions, planning of soil structure, assessment of product damages due to environmental factors, and analysis of product quality. Apart from agricultural analysis, it is also used in many special areas such as the determination of food quality. As a result of developing hyperspectral imaging technologies, the studies done in the field of hyperspectral image processing is also increased. The fact that these images contain many bands causes some difficulties in image processing. Recently, deep learning based methods have been developed to overcome these difficulties. However, running deep learning models needs high computing power, which means high cost and energy consumption. Our motivation in this study is to propose a classification method on hyperspectral datasets using computers with low-processing power. When studies in the literature are analysed, there is no study actualized with more than 95% accuracy using SVM and KNN algorithms. In this study, approximately 99% accuracy is achieved by using SVM and KNN algorithms by the help of pre-processing steps.

Literature review
There are many studies in literature on multispectral and hyperspectral images [1][2][3]. Seniha et al. proposed two methods for extracting the canopy area of the LiDAR sensor from hyperspectral data [4]. In the first method, LiDAR sensor and hyperspectral data are used. Thus, they developed a line of sight algorithm that takes into account the sun's arrival angles and the height of the environment. The second method is to detect shadow areas using only hyperspectral images. They then examined the effects of detected shadows on real data. Today, image processing methods are developed for agricultural fields and agricultural products using RGB cameras [5][6][7]. The studies such as diagnosis of agricultural diseases using the Internet of Things and unmanned aerial vehicles [5], detection of weeds on the bird's eye view of agricultural fields [6], measuring the quality of the collected products with image processing techniques [7] are some of them. However, RGB cameras may not sufficient for capturing all useful information of a scene. For this reason, multispectral and hyperspectral cameras are used. Hyperspectral cameras are widely used in agricultural areas as well as in many areas [8,9]. Zang et al. classified hyperspectral images by using original spectral features, extended multi-attribute profiles (EMAPs), and the hybrid of original spectral features and abundance information. In their study, 80.46%, 94.27%, and 88.87% success rates are achieved on the Indian Pines dataset by using original spectral features, EMAPS, and hybrid features respectively [10]. Harikiran et al. used PCA and bidimensional empirical mode decomposition (BEMD) techniques for feature extraction, and deep residual networks (RESNET) for classification. It is reported that PCA-BEMD-RESNET model achieves 99.30% accuracy for Indian Pines dataset [11]. Okwuashi et al. used a method based on deep support vector machine (DSVM) for Indian Pines and University of Pavia datasets. DSVM is compared with some other known classification methods on Indian Pines dataset [12].
Chu et al. proposed a method using three-dimensional convolutional neural networks (3D CNN) with spectral partitioning. It is reported that using spatial spectral features extracted with 3D CNN contributes to the smooth operation of the system with small data. The method that achieves 98.0% classification performance on Indian Pines dataset needs high computational power because it contains many 3D CNN layers [13]. Pathak and Kalita used features derived from spectral and spatial information to classify the hyperspectral images with SVM. The method achieved 91.37% accuracy for Indian Pines dataset [14]. When the methods in the literature are analysed, it is seen that the studies are generally with low success rate and the studies with high success rate require high computing power. In this study, it is aimed at a method that requires less computing power and higher success rate. The datasets, methods, and success criteria used in the studies in the literature are summarized in Table 1.

Our method
In this study, SVM-and KNN-based methods are proposed using Indian Pines, Pavia, and Salinas datasets. In the proposed method, each band of hyperspectral datasets is converted into a grayscale (0-255) image. To increase classification success, image pre-processing steps are applied to grayscale images. First, the median filter is applied to ensure that the near pixels in the image are more regular. Then, to ensure that the values of the pixels with the same class label converge to each other, mathematical morphology and mean filter are applied. After the pre-processing steps, the 3D hyperspectral matrix is converted to 2D, and the most meaningful features are selected by the help of the relief algorithm. Finally, the most significant features are used in the classification of segments using SVM and KNN algorithms.

Contributions
The advantages of the proposed method are given below.
• In literature, the success of hyperspectral image classification using KNN and SVM algorithms are lower than the success of CNN algorithms. In this study, better results are obtained in KNN and SVM algorithms by applying the pre-processing steps and Relief algorithm. • Using the Relief algorithm, only the most meaningful bands are selected instead of using all bands. Thus, the computational complexity of the proposed method is reduced. • In the literature, deep learning architectures and the computers with high computational power are used for classification of hyperspectral images, while in this study, better results are obtained with light computing methods and a laptop with low computational power.

Study outline
This study continues as follows. In the second part of the study, the datasets used are detailed. The number of spectral bands, the way they obtained, and the number of classes for each dataset are explained. In the third part of the study, applied image preprocessing steps, relief feature selection algorithm, and SVM and KNN classification algorithms are clarified. In the fourth section, the results obtained from the proposed method are given and compared with the studies in the literature.

MATERIALS
In this study, three different datasets were used: Indian Pines dataset, Pavia University dataset, and Salinas dataset.  Figure 1.

Pavia University dataset
This dataset was obtained by capturing hyperspectral images of the University of Pavia in Pavia, Italy. Pavia University data set was obtained using the ROSIS-03 (Reflective Optics System Image Spectrometer) hyperspectral sensor. This hyperspectral

Salinas dataset
The Salinas data set was created by taking the hyperspectral images of the Salinas Valley in California. This data set was taken with the help of AVIRIS hyperspectral sensor. This hyperspectral dataset consists of examples with 512 × 217 size and 204 bands [17,18]. Hyperspectral bands of the Pavia University dataset have a spectral range of 430 to 860 nm. Since the capturing distance of the dataset is about 3.7 meters, it is a highresolution dataset with 16 different labelled classes. Artificial RGB image, reference image, and tagged data types of Salinas data set are given in Figure 3.

THE PROPOSED METHOD
In this study, a method providing high classification success for three different hyperspectral datasets is proposed. The method generally consists of obtaining image from the hyperspectral dataset, image pre-processing, feature extraction, feature selection and classification steps. The block diagram of the proposed method is given in Figure 4.

Image obtaining from hyperspectral dataset
Hyperspectral images contain wavelengths between 0.4 and 14 µm, called the optical region of the electromagnetic spectrum. Among these wavelengths, hyperspectral datasets were created by taking remote sensing data from many different bands. First, each band in the dataset is normalized between 0 and 255 using (1), where X is the current pixel value, X min is the smallest pixel The normalization process is applied for each band of each image in the dataset. This process results in obtaining a new dataset that has values in the range of 0-255.

Image pre-processing
After converting the hyperspectral images to grayscale images, image pre-processing steps are applied. First of all, median filter, which is widely used for noise removal, is used on the grayscale images to obtain more regular distributed images. The median filter is a non-linear filter type, and it helps to remove the disadvantage of small pixel value changes in a neighbourhood besides removing noise. This paper uses it for making equal the pixels with close values. To apply median filter, a mask with 3 × 3 window size is shifted on the image. The pixel values covered by this mask are sorted and the middle value is assigned to the center pixel. In Figure 5, the application of the median filter with 3 × 3 window size is given. The process visualized in Figure 5 is applied to each grayscale band image obtained from the hyperspectral dataset. Then, one of the most important applications of mathematical morphology, dilation, is applied on the image. Dilation process is a morphological process that enlarges or thickens the object [19][20][21]. The purpose of the dilation process is to fill the holes and gaps formed on the image due to the noises and soften the corner points. In this study, a 3 × 3 dimension structural element with all pixels are 1 is used. After the dilation process, mean filter is applied on the image. The aim of applying mean filter is to reduce the amount of change between one pixel and the other pixels on the image. In mean filter, first, the window size is determined. After, the average of the values in the scope of the window is calculated. Finally, the calculated mean value is our new centre value. For this convolutional method, we choose a mask with 5 × 5 window size. In Figure 6, the mean filter application is shown with a 5 × 5 window size.

Feature extraction
In this study, grayscale images between 0 and 255 for each band in hyperspectral datasets. Each data in dataset consists of Mean filter application with a 5 × 5 window size many bands. For feature extraction from these multiple bands, all bands should be combined into a single feature matrix. Consider that a data with K bands, and each band has N × M size. To achieve feature matrix, the columns in a band are aligned consecutively. Consequently, the image matrix is transformed into a column vector with length N × M. After applying the process for all bands, K column vectors, each has N × M length, are obtained. By ordering these vectors horizontally, the feature matrix is obtained (N × M × K). This process is shown in Figure 7.
As shown in Figure 7, 3D hyperspectral images consisting of multiple bands are transformed into 2D feature matrix. For example, for a data with 145 × 145 size and 200 bands in Indian Pines, a feature matrix with 21025 × 200 size is created. Similarly, for each of the data in the Pavia University dataset whose size is 610 × 340 × 103, a feature matrix with dimension 207,400 × 103; for each of the data in the Salinas dataset

Feature selection
In this study, relief algorithm is used to reach the most significant features [15]. The relief feature selection algorithm, commonly used in the literature, was developed by Kira et al. According to this algorithm, the weight value is initially set to 0 for each feature. Weights of these features are updated by selecting random data at each step. Thus, the most weighted features are determined as most significant features at the end of the iterations.

Classification
After determination of most significant features, SVM and KNN algorithms are used to classify the samples in the datasets. SVM and KNN algorithms are frequently preferred successful algorithms in classification algorithms. The SVM algorithm is used to examine the connection between variables. It aims to determine the distinction between points by creating an Ndimensional space. This paper prefers SVM Cubic algorithm. The KNN algorithm makes classification by looking at the class of its neighbour, whose Euclidean distance is closest. We prefer KNN fine version which is an implementation of KNN algorithm [16]. Generally, in this study, it is done classification with SVM cubic and KNN fine algorithms using MATLAB classification learner application. The parameters used in the classifiers are given in Table 2.

Performance analysis
By running the classification algorithms 100 times with the settings given in Table 2, accuracy, precision, recall, geometric mean, and F-measure values are calculated. The equations for calculating the metrics are given by (2)

EXPERIMENTAL RESULTS
In this study, hyperspectral image classification method is proposed and the proposed method is tested on Indian Pines, Pavia University and Salinas datasets. MATLAB classification learner toolbox is used and the results are obtained by 10-fold cross validation. The result images belong to steps such as obtaining grayscale images, applying median filter, applying dilation, and applying mean filter are given by Figures 8-10. It can be seen that sequential use of the median filter, dilation, and mean filter make the images segmented. As a result of the operations shown in Figures 8-10, the 3D hyperspectral dataset is converted into a 2D feature matrix. The most significant features are selected from the feature matrix by using the relief algorithm. After, classification is made using SVM and KNN algorithms. The classification performances of the proposed method on Indian, Pavia, and Salinas datasets are given in Table 3.
The results given in Table 3 were obtained after running 100 iterations. When the classification results for the Indian data set are examined, we can see that the best value for SVM Cubic is 99.69%, and the best value for KNN Fine is 99.76%. On the

DISCUSSION
In this study, Hyperspectral image classification is made using Indian Pines, Pavia, and Salinas datasets. Image processing is performed on multi-band hyperspectral images. After image processing steps, using the relief algorithm, the most significant bands for classification are selected among the hyperspectral bands. For Indian Pines dataset, only 65 hyperspectral bands selected by relief algorithm among 200 hyperspectral bands in total are used. For the Pavia dataset with 103 hyperspectral bands, the most significant 40 hyperspectral bands selected by the relief algorithm are used. Similarly, for Salinas dataset with 204 hyperspectral bands, only 55 most significant hyperspectral bands are selected. With selecting the most significant bands, unnecessary bands are eliminated and a performance increase is achieved in case of time. Furthermore, the classification algorithms give better results without useless bands for classification. This paper uses SVM cubic and KNN fine algorithms. The performance results of the proposed method and the results of the studies in the literature are given in Table 4.
When the overall accuracy (OA), average accuracy (AA), and Kappa coefficients (K) parameters given in Table 4 are examined, it is seen that the proposed method gives successful results. Its performance in terms of success and speed are better understood when compared with the current method.  As can be seen in the literature studies, the accuracy rates of SVM and KNN algorithms are low for the mentioned datasets. In this study, we aimed to increase the performance of SVM and KNN by eliminating unnecessary bands. In this way, the results are obtained with high accuracy in a practical, and faster way. It is seen that the accuracies of the SVM and KNN methods, which is around 80% before, are increased to 99% with the study. The proposed method focuses on studies that increase the success of the current methods rather than proposing a completely new method. The used pre-processing steps and relief algorithm are tested also for decision tree and Bayesian classifier, but the highest accuracies are obtained for KNN and SVM. Moreover, the highest accuracy is achieved with cubic SVM in SVM algorithms and fine KNN in KNN algorithms.
The accuracies and running times of the methods are calculated for both cases, which are with image pre-processing steps and without image pre-processing steps. The obtained results for different datasets are given in Table 5.
When image pre-processing steps are not applied, both SVM and KNN classifiers give results with lower accuracy. Further-more, when the pre-processing steps are not applied, the relief algorithm chooses more features, and the classification process needs more time.
The advantages of the proposed hyperspectral image classification algorithm are given below: • Pre-image processing before classification in the proposed method increases the success of the method. • The most significant bands are selected using the relief algorithm on the hyperspectral dataset which has tens of bands. Thus, the proposed method is provided to work faster. • The proposed method requires less resources for classification by using the most significant bands with feature selection instead of all hyperspectral bands.

CONCLUSIONS
In this study, a method based on image processing and machine learning is proposed for hyperspectral image classification, which is important for the analysis of agricultural areas. According to the proposed method, first, the hyperspectral images which have multiple bands are converted into grayscale images. Then image pre-processing methods are applied on grayscale images. Later, features are extracted and the most significant features are selected by using the relief algorithm. Selected features are used in classification algorithms. In this study, two different classification methods which are SVM and KNN are employed. SVM cubic and KNN fine versions of these algorithms are chosen. The methods are tested on three different agricultural hyperspectral datasets which are Indian Pines, Pavia, and Salinas. For three different datasets, the best SVM cubic classification result is 99.96%, and the best KNN fine result is 99.95%. When the studies in the literature are examined, it is clearly seen that the image pre-processing and feature selection steps help to improve the success rate.