PrismPatNet: Novel prism pattern network for accurate fault classification using engine sound signals

Engines are prone to various types of faults, and it is crucial to detect and indeed classify them accurately. However, manual fault type detection is time‐consuming and error‐prone. Automated fault type detection promises to reduce inter‐ and intra‐observer variability while ensuring time invariant attention during the observation duration. We have proposed an automated fault‐type detection model based on sound signals to realize these advantageous properties. We have named the detection model prism pattern network (PrismPatNet) to reflect the fact that our design incorporates a novel feature extraction algorithm that was inspired by a 3D prism shape. Our prism pattern model achieves high accuracy with low‐computational complexity. It consists of three main phases: (i) prism pattern inspired multilevel feature generation and maximum pooling operator, (ii) feature ranking and feature selection using neighbourhood component analysis (NCA), and (iii) support vector machine (SVM) based classification. The maximum pooling operator decomposes the sound signal into six levels. The proposed prism pattern algorithm extracts parameter values from both the signal itself and its decompositions. The generated parameter values are merged and fed to the NCA algorithm, which extracts 512 features from that input. The resulting feature vectors are passed on to the SVM classifier, which labels the input as belonging to 1 of 27 classes. We have validated our model with a newly collected dataset containing the sound of (1) a normal engine and (2) 26 different types of engine faults. Our model reached an accuracy of 99.19% and 98.75% using 80:20 hold‐out validation and 10‐fold cross‐validation, respectively. Compared with previous studies, our model achieved the highest overall classification accuracy even though our model was tasked with identifying significantly more fault classes. This performance indicates that our PrismPatNet model is ready to be installed in real‐world applications.


| INTRODUCTION
Nowadays, engine fault detection plays an important role in modern engineering systems (Gelgele & Wang, 1998;Zhang & Deng, 2018). Several techniques have been proposed in the literature to detect and rapidly resolve abnormalities in motors or engines. Incorporating artificial intelligence (AI) usually results in good solutions for automated fault detection. Moreover, AI techniques are widely used in robotic applications, vehicle control, and medical systems (Gundogdu et al., 2018;Oniani et al., 2021;Yan, 2006).
Automatic engine fault detection systems are crucial for automotive and other vehicle industries (Zhang & Deng, 2020). Therefore, computer-aided fault diagnosis models have been used in recent automobiles. However, these models lack the ability to detect and, indeed, discriminate between a range of faults. Despite this shortcoming, intelligent systems for automatic fault detection in automobiles have greatly increased reliability and extended mobility (Wang et al., 2007). By implementing machine learning-based fault diagnosis systems, faults can be detected in real time. For example, signal-based fault detection systems are used to detect anomalies by continuously measuring and analysing engine parameters (Chen et al., 2015). These parameters can come from vibration, and pressure measurements (Adaileh, 2013;Wu & Chuang, 2005). A fault is detected when measured signals deviate from nominal values. However, the algorithms that detect faults are computationally complex. Moreover, they only exploit a linear relationship which makes them error-prone.

| Motivation
To address these shortcomings (see literature review), we propose a new method for engine fault detection based on the prism shape algorithm.
Our work was motivated by three essential considerations: (i) Graph-based feature engineering methods have been used extensively in established research work. In this research, we used a graph-based feature extraction function that uses a prism shape (a well-known 3D shape).
Our initial motivation was to investigate the feature extraction capability of the prism shape-based function. (ii) We established a new feature engineering architecture using the proposed prism shape-based feature extraction function. Herein, we mimicked the architecture of deep learning (especially CNN) models to create a multileveled feature generation model. Thus, this model was named PrismPatNet. (iii) Fault detection models are very valuable for Industry 4.0 applications to establish the required quality of service. Engine sound acquisition and analysis is costeffective and non-invasive. Therefore, we have used engine sound signals as a basis for our fault classification system. During the development phase, we collected a big engine sound dataset that contains one normal and 26 fault categories. Moreover, there are 8614 sound observations in this dataset. We applied to the proposed PrismPatNet to get a high classification performance.

| Contributions
The main contributions of this work are: a. To the best of our knowledge, this is the first work to classify 27 different engine sounds. Earlier work focused on fewer engine fault classes.
In this respect, the sound dataset collected is the most comprehensive dataset to evaluate engine fault diagnosis systems thoroughly.
b. Developed a novel 27 classes engine fault classification dataset which is publicly available at: http://web.firat.edu.tr/turkertuncer/gasoline_ engine_fault_dataset.rar c. The proposed model mimics popular state-of-the-art deep learning models. The recommended prism pattern (it is a local histogram-based feature generator), maximum pooling, neighbourhood component analysis (NCA) selector, and support vector machine (SVM) were combined to create a new and effective machine learning model.
To support our claim about the efficacy of the PrismPatNet system, we have structured the remainder of the paper as follows. In Section 2, the literature review is presented. Section 3 defines material and method, the results of the proposed PrismPat-based feature engineering model for the used dataset are given in Section 4, the calculated results and findings are discussed in Section 5 and Section 6 presents conclusions.

| LITERATURE REVIEW
In the literature, most studies focus on systems that can detect and classify a limited number of faults. Furthermore, some studies did not develop automated systems. The absence of multi-class fault detection and in some cases the absence of automatization constitutes a research gap that creates a need to develop an automated system with many faults classes. Our PrismPatNet addresses that need by automating the classification of 27 engine sounds. Li and Zhao  presented a fault detection model for an aircraft engine. They used kernel extreme learning machine methods. They calculated an accuracy value of 98.68% for eight classes with 10-fold CV. In their study, the computational complexity is high. Zhao et al. (Zhao et al., 2020) used an extended least squares support vector machine for aircraft engine fault detection. They attained an accuracy rate of 98.07%. In their study, the number of classes is small. Wang et al. (Wang et al., 2014) proposed an engine fault detection approach. Their approach was based on Wavelet packet denoising and Hilbert-Huang transform. They attained an accuracy rate of 96.00% for 7 classes. However, the number of classes was low and the dataset was small. Ahmed et al. (Ahmed et al., 2014) tried using artificial neural networks to detect fault. Their method achieved an accuracy of 97.00%. The method had a high computational complexity and the data volume was small. Basir and Yuan (Basir & Yuan, 2007) developed an engine fault detection method. Their method was based on the Dempster-Shafer evidence theory. For 3 classes they obtained 94.00% accuracy. The number of classes was insufficient for practical applications and the dataset was small. Yong-Sheng et al. (Yang et al., 2017) utilized Discriminative non-negative matrix factorization for diesel engine fault diagnosis. They selected 8 classes for experimental results and obtained an accuracy of 93.38%. The number of classes and data used was small in their study. Jung (Jung, 2019) used machine learning techniques for data-driven fault diagnosis. For 9 classes, the accuracy rate was calculated as 90.00%. In their study, both the number of labelled data sets and the number of classes were small. Wu et al. (Wu et al., 2010) proposed an engine fault classification method using artificial neural networks. They selected 10 classes and attained an accuracy of 99.00%. The computational complexity is high for their proposed method.
Existing fault detection models have been developed for fewer fault classes. Some of the proposed engine fault diagnosis models have used vibrations or currents to detect/classify faults. When compared to engine sounds, these signals are more difficult to measure in real-world scenarios. We developed a new sound dataset with 27 groups to overcome these problems. The data was collected using a smartphone microphone.
The dataset was used to train and test our new PrismPatNet machine learning model in a second step. The next section provides further details about the new model.

| MATERIALS AND METHODS
Machine learning methods have been used in many areas of real-life applications to detect machine faults accurately and thereby reduce the running cost Tomasik et al., 2021;Xu et al., 2021). This work proposes an automated engine fault detection model to classify 27 engine sounds. Figure 1 shows a graphical layout of the proposed engine sound classification model. F I G U R E 1 Layout of the proposed engine sound classification model.

| Material
Our first action was to collect an engine fault dataset comprising 26 faulty engine sounds and one healthy control (overall 27 classes). The sounds were collected from a Tofaş 131 gasoline engine. This gasoline engine was designed for automobiles and is used for educational purposes. The Tofaş 131 attributes are given as follows. It has 65 HP (48 kW) power, 1297 cc capacity, and 102 Newton/meter torque. The sound signals were collected in a workshop, the photos in Figure 2 document that process.
All the engine sound signals were collected using a smartphone ( Figure 2). The sampling frequency was set to 44.1 kHz and the sound signals were source code into the m4a standard. The collected sound signals were segmented into the one-second segment to develop a novel machine learning model. Table 1 shows the sound attributes of the 26 faults and the normal control. After recording the dataset, we were ready to design the information extraction algorithms.
As can be seen from Table 1, there are 27 categories. The 26 categories contain faulty sounds and one category (22nd class) contains sounds of the healthy engine.

| PrismPatNet method
The main objective of the proposed PrismPatNet was to create a lightweight (has a low time burden) and efficient classification model for engine sounds. That was accomplished by structuring the PrismPatNet method into three phases: (i) feature extraction using a prism pattern and maximum pooling network, (ii) selecting the most significant 512 features with NCA (Goldberger et al., 2004), and (iii) classification using the SVM classifier (Vapnik, 1998).  . The proposed prism pattern algorithm extracts parameters from the sound signal and six decomposition levels. Subsequently, the parameters were merged and processed with NCA. The 512 most significant parameters were selected from the NCA output to form a feature vector. These feature vectors were input to the SVM classifier for automated classification.
The pseudocode of the proposed PrismPatNet classification model is given in Algorithm 1.
The main characteristics of the proposed PrismPatNet method are presented in Table 2.

| Iterative decomposer and prism pattern-based feature generation
In the initial design phase of the proposed PrismPatNet method, we established the prism pattern, and we created the feature extractor based on maximum pooling. The prime feature generation function is established through a novel prism shape pattern. Various graphs have been used to present patterns for local and histogram-based feature generators (Tuncer, Dogan, & Subasi, 2021;Tuncer, Dogan, & Acharya, 2021;Aydemir et al., 2021). The proposed feature generation function uses the prism shape as a pattern. To model the prism shape as a pattern, overlapping blocks with a length of 50 were used. These blocks were mapped into two 5 Â 5 matrices. The prism shape is modelled by deploying these matrices, as shown in Figure 4.
F I G U R E 2 Photos of the Tofaş 131 gasoline engine and the data collection environment.
The prism shape, shown in Figure 4, has 18 edges. Binary features are generated by deploying these edges together with a signum function.
To be specific, deploying an edge involves first and second parameters of the signum function. Therefore, 18 bits are generated using this shape (prism pattern). The generated 18 bits are divided into two groups, and these groups are named left and right. Two feature map signals are created using the left and right bits. Finally, histograms of these map signals are extracted, and these histograms are merged to calculate the prism pattern feature vector. The recommended iterative feature generator uses the presented prism pattern and the maximum pooling decomposer extracts both low-level and high-level features. The steps of the iterative feature generator are given below: Step 1: Decompose the signal using maximum pooling with 1 Â 3 sized non-overlapping blocks. In this work, we present a six-level feature generator. The number of levels is calculated using Equation 1.
In Equation 1, L is the number of levels, b defines the reduction factor, ln represents the length of the sound, and w is the length of the used overlapping block.
where D t is t th decomposed signal, maxp :, : ð Þ is maximum pooling.
Step 2: Engages the prism pattern feature generation function on the decomposed sound signals for feature extraction.
Equations 4 and 5 indicate that, seven feature vectors (feat) are generated using the recommended prism pattern (prism À pat : ð Þ). The recommended prism pattern is defined below.
Step 2.1: Create overlapping blocks with a length of 50.
Step 2.3: Generate 18 bits by deploying signum (basic comparison function), as shown in the prism edge schematic presented in Figure 5. The basic bit extractor is defined in Equation 8.
where S :, : ð Þ is a comparison function for bit generation and f, s are the first and second input parameters of the signum function. The mathematical notation of the proposed bit generation is explained below.
F I G U R E 5 Schematic diagram of the prism edges. The edge labels have been defined according to the bit order and Equation (10) provides a mathematical description how this pattern was used for binary feature extraction.
Step 2.4: Generate left and right bits using the 18 bits created in the previous step.
Step 2.5: Calculate map signals using right and left bits.
left g ð ÞÃ2 gÀ1 , i 1,2, …, ln À 49 f g ð13Þ In this work, map left , map right define right and left map signals and ln defines the length of the used sound.
Step 2.6: Extract histograms of the calculated map left and map right signals. According to Equations 11 and 12, these signals are coded with 9 bits. Thus, each signal has a histogram length of 512.
Step 2.7: Merge the two generated histograms and obtain a feature vector of length = 1024.
Steps 2.1-2.6 define the proposed prism À pat : ð Þ feature generation function. Furthermore, the real MATLAB code of this feature extractor has been given in Appendix A.

| Feature selection
The proposed multileveled prism pattern feature generator extracted 7168 features for each sound signal. This leads to a 7168 feature matrix. To choose the top 512 features from the 7168 generated features, an NCA (Goldberger et al., 2004) based feature selector was used. The NCA uses common weights to generate feature selectors. The optimal weights were calculated based on city block distance and stochastic gradient descends (SGD) optimizer. By deploying the generated weights, the top features were selected. The two primary objectives of feature selection were: (i) increasing the performance of the model using the most informative features and eliminating redundant ones, and (ii) reducing the time complexity during classification (Dogan et al., 2020;.
The following steps were used to choose the top features: Step 4: Use NCA to select the feature vector and obtain the weights.
Step 5: Sort the obtained weights in descending order.
Step 6: Choose the top 512 out of 7168 features.

| Classification
SVM (Jain et al., 2021) was utilized as a classifier in this model. In this work, we have used a shallow classifier to obtain high classification accuracy using the presented PrismPatNet with hold-out validation (the split ratio is 80:20) and a 10-fold cross-validation strategy. The SVM attributes are: kernel = polynomial of order 3, kernel scale = automatic, coding = one-versus-all, and box-constraint (C) level = 1.
Step 7: Classify the chosen features using the 3rd-order polynomial kernel SVM classifier with 10-fold cross-validation and 80:20 hold-out validation techniques.

| EXPERIMENTAL RESULTS
This section details the experimental setup that was used to obtain the PrismPatNet performance results. The GEFC27 dataset was collected from an automotive workshop of Arapgir Vocational School, and the sound signals were acquired using a smartphone microphone. The collected sounds were segmented into blocks of one-second duration. The PrismPatNet method was implemented using MATLAB 2020b on a standard personal computer. We did not use parallel processing to execute the PrismPatNet algorithm. The model quality was evaluated with the following seven performance measures: Accuracy, overall precision, overall recall, overall F1-score, Cohen's Kappa, geometric mean, and Mathew correlation coefficient metrics (Tuncer et al., 2019). Table 3 shows the seven performance measures that were obtained by assessing the developed Pris-mPatNet model with hold-out and ten-fold cross-validation strategies. Figure 6 shows the performance measures for each of the 27 classes using: (a) hold-out and (b) 10-fold cross-validation strategies.
It can be noted from Figure 6 that the developed model achieved over 94% for F1, precision, and recall. Furthermore, Figure 6 indicates that this performance threshold holds for both hold-out validation (80:20) and ten-fold cross-validation. Figure 7 shows the fold-wise accuracy obtained for the developed PrismPatNet method.
The model achieved the highest accuracy of 99.77% in the 10th fold and the lowest accuracy of 97.45% in fold 4.

| DISCUSSION
Automated and accurate fault classification is very useful for transportation and manufacturing systems. Therefore, many automated fault classification methods have been developed Aljemely et al., 2021;Yaman, 2021). To extend current knowledge and progress the technology, we developed a new engine sound dataset which contains 8614 observations. Each observation was labelled as belonging to one of 27 distinct engine sounds. This sound signal dataset was easy to collect at a low cost. In the second phase, a new hand-modelled features-based PrismPatNet was developed. The prism shape was modelled to generate local features, and maximum pooling was employed to decompose signals. Both low-level and high-level features were generated by deploying both the presented prism pattern and maximum pooling. NCA was used T A B L E 3 Performance measures to assess the presented PrismPatNet model based on hold-out and ten-fold cross-validation strategies.  to select the 512 most significant features. The resulting feature vectors were fed to a shallow SVM classifier which labels each input as belonging to one of 27 classes. The main objective of the presented prism pattern is to extract hidden patterns using a 3D shape relation. Deep networks inspire the presented PrismPatNet. The deep networks can extract multilevel features from a signal. Therefore, a multileveled feature generator is employed in this work. The proposed PrismPatNet model is simple and efficient. Our model was developed using both hold-out validation and 10-fold cross-validation. We have obtained an accuracy of 99.19% and 98.75% using 80:20 hold-out and 10-fold cross-validation, respectively. The confusion matrices in Figure 8 present a result summary.
In this model, the NCA was used as a feature selector. Without NCA, the presented model would need to process 7168 features. We selected the top 512 out of the generated 7168 features using NCA. That decreases the computation cost of the SVM and it increases classification ability. Figure 9 documents the positive effect NCA has on classification accuracy.
By using NCA, our model attained 99.19% classification accuracy. 96.44% classification accuracy was calculated without NCA and a bigger feature vector. Thus, we used the NCA selector in this research to get higher classification accuracy with lower running time.
We selected the SVM algorithm as the engine sound classification method. The selection process involved training and testing of 6 classifiers and these classifiers were linear discriminant (LD), naïve Bayes (NB), support vector machine (SVM), k nearest neighbours (kNN), bagged tree (BT) and artificial neural network (ANN). Classification accuracies of these classifiers are depicted in Figure 10.    method. We have shown no need to use wavelet or other frequency transformations to get high classification accuracies. Pooling functions are good solutions for classification problems that require multiple feature levels. Furthermore, we tested the feature extraction capability of the prism pattern and showed a good performance for acoustic signal classification. Also, the proposed PrismPatNet can easily be implemented in embedded systems.
The main advantages of the proposed method are as follows: • Developed a novel and accurate PrismPatNet model to classify 27 types of engine sounds.
• Obtained an accuracy of 99.19% and 98.75% using hold-out and 10-fold cross-validation, respectively.
• Created a big gasoline engine fault GEFC27 dataset of engine sounds to validate the developed model.
• The prism pattern has a 3D shape that can extract subtle features from the sound signal and yield high classification performance.
• Achieved an efficient, simple, economical, and lightweight learning network Figure 10 demonstrated that the best classifier among the six tested algorithms is SVM since it reached 99.19% classification accuracy. The second best algorithm is kNN with an accuracy of 97.81%.
Finally, the NB showed the weakest performance, with an overall accuracy of 71.42%. (PrismPatNet).
• The proposed model is the first to classify 27 different types of engine sounds.
A limitation of this research is: • We used only a single motor type. To get a larger dataset, different motor types and different environmental conditions should be used.
F I G U R E 9 Classification accuracy with and without NCA.
F I G U R E 1 0 Accuracies of the six tested classification algorithms.
In the future, we plan to collect data related to diesel, electricity, and gasoline motors/engines. Our model can also be used in various vehicles to automate motor fault detection.
Moreover, PrismPatNet is simple to use and based on mobile phone technology, which enables a layman to detect and subsequently repair engine faults. It can also be employed to assess engine faults in used and modified cars.

| CONCLUSIONS
In this work, we have developed a novel PrismPatNet model to automate the classification of 27 engine sounds. The generated system was validated with a newly collected GEFC27 dataset which comprises of 8614 observations, each belonging to one of 27 classes. The presented model was assessed with the GEFC27 dataset, and it reached 99.19% and 98.75% accuracy using 80:20 hold-out validation and 10-fold cross-validation, respectively. Our model is simple, accurate, fast, and easy to implement. It can also be used to classify other types of one-dimensional signals. In the future, we intend to explore new 3D shape feature generators based on deep models to classify sound signals.
The sounds used were collected from a noisy environment. Therefore, the test results reflect the expected performance for real-world applications. In the future, a big dataset can be collected from various motors and a big, labelled dataset can be created using the deep network models or the recommended PrismPatNet. By using these big, labelled datasets, automatic fault diagnosis applications can be developed. Moreover, these applications can be used by members of the public.

CONFLICT OF INTEREST STATEMENT
The authors declare that they have no conflict of interest.

DATA AVAILABILITY STATEMENT
The data that support the findings of this study are openly available in gasoline_engine_fault_dataset.rar at http://web.firat.edu.tr/turkertuncer/ gasoline_engine_fault_dataset.rar.