A fused CNN‐LSTM model using FFT with application to real‐time power quality disturbances recognition

With the progress of renewable energy generation and energy storage technologies, more and more renewable sources and devices are integrated into the power system. Due to the complexity of the power system, single and multiple power quality disturbances (PQDs) occur more frequently. Hence, real‐time detection of PQDs is the primary issue to mitigate the risk of distortions. This study presents the real‐time PQDs classification using fused convolutional neural networks (CNN) combined with long short‐term memory (fused CNN‐LSTM) architecture based on time and frequency domain features. The frequency‐domain features were obtained from time‐series data using fast Fourier transform. The original time‐domain and frequency‐domain features are extracted by respective CNN‐LSTM structures. The extracted time and frequency domain features are concatenated to classify the PQD through fully connected layers. Our proposed method was trained and tested using 16 types of synthetic noise PQDs data generated by mathematical models, in accordance with the standard IEEE‐1159. Moreover, to further verify the performance of our approach, a simulation distributed power system is carried out to detect various PQDs. We compared three advanced neural network approaches: Deep CNN, CNN‐LSTM, and multifusion CNN (MFCNN). The fused CNN‐LSTM model takes only 0.64 ms to classify each PQDs signal and achieves an accuracy of 98.95% and 98.89% in synthetic data and simulated power systems which indicates our proposed method outperformed compared methods.


| INTRODUCTION
In recent decades, the demand for energy supply has been increasing explosively in domestic, industrial and commercial, and so forth. However, traditional power generation causes large greenhouse gas emissions, such as oil, coal, and natural gas, which are against sustainable development. Hence, renewable energy (RE) resources have gained attention for reducing environmental effects which generate energy without carbon emissions. The penetration of renewable power generation is increasing gradually with the declining costs of devices and the advancement of technology. 1 The intermittent, fluctuation, and randomness characteristics of most RE (e.g., solar and wind) affect power system stabilization. 2 With the extensive use of RE sources, nonlinear loads, and other sensitive devices, power quality disturbances (PQDs) occur more frequently. Moreover, the application of bidirectional charging technology, such as vehicle-to-grid and building-to-grid, will lead to more frequent and complex power quality (PQ) issues. 3 The term PQDs means that the deviations in voltage, current waveform, or frequency differentiate from the standard. Poor PQ may cause severe hazards, affect power generation, damage sensitive equipment, misoperate the protection device, or even cause power outrage. The standard IEEE-1159 characterizes the major categories of PQDs 4 : sag, swell, interruption, harmonics, flicker, notch, spike, and so forth. Considering the potential risks of different PQDs on the power system, real-time detection and classifying of the PQDs types are the first priority to improve the PQ. Normally, PQDs manifest as complex disturbances rather than single disturbances, which means that multiple disturbances occur simultaneously, especially in distributed energy resource systems. 5 The presence of multiple disturbances increases the PQDs types, potentially reducing classification accuracy.
Generally, PQDs classification methods are divided into two categories: model-based and data-driven approaches. 6 The model-based methods are developed based on a deep understanding of the power system and the PQDs definition, the classification accuracy relies on the engineer's domain experience and knowledge. 7 The distributed energy system increases the difficulty of developing the model-based method due to its complex system topology in distributed power systems. However, data-driven methods allow the model to learn from sufficient data, which does not require domain and detailed energy system knowledge. Figure 1 represents the typical distributed energy system, where distributed RE replaced central power generation. With the implementation of smart grids, advanced metering infrastructure, and the internet of things technologies, a considerable amount of data are being stored in the cloud database, which enables the development of data-driven methods on the enormous amount of measured data.
Traditional PQDs classification methods comprise three stages: feature extraction, feature selection, and disturbances classification. 8 For feature extraction, various signal processing techniques were applied to extract F I G U R E 1 The typical distributed energy system. distinctive features from the original PQDs signal. Xiao et al. 9 and Borges et al. 10 applied fast Fourier transform (FFT) to extract the frequency domain information for disturbances classification. FFT is an optimized method of discrete Fourier transform (DFT), which is more efficient than DFT, but the results are almost the same. 11 Wavelet transform (WT) and its modification were also widely used in PQDs classification, which decomposes the signal into different levels. 12 Kanirajan and Suresh Kumar 13 adopt 5-level Daubechies 4 (db4) to extract decomposed features. However, the main disadvantages of WT are noise sensitivity and difficulty selecting the appropriate mother wavelet, which may deteriorate the classification accuracy. Hence, several modification methods of the WT are proposed. Kaushik et al. 14 used Hilbert transform and Stockwell transform for feature extraction on 16 types of synthetic PQDs data.
In machine learning, feature selection is an indispensable technique to improve classification performance. Dash et al. 15 used the mean, energy, standard deviation, entropy, skewness, and maximum amplitude statistical features for microgrid disturbances detection. In the study by Borges et al., 10 the authors extracted 16 statistical features from time and frequency domain data based on the original signal and FFT features. A similar statistical feature selection method from the decomposed wavelet coefficients is presented in Kanirajan and Suresh Kumar. 13 Liu et al. 16 select the eight eigenvalues from each intrinsic mode function. However, these handengineered feature selection methods may remove the critical features. Hence, it is necessary to automatically select the optimal features. An artificial bee colony optimization technique was applied to find the optimal features in Khokhar et al. 17 Ahila et al. 18 applied particle swarm optimization to reduce the redundant features. 18 For disturbances classification, different machine learning algorithms have been proposed to classify the disturbances types. An adaptive K-nearest neighbor method was adopted based on selected features to classify the PQDs in Liu et al. 16 Dash et al. 19 implemented the support vector machine to recognize the single and multiple PQDs based on the tuned features obtained by using whale optimization algorithm. Markovska et al. 20 applied random forest to identify the 21 types of PQDs under the different noise level environments based on the hand-engineered feature selection method and verified via LabVIEW. Due to the limitation of selected features, more researchers have proposed diverse deep learning (DL) structures to recognize the different disturbance patterns, which can automatically extract features through hidden layers. A large number of studies based on convolutional neural networks (CNNs) have been used to extract features. 21, 22 Wang and Chen 23 proposed a deep one-dimensional (1D) CNN to classify the PQDs and verify the classification performance on the simulation and multi-microgrid systems. Qiu et al. 24 presented multifusion CNN (MFCNN) architecture to recognize the PQDs types under various noises, which fuse the raw signal and FFT features. However, the data type of PQDs is time-series, and CNN structure is commonly used for spatial data. 25 Hence, many researchers have used recurrent neural network (RNN) architectures for disturbances classification. Gu et al. 26 proposed a new label-guided attention network to classify PQDs types. It comprises the convolutional layers, attention mechanism, and bidirectional RNN. An Enhanced LSTM network method was presented to classify the six events using active power. 27 Xiao and Li 9 proposed an integrated DL method consisting of CNN-GRU, ResNet-GRU, and Inception-GRU networks to identify PQDs. 9 Moreover, Sindi et al. 28 proposed a hybrid convolutional network to identify the disturbance, which combined 1D and 2D features from the original PQDs signal and PQDs image. To address the real-time PQDs classification, we proposed a DL structure named Fused CNN-LSTM, which incorporates CNN and LSTM structures to explore disturbances types. The main contributions of this paper are summarized as follows: The rest of this article is structured as follows: Section 2 depicts the PQD types and generates the synthetic data based on the mathematical models. Section 3 describes the architecture of the fused CNN-LSTM model and the implementation process. In Section 4, the experiments and comparisons with other methods are reported. Section 5 presents the results performed under the simulated distributed system environment. Finally, the conclusion and future works are presented in Section 6.

| SYNTHETIC DATA GENERATION
To support the validity of our proposed model, efficient PQDs data is required, which includes various disturbances. IEEE-1159 defines different PQDs types and their parameter variations. 4 In this work, the 16 types of single and multiple synthetic PQDs signals are generated in Python according to the mathematical model. The PQDs mathematical models are shown in Tables 1 and 2, which contain 10 single types and 6 multiple types. In PQDs mathematical equations, the symbol A, normally equal to 1 is a constant value that represents the waveform amplitude. The α in the equations describe the various intensity of disturbances in different events. u t ( ) is a step function that could control the duration of the disturbances. For 1 2 in the equation means the disturbances occurred during t 1 to t 2 . The sign function sgn() is a real-valued step function that can be defined as: One thousand waveforms are generated in each disturbance type by randomly changing the parameter values. It records the 10 cycles signal in each PQDs waveform. The fundamental frequency and sampling rate are set to 50 Hz and 10 kHz, respectively. According to the above conditions, each waveform records data every 0.2 s, containing 2000 sample points. The single and multiple PQDs waveforms are shown in Figures 2 and 3. A total of 16,000 PQDs data is split into training, validation, and test sets in the ratio of 6:2:2. However, the real PQDs data was collected from sensors that usually contain noise. 29 Hence, different levels of Gaussian white noise are added to the training and validation set to improve the accuracy of our proposed model. 30,31 In the test set, the signal-to-noise ratio with 20, 30, and 40 dB are added to the test set, respectively. A brief description of each dataset is shown in Table 3.

| PROPOSED PQDS CLASSIFICATION METHODOLOGY
This section introduced our proposed fused CNN-LSTM architecture, which fused time and frequency features based on the time and frequency domain inputs for PQDs T A B L E 1 Mathematical models of single power quality disturbances.

Disturbances
Mathematical equation Parameters T A B L E 2 Mathematical models of multiple power quality disturbances.

Disturbances Mathematical equation Parameters
Interruption with harmonics (C13) Flicker with sag (C15) sin( ) F I G U R E 2 Single power quality disturbances waveforms.
classification. As mentioned in Section 2, the PQDs data are time-series signals in 10 cycles which contains 2000 sampling points. In addition, the frequency domain input data were obtained by FFT. Our proposed method consists of three main phases. The first phase is to obtain the frequency domain information, which employs FFT to transform the time-series data. Subsequently, the Z-score normalization method is applied to rescale the CEN ET AL.  . The details of our proposed method are as follows.

| Fast Fourier transform
The first phase aims to obtain the frequency domain information from PQDs signals. Considering that timeseries signals are not appropriate for some types of recognition, especially in the case of harmonics distortions. FFT is adopted to obtain frequency domain information from the PQDs signals, which decompose the signal into different sinusoid frequencies. The process of FFT can be expressed as follow: where W e = n j π N − 2 / and k is the sampling point from 0 to N − 1. After applying FFT, it obtained one frequency output denoted as and m is the frequency features from FFT.
The typical PQDs waveform under noise environment and frequency component plots are illustrated in Figure 4A-D from top to bottom. It can be noted that from each frequency component plot, the fundamental frequency is 50 Hz. The frequency component results in Figure 4B indicate that harmonic components exist in the waveform, which are third, fifth, and seventh, respectively. However, it is difficult to distinguish the detailed harmonic components from the original PQDs waveform.

| Data normalization
Data Normalization is essential before training the classification model that rescales each feature in the fixed range, such as decimal scaling normalization, min-max normalization, Z-score normalization (ZSN), 32 and so forth. In this work, ZSN is applied to the original signal X t and frequency features X f . 33 The corresponding F I G U R E 3 Multiple power quality disturbances waveforms.
T A B L E 3 Generated synthetic data for the proposed method.
. The calculation of each feature x ′ itn and x ′ ifm are as follows: where μ tn , μ fm , and σ tn , σ fm are the mean and standard deviation of ith features in time and frequency domain data, respectively. The mean and standard deviation of each time and frequency data rescaled variable are 0 and 1. Although data normalization avoids the numerical distance between different features, it cannot ensure the existence of irrelevant features. These unwanted features could deteriorate the classification accuracy and increase the computational complexity. Thus, eliminating irrelevant features is necessary for improving classification accuracy.

| Fused CNN-LSTM model
As normalized time and frequency data individually contain partial features in different PQDs. This study proposes a fused CNN-LSTM model based on time and frequency domain data for disturbances classification. The CNN and LSTM modules are used to extract spatial and sequential information, respectively. The architecture of our proposed model is shown in Figure 5. Our fused CNN-LSTM framework processes time domain and frequency domain signals separately and simultaneously. It contains three stacked CNN modules denoted as Conv1, Conv2, Conv3, and one LSTM layer. The CNN module consists of two 1D convolutional layers, a maxpooling layer, and batch normalization, which is similar to the VGG network. 34 As shown in Figure 5, each convolutional layer uses the same stride, kernel size, and activation function except that the number of filters increases as the depth increases. The LSTM layer is connected after the three stacked CNN modules. The features of two CNN-LSTM modules are concatenated and input to the two fully connected (FC) layers. To ensure the model generalization, 35 the dropout ratio is set to 0.2 in FC layers to prevent overfitting. Finally, the softmax activation function in the output layer is used to map the classification probability. The convolutional layer in CNN is the most significant difference from the conventional neural networks, which extract features based on convolution filters with shared weights and biases. Compared with conventional approaches, CNN decreases the parameters dramatically and achieves good performance by convolutional kernels.

| 2273
The output of the 1D convolutional layer can be expressed as: where X i l−1 is the 1D input in l − 1th layer, w j l is the filter kernel, b l represents the bias, and • σ ( ) indicates the activation function. A fixed convolution kernel with the size of 2 and stride of 1 is applied as shown in Figure 5.
The rectified linear units activation function is used after each convolutional layer to add the nonlinearity. Long short-term memory, a variant of RNN architecture, was used to solve sequential data problems, such as machine translation and speech recognition. The unit in LSTM is called "cell", which is more complicated than traditional neurons. An LSTM cell comprises three parts: the input gate, the output gate, and forget gate. Figure 6 shows the standard LSTM architecture. These sophisticated LSTM cells connect to each other, enabling LSTM to distinguish useful long-term and short-term features. The equations in the LSTM cell can be defined as:  forget, and output gate in the LSTM cell. ct and c t are the cell input vector and cell state vector. ⨀ is the elementwise product, σ s and σ tanh are the activation functions of sigmoid and tanh, respectively. The parameters W * , U * , and b * are the weights and biases in different gates.

| Training and evaluating the model
To verify the effectiveness of the Fused CNN-LSTM, 16 types of synthetic PQDs data, as mentioned in Section 2, are used to train the model. The categorical cross-entropy and Adam optimizer are adopted to train our model. Categorical cross-entropy is a loss function commonly used for classification. It computes the single data loss value between the category label and predictions, and the cost function can be calculated by: where   y y y y = , ,…, represents the ith signal corresponded label and   y y y ỹ =̃,̃, …,ĩ is the predicted score vector for the signal i.
Through several experiment comparisons, the optimal configuration and hyperparameters of our proposed network are presented in Table 4. The batch size is set to 32, which updates the model parameters based on backpropagation by randomly selecting the data from the training set. A total of 70 epochs are used to train the model. The initial learning rate is set to 0.001 and automatically reduced by 1/10 if the loss value in the validation set does not decrease during 10 epochs. The proposed method is trained and evaluated in the TensorFlow framework based on Python 3.7 environment with an Intel Core Xeon CPU (2.3 GHz), 16 GB DDR4 RAM, and an NVIDIA Tesla P100 GPU.
The detailed results of our method for each type under different noise environments are listed in Table 5. As shown in Table 5, the accuracy under 20, 30, 40 dB, and pure noise environment are 97.75%, 99.34%, 99.37%, and 99.34%, respectively. The highlevel noise (20 dB) data accuracy is relatively lower than in low-noise and no-noise environments (30 dB, 40 dB, and pure).

| Analysis of experimental results with model comparison
To demonstrate the advantages of the fused CNN-LSTM model, we applied three advanced neural network architectures without using handcrafted feature selection methods for comparison. The details of the comparison methods are introduced as follows.

| CNN-LSTM
A hybrid architecture was adopted by Mohan et al. 36 It contains 64 and 128 filters with stride 3 in two convolutional layers. One LSTM layer with 50 cells is connected to the convolutional layers, and a FC layer is used to represent the PQDs type.

| Multifusion CNN
The MFCNN model 24 uses raw PQDs signal and its FFT signal as an input. During convolutional layers, the time and frequency domain features are merged into one layer. Then the combined features are connected with the three dense layers to classify the disturbance types.
Considering the PQDs signal length in our dataset, directly implementing the above models may lead to unsatisfactory results. Thus, to compare the model performance reasonably, we modified some model parameters and configurations based on our experience. The details of the abovementioned models for PQDs classification are shown in Tables 6 and 7. Figure 7 shows the performances of the abovementioned algorithms after tuning the hyperparameters. It clearly shows that our proposed model outperforms other models under different noise environments. The accuracy of deep CNN and CNN-LSTM is better than MFCNN. The average accuracy of deep CNN and CNN-LSTM is about 98.5% and MFCNN achieved the lowest result with 97.38%. It can be observed that the performance of the different methods in 30 dB, 40 dB, and pure conditions are similar. However, under the 20 dB noise environment, the accuracy of deep CNN, CNN-LSTM, and fused CNN-LSTM decreased by 1.5%. But MFCNN decreased by more than 2.5% to 95.5%, which is very susceptible to noise.

| COMPARISON OF METHODS ON SIMULATED SYSTEM
The proposed method was further validated in the typical distributed system. Figure 8 shows the detailed distributed system, which consists of two wind generators, one photovoltaic generator, one synchronous generator, a high-voltage network, and different load types. The 0.6 kV distribution network is connected to the 115 kV high-voltage network through the step-up transformer. The fundamental and sampling frequency are the same as the synthetic data configuration.
T A B L E 6 The structures of compared models.

Models
Layers (not including max-pooling layers) Deep CNN 1D Convolutional layer: 9 The different types of PQDs signal were obtained by manually creating various faults, special operations, and faults durations, such as line faults, nonlinear loads, and so forth. Considering the voltage signals are monitored and recorded from different voltage level locations, various amplitudes of PQDs were converted into per unit according to its bus voltage. The typical types of PQD voltage waveforms from the simulated system are shown in Figure 9. The three waveforms using different colors in Figure 9A-D represent the three-phase fault waveforms. Figure 9E-H shows the single-phase disturbance waveform. Table 8 summarizes the disturbances identification results of different methods in simulated data. As shown in Table 8, the average accuracy of fused CNN-LSTM is 98.89%, which is better than deep CNN, CNN-LSTM, and MFCNN. It shows that deep CNN, CNN-LSTM, and MFCNN models can not accurately recognize the flicker disturbances. But our proposed method achieves 90% accuracy on flicker disturbances. The confusion matrix in Figure 10A-D clearly shows the predicted results of four neural network models.
As shown in Figure 10A, the deep CNN model can not accurately identify the flicker and harmonics disturbances. For the 50 signals of the flicker waveform, 40 signals are detected correctly, and 10 are misclassified as flicker with swell disturbance. Among the 100 harmonics signals, only 34 harmonics disturbances are identified correctly. The rest of the harmonics waveforms are classified to flicker with harmonics, oscillatory transient, sag with harmonics and swell with harmonics which means Deep CNN cannot precisely identify the harmonics disturbances. The confusion matrix of CNN-LSTM in Figure 10B shows most PQDs data are classified correctly except for flicker and normal types. From the 50 F I G U R E 8 The simulated distribution system.    Figure 10C shows that the MFCNN model can not distinguish the flicker and its multiple disturbances, only four flicker disturbances are classified correctly. However, the recognition result in Figure 10D demonstrated that our proposed model achieved the best performance.

| CONCLUSION
This paper proposes a fused CNN-LSTM to identify the PQDs types using time and frequency domain features. The original PQDs signal and frequency data were processed by CNN-LSTM individually. Afterward, the extracted time and frequency domain features were fused into one layer. Finally, two FC layers and a softmax classifier were used to predict the disturbances category. In particular, the fusion of time and frequency features enhances the accuracy in detecting PQDs. Comparison analysis of classification time consumption and accuracy with other methods in synthetic and simulated data demonstrates that our proposed method is more suitable for applications, including noise environments. Future works will focus on improving the noise environment classification accuracy and adding more complex PQDs types.