Intelligent diagnosis of cascaded H-bridge multilevel inverter combining sparse representation and deep convolutional neural networks

Effective fault diagnosis for cascaded H-bridge multilevel inverter (CHMLI) can reduce failure rate and prevent the unscheduled shutdown. Nevertheless, traditional signal-based feature extraction and feature selection methods show poor distinguishability for insuf-ﬁcient fault features in a one-dimensional space. The shallow learning models are prone to fall into local extremum, slow convergence speed and overﬁtting. To cope with these problems, a novel image-oriented fault diagnosis strategy based on sparse representation (SR) and deep convolutional neural network (DCNN) is proposed for CHMLI. Initially, Hilbert–Huang transform (HHT) is applied to obtain the HHT spectral images of original monitoring signals, where these images comprehensively represent the features with detailed information of multiple domains on the time-frequency plane. Furthermore, an image fusion method based on the SR algorithm is employed on these spectral images of the same fault category to


INTRODUCTION
Cascaded H-bridge multilevel inverter (CHMLI) has been widely used in the fields of energy, transportation, communication, industrial manufacturing, and aerospace [1]. Compared to the traditional two-level inverter, the CHMLI possesses the following advantages: Low voltage stress in power switches, the low harmonic content of output voltage waveform, less switching loss, and high working efficiency. However, the number of power semiconductor devices in the multilevel inverter (MLI) circuit increases exponentially with the circuit structure becoming more complicated, and the probability of power semiconductor device's fault increases [2], which directly leads to abnormal working conditions or emergency shutdown. In this case, efficient and accurate fault diagnosis of the CHMLI is of utmost to ensure the reliability and security to avoid financial losses and causalities.
Generally, the faults of the CHMLI can be divided into shortcircuit fault (SCF) and open-circuit fault (OCF) [3]. Due to the extremely short duration of the SCF, which has severe impacts on the CHMLI, hardware protection is adopted for SCF [4]. More precisely, when an SCF happens, the fast-acting fuse will be disconnected and the SCF is converted to OCF. In the existing researches for the OCF in MLI circuit, Hao et al. [5] utilised wavelet analysis to extract fault features, then applied a support vector machine (SVM) to diagnose OCFs for CHMLI. In [6], Kuraku et al. proposed a diagnosis strategy based on the probabilistic principal component analysis (PPCA) and SVM for controlled power semiconductor devices in a single-phase CHMLI. In [3], Wang et al. utilised principal component analysis (PCA) and the multiclass relevance vector machine (PCA-mRVM) to diagnosis the OCFs for CHMLI, where fast Fourier transform (FFT) is used to extract fault features for original monitoring signals, PCA-mRVM to reduce the dimension of high-dimensional features and classify the OCFs samples for CHMLI. OCF diagnosis methods of CHMLI are mainly divided into current-based and voltage-based diagnosis methods. The current-based fault diagnosis method can locate the fault by measuring the output current, through coordinate transformation, signal processing or pattern recognition technology. The commonly used methods include the current trajectory method [7], average current Park's vector method [8], and current pattern recognition method [9] and so forth. The selection of the fault signal is very important, which will directly affect the result of fault diagnosis. The diagnosis time of the currentbased method is usually in one fundamental period, and it is easily affected by noise, disturbance and other factors. Therefore, it will lead to misdiagnosis when the current signal, such as the current trajectory method and average current Park's vector method [8], is selected as the fault signal for CHMLI. On the contrary, the output phase voltage signal is closely related to the fault type and location. Different OCF can extract different fault feature. The voltage-based fault diagnosis method [10] combines the output voltage waveform of CHMLI and the diagnosis model to locate the fault. The diagnosis time is relatively short, and the reliability is high. Therefore, this paper uses the output voltage as the fault diagnosis signal.
In general, the diagnosis approaches are roughly classified into model-based [11] approaches and data-driven [12][13][14][15][16][17] approaches. Section 2 comprehensively reviews the state-of-theart fault diagnosis approaches for the circuit in recent years, analysing the strengths and limitations of these approaches. However, there are several limitations in feature extraction and fault diagnosis of CHMLI, which is mainly reflected in the following three aspects: 1. In the aspect of signal acquisition, the monitoring signals of the CHMLI are weak and susceptible to interference, and the signal-based feature extraction methods are unable to provide enough information in specific frequency bands. In addition, power semiconductor devices in the CHMLI are closely coupled. The hybrid signal formed by the crosstalk of adjacent power semiconductor devices makes the original monitoring signal easy to be covered up or distorted, resulting in a low signal-to-noise ratio. 2. In the aspect of fault feature extraction, the feature extraction result of the signal-based method is a high-dimensional matrix, which not only contains relevant features but also redundant information. If all the features are imported into a classifier directly without further processing, it will increase the computational complexity. The traditional signal-based feature extraction methods usually compress the amount of data by sampling or directly discard some signal details to generate medium or small-scale feature datasets. However, it will lose a lot of important fault features in the process of feature dimensionality reduction. 3. In the aspect of fault diagnosis, the shallow learning models cannot reveal the complex inherent relationships between the root cause of faults and the signal signatures, which often suffer from invalid learning and weak generalisation when learning with massive fault features. With the increase of the number of power semiconductor devices and the complexity of the circuit, the data to be diagnosed in the long-term monitoring process is generally massive. Although the optimisation algorithms can improve the diagnosis accuracy, the computation cost and diagnosis time are increased.
Therefore, this paper proposes a novel image-oriented diagnosis strategy based on sparse representation (SR) and deep convolutional neural network (DCNN) to solve the above problems of feature extraction and fault diagnosis of CHMLI. The main contribution of this paper is described as follows: 1. The time-frequency graph analysis realises the effective feature extraction of massive fault signals and overcomes the problems of time-frequency coupling of monitoring signals and poor separation of fault features. Specifically, the output voltage is converted into two-dimensional image features, highlighting the time domain and frequency domain features. Time-frequency joint analysis reveals the transient information distributed in each frequency band, and local features can be extracted in the time domain and frequency domain, and the fine feature characterisation of CHMLI fault is realised. 2. The SR algorithm is applied to fuse multiple time-frequency images of the same fault category, which not only contain abundant fault features but also uncover the difference between samples of different fault categories. It improves the accuracy and robustness of fault classification. Additionally, the SR of the time-frequency graph of the massive fault signals reduces the difficulty of learning the DCNN for subsequent fault classification and reduces the storage space of the computer. 3. This paper uses a variety of DCNN models to classify the fused images and identify the fault type, which achieves substantial improvements in the accuracy of diagnosis through the superior ability in determining fault types and alleviating over-fitting problems of CNNs.
The remainder of this paper is organised as follows. Section 2 reviews the related work of fault diagnosis methods. Section 3 describes the proposed method and theoretical techniques. Section 4 introduces the experimental platform of the CHMLI. Section 5 presents the experimental results of feature extraction, feature fusion, and fault diagnosis. The conclusion and future work are drawn in Section 6.

RELATED WORK OF FAULT DIAGNOSIS METHODS
The state-of-the-art model-based and data-driven fault diagnosis methods are reviewed in the following subsections.

Model-based approaches
Most model-based approaches [18] are dependent on the empirical knowledge of the operation conditions, material characteristics, and failure mechanism to build mathematical models, among which state estimation method, parameter identification method, and analytical model method are representative. The state estimation method [19] uses the mathematical model of the circuit and the observation signals to design the state observer and analyse the residuals between the observed value and the true value of the circuit to realise fault diagnosis. The parameter identification method [20] obtains physical or model parameters for fault diagnosis based on the known topological relationship, input and output of the circuit. However, the mixed signal formed by intense noise and crosstalk of the other power semiconductor device makes the original monitoring signal of the CHMLI relatively easy to be distorted, resulting in a low signal-to-noise ratio. The analytical model method [21] establish the mathematical model of the circuit and analyse the residual change between the model and measured data to realise fault diagnosis. Nevertheless, the analytical model method can hardly build systematic and precise mathematical models in practice owing to the uncertainty and sensor noise for complex systems. Under such circumstances, the datadriven methods emerge with the advantages that can automatically infer causality hidden in the data and directly model the fault features of CHMLI. Nevertheless, the effective construction of the feature set and design of the high-accuracy prediction model are the two ambiguities that hamper the popularity of data-driven methods [22].

Data-driven approaches
Data-driven approaches [23,24] [27]. Nevertheless, these time-domain methods are unable to provide sufficient information in specific frequency bands. For frequency-domain feature extraction spectral analysis, via fast FFT [3], sweep frequency response analysis, wavelet [28,29] of the output signal, the features of the circuit were fetched. In [30], wavelet transform is used to decompose the three-phase output voltage of the inverter into high-frequency and lowfrequency coefficients, and the square sum of the three-phase low-frequency coefficient is taken as the characteristic vector of the phase output voltage so as to carry out fault diagnosis for the inverter. However, frequency-domain methods discard the structure information embedded in the original monitoring signal. Under these circumstances, the time-frequency analysis is proposed to solve the above problems. It represents the rela-tionship between the frequency components of the signal over time [24]. Hence, in this paper, the Hilbert-Huang transform (HHT) algorithm is utilised to transform the original monitoring signal into the spectral image and capture comprehensive features with the involvement of time-domain and frequencydomain characteristics.

Feature selection
The feature selection methods always applied suitable projections to map the matrices in a feature subspace capturing high-discriminative fault information. In [31], PCA is used to extract the fault feature of the MLI circuit, and BP neural network is used to realise OCF diagnosis. Thereafter, a variety of approaches, that is, independent component analysis, kernel PCA, two-dimensional non-negative matrix factorisation and two directions two-dimensional linear discriminative analysis, are implemented to increase the discrimination between different fault categories via further obtaining the lower-dimension feature vectors. However, the fault information is lost during the dimensionality reduction process. Based on these reasons, unlike the above feature selection methods, this paper introduces a feature fusion algorithm based on SR to capture highorder relations between fault features and class labels by fusing the HHT spectral images of the same fault category.

Fault diagnosis
There are many shallow learning models, that is, artificial neural networks [32], SVM [5], mRVM [3], and extreme learning machine [33], which have been widely implemented in fault diagnosis. Moreover, various optimisation algorithms, such as genetic algorithm, quantum-behaved, chaos theory, particle swarm optimisation [27], and crow search algorithm [13], have been applied to optimise the hyper-parameters of the above shallow learning models. After that, deep learning models have emerged as an effective approach because of their powerful generalisation ability by learning the mapping relationship between the available fault feature and the corresponding fault category. In recent years, several effective deep learning models have been applied in the fault diagnosis, that is, deep belief network (DBN) [34], sparse auto-encoder (SAE) [35]. For instance, Sun et al. [13] presented a novel DBN model optimised by the crow search algorithm to realise fault diagnosis for a DC-DC circuit. In [35], Long et al. investigated a new deep transfer learning method for fault classification, which is a supervised transfer learning based on a three-layer SAE. In addition, DCNN is emerging as a highly effective neural network architecture for fault diagnosis. In the existing literature, Ince et al. [36] applied one-dimensional CNN on the real-time fault diagnosis. DCNN models use convolution operations to extract features at different levels of the image from shallow to deep, and the entire network can automatically adjust the parameters of the convolution kernel, thereby generating the most suitable classification features without supervision. Compared to shallower networks, the DCNN has a The overview of the proposed image-based fault diagnosis method strong ability to capture basic features. Therefore, in this paper, the DCNN models are applied to classify the HHT spectral images and acquire accurate diagnosis results.

PROPOSED DIAGNOSTIC METHOD and THEORETICAL techniques
The proposed image-based fault diagnostic strategy is represented in Figure 1, which is described as follows: Step1: Data acquisition. The

Hilbert-Huang transform
HHT algorithm, proposed by Huang et al. [37] in NASA, is useful for non-linear and non-stationary time-series analysis. The feature extraction methodology based on the HHT algorithm involves two steps: Empirical mode decomposition (EMD) and Hilbert spectrum analysis (HSA).

Empirical mode decomposition
The EMD algorithm extracts a series of intrinsic mode function (IMF) components and a residual component according to the extreme point of the original signal, which decomposes the fluctuation or trend of different scales that are real in the signal step by step, from high frequency to low frequency. The new features are denoted as where C j (t ) represents the jth IMF component; r (t ) represents the residual component.

Hilbert spectrum analysis
HSA is performed on the extracted IMF component to explore the time-frequency distribution of the signal, and instantaneous frequency and the instantaneous amplitude of the IMF For IMF component C j (t ) of Equation (1), a Hilbert transformation is represented as where P is the Cauchy principal value. The amplitude function a j (t ) and phase function j (t )are represented as Hereafter, the instantaneous frequency of the IMF component, describing the fluctuation frequency at a certain moment, is represented as The values of the instantaneous frequency and the instantaneous amplitude depend on the time t. The signal amplitude in the time-frequency distribution a j (t ) ∼ j (t ) ∼ t is called the Hilbert time-frequency image.

Sparse representation for image fusion
In the SR algorithm, the source image I is divided into small blocks, and the over-complete dictionary D is used to solve the problem that SR cannot directly be used with image fusion since the SR globally handles an image. As shown in Figure 2, the jth patch is lexicographically ordered as a vector v j , which is where d t represents an atom from a given over-complete dictionary and If the vectors of all the patches in an image I are constituted into one matrix V, which is expressed as where S = [s 1 , s 2 , … , s J ] and J represent the number of image patches. Hereafter, matrix V can be expressed as where S is a sparse matrix.

Proposed fusion scheme
There are k registered HHT time-frequency images I 1 , I 2 , … , I k with the size of M × N. As shown in Figure 3, the proposed fusion scheme based on the SR algorithm is described as follows: Step1: The sliding window technique is used to divide each HHT time-frequency image I k , from left-top to rightbottom, into patches of size n × n, that is, the size of the atom in the dictionary. Hence, the source images will be segmented into N patches, which are denoted as Step2 Step4: With the "max-L 1 " rule, i m is fused to obtain the fused sparse vector The fused spare coefficient of V i F is obtained by Step5: Eventually, the fused image is obtained from the fusion coefficients and dictionary reconstruction. The above process is iterated for all image patches in to obtain all the fused vectors . Each V i F is reshaped into a patch Z i F , and Z i F is substituted into its original position in S F . As patches are overlapped, each pixel value in S F is averaged over its accumulation times.

Deep convolutional neural networks
Many famous CNN models have been proposed, that is, LeNet [38], AlexNet, ResNet, VGGNet, GoogLeNet [39] and so forth. Table 1 compares the important parameters of LeNet, AlexNet, ResNet, VGGNet, and GoogLeNet. In 1998, LeCun et al. [38] in New York University developed a LeNet model that can recognise handwritten digits. This is the first time that a CNN model has been successfully applied in image recognition to solve practical problems. In the 2012 ImageNet Large Scale Visual Recognition Competition (ILSVR), AlexNet achieved a top-five error rate of 15.3%, while the second one achieved a top-five error rate of 26.2%. Subsequently, complex and better performance DCNN models experienced explosive growth.
ResNet proposed a residual network that uses the difference between output and input for optimisation training. ResNet reduces the problem of gradient disappearance in deep neural The cascaded H-bridge multilevel inverter experimental system setup networks by introducing the 'shortcut' module and identity connections.
GoogLeNet [39] won the championship of the ILSVRC competition in 2014, while the second place is VGGNet. GoogLeNet proposed the concept of 'Inception module', breaking the tradition of connecting CNN's layer by layer. The main idea of Inception is to find the optimal local sparse structure of the image and replace it with dense components approximately. It can achieve effective dimensionality reduction, which increases the width and depth of the network under the same computing resources. Moreover, GoogLeNet can reduce the parameters that need to be trained and alleviate the problem of overfitting. In GoogLeNet, each Inception module consists of multiple parallel convolutional layers with a size of 1 × 1, 3 × 3, 5 × 5, and a max pooling layer for the extraction of different features simultaneously.
The training and testing sets are inputted to the input layer of the above-mentioned DCNN models. Next, a useful feature is obtained from the feature extraction layer, which is composed of convolution and pooling layers. The input of the network is a 2D array data, which represents the spectra images that will be classified in DCNN models. At last, the softmax layer is used as an OCF classifier for the CHMLI. There are nine categories of faults, that is, healthy condition, S 11 , S 12 , S 13 , S 14 , S 21 , S 22 , S 23, and S 24 OCF.

Configuration of the fault diagnosis system
As shown in Figure 4, the experimental setup consists of an experimental circuit, a DC source, and an FPGA Xilinx SPARTAN-3E XC3S250E controller board. The CHMLI covers the dead-zone, driving, and the main circuits. A 17N80C3 MOSFET module has been selected for the power switch transistors in the main circuit. The driver circuit consists of IR21844 integrated power modules. The fault diagnosis experi-  Table 2. The output DC voltage of the CHMLI is 50 V, and the sinusoidal reference signal with an amplitude modulation factor m a is 0.8. The load resistance R load is 10k Ω, the sampling time T s is 20 µs, and the fundamental frequency is 50 Hz. Gaussian noise of 10% is added to the input data to test the proposed fault diagnosis technique performance. In this paper, a single-phase CHMLI is controlled via the PSPWM technique [40]. PSPWM is a type of PWM control mode that is suitable for MLIs [41]. The considered parameters are listed in Table 3. For an l-level inverter, l−1 carriers have the same frequency f c and amplitude V c . A reference signal with amplitude V r and frequency f r has its zero centred in the middle of the triangular carrier set.
The signed and triangular waveforms are usually taken as the reference and carrier waveforms for every module that has the same reference signal. The amplitude and time of the reference and carrier signal are affected by the amplitude modulation index m a and frequency modulation indexm f . The implementation process of the PSPWM technique for a CHMLI is illustrated in Figure 5. Comparing the reference with each of the carrier signals continuously, the output voltage level is set to high when the reference is greater than the carrier, and it will set to be low when the reference is lower than the carrier.
In l-level MLI, the amplitude modulation index, m a , and the frequency modulation index, m f , are defined as The implementation process of the phase-shift pulse-width modulation technique

Fault signal analysis
This paper focuses on the OCF diagnosis in the CHMLI. Since the multiple power semiconductor devices are unlikely to break down simultaneously, this paper only considers the fault of one power semiconductor device. Table 4 lists the fault modes, classification labels, and fault codes. More precisely, the classification label [0,1,0,0,0,0,0,0,0] T indicates that an OCF occurs at bridge-1 switch-1 (S 11 OCF). The output voltage of the CHMLI, which obtained by the voltage measurement sensor, is used as the original monitoring signal. The output current signals of CHMLI are independent from OCFs. Hence, the output voltage signal is taken as the original signal and input to the classifier after feature extraction. Figure 6(a) shows the output voltage signals of CHMLI under healthy condition. Figure 6(b) depicts the output voltage at the S 11 OCF without providing a positive power supply from the source to the load through the power semiconductor device S 11 . When the OCF occurs in S 11 , it can be seen from the output voltage waveform that the power semiconductor device of the second H-bridge is fault-free, and the positive half-cycle has only one step, that is, +50 V, and the negative half-cycle has two steps, that is, -50 and -100 V. The output voltage signal under the S 13

5.1
Image-oriented feature extraction

Empirical mode decomposition
EMD is suitable for non-stationary and non-linear fault signal decomposition processes, and the IMF component can reveal the intrinsic and essential information of the fault feature. Different IMF component contains a different time scale, which can make the characteristics of the signal display at different resolutions. An in-depth analysis of IMF components can more accurately and effectively extract the fault feature information of the monitoring signals. Figures 7(a) and (b) showed IMF1-IMF6 components under the healthy condition and S 11 OCF, which highlight the local features of the output voltage signals. The frequency of noise is usually much higher than the frequency of the fundamental frequency component, so the average value of the instantaneous frequency of the noise will also be much greater than the fundamental frequency. The IMF1 component is a high-frequency component. From the last noise IMF to the first fundamental frequency IMF, the instantaneous frequency will drop sharply from high to low.

Hilbert spectrum analysis
The instantaneous amplitude and instantaneous frequency of IMF components can be used as fault features by HSA. Hereafter, all results are expressed as a distribution diagram of  Figure 8(a). As a contrast, Figure 8(b) shows the instantaneous amplitude and instantaneous frequency of the IMF1 component for S 11 OCF. There are substantial differences in the instantaneous amplitude and frequency between the healthy condition and S 11 OCF. Due to the change in amplitude caused by OCF, the envelope of the signal will also change, which leads to more fault information in the signal envelope. In this experiment, the instantaneous amplitude and the instantaneous frequency of the IMF1 component contains useful fault information. The HHT time-frequency images for healthy condition and S 11 OCF are shown in Figures 8(c) and (d). However, if only the HHT algorithm is used to extract the time-frequency images of each fault category, the discrimination in HHT timefrequency images between all fault categories is not significant.
To solve this problem, this paper uses an image fusion algorithm based on the SR algorithm to fuse the multiple images for the same fault category, and the fused image contains more fault features.

Feature fusion
In the SR algorithm, the over-complete dictionary D is trained by the K-SVD algorithm. During the training process, the length of the dictionary is set to 256, and the number of iterations is set to 1000 times. The block size is fixed to 8 × 8. Subsequently, the "max-L 1 " rule is used to obtain the fused sparse vector. As shown in Figures 9(a) to (i), the fused images are obtained under healthy condition, S 11 OCF, S 13 OCF, S 21 OCF, S 23 OCF, S 14 OCF, S 12 OCF, S 24 OCF and S 22 OCF, respectively. Compared with the source images in Figure 8, the fault features of the fused images are more clearly distinguished. Ultimately, there are 1500 fused images for each fault category and a total of 13,500 fused images for the nine fault categories. After that, as the depth and width continue to deepen, the DCNN model can extract features from more complex images. Its powerful feature recognition capabilities will be fully reflected, which can convert the issue of fault diagnosis to image classification. Before the experiment of fault diagnosis, 1000 fused images are randomly selected as the training set for each fault category, the remaining 500 fused images are used as the testing set. That is, the training set of nine fault categories has a total of 9000 fused images, and the testing set has a total of 4500 fused images.

Experimental parameter configuration
The experimental platform is a server with Intel (R) Core (TM) i7-4790 CPU @ 3.60 GHz. The GPU is an NVIDIA GeForce GTX 750. The programming language is Python 3.

Diagnosis results of the proposed image-based methods
In this section, we applied different pre-trained DCNNs to train the fused images and achieve the intelligent classification of  Figure 10; all these DCNN models have realised high accuracy. Table 5 lists the accuracy, the loss value, the number of iterations, the calculation time for each iteration, and when the verification accuracy reaches the maximum value for the first time. Among these DCNN models, the GoogLeNet model realises the best classification performance. The GoogLeNet reaches the maximum accuracy in 160 iterations, with a validation accuracy of 100%. The ResNet-50 reaches the maximum accuracy in 50 iterations, with a validation accuracy of 100%. The VGG-19 reaches the maximum accuracy in 198 training iterations, with a validation accuracy of 95.14%. The average fault diagnosis accuracy of all DCNN models is 91.74%. The verification results of GoogleNet and ResNet-152 are stable, but the LeNet-5 and AlexNet models perform poorly, which is unstable and fluctuates too much.
To highlight the advantages of feature fusion, the diagnosis result (after image fusion) for different DCNN models in terms of classification accuracy is shown in Figure 11. The specific verification results are listed in Table 6. Because the SR algorithm can fuse time-frequency images of the same fault, the fused image contains more fault features. It can be seen from Table 6 that after feature fusion, the diagnosis accuracy of different DCNN models is increased, which demonstrates that feature fusion based on the SR algorithm can improve the accuracy and efficiency of fault diagnosis.

Different DCNN models comparision
A single comparison experiment cannot represent the classification accuracy of multi-label images. Therefore, to demonstrate the robustness of the proposed image-based diagnosis strategy, Figures 12(a)-(h) represent the diagnostic accuracy for LeNet-5,     Figure 13, and the plot consists of five numerical points: A minimum value (Min), a lower quartile (Q2), a median (Med), and an upper quartile (Q1), a maximum value (Max). The most significant advantage of the box plot is that the discrete distribution of the data can be described in a relatively stable way without the influence of the abnormal value. According to the results presented in Figure 13, it can be observed that the diagnosis results of LeNet-5 and AlexNet are the worst, and the accuracy is less than 90%. In contrast, the max value of ResNet-18 and ResNet-152 has reached 100%. However, GoogLeNet's multiple diagnosis results are more stable than ResNet-18 and ResNet-152. The overall classification accuracy of GoogleNet is more than 96%. Moreover, considering the basic network parameters in Table 1, the parameters of GoogLeNet are not only smaller than those of other DCNN models but also the size of the GoogLeNet is small. Consequently, we choose GoogLeNet as the comparative DCNN model for CHLMLI circuit fault diagnosis in this paper.

5.4.2
Comparison with signal-based shallow learning model Several traditional signal-based shallow learning models back propagation (BP) neural network in [42], SVM in, and mRVM in [3]) are utilised for fault diagnosis of the CHMLI to demonstrate the superiority of the proposed image-based diagnosis strategy.  is the kernel function and is the kernel parameter for several methods. c is the penalty factor of SVMs, CL is the predetermined limit for PCA. The Gaussian kernel function is used in mRVM, and the range of used parameters is (0, 1). The mRVM kernel parameter of the fault diagnosis model is initialised to 0.5. The parameter configuration for these methods is shown in Table 7.
According to Table 8, the achieved results prove the proposed image-based methods in this paper are superior to FFT- PCA-BP, FFT-PCA-SVM, and FFT-PCA-mRVM approaches.
In these signal-based shallow learning model, the FFT-PCA-mRVM method performs the best classification accuracy. Among the above-mentioned DCNN models, the GoogLeNet realises the best classification performance. Hence, the fault diagnosis results based on FFT-PCA-mRVM and HHT-SR-GoogLeNet are plotted in Figures 14 and 15. As shown in Figures 14 and 15, the red mark indicates the error diagnosis. The abscissa represents the true fault category. For instance, the fault codes between 0-1 belong to fault category 0, namely, the healthy condition. The fault codes between 1-2 belong to fault category 1, namely, the S 11 OCF. For FFT-PCA-mRVM, the mean diagnosis accuracy of all faults categories is 90% (405 of 450; 45 errors). More precisely, the classification accuracy for FFT-PCA-mRVM under healthy  For HHT-SR-GoogLeNet, the mean diagnosis accuracy of all faults categories is 99.73% (4488 of 4500; 12 errors). The classification accuracy for HHT-SR-GoogLeNet under healthy condition and S 11 , S 12 , S 13 , S 14 , S 21 , S 22 , S 23 , S 24 OCFs are 100% (500 of 100, zero error), 99.4% (497 of 500, three errors), 100% (500 of 500, zero error), 99.8% (499 of 500, one error), 99.6% (498 of 500, two errors), 99.8% (499 of 500, one error), 100% (500 of 500, zero error), 99% (495 of 500, five errors), 100% (500 of 500, zero error), respectively. In addition to the classification accuracy, the number of training samples is an important measure that is used to evaluate the proposed scheme. Our proposed method outperforms the other methods on a large set of fault feature data and realizes higher accuracy.

CONCLUSION
This work has proposed an image-oriented diagnosis scheme based on the SR algorithm and DCNN for the CHMLI. The conclusion of this study is summarised as follows: (1) The fault feature extraction experimental results demonstrate that timefrequency joint analysis can extract the localised features in both the time and frequency domains; (2) the feature fusion experiment results reveal that the SR algorithm fused multiple timefrequency images of the same fault category to obtain more fault feature data, which can enhance fault features and improve the accuracy of fault diagnosis; (3) the experimental results of the fault diagnosis demonstrate that the image classification method can effectively and accurately classify various fault categories.
The results of a comparison study demonstrate that the proposed image-oriented diagnosis strategy has a high potential for a data-driven fault diagnosis field.