Feature‐clustering‐based single‐line‐to‐ground fault section location using auto‐encoder and fuzzy C‐means clustering in resonant grounding distribution systems

Correspondence Mou-Fa Guo, Department of Electric Power Engineering, Fuzhou University, Fuzhou 350108, China. Email: gmf@fzu.edu.cn Duan-Yu Chen, Department of Electrical Engineering, Yuan Ze University, Chung Li 32003, Taiwan. Email: dychen@saturn.yzu.edu.tw Abstract Many sensors like digital fault indicators (DFIs) have been applied and promoted in distribution systems. The sensors can provide a technical mean for single-line-to-ground (SLG) fault section location, but there are still some feature extraction and fault diagnosis problems. A novel SLG fault section location method utilizing auto-encoder (AE) and fuzzy C-means (FCM) clustering is presented in this work. Taking advantage of abundant information provided by DFIs, striking features can be extracted by the AE network, which is different from the artificially designed features that rely on prior knowledge. Compared with the learning-based methods requiring massive training data, the proposed method only requires the data from one SLG fault. By applying the AE network to the zerosequence current measured by DFIs, the SLG fault section location’s striking features could be obtained. Through feature classification by FCM clustering without setting threshold, the positional relationship between each detection node and the fault point would be distinguished to locate the fault section. Considering the abnormal communication of DFIs, the experiment proves that the proposed method can work effectively under various fault conditions.

the SLG fault section location, which is an essential aspect of enhancing self-healing and automated fault management systems [5].
To address the problem of fault section location, the scholars divided the related methods roughly into several categories, such as the traveling waveform method [6], injection signal method [7], impedance-based method, etc. Each method has its advantages and disadvantages. The method in [6] can measure the traveling wave to locate the fault position precisely. The traveling wave methods can be sorted into single-ended methods and double-ended methods both requiring the specified devices to measure the traveling waves. There is difficulty in wave-head detection for single-ended methods due to the complex network structure with multiple branches. Meanwhile, the slight measurement error would affect the time synchronization for double-ended methods. The injection signal method [7] utilized a specially designed equipment to inject a characteristic signal into the distribution systems and usually work offline to avoid affecting its regular operation. The first two methods need new devices with high investment installed in primary substations. Some researchers focused on the impedance-based techniques [8,9], generally calculating the equivalent impedance to locate the fault point based on the voltage and current measured at the primary substation. The proposed method in [8] utilized the bus impedance matrix to express the substation voltage and current as a function of the fault location and fault resistance (FR) to yield the fault point. The fault location could be identified by comparing the calculated reactance between the fault point and the critical node with the known line parameter, the fault location involving the high-frequency transient information in [9]. However, the impedance-based methods can rarely solve the problem of multiple fault sections or fault points with the same electrical distance in industrial applications. Although the above techniques adopted the information of the primary substation for SLG fault section location, the accuracy and speed of such methods would be affected by the limited availability of data due to the line with complicated laterals and different parameters of the line in urban distribution systems yet [10]. Therefore, in addition to the information from the primary substation, more distribution system data is conducive to realize SLG fault section location.
Recently, many sensors, such as digital fault indicators (DFIs), have been widely installed in distribution systems due to their lower cost [11][12][13]. Although the traveling wave-based method and injection-based method can work well during SLG fault, both methods have their limitations. Traveling wave-based method requires a high sampling rate, and it is difficult to adapt to the multi-branch structure in distribution systems. The injection-based method needs additional signal injection equipment installed, and the cooperation of the detection equipment on the line is required to locate the fault section. In theory, the more the DNs are installed in a line, the higher the accuracy of locating the fault section will be. However, it is neither economical nor necessary to install too many DNs at each line in distribution systems, balancing the relationship between the cost of customer service outage and device investment cost [14]. The DFIs with optimal configuration would reduce operating costs and service interruptions by identifying the fault section. The electrical quantities at DNs could be recorded, of which the issue of fault section location may benefit from the abundant information. In general, the sensors such as DFIs could indicate the line section through which the fault current flows, depending on whether the fault current exceeds the threshold [15]. Because of existing blind zones and misjudgements when using a single criterion, two methods are fused based on fuzzy theory to realize fault section location in [16], which require feature extraction and threshold setting by human. The fault section could be identified by constructing a relationship matrix between possible fault location and line current in [11], which may be affected when receiving incomplete information due to communication loss. The method based on setting threshold cannot provide satisfactory results under the various conditions in industrial applications. Therefore, extracting useful features from redundant information for SLG fault section location are the primary problem for further research.
Many methods [17,18] utilized the transient signals of SLG fault to extract features artificially by signal processing technologies. Then, the artificial features were used to establish the judgment criterion for fault diagnosis. In [18], the dynamic time warping (DTW) distance is calculated as an artificial feature, measuring the similarity of two zero-sequence currents from two adjacent DNs. The fault section would be located depending on whether the pre-set threshold of DTW is achieved. Although the suitable threshold could be determined by massive data in the simulation platform to find the fault section, it could be hard to make high accuracy due to the complicated industrial environment. The data-driven techniques [19][20][21][22] would extract features adaptively and realize the fault location, simultaneously. In [20], the fault point can be predicted by the trained support vector machine, which maps lowdimensional data into high-dimensional space through kernel function to separates the samples. An SLG fault indicator is proposed in [21], estimating SLG fault based on k-nearest neighbour in the regression model. Because there exists a difference between the transient zero-sequence current waveforms of the fault line and the sound lines, continuous wavelet transformation and convolutional neural network (CNN) were adopted in [22] to extract useful features and locate fault line simultaneously, which could also be utilized to realize SLG fault section location incorporating sensors such as DFIs. Even so, a large number of samples with hand-crafted labels are needed for model construction, and the high computational cost has become a bottleneck of such methods. Besides, as for electric utilities, it is difficult to collect enough SLG fault data to build a suitable sample library for training models in practical applications.
The method in [23] for fault line detection utilizes the correlation between the outlet zero-sequence current of the fault line and the sound lines, whose idea could also be used for SLG fault section location. When SLG fault occurs in the distribution systems, the transient zero-sequence currents through two-ends of the sound line section are almost the same. Meanwhile, the transient zero-sequence currents through two ends of the fault line section are quite different. The DFIs with communication interfaces could transfer electric quantities to the control data, by which the zero-sequence current at different detection nodes (DNs) could be measured. However, the communication unit of DFIs usually blocks channels, loss packets, and then retransmit, thus, which cannot cooperate with the control centre in real-time for better performance of fault location [24].
To overcome the literature's drawbacks, we proposed a novel fault section location method using auto-encoder (AE) and fuzzy C-means (FCM) clustering in this work. Incorporating with sufficient information provided by DFIs, the AE network can extract useful features from fault signal adaptively, which is different from the artificially designed features based on conventional methods. The proposed method only requires data from the current event of SLG fault, thereby  The remaining of this work is organized as follows. Section 2 introduces a character analysis of SLG fault and the proposed method, including feature extraction and fault classification. In Section 3, it provided a simulation model based on PSCAD/EMTDC and verification using simulation data and on-site recorded data. In Section 4, the proposed algorithm in this work is compared with other existing methods to assess its performance. Finally, the conclusion is given in Section 5. Figure 1 shows a resonant grounding distribution system with lines divided into some sections by DNs wherer L andLare the equivalent resistance and inductance of the Peterson coil. Supposing that an SLG fault occurs at the m-th section between DN m and DN m+1 of the lineL j , the zero-sequence network is depicted in Figure 2.

PRINCIPLE AND METHOD
After an SLG fault occurs, the zero-sequence current of the fault point consists of the capacitance currents and the inductive current via the Peterson coil. In the transient process after the SLG fault occurs, it can be considered that the zero-sequence virtual voltage source at the fault point charges the Peterson coil, The flow chart of the proposed method and the grounding capacitance is in a state of rapid impulse discharge. At this time, the frequency of the transient capacitance current's frequency and the transient inductor current are very different, so the two cannot be compensated. The general transient process lasts for several cycles, during which the role of the Peterson coil can be ignored [25]. Figure 2 shows the zero-sequence network of the fault line, which neglects the Peterson coil.u f andR f indicate the virtual source of fault point and FR, respectively.G jk andB jk are the ground conductance and ground admittance of the kth section.i 0 jm is the zero-sequence current measured from DN k in the mth section of the lineL j .i 0i (i ≠ j )is the zero-sequence current measured at the bus outlet in the sound lines, also termed as its leakage currents generated by the total of the ground conductance and ground admittance of the line L i .
∑ i 0i is the sum of the zero-sequence current flowing through the sound lines. Meanwhile, the zero-sequence current measured from the DNs in the downstream of fault point is determined by the capacitance currents mainly.
The zero-sequence current measured from the DNs in the upstream of fault point depends on the capacitance currents in the downstream line and the leakage currents in all sound lines, as shown in Figure 2. It is inferred that there exists a difference in zero-sequence current between the upstream DNs and the downstream DNs of the fault point; moreover, there are similarities between the zero-sequence currents of the DNs located on the same side of the fault point. However, the transient zero-sequence currents from the DNs located on the same side of the fault point are almost the same. Therefore, the characteristic of SLG fault is the basic principle to determine the SLG fault section in this work. The flow chart of the proposed method can be shown in Figure 3. When an SLG fault occurs, the zero-sequence currents at all DNs would be recorded by DFIs. The AE network is established to extract the features of the normalized transient zero-sequence current automatically. Features classification is used to divide the DNs into two groups by FCM clustering without the set threshold to determine whether the DN is at the downstream or upstream of the fault point. Because the amplitude of the zero-sequence current at the upstream of the fault point is more significant than that at the downstream of the point, therefore, the positional relationship between DNs and the fault point would be obtained and used to locate the fault section that SLG fault occurs.

Measurement and pre-processing
When an SLG fault occurs in resonant grounding distribution systems, the zero-sequence voltage U 0 is adopted as the triggering criterion depending on whether U 0 is more significant than 15% of the phase voltage. Then, the transient zerosequence currents at all DNs would be recorded by the sensors such as DFIs. The first-half cycle waveforms of the transient zero-sequence currents will be obtained for normalization. Because there exists a difference in the transient zero-sequence current at the different DNs in actual distribution systems, the transient zero-sequence currents should be normalized as

Feature extraction with AE
It is challenging to extract compelling features from the SLG fault to establish a suitable judgment criterion for fault section location. Generally speaking, a single characteristic quantity of the fault signals such as amplitude/polarity, energy, waveform correlation coefficient, etc., is taken as artificial features to establish the existing method's criterion. The methods based on supervised-learning algorithms could extract the features of fault signals adaptively and realize intelligent classification for fault diagnosis simultaneously. However, it requires many samples for the training model, which could not be practical because the on-site recorded data is hard to obtain in practical applications. Therefore, AE is utilized to learn the fault signal features, which is an unsupervised learning algorithm and may be an effective solution for fault diagnosis. As shown in Figure 4, the basic structure of the AE network is composed of an encoder and decoder. The input layer and output layer own the same dimension to make the reconstructed input infinitely close to the original data. Meanwhile, the hidden layer's size is smaller than the two layers to reduce the dimension of the input data. The hidden layer's output feature would be regarded as the compressive representation of the input data [26]. The encoder maps the original signal x = [x 1 , x 2 , … x n ] T from the input layer into the hidden layer, and the output of the hidden layer Then, the decoder would utilize the hidden vector h to get where w, w ′ , and b, b ′ are the weight coefficients and bias coefficients of the AE network in the encoder and decoder processes, respectively. Multiple forward iterations and backpropagation algorithms adjust the weights and bias of the network. f (⋅) and g(⋅) are the activation functions in encoder and decoder processes, where tanh function is utilized instead of the general sigmoid function or ReLU function, which cannot reflect the polarity of the transient zero-sequence currents.
The AE network's training termination condition is that the number of epochs exceeds 100 times, or the loss function's output in (4) is less than the threshold set to 10 −3 where E ( j ) represents the loss function's output of the jth training sample measuring the difference between the original signal x ( j ) and reconstructive signalx ( j ) , and N is the number of training samples. The second item D is the L2 regularization, and λ is the weight attenuation coefficient set to

Feature classification with FCM
FCM clustering is a useful pattern recognition method, which divides the input signal into a cluster centre based on the membership matrix to realize flexible fuzzy partitioning. Supposing The jth sample in Z belongs to the membership degree of ith class represented by u i j , which satisfies (5) where m is the fuzzy weighted index, set here as m = 2; and is the cluster centre of the ith class, set here as c = 2. The calculation formula for v i and the iterative formula to minimize the objective function are shown as After iterative convergence, the optimal clustering centreV = [v i ] and the membership matrix U c×n = [u i j ] are obtained. The features extracted from the fault data are input into FCM clustering in order to achieve feature classification.

Fault section location based on AE and FCM
In engineering applications, the AE network is applied to the transient zero-sequence currents to extract useful features. The feature classification is then achieved by FCM clustering to obtain a membership matrix to determine whether the DN is in the upstream of the fault point or the downstream. The membership matrix is as follows: where u i j indicates the membership degree that the DN j attaches to the category i, and N equals the number of DNs. The DNs attached to category 1 or category 2 could be confirmed through the membership matrix U. FCM clustering can divide DNs into the two categories, but it fails to determine whether the DN belongs to the upstream or downstream of the fault point because the amplitude of the transient zero-sequence current at the upstream section of the fault point is higher than that at the downstream section. Therefore, the average amplitude of the transient zero-sequence current is obtained by (9) revealing the different categories' content where Ave ( j ) ( j = 1, 2, … , c) indicates the average amplitude of the zero-sequence current measured by the DNs grouped in the jth category,x ik (i = 1, 2, … , N ; k = 1, 2, … , n) indicates kth data point of the zero-sequence current of DN i ,C j represents the zero-sequence current measured by the DNs grouped in the jth category, |. | means the absolute value operation,N j indicates the total number of the DN grouped in the jth category. In this work, utilizing the sensors such as DFIs installed in the distribution system, the proposed method for SLG fault section location was presented combining with AE and FCM clustering. When an SLG fault occurs, the data of the transient zero-sequence current would be collected. After normalization operation, the first-half waveforms of the transient zero-sequence current would be regarded as the AE network's original input. Through the encoder and the decoder, the reconstructed waveform is obtained to infinitely approximate the original waveform. The transient zero-sequence current's output feature would be extracted from the hidden layer of the AE network once the training termination condition is satisfied. The FCM clustering is adopted to the extracted features. The extracted features are sorted into two categories, and the clustering result shows the relationship between the fault point and the DNs to locate the fault section. The visualization process of the proposed method is shown in Figure 5. Due to problems like data shortage and communication problems existing in the real environment, the proposed method can overcome the difficulty in model training requiring massive training data and

Simulation test
The simulation model of a resonant grounding distribution system based on PSCAD/EMTDC ® is established to verify the proposed method, where O L indicates the length of the overhead line, C L suggests the length of the cable line, R f indicates FR. The main transformer, overhead line, and cable line parameters could be found in Tables 2 and 3, respectively. What's more, adopt the impedance Z to equalize each line's load, which is set as 100+j40 Ω. The compensation degree of the Peterson coil is temporarily taken to 10%. The corresponding equivalent inductance L is equal to 0.5265 H. Because the active loss of the Peterson coil is about 2.5%-5% of inductive losses, 3% of inductive losses is considered as the active loss for calculation,  Figure 6. The sampling frequency is set as 10 kHz. The SLG faults are set to occur at the line L 1 , which is simulated under various conditions. Table 4 shows typical faults under different fault points, fault initial angles (FIA), and FR). The fault sections are all located correctly. For each fault with a verified condition, for instance, the FR would change randomly, but there is always a certain difference between the transient zero-sequence current waveforms. The AE network can extract the adaptive characteristics from the waveforms. Therefore, the proposed method is suitable for the typical SLG fault with different FIAs and FRs. Besides, the influence of noise is    Table 4 is taken as an example to illustrate the result in detail. After adding 5 dB of noise, the original and polluted waveform of the transient zero-sequence current is shown in Figure 7(a) and (b), respectively. The extracted feature of transient zerosequence current is shown in Figure 7(c). And the clustering result based on FCM is shown in Figure 7(d), which gives a good discrimination degree between the fault line section and sound line sections. Tables 5 and 6 show the fault section results for different voltage levels and compensation degrees. It shows that fault line sections can be correctly identified in a wide range of voltage levels and compensation degrees.
With the rapid development of distribution systems, changes in topology and parameters should be considered. The topology changes are simulated by adding and deleting one line in the model of the distribution system, illustrated in Table 7, which   shows that the proposed method is suitable for use in different network structures. The zero-sequence parameters of the system are changed to verify the robustness of the proposed method. When an SLG fault occurs at f 2 , the FIA, FR, and SNR are 60 • , 100 Ω, and 20 dB, respectively. There is no change in the zero-sequence parameters of Case 1 in Table 8. In Case 2 and Case 3, the zero-sequence inductance of the systems is set to 90% and 110% of the initial inductance. Meanwhile, the zero-sequence capacitance is set to 90% and 110% of the initial capacitance in Case 4 and Case 5. According to the results of the location presented in Table 8, it can be observed that there is no influence on the location results when the zero-sequence parameter of the systems fluctuates by 20%. The effect of different load types is evaluated, as shown in Table 9. The load conditions considered in Case 1 is that all loads are three-phase balanced constant loads. In Case 2, the loads in line L 1 are removed. In Case 3, the loads connected to the end of the first branch of the line L 1 are replaced with onephase constant loads. Meanwhile, the loads connected to the   end of the second branch are replaced with two-phase constant loads. As the location results are shown in Table 9, the proposed method could work correctly for different load types, even if unbalanced loads are connected to the systems. An SLG fault signals in engineering applications, such as the zero-sequence currents, are obtained through wired or wireless transmission from sensors. However, asynchronous sampling is unavoidable due to low communication quality. Table 10 shows two cases used to verify the proposed method. In Case 1, the signals from DN 1 and DN 2 lag five sampling points behind the signals from DN 3 and DN 4 . In Case 2, the signal collected from DN 1 and DN 2 lead ten sampling points ahead of the signals of DN 3 and DN 4 . The location results in Table 10 show that the proposed method can overcome the influence of little asynchronous sampling and effectively locate the fault section.

On-site recorded data test
Nowadays, there are many DFIs installed in real resonant grounding distribution systems in China. Once the SLG fault occurs, the fault signal would be detected and measured by the detection unit. The communication unit then transfers the data into the control centre via a general packet radio service [24]. Although the slight communication problem like asynchronous sampling has been discussed in the simulation test, when the problem becomes worse, the fault signals of some DNs cannot be sent into the control centre. According to the on-site recorded data, the phenomena of data missing is more common. Therefore, the valid detection node is introduced to express that the corresponding DFI could capture and send the fault signals into the control centre successfully. The range of the fault zone would be expanded due to the problem above. However, the fault section could be located correctly by the proposed method still.
The fault data of various SLG faults is collected in a real 10 kV distribution system in China. The structure of the system is shown in Figure 8(a). As shown in Figure 8(b), each DFI consists of a fault detection unit and a communication unit. There are four cases taken as examples to illustrate the robustness of the proposed method. In Case 1, the SLG fault occurs at the fault point f 12 . Meanwhile, only eight EDNs are sending the data back. Figure 9(a) shows the on-site recorded waveforms for Case 1. Figure 9 Table 11. The results show that although the fault zone range has been expanded, the fault sections are still located correctly, indicating that the proposed method could affect on-site recorded data.

COMPARISONS WITH OTHER METHODS
The proposed method based on AE and FCM can adaptively extract the essential features without manual pre-construction and extraction. Besides, the AE belongs to the unsupervised learning method, which is practical for limited sample scenarios. FCM clustering can also realize the feature classification automatically without human intervention. To present the proposed method's reliability advantage, we compared this method with two methods introduced in [27] and [18], which have been verified in distribution systems.

Method based on 1-D CNN
The authors in literature [27]

Method based on DTW
The authors in literature [18] adopted the DTW distance as a feature to locate the fault section, which uses the dynamic programming idea to measure the difference between two twoseries to obtain an optimal path to get the smallest DTW distance between the two sequences along the path. The DTW distance is taken as an artificially designed feature to establish a judgment threshold based on massive simulation experiments.
When an SLG fault occurs, the transient zero-sequence currents measured from the DNs would be obtained and used to calculate DTW's distance as the features to represent the line sections. The resulting matrixDTW * is illustrated as where d i j indicates the distance of DTW between the two adjacent DNs such as DN i and DN j . The line section S i j between DN i and DN j could be determined as a fault section if the matrix component exceeds the set threshold D set TW , which is generally taken as 0.4 based on massive simulation experiments in [18]. Because the DTW distance is manually extracted as artificial features relying on prior knowledge, and the thresholds need to be set based on simulation experiments. Thus, it is difficult to adapt to all fault conditions, especially to some extreme conditions.
Three SLG fault section location methods mentioned above are used to identify the fault section when the SLG fault occurs. Because the method based on 1-D CNN needs many samples for the training model, the training samples were provided by simulating SLG fault in the lines, except the line L 1 , shown in Figure 8 with different parameters, including FR, fault angle, and fault point. Due to limited space, the training process of 1-D CNN is not shown here. Since the methods are mainly used to locate the fault section rather than the fault line, the simulation is mainly performed on the line L 1 . The results of the three methods under regular conditions (e.g. different FR, different fault angle, and different fault point) was shown in Table 12, and φ is the accuracy of fault section location, the ratio of the number of correctly located samples (C n ) to the total number of samples (T n ), which is defined in (11) = C n T n × 100%.
The simulation data's comparison results have improved that the proposed method based on AE-FCM and the method based on 1-D CNN perform better than the method based on DTW, which directly calculated the similarity between fault signals. Because of the complexity of SLG fault, harsh environment, and various parameters, the conventional method like the method based on DTW adopted artificial features cannot accomplish the fault section location correctly in some special conditions. The proposed method and the method based on 1-D CNN both belong to machine learning. Although the machine learning algorithms can automatically extract fault features, the method based on 1-D CNN requires a large amount of historical data to train the model. The accuracy of the model is related to the amount of training data. The more is training data, the higher is the model accuracy. In actual application, the on-site recorded data may not be sufficient to train a model well. The same on-site recorded data previously is used to test the three methods illustrated in Table 13. Because the thresholdD set TW of DTW based method used in the simulation test is unable to adapt the on-site recorded data we collected. Therefore, the threshold D set TW is set as 0.6 to adapt on-site recorded data, which is also the drawback of such methods based on experience to select features. Table 12 shows that the proposed method performs better than the other comparison method. The proposed method based on AE-FCM only needs the current SLG fault event data for self-learning and extracts useful features for discriminating the fault section. Therefore, the proposed method can be more suitable and practical in engineering applications than the other two comparison methods.
The simulation data's comparison results have confirmed that the proposed method based on AE-FCM and the method based on 1-D CNN belonging to the machine learning algorithm per-form better than the method based on DTW, which directly calculated the similarity between fault signals. Because of the complexity of SLG fault, harsh environment, and various parameters, the conventional method like the method based on DTW adopted artificial features cannot accomplish the fault section location correctly in some special conditions. Although the machine learning algorithms can automatically extract fault features, the method based on 1-D CNN requires a large amount of historical data to train the model. The accuracy of the model is related to the amount of training data. The more is training data, the higher is the model accuracy. On the contrary, the proposed method based on AE-FCM only needs the data of the current event of SLG fault for self-learning and extracts a practical feature for discriminating the fault section. Therefore, the proposed method can be more suitable and practical in engineering applications than the other two comparison methods.

CONCLUSION
A novel SLG fault section location method based on the AE network and FCM clustering have been proposed in this work. Our contributions are three-fold mainly.