Stockwell-transform and random-forest based double-terminal fault diagnosis method for offshore wind farm transmission line

Due to the difﬁculty and time-consumption in locating short-distance transmission lines for deep-sea offshore wind farm (DOWF),this paper proposes a novel double-terminal fault location method by using Stockwell-transform (ST) and random-forest (RF). After the fault type and branch are accurately determined, the accurate transmission line fault location is located. Firstly, Stockwell-transform is employed to extract fault eigenvalues from the collected wind turbine (WT) current signals, which will reduce the sensitivity of eigenvalues to noise. And the Pearson correlation coefﬁcient (PCC) is introduced to remove duplicate eigenvalues. Secondly, the reserved fault eigenvalues are taken as inputs to the different random-forest to classify fault types and identify the fault branch, respectively. Finally, the double-terminal fault location principle is established in fault negative sequence network (only ABCG uses positive sequence components). Newton-Raphson method (NRM) is used to eliminate the inﬂuence of asynchrony data, which implies an accurate transmission line fault location for deep-sea offshore wind farm. More than 4000 fault cases data obtained by Simulink simulation verify the feasibility and performance of the proposed method. The results show that the


INTRODUCTION
Deep-sea wind energy is a rapidly growing renewable energy source due to the abundant reserves [1][2][3]. A wind farm is a multi-branch transmission system with complex topology and short-distance transmission lines. The transmission lines of onshore wind farms are mainly composed of overhead lines [4]. Manual inspection is of low difficulty and is easy to operate. However, deep-sea offshore wind farm (DOWF) is mainly composed of submarine cables [5]. Due to the bad sea conditions and geological environment, submarine cables are vulnerable to the damage of current and seabed organisms [6]. The manual detection cost of submarine cable is higher and time-consuming than that of the overhead line because of the bad maintenance environment. The traditional way is to replace the whole fault branch, which makes the maintenance cost high. When the accurate fault location is determined, only the transmission line near the fault location needs to be repaired, which is more conducive to reduce the cost of manual detection  [7][8][9]. Therefore, the fast and accurate fault location of deep-sea-offshore wind farm transmission line (DWTL) plays a very important role in the safe operation of the power grid.
In recent research, the main methods for diagnosing transmission line fault can be mainly classified into three classes: the impedance [10][11][12], traveling wave [13][14][15] and learning algorithm-based methods [16][17][18][19][20][21]. The impedance method generally uses the voltage and current information of singleterminal or double-terminal, which is used to extract fault impedance and establish fault criterion. However, the accuracy of location largely depends on fault type, fault inception angle (A f ) and fault resistance (R f ). Although the relationship between fault distance and extreme point of system impedance spectrum is proposed in [12] which solves the influence of R f , the analysis of A f and multi-branch system is still not considered. The traveling wave methods use the time difference between the first traveling wave and the first reflected wave or the first wave front of double-terminals to determine the fault location. Although traveling wave methods based on wavelet coefficients have been proposed to improve the accuracy of fault location in [13]. A method based on homogeneous transmission lines is proposed in [14]. However A f , R f and measurement noise (MN) on the performance of the method is not considered, and the judgment basis of fault type and fault branch is not given. The fault branch identification and fault location are carried out successfully in [15]. However, the above traveling wave based methods can only locate the fault in long-distance transmission line effectively. Smart grid is the development trend in the future. Learning algorithm-based methods has been widely adopted in the field of fault diagnosis. The intelligent algorithms do not affect the action of relay protection methods, such as the circuit breaker. On the contrary, the combination with relay protection greatly improves the intelligence of power grid. Wavelet transform and different machine learning are often used to locate faults in transmission lines, including ANN, FLS, SVM, and ELM [16][17][18][19]. However, wavelet transform usually loses the phase information when it decomposes the fault signal, which leads to the lack of accuracy. Combining adaptive convolution neural network and double-terminal fault location is proposed in [20]. But A f is not considered. And the original method is unable to adapt to the fault location of DWTLs. Reference [21] combines ST and particle swarm optimization with different machine learning to identify fault branch and locate fault. However, the accuracy for fault branch identification of DOWFs is limited by short-distance transmission lines. Although the reference [22] has successfully located the fault of the transmission line of onshore wind farm, the bus is ignored and regarded as non-impedance. The DWTLs are deeply buried in the seabed, and the working environment is worse. As mentioned earlier, its failure rate is high. Therefore, the bus is also studied by the proposed method.
It can be seen from the above analysis that the main factors restricting diagnostic performance include fault inception angle (A f ), fault resistance (R f ), fault location (L f ), measurement noise (MN) and length of transmission lines. All of the above methods have only concentrated on transmission lines of the distribution grid with little attention to wind farms. Meanwhile, most of them cannot be directly applied to DOWF due to the following reasons: • Comparing the length of the distribution network and wind farm lines, it can be seen that most wind farms are composed of short-distance transmission lines. The shortest transmission line length in distribution network research is not less than 6.7 km [21,23,24]. In the study of the wind farm, the length of transmission lines is about 1 km [22]. At the same time, the closer the fault point is, the more similar the fault signal is. When the fault occurs in two adjacent branches, the closer the fault location is to the common node, the more similar the fault signal characteristics are [25,26]. Due to the short distance of transmission lines in wind farms, the fault signals are more similar, so the accuracy of the existing methods will be slightly reduced when they are directly applied to DOWF. • The distribution network is mainly composed of power grid, transmission line, transformer and load. Wind farm is mainly composed of power grid, transmission line, transformer and doubly-fed wind turbine. Comparing the two systems, we can see that the biggest difference lies in the load and wind turbine. When the system fails, the crowbar and internal control strategy of the wind turbine can restrain the change of voltage and current. Therefore, the existing fault methods cannot be directly applied to the DOWF fault research [27,28].
Therefore, a novel double-terminal fault location method based on STRF is proposed to classify fault type, identify fault branch and locate fault point aiming at short-distance DWTLs. And considerable consideration is given by the proposed method to the above shortcomings. The proposed method has no conflict with the traditional relay protection method, and will not affect the circuit breaker and other line protection methods. On the contrary, the combination of the proposed method and relay protection is more advantageous to the operation and maintenance of offshore wind farms, which is beneficial to increase the reliability of line fault protection and reduce the operation and maintenance time. The proposed method starts with the extraction of fault eigenvalues from faulty current signals, collected from different branches of the DOWF by using ST. Then fault eigenvalues are taken as the input of random-forest (RF) to classify DWTL fault types. After the fault type is determined, the fault eigenvalues are sent to the corresponding RF to identify the fault branch. On this basis, combined with Kirchhoff voltage law (KVL) and Newton-Raphson method (NRM), the fault location is carried out by using double-terminal fault location principle.
The main contribution is that a novel double-terminal fault location method based on STRF is proposed aiming at short-distance transmission lines for DOWF. The DWTL fault classification, branch identification and location accuracy are obviously improved. The influence of uncertainty A f , R f , L f is solved by combining ST with RF. ST itself has good noise resistance, which eliminates the influence of MN on fault eigenvalues. NRM is used to eliminate the asynchrony between the two terminals. The satisfied performance is kept even under different length of transmission line.
This paper is organized as follows: A novel fault location method based on STRF is proposed in Section 2. In Section 3, four DWTL fault characteristic matrixes and the selection process of fault eigenvalues are given. In Section 4, the fault classification and branch identification method based on STRF the principle of double-terminal fault location by positive and negative sequence components are proposed. The effectiveness and competitiveness of the method is demonstrated by simulation results in Section 5. Section 6 concludes this paper.

PROPOSED DOUBLE-TERMINAL FAULT LOCATION METHOD BASED ON STRF
Flow chart of double-terminal fault location based on STRF for deep-sea offshore wind turbine transmission line is shown in Figure 1. The accuracy of fault classification is determined by the excellence of fault characteristics. And high precision fault classification is the fundamental requirement of fault branch identification. Meanwhile, the necessary information for double-terminal fault location is provided by accurate fault branch identification. In this paper, a double-terminal fault location method based on STRF is proposed aiming at shortdistance DWTLs for DOWF. The process of the proposed method can be divided into three parts: eigenvalues selection, fault classification and branch identification and doubleterminal fault location.
In the first part (eigenvalues selection): the WT current (i a , i b and i c ) is dynamically measured by the proposed method. To select the useful eigenvalues, i a , i b and i c are transformed by ST. And four fault eigenvalue matrices (C stock , S row , S column and P stock ) are produced after ST. The different statistical methods were applied to C stock , S row , S column and P stock . And the fault eigenvalues are generated. The correlation between eigenvalues is obtained by Pearson correlation coefficient (PCC). The changing trend of fault current will be seriously damaged by the MN. The problem of MN can be solved by ST. Therefore, the accuracy under noise is guaranteed. The number and complexity of the fault data are reduced while the accuracy is not influenced by retaining the eigenvalues with weak correlation.
In the second part (fault classification and branch identification): the preserved fault classification eigenvalues are then fetched to the fault classification RF model to get the fault type of DWTLs. After getting the accurate fault type of DWTLs, 10 branch identification RF models corresponding to fault types are generated according to 10 different fault types. The different fault eigenvalues are then fetched to the fault branch identification RF model to identify the fault area based on the result of the fault type. High precision fault classification is the basis of branch identification, which effectively improves the accuracy of branch identification for DOWF.
In the third part (double-terminal fault location), after the first two parts, the fault types in offshore wind farms are accurately classified and the branches are accurately identified. Although fault classification, fault branch identification and fault location can be regarded as three independent parts, the result of fault classification is the basis of the high accuracy of fault branch identification. And the results of fault classification and branch identification serve the double-terminal fault location principle. The fault voltage and current from the double-terminal of the fault DWTL are extracted. The voltage and current components is converted by voltage and current. And then sequence components are fed to the fault location algorithm. The fault voltage equation is determined to locate the fault accurately by applying KVL to DWTLs. However, the asynchronous data will seriously affect the accuracy of fault location, so the synchronization angle is introduced based on the NRM.

Stockwell-transform based fault matrices generation
Finding excellent fault type eigenvalues is the key to successful fault classification, resulting in high-accuracy. The change law of signal strength with time is the signal time-domain characteristic. And which single frequency signals are synthesized is the frequency domain characteristic. Time-domain signals can be transformed into the frequency domain by Fourier transform (FT) to analyse harmonics [29]. However, frequency resolution and time resolution cannot be taken into account simultaneously by both discrete, continuous and short-time Fourier transform [30,31]. Better time-frequency resolution cannot be provided by a fixed-width sampling window. The wavelet transform resolves the resolution problem. However, MN is inevitable in practice. The shortcoming of wavelet transform is the sensitivity to MN [32]. Meanwhile, wavelet transform will lose the phase information of non-stationary signals [32].
The main reason that affects the accuracy of the model is the extraction of fault eigenvalues. In machine learning, feature extraction is a very important part. In order to extract effective fault eigenvalues, the ST is introduced. The ST is a combination of STFT and wavelet transform, which retains the advantages of STFT and wavelet transform [33]. The phase information is effectively maintained and the robustness to MN is improved by ST [27]. In the proposed method, the ST can be defined by where f is frequency; α and t are time variables; g(t) is input signal.
The Fourier spectrum is obtained by calculating the average value of the local spectrum with time [33]. G(f) is the Fourier transform of g(t), which can be written as: can be extracted from Equation (1). g(t) can be written as Then Equation (1) can be written as operations on the Fourier spectrum G(f) of g(t). G where f ≠ 0. The discrete analogue of Equation (4) is used to compute the discrete ST. In this paper, the fault current signal g[nT] is a discrete-time series, Then the discrete Fourier transform is defined as [33,34].
where n, m = 0, 1, 2, …, M-1, M is the number of samples; The fault current data extracted in the proposed method is the current value of the previous period and the later period of the fault occurrence time, that is M = 335; T is the sampling period, which is 0.0001. By considering the Equations (4), (5) and (6), Stockwell transform of the fault current signal g[nT] is written as [33,34]: where can be written as: By considering the Equations (7) and (8), a fault matrix is generated by DWTL fault current through Stockwell transformation. The frequency and time information of the fault current is contained in the row and column of the fault matrix. The fault matrix is usually known as ST matrix which is a 168 × 335 matrix.
The C matrix is indirect fault matrix, which is used to generate the fault characteristic matrix. The C matrix can be calculated by [27]: In this paper, four fault eigenvalue matrices (C stock , S row , S column and P stock ) are generated by C matrix or ST matrix. The implication and size of fault eigenvalue matrices are shown in Table 1.

Pearson correlation coefficient based fault eigenvalues selection
Different statistical methods are applied to the four fault eigenvalue matrices in the proposed method. The standard statistical methods such as standard deviations, maximum values, minimum value, and kurtosis are widely used to detect the distribution network ground fault [34,35]. Kurtosis, mean values, amplitude and entropy are used for the analysis of power quality transients [36]. Seven statistical methods in the above method are applied to C stock matrix, S row matrix, S column matrix and P stock matrix in order to find the obvious fault eigenvalues. The methods used include kurtosis, mean value, minimum value, maximum value, standard deviation, deviation and entropy. The generating process of unscreened eigenvalues by C stock matrix is shown in Table 2.
The generating process of unscreened eigenvalues by S row matrix, S column matrix and P stock matrix is the same as the C stock matrix. The fault eigenvalues of S row , S column and P stock are numbered T8-T14, T15-T21 and T22-T28 respectively. Thus, 28 eigenvalues are generated from each phase after a DWTL fault occurs at the deep-sea wind farm. The number of eigenvalues is not the more the better, it will lead to useless calculation, and then affect the operation time of the model. So PCC is introduced in order to reduce the number of eigenvalues and maintain the accuracy. PCC is defined as: where c is the length of the eigenvalue; Ta and Tb are two different fault eigenvalues of DWTL for DOWF;T a = ∑ c j =1 T a j ∕c; PCC is the most commonly used linear correlation coefficient which can well reflect the correlation between the two fault eigenvalues [37]. −1 to +1 is the range of PCC. −1 indicates the complete negative correlation between the two fault eigenvalues, and +1 indicates the complete positive correlation between the two fault eigenvalues. 0 means there is no correlation between the two fault eigenvalues.
100 A-phase ground (AG) fault cases are randomly generated, resulting in a 28 × 100 fault matrix. The fault eigenvalues are fetched into Equation (12) in columns to calculate PCC. It is worth mentioning that some fault eigenvalues are constant (T1, T15 and T22) or zero (T7, T14, T21 and T28) in all DWTL fault cases. These unchanged fault eigenvalues are removed by the proposed method since they are meaningless for DWTL fault classification. PCC between all remaining eigenvalues is shown in Figure 2. The trend of partial fault eigenvalues in 100 fault cases is shown in Figure 3.  As shown in Figures 2 and 3, the performance of some fault eigenvalues is similar when a DWTL fault occurs at DOWF. The number of eigenvalues can be reduced without reducing the accuracy of DWTL fault classification by deleting these eigenvalues with high PCC coefficients. Table 3 summarizes the basis for the above Fault eigenvalues selection. The eigenvalues with PCC greater than 0.95 are deleted and only one of them is retained in the proposed method. As can be concluded from Table 3, the final number of preserved eigenvalues is 11, including T2, T5, T6, T8, T9, T10, T12, T20, T23, T26 and T27. In this paper, the recorded two cycle current data of DWTL fault time, including the previous cycle and the latter cycle are sent into ST after a fault on DWTL. The ST matrix and four fault eigenvalue matrices (C stock , S row , S column and P stock ) are generated by ST. The fault eigenvalues are calculated by C stock , S row , S column and P stock of each phase fault current data. A total of 3 × 11 = 33 eigenvalues are obtained at each deep-sea wind turbine side. By combining ST and PCC, not only the effective eigenvalues which can distinguish faults obviously are extracted, but also a large number of useless eigenvalues are reduced, which reduces unnecessary calculation without affecting the accuracy. Then, the fault eigenvalues are sent to the RF model for DWTL fault classification.

Random-forest based fault classification
Machine learning and statistics are closely related fields [38]. Statistical inference is an important foundation of machine learning. From methodological principles to theoretical tools, statistics has been throughout the development of machine learning [39]. The selection of a classification algorithm is the key point of fault classification model construction. Artificial neural network (ANN) and support vector machines (SVM) are the main methods in the field of classification [40]. ANN is to adjust the input and output weights and bias, so as to determine the mapping relationship between input and type [35]. However, many parameters need to be adjusted by ANN. And ANN is also easy to fall into local minima and cannot guarantee that the result is the global optimal solution. SVM is based on statistical theory and structural risk minimization principle [21]. The local minima and over fitting are solved by SVM. However, the adjustment space of the core is limited. RF is one of the algorithms based on statistical thought summary in the development of machine learning [41]. RF is a parallel ensemble learning model, which uses decision tree (DT) as classifier [42]. RF is widely used in classification because of its strong generalization ability, difficult to over fit, simple model and good classification effect [42]. In the proposed method, the CART algorithm is adopted in the DT construction algorithm. CART algorithm is an improved algorithm based on ID3 and C4.5. CART has excellent anti-noise performance and is easy to understand and use, which is very suitable for real-time fault classification [27]. The Gini coefficient is the basis for CART algorithm to select fault eigenvalues. The Gini coefficient is defined by: where D is the total number of samples; P(d/I) is the probability of randomly extracting one sample and belonging to class d when the test variable is I. . Therefore, I has 10 categories, which is I = 10 in Equation (11).
The construction process of RF is to extract Z fault training sets with the same number from the original fault eigenvalue samples by bootstrap method. The RF model is composed of Z DTs model which generated by CART algorithm. The final classification result is obtained by voting on Z results. The number of Z is one of the factors that directly influence the RF accuracy. After a large number of experiments, RF can maintain good performance when Z is set to 10 in this paper.
In order to illustrate the effectiveness and robustness of the proposed fault eigenvalues in fault classification. Taking T2 and T6 as examples, the performance of the two fault eigenvalues under partial fault types is shown in Figure 4. As is shown in Figure 4: • The three fault types (AG, BG and CG) can be clearly divided into two groups by fault eigenvalues T2. The first group includes AG and BG. And the second group includes CG. The two fault types (BG and CG) can be clearly divided into two groups by fault eigenvalues T12. The first group includes BG. And the second group includes CG. Therefore, the proposed fault eigenvalues can be used for accurate fault classification through RF. • As can be seen comparing the performance of eigenvalues in Figure 4 under different noise levels (0 dB and 20 dB). Although some eigenvalues will change significantly under the noise level is 20 dB, different fault types can still be clearly distinguished. Therefore, the influences of MN can be completely eliminated by the proposed RF based fault classification method. • It is worth noting that because S-transform has good noise immunity, all fault eigenvalues perform well in most cases. However, as shown in Figure 4(b), due to the influence of noise, the fault characteristic value T6 inevitably changes slightly. The judgment of RF is a top-down decision-making, so once the fault eigenvalue T6 exceeds the interval value, it will lead to misjudgement.

Random-forest based fault branch identification
The DWTL of DOWF includes collecting lines and multiple wind turbine branch lines, which is a typical multi-branch struc- ture. The accuracy of conventional fault branch identification method cannot meet the requirements for DOWF due to the multi-branch structure. Besides, the difficulty of fault branch identification is increased because the DWTLs are generally short.
Fault branch identification can also be seen as a classification problem. The fault current date of six wind turbines are recorded for fault branch identification for DOWF. 33 fault eigenvalues are generated by a wind turbine. Then 33 × 6 = 198 eigenvalues are generated for fault branch identification of DWTLs which is D = 198 in Equation (11).
The fault type of DWTLs is determined. One RF fault branch identification model is established for each fault type in this paper. Then, the fault eigenvalues are sent to the corresponding RF to identify the fault branch of DWTLs according to different fault types. In order to illustrate the advantages of the proposed method, a total of 10 branches are selected to distinguish the fault branch, which is I = 10 in Equation (11). After a large number of experiments, RF in fault branch identification can maintain good performance when Z is set to 50 in this paper.
In order to illustrate the effectiveness and robustness of the proposed fault eigenvalues in fault branch identification. Taking T4 and T20 as examples, the performance of the two fault eigenvalues under partial fault branches is shown in Figure 5. As The schematic diagram of the double-terminal location is shown in Figure 5: • The four fault branches (L1, L5, L6 and L7) can be clearly divided into two groups by fault eigenvalues T4. The first group includes L1 and L7. And the second group includes L5 and L6. The two fault branches (L2 and L6) can be clearly divided into two groups by fault eigenvalues T11. The first group is L6. And the second group is L2. Therefore, the proposed fault eigenvalues can be used for accurate fault branch identification through RF. • As can be seen comparing the performance of eigenvalues in Figure 5 under different noise levels (0 dB and 20 dB).
Although some eigenvalues will change significantly under the noise level is 20 dB, different fault branches can still be clearly distinguished. Therefore, the influences of MN can be completely eliminated by the proposed RF based fault branch identification method. • Like the problem of fault classification, because S-transform has good noise immunity, all fault eigenvalues perform well in most cases. But once the fault eigenvalue T6 exceeds the interval value, it will lead to misjudgement of fault branch.

Double-terminal fault location
Rapid and accurate fault location after DWTLs fault can reduce the occurrence of wind power abandonment, which greatly improves the development and utilization of wind power. To further locate the fault quickly and accurately, the doubleterminal fault location is introduced. The schematic diagram of the double-terminal location is shown in Figure 6. Negative sequence network is often used for transmission line fault analysis. However, ABCG is different from other fault types and belongs to the symmetrical fault. It is well known that ABCG only produces positive sequence components. Therefore, only in the analysis of ABCG, the positive sequence component is used to replace the negative sequence component. The fault voltage and current obtained from the fault two terminals are converted into negative or positive sequence. According to KVL, the voltage component V 1y and V 2y measured by nodes B1 and B2 can be determined as follows [43]: where L is the total length of DWTL; l is fault distance from bus1 on DWTL; Z y is the sequence impedance; I 1y is the current component measured at node B1; I 2y is the current component measured at node B2; y = 1, 2; 1 represents positive sequence (ABCG), 2 represents negative sequence (other fault types); V f is fault voltage at fault point. Based on Equations (12) and (13), l can be calculated by: According to Equation (14), it can be seen that the proposed method can ignore the fault point voltage and calculate the fault location preliminarily [43,44]. Based on the NRM, the synchronization angle is introduced to achieve data synchronization. Then Equation (14) can be written as Synchronization angle can be expressed in a polar form as: Based on Equation (15), Equation (16) can be rewritten as: where Re(*) and Im(*) are the real and imaginary of the fault signal; The coefficients W 1 , W 2 and W 3 in Equation (17) are defined as [43]:

Simulation environment
A DOWF transmission line model includes a power source, a large number of wind turbines and several short-distance transmission lines [22,45]. A DOWF is taken as an example to construct a DWTL simulation model for offshore doubly-fed WTs in real-time simulation software Matlab/Simulink, and its system wiring diagram is shown in Figure 7.
As mentioned earlier, the bus is also studied by the proposed method. 10 fault branches (6 fault transmission lines and 4 fault bus) and 10 fault types are considered in order to verify the  Table 4. The value range of A f must be greater than one simulation cycle (0-360 • ) in order to fully consider each fault inception angle. A f interval is set to 0.1 s to meet the above requirement. R f is in Ω and L f is in km. The range of R f and L f are 0.001-15 Ω and 0.1-0.9 km. A f , R f and L f are randomly changed to generate a large number of random fault cases.

Performance for fault classification
In order to test the efficacy of the proposed fault classification method, different DWTL faults are studied. 10 Table 5 Table 6.
As can be concluded from Tables 5 and 6, at each noise level, 1500 cases are used to test the effectiveness of the proposed fault classification method. The proposed fault classification method performs best without the influence of MN. The accuracy of fault classification can also reach 100%. The accuracy is slightly reduced when the noise is added to the fault current signal. But the accuracy of fault classification is still as high as 99.53% even under the influence of 20 dB noise. Meanwhile, the accuracy of the proposed method is 99.73% and 99.80% at 40 dB and 30 dB, respectively.
Consequently, the effectiveness of the proposed fault classification method in DWTL fault classification is verified by the above results. Meanwhile, the method is not influenced by fault inception angle, fault resistance, fault location and measurement noise, and the overall accuracy is as high as 99.77%.

Performance for fault branch identification
After accurate classification of fault types, the fault types of DWTLs in DOWF are determined. According to different fault types, 10 corresponding fault branch identification models are generated. Then, the fault eigenvalues generated by the threephase fault current data recorded in the previous branch are fed to the corresponding fault branch identification model. 400 AG fault cases are generated on each DWTL fault branch. Thus, a total of 4000 AG fault cases are randomly generated to verify the effectiveness of the proposed classification method when AG occurs in DWTL. 70% fault cases (2800 AG fault cases) are used for training and the remaining 30% (1200 AG fault cases) are used for testing. A 10 × 10 AG confusion matrix obtained  Overall accuracy(%) is 100 from the proposed branch identification method under a noisefree condition is shown in Table 7.
As with the fault classification problem, A f , R f and L f are randomly selected when all AG fault cases are generated. Different white Gaussian noise is also added to the fault current signal. The SNR of MN includes 20, 30 and 40 dB. Results and accuracy of fault branch identification under different noise conditions are shown in Table 8.
As can be concluded from Tables 7 and 8, in case of AG fault on the DWTLs, the effectiveness of the fault branch identification method is verified by 1200 fault cases. The accuracy of the fault branch identification method for DWTL can reach 99.83% without the influence of MN. Even under the influence of noise, the accuracy of the proposed method is 99.75%, 99.58% and 99.17% at 40 dB, 30 dB and 20 dB respectively. Consequently, the proposed fault branch identification method is not influ-enced by A f , R f , L f and MN. The overall accuracy is as high as 99.58%, which fully verifies the effectiveness of the fault branch identification method when AG occurs on DWTLs.
As with AG fault cases, 4000 different fault cases are generated under the other fault types respectively. Results and accuracy of fault branch identification under different noise conditions in all fault type cases are shown in Table 9.
As can be concluded from Table 9, the proposed fault branch identification method is verified by 4000 fault cases under different DWTL fault types. When the noise level is 20 dB, the accuracy of fault branch identification is as low as 99.17% after AG fault occurs. When the fault condition is noise-free, the accuracy of fault branch identification is as high as 100% after

Performance for fault location
The accurate fault type and fault branch of DWTLs are determined by the above proposed method. The fault location of DWTL can be located by sending the fault dates recorded at double-terminal of the fault branch to Equation (17). 200 fault cases are randomly generated on the fault branch L1, where A f , R f and L f are randomly set as the fault classification and fault branch identification problems. Location results and errors of 50 fault cases on L1 are shown in Figure 8.
As can be seen from Figure 8, when the DWTL fault occurs in branch L1, the fault location can be accurately located by the proposed method on the premise of determining the fault line and fault type. The maximum error can be reached 0.25%, and the minimum error can be reached 0.01%. In the detection of 1 km transmission line, the location error can be controlled  Overall accuracy(%) is 99.83  Table 10.
As can be concluded from Table 10, the fault location can be accurately located by the proposed method no matter where the DWTL fault occurs in the DOWF. The overall error of phase to phase short circuit (AB, AC and BC) is slightly higher than overall error of single-phase to ground fault (AG, BG and CG), but the proposed method can locate all applied faults with error < 1%. The error of fault location is also within 6.1 m when the fault condition is uncertain (A f , R f and L f ), of which the effectiveness of the method is verified.

Influence of system parameters on accuracy
The length of DWTL and the sampling frequency of the system are important factors that influence the accuracy of fault classification, fault branch identification and the error of doubleterminal fault location. In the research of line fault, the setting of sampling frequency is not uniform. Different reference sets different sampling frequency. Most of the sampling frequencies are 3.2, 6.4, 10, 12.6 and 20 kHz [23,[46][47][48]. In order to illustrate the selection basis of the sampling frequency used in the proposed method, the variations of sampling frequencies (3.2, 6.4, 10, 12.6 and 20 kHz) are studied. Only the sampling frequency of the system is changed, the accuracy or error of the proposed method is shown in Table 11 when other fault conditions are the same.
As can be concluded from Table 11, the overall accuracy is close to the peak value at 10, 12.6 and 20 kHz, which is about   99.8%. The maximum error is close to the peak value at 6.4, 10, 12.6 and 20 kHz, which is about 0.6%. The accuracy is improved and the error is reduced with the increase of the sampling frequency. The results show that the proposed method is suitable for sampling frequency above 10 kHz.
To evaluate the proposed method, the variations of DWTL length (1, 2, 5 and 10 km) are studied. Only the length of DWTL is changed, the accuracy or error of the proposed method is shown in Table 12 when other fault conditions are the same.
As can be concluded from Table 12, the accuracy of fault classification is improved and the error of fault location is reduced with the increase of the DWTL length. The overall accuracy of fault classification is close to the peak value at 5 and 10 km, which is 100%. The overall accuracy of fault branch identification is basically unchanged. The maximum error of fault location is close to the peak value at 10 km, which is 0.55. The results show that the proposed method is not only suitable for short-distance transmission fault detection, but also suitable for long-distance transmission line detection.

Comparison of performance under different methods
In order to verify the superiority of the proposed STRF based double-terminal fault location method, different reference methods are applied to fault classification, branch identification and location of DWTLs in DOWF. Different reference methods are compared with the proposed method in terms of overall accuracy and maximum error. The comparisons of the fault classification, branch identification and location methods are shown in Figure 9. As can be concluded from Figure 9, after all the methods mentioned above are applied to DOWF fault diagnosis, some methods still maintain high performance, but the overall accuracy is low in fault branch identification problem. The proposed method outperforms the others in fault classification, branch identification and location aspect. Meanwhile, the influence of A f , R f and L f on accuracy can be ignored by the proposed method. Thus, the proposed method can effectively locate faults with better or a competitive accuracy over the referenced methods.
The speed of fault classification, branch identification and location is briefly demonstrated in this section. The speed determines whether it can meet the real-time requirements, and then determines whether it can be implemented online. For any complex structure line fault monitoring problem, the problem can be roughly divided into four steps: fault detection, classification, branch identification and location. The starting point of this paper is to study the remaining steps after the fault detection, so the proposed method does not need to be implemented online. But in order to demonstrate the feasibility and reliability of the proposed method, the speed of fault classification, fault branch identification and fault location is shown in Table 13.
As can be seen from Table 13, the proposed fault classification method requires an average time of 7.8541 s to generate the final model. After receiving the three-phase fault current signal, the average time required for fault classification is 0.0156 s, which is less than one cycle of 60 Hz system. The proposed fault branch identification method requires an average time of 8.1523 s to generate the final model. The average time required for fault branch identification is 0.0177 s. The proposed fault location method needs 0.0248 s to estimate the fault location. As mentioned earlier, the proposed method is to operate after the fault occurs, so it does not need to be implemented online. However, the proposed method can still complete different steps in about a cycle of 60 Hz system.

CONCLUSIONS
This paper proposes an innovative double-terminal fault location method based on Stockwell-transform (ST), Randomforest (RF) and Newton-Raphson (NRM) for deep-sea offshore wind farm (DOWF). Through a large number of simulations, the feasibility and effectiveness of this method in fault types classifying, branches identifying and fault locating have been verified for short-distance transmission lines. The following conclusions have been reached: 1. ST is used to decompose the fault current signals, and a large number of fault eigenvalues are generated by using the generated fault eigenvalue matrices. The introduction of PCC removes a lot of duplicate eigenvalues, while it does not affect the performance of eigenvalues.
2. After filtering eigenvalues, according to the representation of eigenvalues in different fault types and fault branches, the selection basis of RF is given. And the correctness of RF has also been verified through a lot of simulations. 3. After the fault type and branch are accurately determined, the double-terminal fault location equation is established. In order to solve the problem of calculating errors caused by asynchrony data, NRM is introduced. Synchronization angle greatly improves the accuracy of fault location. 4. More than 4000 fault cases show that the accuracy of fault classification and branch identification for DWTL has been significantly improved by this method. The overall accuracy of fault classification and branch identification be as high as 99.77% and 99.58% respectively with almost 100%. The location error can be controlled within 6.1 m for shortdistance transmission fault location of DOWF. 5. The proposed method overcomes the influence of transmission line length, fault conditions uncertainty (fault inception angle, fault resistance, fault location, measurement noise) and complex wind farm topology, which ensure good accuracy of fault classification, branch identification and location for short-distance transmission lines in DOWF. 6. As a part of the future extension, considering the integration of renewable energy, the sensitivity of the developed method be investigated by applying in a microgrid. Besides, the feasibility of the method is studied by considering wind speed change, load change and different line parameters, so as to solve the difficulty of the short distance submarine transmission line location and reduce the operation and maintenance time of smart grid.