Stacked denoising autoencoder based fault location in voltage source converters-high voltage direct current

High voltage direct current has been more and more popular in modern transmission systems. Accurate fault location could help fault clearance and fast recovery of the faulted system. A stacked denoising autoencoder based fault location method for high voltage direct current transmission systems is proposed. The local measurements are analysed, and an end-to-end stacked denoising autoencoder-based fault location is realised. Representative features are extracted with unsupervised learning and labelled as the input of the regression network for ﬁne-tuning in a supervised manner. The trained network can precisely map the local measurements and their corresponding fault distance. The performance of the proposed method is tested on a point-to-point high voltage direct current transmission system, which is modelled on the platform of PSCAD/EMTDC. The faults on both overhead lines and cables are considered, and the location performance in different scenarios are discussed. The simulation results show that the proposed method is effective in pinpointing faults location in various cases.


INTRODUCTION
During the past few decades, power electronic techniques developed rapidly. The increased capacity of insulated gate bipolar transistor accelerated the rise of voltage source converters (VSC) based high voltage direct current (HVDC) transmissions, which show many merits such as larger transmission capacity, lower construction cost and so on [1]. A lot of VSC-HVDC transmission projects have been installed worldwide, for example, the Aland link in Finland and the Zhangbei multi-terminal grid in China. According to the application requirements of various projects, either cables or overhead lines are selected for power delivery. The most common fault that occurs on transmission lines is the single-pole-to-ground (SPG) fault, which are usually permanent ones for cables, but both the electrical measurements before and after protection can be adopted. The location methods for VSC-HVDC transmission lines include traveling wave-based method, fault analysis method and machine learning-based method. The traveling wave-based method usually has high accuracy and has been widely used in practical applications. Both the single-end method and the double-end method are used. The first step of the single-ended fault location is to record the arrival time of the incident wave and the reflected wave, the second step is to calculate the time difference between the above two waves to achieve the purpose of fault location [2][3][4][5]. However, it is a bit difficult to discriminate whether the reflected wave comes from the fault point or the opposite line end [6]. The first step of the double-end method is to record the arrival time of wave heads that arrive at both terminals of the faulted transmission line. The second step is to calculate the time difference and combine the total length of the line to find the fault point [7,8]. The double-end method requires the global positioning system to ensure the synchronisation of the two-terminal data [9]. The performance of traveling wave-based methods depends heavily on both the accuracy of capturing the arrival time of wave heads and the calculated speed of the propagating velocity of traveling waves. But, the extract capturing of wave heads is quite difficult as overlapping or attenuation occurs when faults are close to or are far away from the measuring devices. Also, the calculation of propagating velocity is not easy, as the velocity varies with line parameters and the surrounding environment. Since the propagating velocity of the wave is almost approaching the speed of light, a slight inaccuracy of arrival time and velocity might generate a significant location error. Other than time-domain features, the frequency spectrum of the traveling wave, which is related to fault distance, is also used [10]. Some research locates faults by extracting the natural frequency of traveling waves, and high accuracy was reported [11][12][13]. But this kind of location methods are ineffective if the natural frequency is higher than the sampling frequency [14]. Furthermore, some transient interferences, such as lightning strokes, can also produce large amplitude coefficients in frequency spectrums, which are difficult to be discriminated from natural frequencies.
The fault analysis method adopts the electrical measurements and the distributed line model of HVDC transmission lines [15]. The fault point can be found according to the distribution characteristics of transient voltages and currents along the line [16,17]. Liang et al. combined the traveling wave theory with the Bergeron time-domain fault location method. The method was reported to solve the problems that the measurement at both ends cannot be synchronised and the line parameters cannot be determined [18]. The fault analysis method has high accuracy and excellent stability, but the lack of accurate line parameters imposes a challenge on location accuracy, especially when the line is quite long.
Thanks to the widely use of fault recorders installed at the end of the HVDC transmission lines, a large amount of fault data is collected. On the other hand, advanced simulation platforms can model practical power system and generate high-quality simulated data, which are quite similar with actual electrical measurements. These advantages provide the possibility for fault location using data-driven approaches. Some research discussed the application of machine learning models in fault location, as a regression mapping can be modelled between electrical measurements and fault locations [19,20]. Vasanth et al. processed the fault current with wavelet transform (WT) and then used artificial neural networks to learn the relationship between wavelet coefficients and distances [21]. Lan et al. extracted the high-frequency components of unsynchronised two-end voltages with empirical mode decomposition (EMD). They set up two convolutional neural networks to realise rough fault segmentation and accurate fault location, respectively [22]. Johnson and Yadav presented a scheme for detecting, classifying and locating faults using a support vector machine [23]. The generalisation capability of machine learning models is also improved by adopting advanced learning algorithms to explore their potential practical applications [24]. These machine learning-based methods were reported to have excellent performance in finding fault locations.
However, properly-extracted features, for instance, wavelet coefficients, frequency spectrum and so on, are required and crucial for improving machine learning-based location accuracy. When these features cannot fully reveal the difference between traveling waves from different locations, the location accuracy may be reduced.
Unsupervised learning models that are trained layer by layer can learn a representation from high dimensional data [24,25]. This learning procedure can be regarded as the feature extraction of raw data. Without the interferences of experts, the learned features might be more stable and general than the feature representation based on time-frequency analysis, such as WT and EMD [26]. Stacked denoising autoencoder (SDAE) is one of the popular unsupervised learning models. It is actually a type of neural network which considers the influences from noises during the training process. The unsupervised machine learning models study the mapping relationship between electrical measurements and fault location, and can avoid the difficult issues of electrical analysis based location methods, for instance, wave velocity calculation and line parameter estimation.
Therefore, to explore the capability of unsupervised learning in pinpointing faults on HVDC transmission lines, an SDAE based method is proposed to provide an end-to-end solution for fault location. The main contributions of this paper are listed below: 1) An end-to-end SDAE-based fault location method is proposed to pinpoint the SPG fault on VSC-HVDC transmission lines. The feature extraction, which is contained in traditional machine learning-based methods, is replaced by unsupervised learning to ensure better characterisation of raw data.
2) The relationship between transient waveforms of traveling waves and fault location is analysed and mapped with SDAE network. It is found that the normalised transient waveform can stand the effect of grounding resistance and line parameter.
3) The location performance has been improved to a higher accuracy under different simulation scenarios, for instance, heavy pollution of noises and low sampling rates. The location error can be reduced to around 1.6% when the sampling rate is 10 kHz and the signal-to-noise-ratio (SNR) is only 30 dB.
The rest of this paper is organised as follows: Section II analyses the post-fault process of VSC-HVDC transmission system and the characteristics of their traveling waves. Section III introduces the fundamentals of SDAE. Section IV describes the procedure of the proposed SDAE-based location method. The performance of the proposed method is tested with a simulation model under different scenarios in Section V. Comparisons between the proposed method and some traditional methods are produced in Section VI. Section VII concludes this research.

Post-fault process on high voltage direct current transmission lines
The most common fault that occur on the HVDC transmission lines is the SPG fault. When the fault occurs, it can be modelled by a step voltage excitation that is turned on. As only the phenomenon of traveling waves is discussed in this paper, the symmetrical mono-polar VSC-HVDC is adopted to simplify the post-fault analysis. The convertor is grounded directly with capacitors at the DC side and a large resistor at the AC side. The transient post-fault process of its SPG fault is actually the procedure of capacitor discharging, and a new stable state will be reached after this transient progress [27]. Ground faults will generate traveling waves, which will propagate along the transmission lines at nearly light speed. When the first traveling wave reaches converters, the capacitor discharging starts and leads to a quick drop of faulted pole potential and a fast rise of fault current. When the potential of the faulted pole decays to zero, the potential of the operating pole reaches the DC bus voltage. This transient process finally becomes a post-fault steady state after tens of milliseconds.

Characteristics of traveling waves
The waveform of traveling waves contains a lot of fault information such as location, grounding resistance, line type and so on. The following contents study the characteristics of traveling waves in order to find out the relationship between waveform features and faults. Here, the waveform attenuation, shape, magnitude and the effect of line types are discussed. Attenuation of traveling waves occurs when they propagate along transmission lines. The original excitation of grounding fault is a step waveform. The longer the traveling wave propagates, the smoother the wave head becomes. Figure 1 shows the attenuation of wave heads after spreading different distances. At the same time, to ensure that only the first wave head is compared, a data window of 60 μs is used. It is easy to find the attenuation of wave heads becomes more significant with the increase of fault distances. The stair width equals the duration of reflected traveling wave returns from the faulted point, w = 2d/v, where, w is the stair width, d is the fault distance and v denotes the propagating velocity, which is around 3×10 8 m s −1 . So, the traveling waves from nearby fault points have narrow stairs, for example, the stair width is only around 66.7 μs when the fault distance is 10 km away, but it is larger than 200 μs when the fault is as far as 180 km.
The magnitude differences in traveling waves are mainly due to the grounding resistance when fault distances are the same. As shown in Figure 3 , the amplitude of the traveling wave decreases with the increase of the grounding resistance. All the faults are located at 10 km away. A short data window is used to study the wave heads only. A larger resistance leads to a smaller magnitude of traveling waves. But their waveforms are similar, and the effect of grounding resistance can be eliminated by magnitude normalisation.
The effect of the transmission line type is also discussed.  Based on the above analysis, traveling waves vary with fault locations, grounding resistances, and line parameters. Magnitude normalisation can be used to avoid the effect of grounding resistance and line types. It is possible to map the relationship between the normalised waveforms of traveling surges and the fault locations and pinpoint faults with an end-to-end solution.

Stacked denoising autoencoder
The autoencoder is a neural network that can reconstruct the original input. It has two processes: Encoding and decoding. The function of the encoding process is to extract features with lower dimensions. The encoding of the input is a type of data compression [28]. Ordinary auto encoders can only learn from pure input. In order to deal with the data in the corrupted input, the ordinary auto encoders are expanded to produce denoising autoencoder (DAE) [29]. The expanded DAE is more robust and representative than ordinary auto encoders, as it can deal with corrupted input data. The way DAE achieves data corruption is to randomly add zeros to a small part of x in each period. The random selection leads to different elements that are masked as zero in different periods. The way that DAE encode damage input becomes hidden code is in Equation (1).
where,x is the input data part that is forced to be zero, s is the activation function, W is the weight matrix, b is the bias vector and h is the output of the hidden layer. Then, the code h is mapped back tox, and the version of x is reconstructed through the decoding process, as presented in Equation (2).
where, W' is the reconstruction weight matrix and b' is the reconstruction bias vector. The structure of a DAE is shown in Figure 5. Incomplete data sets can also be processed by DAE, but narrowing the gap between x andxis still the focus. The SDAE is made up of several DAEs stacked, and the training process is as follows: First, after training the first DAE, the output of the hidden unit will be randomly set to zero in a certain proportion. Then, the modified hidden code will be sent as the input of the second DAE. Finally, the above process is repeated to ensure that all DAEs are trained [30].

Network based on stacked denoising autoencoder
The fault location problem belongs to regression calculation. The output layer of SDAE is at the top of the network, with only one neuron, and at the same time, the activation of this neuron is the sigmoid function. The constructed SDAE training method includes two steps: training and fine-tuning. The training uses an unsupervised algorithm, and the fine-tuning uses the stochastic gradient descent method [31]. Figure 6 illustrates the construction and the parameter initialisation of an SDAE network. First, the initialisation of the l th hidden layer of the SDAE network needs to be completed by the weight and deviation of the l th DAE. Second, the m samples are divided into n mini-batches to ensure that each small batch contains m' samples (m' = m/n). Finally, the above mini-batches are sent to the network one by one. The back-propagation of the computed loss is used to update the parameters. The above process can be calculated in detail by using Equation (6), where k refers to the k th iteration, and ε is the learning rate.  The acceleration of the learning process depends on momentum. Equation (7), which is deduced from Equation (6), shows that the descent direction is decided on the gradients of both previous and current iterations. But the influence from previous gradients decays exponentially over iterations.
The network parameters are not static. It will be updated n iterations within a period of time. With a suitable selection of hyper-parameters, the training process can reach convergence rapidly and the relationship between the inputs and the targets can be successfully established.

STACKED DENOISING AUTOENCODER BASED FAULT LOCATION
The flowchart of SDAE-based location method is shown in Figure 7. The proposed method includes three steps: Raw data pre-processing, SDAE training and fault location calculation. The details of each step are descripted in the following contents.

4.1
Raw data pre-processing The post-fault transients contain much traveling-wave information related to fault locations. The data segment of a few milliseconds will be collected and used, and the collection is started once the initial wave head is detected. The post-fault transients are usually decoupled to avoid the induced influences between two transmission lines. Here, the Karenbauer pole-mode transformation shown in Equation (8) is adopted for decoupling.
The electrical measurements of positive and negative poles are decoupled to be 0-mode and 1-mode ones via Karenbauer transform, as demonstrated in Equation (9).
Here, i + and iare currents collected from positive and negative poles, respectively. i 0 stands for 0-mode current, and i 1 denotes 1-mode current. The 1-mode signal is regarded to be more stable and is unlikely to be affected by frequency variation and line surroundings. Thus, the 1-mode component is adopted for fault location calculation.
To avoid the effects of magnitude, min-max normalisation is used. The definition of min-max normalisation is shown in Equation (10), where, x is the measured signal, x max is the maximum value, x min is its minimum value and y is the normalised vector, which is in a specified range [0, 1]. Both the input and the output target are normalised.

Stacked denoising autoencoder training
One fault location sample is composed of one transient measurement input and one fault location target. Faults at different locations of transmission lines constitute the sample set, which is randomly divided into the training set and the test set. The training set is used to train the network, and the test set is employed to test the network performance. The structure and parameters of SDAE network are adjusted to achieve better performance. The training of SDAE is realised by layerwise pre-training and back-propagating fine-tuning of the whole network.

Fault location calculation
The trained network with properly-selected structure and parameters can be used for fault location. Once a fault occurs, the transient electrical measurements are recorded and used as the input vector of SDAE network. The corresponding fault location can be calculated by the trained model.

Simulation model and parameter settings
A point-to-point VSC-HVDC transmission system, as shown in Figure 8, is modelled on the platform of PSCAD/EMTDC. This model will be used to prove the validity of the proposed SDAE-based fault location method. Two-level AC/DC converters are adopted in VSC stations. The DC bus voltage is set to ±200 kV, and the neutral point of the supporting capacitor is directly grounded. The length of the transmission line is 210 km.
As the propagating characteristics of transients in overhead transmission lines and cables are different, both two kinds of transmissions are modelled to improve the generalisation performance of SDAE network. The simulation process of

Raw data acquisition and pre-processing
The current transients are collected and sampled at a rate of 10 kHz. The time window is selected to be 5 ms to ensure enough transient information is measured for location. As demonstrated in Figure 9, once the initial wave head of the traveling wave is identified, data collection will be started. The raw data within the window is decoupled and normalised, and then is used in SDAE network training and test.

Network training and hyper-parameter selection
The samples generated by the simulated model is randomly divided into two groups: 80% (512 samples) for training, and 20% (128 samples) for test. The size of the input layer is the same with the dimension of the input vector (5 ms ×10 kHz = 50), while the size of the output layer is only one that suggests the location.
Other parameters are selected according to the performance demonstrated in the test of the proposed model. To reduce the impact of random initialisation of weights and biases, the average performance of 10 trained networks are used. The details of DAE network during training are listed in Table 1. Since the momentum, α, has quite a small effect on the accuracy, the default value, 0.5, is adopted here.
Number of hidden layers: Although deeper network can generate more complex feature representations, too many hidden layers will lead to increasing computation and over-fitting problems. According to the mean errors of different networks shown in  Table 2, two hidden layers are enough to obtain an ideal result for this research. Therefore, the number of hidden layers is set to 2, which together with the input layer and output layer form a four-layer network.
Learning rate: The learning rate describes how far to move the weights in the gradient descent algorithm. It will affect the training duration and performance. It can be neither too small to slow down the convergence procedure, nor too large to cause divergence. Figure 10 shows the mean errors of networks with different learning rates in both pre-training and finetuning stages. According to Figure 10, the learning rate ε is set to be 0.05 and 2.6 in pre-training and fine-tuning, respectively, to achieve minimum errors.
Size of hidden layers: The feature vector extracted by an SDAE is a reduced dimensional representation of data. Clearly, the size of the 1 st hidden layer should be smaller than that of the input layer and larger than that of the 2nd hidden layer. So, 50 ≥ S 1 ≥ S 2 ≥1, where S 1 and S 2 denote the size of the 1 st and 2 nd hidden   Figure 11 illustrates the distributions of mean location errors. It can be found that the average error of the network with an architecture of 50-48-36-1 is the smallest, so this architecture is chosen for fault location.

Location results
The location performance of the proposed DAE-based method is evaluated by location errors, as defined by Equation (11).
Here, l cal is the calculated locations, l act is the actual ones, and l total is the total length of transmission lines. For the same scenario where multiple samples are used for testing, the mean error is used to evaluate the overall performance. Its definition is shown in Equation (12).
where, N is the number of test samples, and er i stands for the error of the i th sample. Figure 12 shows the location results of test samples with the proposed method. According to the box plot of the distribution of location errors in Figure12(a), almost all mean errors at different locations are smaller than 2%. Only the mean error of samples at 190 km is a bit higher than 2%. The overall mean location error of both overhead transmission lines and underground cables is 1.22%. As shown by the probability distribution of the location errors in Figure12(b), 77.5% of the errors are in the range between 0.7% and 1.5%. Only 4.17% of the errors are larger than 2%. The details of test samples with a location error larger than 2% are listed in Table 3. Since the number of cable samples (192 samples) used for training are far less than those of overhead lines (320 samples), the location errors of cables are larger.
The fault location is performed on a computer with an Intel i7-5500U processor (main frequency: 2.4 GHz), a 8GB memory, and a 64-bit operating system. The simulation platform is MAT-LAB. The training time of SDAE model is 4.064 s, and the location process needs only 4 ms. As data window is 5 ms, the total  duration of fault location is 9 ms. The proposed method is used to pinpoint the fault after primary protection. Such duration is short enough to fully make use of recorded transients before breaker operation and provide a quick response for maintenance staff.

Effect of grounding resistance
To demonstrate the performance of the proposed model with different grounding resistances, the samples with the same resistance are gathered, regardless of fault locations and line parameters. The mean error of these samples is calculated and listed in Table 4. For each kind of resistance, the number of test samples is 16 for overhead lines and 8 for cables. According to the data in Table 4, the mean error changes a little with the increase of grounding resistances. The location results are still accurate when grounding resistance reaches up to 80 or 100 Ω. The data shows the proposed method can eliminate the influence of grounding resistances.

Effect of line parameters
The location errors of different kinds of transmission lines are statistically analysed to evaluate the effect of line parameters.

Effect of feature vectors
For VSC-HVDC transmission systems, the local measurements include voltage and current. These measurements are often used to locate faults. The performance of the proposed method with other local measurements is discussed. Here, four decoupled feature vectors are considered: 1-mode current, 0-mode current, 1-mode voltage and 0-mode voltage. The SDAE model is trained separately with each kind of inputs. The same dataset generated in Section 5.1 is used for training and test. All the input samples are prepared with the same procedure. The mean errors are listed in Table 5. Table 5 clearly shows that 1-mode measurements have lower errors than 0-mode measurements, and currents have better performance than voltages. The 1-mode component, which focuses more on the differences between the faulted pole and the normal pole, is suitable to deal with SPG faults. The VSC-HVDC system does not control current, but voltage. The currents can reveal more transient characteristics, for example, the

Effect of time window length
Window length suggests the dimension of data when the sampling frequency is fixed. Due to the different response speeds of protection devices, the data window length will vary. Longer input vectors usually contain more abundant transient information. Therefore, the time window length from 2 ms to 7 ms is discussed, and their location errors are illustrated in Figure 14.
The same dataset generated in Section 5.1 is used. As shown in Figure 14, the longer window length can produce smaller error. Especially, when the window length increases from 2 ms to 3 ms, the error decrease is considerable. When the window is longer than 5 ms, the errors drop below 1%. But when the window is further extended to 7 ms, the error decrease is quite small. As a longer data window requires more response time, a time window of 5 or 6 ms is suitable for fault location when the sampling frequency is 10 kHz in this research.

Effect of sampling frequencies
The sampling frequency is also an important factor for transient-based analysis. Higher sampling frequency can provide more transient information if the length of the time window is fixed, but more requirements on devices and communications. Here, the effect of sampling frequency is discussed with the same simulation model in Section 5.1, and the samples are processed in the same way. The sampling frequency varies from 10 kHz to 250 kHz. The mean location errors are listed in Table 6. As shown in Table 6, the location error decreases with the increase of sampling frequency. The error with the sampling frequency of 250 kHz is almost half of the error when 10 kHz is

Effect of noises
In practical applications, background noise is the major interference of fault locations. Among all kinds of noises, Gaussian white noise is the most common one to be discussed. Different levels of Gaussian white noise are added to the original transient signal to evaluate the performance of the proposed method. The noise level is reflected by the SNR. Here, the SNR varies from 60 dB to 30 dB with a step of 10 dB. Figure 15 shows the mean location errors of the proposed method under different SNRs. Here, 640 samples are considered.
The mean location error slightly increases with the decrease of SNRs. It is only 1.6% even when the SNR is as low as 30 dB. The SDAE-based location method can effectively withstand the influences from background noises.

Effect of close faults
For the transient based fault location, especially traveling wavebased ones, it is hard to distinguish the faults that occur close to the measuring equipment. To demonstrate the performance of proposed method, some SPG faults are simulated. For each kind of line, 10 faults are equally located from 1 km to 10 km. Figure 16 shows the box chart of location errors of proposed method.

FIGURE 16
Mean location error of different close faults According to Figure 16, the mean errors of proposed method at different locations decrease with the distance. When faults occur near 10 km, the location errors are small and around 1%. But when faults occur at 1 km, the mean error reaches more than 6%. Since no close fault samples were included in training set, the location performance decreases. If more close faults are included when training the SDAE network, the location errors may drop.

COMPARISONS
This section aims to compare the performance of the proposed method with some existing location methods: Time-domain traveling-wave-based method, nature-frequency-based method, impedance-calculation-based method and artificial intelligence (AI)-based method.
1) Time-domain traveling-wave-based method: The timedomain traveling-wave-based location is the most popular method in practical applications [5]. It is critical to calibrate the arrival time of the incident wave (t i ) and the reflected wave (t r ). The fault distance can be calculated with equation l = (t r -t i )v tw /2, where v tw is the propagating velocity of the traveling wave and v tw is 3 × 10 8 m s −1 . 2) Nature-frequency-based method: The traveling wave recorded at the ends of transmission line is actually the nature response of fault excitation [32]. Natural-frequencybased method finds the frequency f n in the frequency spectrum of traveling wave and calculates the fault distance with the equation l = v tw /2f n . 3) Impedance-calculation-based method: The fault distance can be estimated by calculating the line impedance of postfault equivalent circuit [33]. R-L representation of transmission line is often used. 4) AI-based method: AI technique can map the relationship between measured data and fault location [24]. As the deep network has been proved to generate better performance than traditional shallow ones, a SEA model is compared in this paper.
Qualitative comparisons between the existing methods and the proposed method are performed under different scenarios, for instance, sampling frequency and SNRs. Table 7 shows the location errors of different methods with the same dataset generated in Section 5.1. The sampling frequency is 10 kHz. As shown in Table 7, the proposed method and AI-based method have higher accuracy than the other existing methods. The maximum location error of proposed method is only 1.25%, which is the lowest among all methods. The location errors of traveling-wave-based ones and impedance-based one are much larger. The average location error of time-domain traveling-wave-based method is 13.68%, which is more than ten times of that of proposed method.
Since sampling frequency is curial for traveling-wave-based methods, the location performance of all those methods are discussed under different sampling frequencies. Here, four different sampling frequencies are considered. The mean location errors are listed in Table 8.
With the increase of sampling frequency, the location performance of traveling-wave-based methods and impedance-based method improved dramatically. Most location errors can fall below 1% when the sampling frequency reaches 500 kHz. The location error of time-domain traveling wave based method drops to 0.22%, which is around one third of proposed method.
To demonstrate the effective denoising capability of the proposed SDAE-based method, the effect of noises is also considered. The measured dataset is polluted by Gaussian white noises with different SNRs, from 60 dB to 30 dB. The sampling frequency is 10 kHz. The mean location errors are shown in Table 9. Table 9 indicates that the location performance of discussed methods decreases with the increase of SNR. When the SNR is 30 dB, the mean location error of the time-domain travelingwave-based method and AI-based method reaches 19.46% and 2.13%, respectively, while the error of the proposed method is only 1.6%. The increment of mean location error of proposed method is less than 0.4% when SNR changes from no noise to 30 dB. This suggests the proposed method has a stronger denoising capability than the compared methods.

CONCLUSIONS
This paper presents a fault location method based on SDAE, which has ability to extract representative features from unlabelled data, and fine-tune the network with labelled ones. Different from traditional fault location methods that need wave speed estimation, carefully selected characteristics, or accurate line parameters calculation, the proposed method handles time-domain information directly and provides an end-to-end fault distance solution. The simulation results show that the proposed SDAE-based method can withstand the effect of noises, system parameters, grounding resistances, line parameters and so on. Its performance can be further improved by increasing the length of the data window or the sampling frequency. The proposed SDAE-based method reveals much better performance than the traditional ones under the same circumstance. The proposed method has very broad practical application prospects.