Non-parametric statistics-based predictor enabling online transient stability assessment

Online transient stability assessment (TSA) is of great necessity for fast awareness of transient instability caused by fault contingencies. In this paper, a non-parametric statistics based scheme is proposed for response-based online TSA. A critical clearing time-based stability margin index is defined as the predictive output and 14 kinds of severity indicators are proposed as input features for the TSA predictor. With no prior knowledge of the correlation structure, the non-parametric additive model is used as the basis of the predictor. To screen out the weakly correlated indicators and reduce the dimensionality of the input space, two-stage feature selection is fulfilled by non-parametric independence screening and group Lasso penalised regression successively. The predictor is then learnt by least-squares regression in the reduced multi-feature space. With phasor measurement unit measurements at generator buses, severity indicators can be computed in the real-time and fast evaluation of post-fault stability margin can be made by the offline-trained predictor. The effectiveness of the proposed non-parametric statistics based scheme is demonstrated in a modified New England 39-bus system and a practical 756-bus transmission system in China.


Introduction
Transient stability refers to the ability of power systems to maintain synchronism when subjected to severe disturbances such as faults [1].In large-scale interconnected power systems, a fault that occurs in a subsystem may cause unexpected out-of-step of some synchronous generators, network splitting from critical transmission corridors and ultimately blackouts.Online transient stability assessment (TSA) is necessary for system operation since it predicts the post-fault stability status in a real-time manner, providing the opportunity for timely execution of remedial action schemes to prevent system collapse.
Time-domain simulation (TDS) is the classical method for TSA.However, TDS is not practical for online prediction of transient stability due to the extensive computational burden of solving the differential algebraic equations.Currently, prospective methods for online TSA are the transient energy function (TEF) methods, the hybrid methods, the Lyapunov exponents (LEs) methods and the data-mining methods: (1) The TEF methods construct the Lyapunov functions to evaluate the transient energy and predict the post-fault stability by comparing the energy at fault clearing time with the maximum potential energy that can be absorbed by the post-fault system.TEF models that incorporate with high-voltage DC transmission [2] and stochastic renewable generation [3] have been proposed, respectively, to accommodate the integration of these novel components and to provide a more accurate assessment.However, the difficulty of TEF-based real-time TSA is mostly the online computation of the controlling unstable equilibrium point (CUEP) and correspondingly the maximum potential energy.In [4], a lookup table of potential modes of disturbance is built offline to assist real-time identification of CUEP.However, the computation of CUEP is only quickened but not displaced.
(2) Hybrid methods are referred to the combination of TDS and TEF methods initially.However, with the deployment of phasor measurement units (PMUs), trajectories from TDS are replaced by online measurements, so that hybrid methods are adaptive to realtime application.The concept of potential energy in the critical corridors is used to detect loss of synchronism in [5].The pair-wise relative energy function is proposed in [6] for fast identification of the critical generators and then the single machine equivalents (SIME) and the equal-area criterion are employed to qualify the transient stability.Both the above-mentioned hybrid methods rely on the prediction of the SIME's unstable equilibrium point since they make use of the TEF concepts.
(3) On the basis of the ergodic theory of dynamic systems, LEs can indicate the divergence or convergence of the generator angle trajectories and thus the synchronism of generators can be detected by tracking the sign of the maximal LE (MLE) [7].A data-driven approach for LE computation from time-series PMU measurements is proposed in [8], so that the model-dependent problem in [7] is overcome.Nevertheless, the post-fault stability status is still difficult to determine because the MLE may oscillate from positive and negative values before the post-fault system settles down.To address this problem, a recursive least-squares-based method is proposed in [9] for fast estimating the MLE, and therefore the monitoring window is shortened.So far, the connection between the change sign of MLE and the system's passing through the CUEP has not been studied.The optimal time for remedial control is thus not clear and future research is needed.(4) As for data-mining-based TSA, post-fault stability can be assessed promptly by feeding the online measurements into the offline-learned predictor.With the advent of data-rich and processor-rich smart grid environment, data-mining techniques can promote response-based TSA, enabling real-time awareness of transient stability and also response-based remedial control schemes [10].Some promising results of response-based TSA by data mining have been reported in [10][11][12][13][14][15][16][17][18][19][20][21][22].Decision trees (DTs) have the superiority of transparency [11] and do not rely on the backpropagation (BP) training process [12] in comparison with neural networks (NNs).To reduce the input dimension without compromising the predictor's accuracy, a novel DT-based TSA scheme is proposed by introducing the characteristic ellipsoid (CELL) theory to extract the key features from limited PMU measurements in [13].Another DT-based framework is proposed in [14] to predict the unstable generator grouping pattern in power systems with renewable generation.On the basis of stability prediction, the response-based one-shot control scheme is also developed by using PMU measurements and DTs in [15].Support vector machine (SVM) and its derivatives such as core vector machine (CVM) have also been applied to develop TSA classifiers.In [16], the SVM classifier takes proximity of the actual voltage variations to the pre-identified templates as inputs and satisfactorily prediction can be made within the six-cycles observation window.Another SVM classifier using TEF-based features as inputs is also investigated in [17] and accurate estimations can be provided for multiple contingencies with the maximum load/generation deviation to be ±20%.In [18], case studies in two practical power systems in the USA and China have also validated the effectiveness of CVM classifiers.Apart from training a single conventional predictor, ensemble methods have been proposed as well.Catastrophe predictors based on the random forest (RF) are built on the basis of wide-area severity indices in [10] and these RF-based predictors have shown the robustness to different network dynamics when compared with a single DT.An intelligent system (IS) is developed for post-disturbance TSA by using an ensemble of extreme learning machines in [19].Owing to the fast learning speed of extreme learning machines (ELMs), the ensemble classifiers can be updated by online pre-disturbance TSA results, thus improving the self-adaptiveness of the IS.The emerging deep learning techniques have also been proposed in recent years.Following the framework of [19], the long short-term memory network is used to develop a temporal self-adaptive TSA system in [20].To develop a scalable TSA framework for largescale power grids, MapReduce-based parallelised NNs are used for instability prediction and critical unstable generators identification in [21].All the above-mentioned predictors only classify the post-stability status, but none of them provide any quantitative assessment.Multi-variate adaptive regression splines are used in [22] to assess the transient stability margin based on real-time measurements.However, the literature on developing quantitative TSA predictor as [22] is very limited.
In this paper, a non-parametric statistics based scheme is proposed for response-based online assessment of transient stability margin.The stability margin based on critical clearing time (CCT) is first defined as the predictive response, and 14 kinds of severity indicators are proposed as input features.With no prior knowledge of the correlation structure between severity indicators and the corresponding stability margin, the non-parametric additive model is used as the basis of the predictor.After forming a knowledge base, weakly correlated severity indicators are screened out by a novel two-stage non-parametric analytics based feature selection.Predictor for online TSA is then learned by least-squares regression in the reduced multi-feature space.Since the severity indicators can be computed by post-fault PMU measurements, the stability margin can be estimated by the predictor in real-time operation.An illustrative case study on a modified New England 39-bus system and application of the proposed scheme on a practical 756-bus transmission system in China are provided.
The rest of this paper is organised as follows.The CCT-based stability margin index and the severity indicators are defined in Section 2. The technical details of the proposed scheme and the fundamentals of non-parametric statistics are introduced in Section 3. A case study on the New England 39-bus system is presented to illustrate the effectiveness of the proposed scheme in Section 4. Application to a practical transmission system is provided in Section 5. Finally, conclusions are drawn in Section 6.

Stability margin and severity indicators
Data mining is performed to establish a projection between the response and the features.As for data-mining-based TSA in this paper, a stability margin index based on CCT is defined as the predictive response, while 14 kinds of response-based severity indicators are proposed as the inputs of the predictor.

CCT-based stability margin index
CCT represents the boundary of transient stability from the perspective of fault clearing time.The margin of fault clearing time as is shown in (1) can be used to evaluate the post-fault stability of power systems where t cr and t cl are the CCT and the fault clearing time of a fault contingency, respectively.

Response-based severity indicators
Post-fault responses of power angles, rotor speeds and accelerating powers of generators, and voltage magnitudes of buses contain key information relating to transient stability.Numerous severity indicators have been proposed for stability assessment [18,[23][24][25][26].
In this paper, 14 kinds of indicators are proposed to form the multidimensional space of input features.Definitions of these indicators are shown in Table 1.
The nomenclature is given as follows.Here, t 0 and t cl are the faults occurring time and the fault clearing time, respectively, while t end is the moment at which observation window ends.g denotes the serial number of generators while G represents the set of generators.δ, ω, P m and P e are power angle, rotor speed, mechanical power input and electrical power output of a generator.V is the voltage magnitude of the generator bus and V N is its rated value. Besides M g (gεG) is the inertia coefficient of a generator while Mis the aggregated inertia coefficient of a power system.δ COI , ω COI and P COI are the phase angle, the angular frequency and the accelerating power of centre of inertia (COI).The superscript of COI represents that the parameters are under COI reference.Thus, we have ) These indicators capture the post-fault response of a single generator.To evaluate the impact of a fault contingency on the system, five kinds of statistic indices, which are the maximum, the minimum, the maximum separation, the average and the standard deviation, are utilised.Therefore, an input space of 14 × 5 = 70 indicators is formed.These statistic indexes will be presented by mean and ρ i std in this paper.

Proposed scheme
The proposed scheme for online TSA consists of four stages, which are data preparation, feature reduction, predictor training and online application.The technical details of these four stages will be introduced along with the fundamentals of non-parametric statistics in this section.

Data preparation
The performance of a predictor relies heavily on the comprehensiveness of the knowledge base.As for TSA, the uncertainties that have impacts on post-fault stability are a pre-fault operating condition (OC), fault location and fault clearing time.A database of OC should be first produced by historical or forecasted OC data.To enrich this database, more stochastic OCs can be generated by random variation of loads and generations.On each OC, an arbitrary transmission line is selected as the faulted device according to its average failure rate and the fault clearing time is sampled by its probability distribution.For each fault contingency, TDS is performed to compute the post-fault responses, the severity indicators and the CCT-based stability margin.Each instance of fault contingency is then represented by a row vector of severity indicators and the corresponding stability margin as is shown in the equation below: where Simulation and recording of fault contingencies are repeated until a knowledge base of a given scale is generated.

Feature reduction by two-stage non-parametric analytics
During the stage of offline learning, feature reduction and predictor training are two major issues.Feature reduction is to screen out some weakly correlated severity indicators as they may not have a significant contribution on the performance improvement of the trained TSA predictor, but instead leads to the increased computation burden of predictor training.On the other hand, predictor training is to learn the marginal correlation structure and the relating parameter.
There is little prior knowledge that indicates the correlation takes a linear form or belongs to any other finite-dimensional parametric family.To address this difficulty, the non-parametric additive model, as is shown in (10), can be used for better flexibility and approximation accuracy where For simplicity, the severity indicators ρ are rewritten as x and the saying of 'feature x' is invariably referred to the severity indicators in the rest of this paper.M(.) denotes the correlation between the stability margin η and the severity indicators X. μ is the intercept, f i (.) is the non-parametric component relating to the ith indicator x i and ɛ is an unobserved regression error.For feature reduction in the high-dimensional non-parametric additive model, a two-stage feature selection technique is proposed by combining the non-parametric independence screening (NIS) and the group Lasso penalised regression.

Feature reduction based on NIS:
The principle of NIS [27] is to identify and screen out the weakly correlated features by fitting the univariate correlations and thresholding the regression error of the fitting functions.
Supposing there is a correlation between the stability margin η and the severity indicator x i , this correlation can be estimated by a fitting function as is shown in the equation below: where f i (x i ) denotes the fitting function and ɛ i is the regression error that obeys normal distribution.The goodness of each fitting function can be evaluated by the root mean square error (RMSE) as is shown in the equation below: where k denotes the serial number of instances and K is the number of instances.Without any prior knowledge of the parametric model for correlation fitting, basis-spline function, namely B-spline function, is often used for non-parametric regression because of its ability to fit both linear or non-linear correlations.The B-spline function can be represented by a linear combination of basis functions as is shown in the equation below: where basis function ϕ j,γ (x i ) is a piecewise polynomial function of γ order, d denotes the degree of freedom and β j represents the contribution coefficient of the basis functions ϕ j,γ (x i ).Three-order B-spline is used in this paper, thus order γ is chosen as 3. To fit non-linear correlations smoothly, the marginal degree of freedom has to be searched by repeatedly performing univariate regression under a different degree of freedom.In this paper, the degree of freedom takes a value within the set of {d|3 ≤ d≤5 and dεZ}.
All the severity indicators are ranked by RMSE and features with RMSE higher than a given threshold are then screened out.The procedure of feature reduction based on NIS is summarised by Algorithm 1 (see Fig. 1).Each row of data matrix s represents an instance.The last column of s is the stability margin η while the rest denote the severity indicators.

Feature reduction based on group Lasso:
Group Lasso [28] is an enhanced penalised method for feature selection and non-parametric regression.For traditional Lasso, components f i in the additive model is considered to take the universal form of linear parameters, which makes it incapable of accurate estimation of non-linear correlations.In contrast, feature reduction based on group Lasso is achieved by penalising the non-parametric components in the additive model.Assuming that the nonparametric components share the same structures as the marginal B-spline univariate functions that are determined in the stage of NIS, the non-parametric additive model can be rewritten as is shown in the equation below: The estimation of ( 14) by group Lasso can be expressed by an optimisation problem as shown in the equation below: (see (15)) where λ is the penalty factor and w i is the weighting factor for the ith severity indicator.As λ increases, coefficients β of some features may decrease to 0. This characteristic of group Lasso ensures that ineffective indicators can be removed from the non-parametric additive model.
Usually, the weighting factors cannot be determined in advance.To avoid the extra computation burden of tuning weighting factors, the adaptive group Lasso is proposed.The procedure of the adaptive group Lasso is demonstrated as follows: (1) Initialise all the weighting factors such as such as 1 and set the iterator l as 0.
(3) Assign the new value to the weighting factors w by the equation below: (4) Compute the maximum difference Δ of the weighting factors at the current stage and the previous stage by the equation below: In summary, the two-stage feature reduction is fulfilled by successively performing the NIS and the group Lasso.NIS helps determine the marginal structure of the non-parametric components and screen out weakly correlated features in the high-dimensional non-parametric additive model.Unlike that the NIS evaluates the features independently, the group Lasso realises feature reduction in a joint multi-variate space, providing a comprehensive evaluation of multiple features.

Predictor training
After two-stage feature reduction, the indicators that are preserved in the non-parametric addictive model are selected to form a reduced multi-feature space.With this reduced input space, TSA predictor can be trained by the component-wise least-squares regression.The non-parametric additive model is rewritten as is shown in the equation below: In (18), the coefficients are represented by α for the purpose of distinguishing them from those solved by group Lasso.N R denotes the number of reduced severity indicators.
Given the reduced data matrix S R B-spline basis matrix Ψ can be formed by basis functions ϕ ij where Then the coefficients α can be solved by the equation below:  where Once the coefficients are solved, the non-parametric additive model in ( 14) can be used for fast prediction of transient stability margin.

Online application
Although the predictor for TSA is trained, online application cannot be realised unless the severity indicators can be computed with real-time measurements.Provided that PMUs are installed at the HV buses of power plants, the power angle δ and the rotor speed ω of generators can be approximated by the phase angle and angular frequency of these buses.The electrical power output of generators P e can be approximated by the active power flow of step-up transformer.Considering the delay of speed governors to adjust the mechanical power input, P m is assumed to be to the same as its pre-fault value, thus P m ≃ P m 0 = P e 0 .Moreover, the voltage magnitude of generators' terminal buses V can be directly measured by PMUs.On the basis of the assumption, severity indicators can be computed with real-time PMU measurements.
With the above-mentioned assumption, once a fault is detected in real-time operation, post-fault PMU measurements within the observation window are used to compute all the severity indicators.After that, post-fault stability margin can be estimated by feeding the severity indicators to the predictor.

Illustrative case study
The proposed scheme is first illustrated by a case study on a modified New England 39-bus system with centralised wind power integration.The synchronous generator at Bus #37 is replaced by 200 × 3.6 MW double-fed induction generators (DFIGs).All the DFIGs operate at constant power factor mode and the power factor is set to be 1.0.

Data generation
Operating conditions of nine different loading scenarios (varying from 80 to 120% of base condition with the increment of 5%) are used to form the database of OCs.To simulate the uncertainty of real-time operation and enrich the knowledge base, stochastic OCs are also generated by random sampling.Wind power generation at Bus #37 is sampled according to the Weibull distribution while the active power outputs of synchronous generators vary from 50 to 150% of their base condition.The unbalanced power consumption is met by the synchronous generator at Bus #39.
Fault location is randomly chosen at 0, 50 and 100% of the length of an arbitrary transmission line.The fault clearing time is modelled as a normal distribution with a mean value of 0.2 s and a standard deviation of 0.02 s.TDS is performed to simulate the post-fault responses, where the simulation time is 10 s.
About 2000 instances are generated.About 80% of them are randomly chosen to form the training set while the others serve as testing data.With PMU installed at synchronous generator buses, severity indicators are computed with the relevant post-fault trajectories and the observation window is set to be 0.1 s after fault clearing.The CCT-based stability margin index is searched by recursive simulations.

NIS
After data generation, NIS is performed to determine the structure of the addictive model and identify the severity indicators that are weakly correlated to the stability margin.The goodness of the univariate fitting functions is evaluated by RMSE and the results are shown in Fig. 2.
As can be seen from Fig. 2, the 7th and the 14th severity indicators, which are the coherency indices of generator rotor speeds (ρ 7 ) and the dot-product index of generator power angle and rotor speed (ρ 14 ), are ranked as the top two, making them the most correlated features to the stability margin η.Besides, generally for all the severity indicators, the minimum index is less correlated when compared with the other four indexes.

Tuning of the parameters
The threshold for NIS and the penalty factor for group Lasso are two key parameters to be tuned.The marginal threshold for NIS is first studied while the penalty factor is set to be 0.5.The goodness of the non-parametric predictor and the number of input features in this predictor are shown in Fig. 3.For both simplicity and accuracy of the TSA predictor, the threshold for NIS is chosen as 0.333 so the first 50 indicators are selected.
After determining the threshold for NIS, the marginal penalty factor for group Lasso regression is then studied.The goodness of the non-parametric predictor and the number of input features in this predictor are shown in Fig. 4. As can be seen from Fig. 4, the number of severity indicators in the trained predictors decreases with the increase of penalty factor, meanwhile resulting in the decrease of the predicting accuracy.

Testing data validation
With the NIS threshold and the penalty factor set to be 0.333 and 0.5, a non-parametric predictor is trained under the reduced set of severity indicators.The TSA predictor takes the non-parametric additive form as (17).To demonstrate the non-parametric components of this functional predictor, the B-spline component relating to the 68th input feature (ρ 14 diff ), is shown in (22) as an example.To understand the non-linear correlation between ρ 14 diff and the stability margin η, this B-spline component is also plotted in Fig. 5 After predictor training, the testing data is used to assess the goodness of the trained predictor.The stability margin is estimated by feeding the selected severity indicators into the non-parametric predictor.The regression between the stability margin η and the estimated value η ¯ is plotted in Fig. 6.Each scatter in Fig. 6 represents an instance in the testing set.As can be seen from Fig. 6, all the scatters locate close to the line of η ¯= η, indicating the predictor has fitted the correlation between the stability margin and the severity indicators properly.
Statistical analysis is performed to investigate the probability distribution of predicting error.The histogram of predicting error is shown in Fig. 7.The probability distribution of predicting error follows a normal distribution.The mean μ of predicting error is 0 while its standard deviation σ is 0.0496.According to the characteristics of normal distribution, the μ ± 2σ interval includes 95% of all the instances.Therefore, when applying the predictor to unseen instances, the predicting error is expected to be <±0.0992 at the confidence level of 95%.

Impact of measurement length
Either the 7th or the 14th severity indicator relies on the length of the observation window, so the impact of measurement length on predictor performance cannot be ignored.Predictors of different measurement lengths are trained under an identical set of input features.The performance of these predictors is evaluated by RMSE and the comparative result is shown in Fig. 8.With the observation window lengthened, the predicting error decreases and the performances of predictors are enhanced.For fast evaluation of stability margin, while maintaining the expected maximum predicting error should be <0.1, the measurement length is chosen as five cycles.Notably, the measurement length in the existing literature varies from four cycles to ten cycles [16,17,19,20].Therefore, the observation window of the proposed TSA predictor is comparative with the existing predictors of other data-mining techniques.

Impact of measurement error
According to the 'IEEE Standard for Synchrophasors for Power Systems', PMUs with level 1 compliance should have a total vector error <1% [29].To study the impact of measurement error on predictor performance, a random error between −1 and 1% is added to the simulated post-fault trajectories in the testing data.With the observation window to be five cycles, the indicators are then recomputed by these noisy trajectories.The histogram of the predicting error for the noisy testing data is shown in Fig. 9.The probability distribution of the predicting error follows a normal distribution.The mean and the standard deviation of the predicting error is −0.0138 and 0.1843, respectively.Clearly, the predicting performance on the noisy testing data is worst than that on the noise-free testing data.
However, if the TSA predictor is re-trained by using the noisy training data, the predicting performance is improved.In this case, the mean and the standard deviation of the predicting error are 0.0035 and 0.0584, respectively.The histogram of the predicting error, in this case, is shown in Fig. 10.A similar conclusion about the impact of measurement errors can be found in [30].

Impact of topology change
Topology change has a significant impact on the post-fault stability of power systems.Therefore, the robustness of TSA predictors should be tested under the scenarios of topology change.For this propose, 1000 instances are generated under stochastic N − 1 prefault OCs.The predictions of these instances and distribution of the prediction error are shown, respectively, in Figs.11 and 12.The RMSE of these instances is 0.0498, thus it can be concluded that the predictor has shown a consistent performance in N − 1 condition.

Comparison of different predictors
The comparison is performed among the proposed method, multilinear regression (MLR), NN, regression tree (RT) and support vector regression (SVR).An identical set of instances is utilised for predictor training with the rest to be tested.The predicting error of these techniques is provided by Table 2.The NN-based predictor has the best overall performance among all the techniques.The RTbased predictor is an unpruned tree with 1471 children nodes and thus suffers from overfitting badly.There is a trade-off between the accuracy and the transparency of data-mining technologies [11].The proposed non-parametric predictor has an explicit formulation rather than a 'black-box' projection, which provides better interpretation while maintaining comparative evaluation accuracy with NN.

Application in a practical system
The proposed scheme is also applied to a practical 756-bus transmission system in China [14].The 500 kV backbone network is demonstrated in Fig. 13.The detailed dynamic models for synchronous generators, turbines, speed governors, excitation systems and power system stabilisers are used for simulation.

Data generation
Following the proposed TSA scheme and the previous New England 39-bus system case, a database of OCs is first generated by the OCs of different loading scenarios and the stochastic OCs.Fault location is randomly selected among 500 kV transmission lines while the fault clearing time is also modelled as a normal distribution with a mean value of 0. The threshold for NIS and the penalty factor for group Lasso are tuned to be 0.25 and 0.5, respectively.Group Lasso algorithm is employed to carry out the second stage of feature selection.Eventually, 27 indicators are selected to form the input space for predictor training.

Predictor training and application results
A TSA predictor is then trained with the selected severity indicators.The stability margin of all the testing instances are estimated by the trained predictor and the RMSE of these instances is 0.0437.The regression between the stability margin and the estimation for testing instances is shown in Fig. 15 and the distributions of the predicting error are given by Fig. 16.It can be concluded that the proposed non-parametric statistics based scheme can also provide a promising assessment of post-fault stability margin in the larger-scale practical transmission system.

Conclusion
A non-parametric statistics based scheme is proposed for responsebased online TSA in power systems.To train a TSA predictor, the CCT-based transient stability margin and 14 kinds of severity indicators are defined as the predictive response and the input features, respectively.To address the problem of lack of prior knowledge of correlation structure, the non-parametric additive model is used as the basis of the TSA predictor.To screen out the indicators that do not significantly help improve the performance of the predictor, two-stage feature selection is fulfilled by successively NIS and group Lasso penalised regression.After that, the predictor is learnt by component-wise least-squares regression in the reduced multi-feature space.With PMU measurements at generator buses, severity indicators can be computed in the realtime and fast evaluation of post-fault stability margin can be made by the offline-trained predictor.
Case studies on the modified New England 39-bus system and the practical 756-bus transmission system in China are provided to illustrate the effectiveness of the proposed scheme.Numerical results demonstrate that:  (1) with the observation window to be five cycles, the RMSEs for the well-trained predictors in these test systems are 0.0496 and 0.0437, which indicates the expected maximum error to be <0.1 and satisfies the need of accurate estimation of post-fault stability margin; (2) the proposed two-stage non-parametric statistics based feature selection scheme can identify the weakly correlated indicators and thus reduce the dimension of input features; and (3) the proposed non-parametric TSA predictor has the ability to adapt to normal and possible N − 1 pre-fault OCs.
Future research emphasis lies in two aspects.First, PMU measurements at all the generator buses are assumed to be available in this paper.For practical application, no PMU measurements at some generator buses or data missing problem should be addressed.So, the robustness of the TSA predictor under incomplete observation should be studied and improved.Second, remedial control schemes should be taken when the post-fault system is about to lose synchronism.Hence a data-mining-based scheme for response-based control should be studied to incorporate with the proposed TSA predictor.

Acknowledgment
This work was supported by the National Natural Science Foundation of China (No. 51437003).

) ( 5 )
If the maximum difference Δ is less than a given threshold, stop the loop and the values of w, μ and β at the current stage are then the final result of predictor training; otherwise, return to step (2) to proceed the computation.

Fig. 2 Fig. 3
Fig. 2 Goodness of the univariate fitting functions evaluated by RMSE

Fig. 9 Fig. 10 Fig. 11 Fig. 12
Fig. 9 Histogram of the predicting error for the noisy testing data (predictor trained by noise-free training data) 2 s and a standard deviation of 0.02 s.The length of the observation window is set to be five cycles.TDS is performed to simulate the post-fault responses and compute the severity indicators and the transient stability margin.About 2500 TSA instances under normal OCs and 500 TSA instances under N−1 OCs are generated.5.2 Feature selection About 2000 instances of normal OCs are used for predictor training while the others including the rest instances of normal OCs, and all the instances of N − 1 OCs serve as testing data.On the training data, NIS is performed to determine the structure of the addictive model and identify the severity indicators that are weakly correlated to the stability margin.The result of correlation evaluation is shown in Fig. 14.

Fig. 13 Fig. 14
Fig. 13 500 kV bulk grid of the practical transmission system

Fig. 15 Fig. 16
Fig. 15 Regression between the stability margin and its estimation by predictor for the testing data in the practical transmission system case