Real‐time risk assessment of cascading failure in power system with high proportion of renewable energy based on fault graph chains

The access of a high proportion of renewable energies has deepened the randomness and complexity of cascading failures (CFs) in power systems. In this regard, a real‐time risk assessment method for CFs in power systems with high proportion of new energy is proposed. First, combined with historical statistical data and relevant national standards, a CF simulation model that considers the off‐grid protection action of renewable energy units in the event of a power grid fault is proposed. The model is based on the continuous steady‐state power flow model, which simulates the spread of CFs via continuous power flow calculations. Second, via introducing the concept of a fault graph chain, the electrical and topological characteristics of the continuous dynamic of the power system in the process of CFs can be described. Then, through continuous CF simulation and replay buffer, a data‐driven method is used to calculate the CF risk index corresponding to the fault graph. Finally, a cascaded graph neural network is employed to fit the nonlinear mapping relationship between fault graphs and CF risk indicators. The simulation results in the IEEE 39‐bus system show that the proposed method can accurately and real‐time evaluate the risk of CFs.

leading to a cascading failure (CF) of a larger section of the network.Typically, power systems are designed to be resistant to CF, but it may be unavoidable.
The CF of the power system is a series of processes that are caused by a local sudden failure of the power system and further spread through the power grid, leading to successive failures of other components in the system, which has the risk of further causing a blackout. 1,2In recent years, large blackouts caused by CF in power systems with high proportion of new energy have occurred all over the world.Because the devices with electronics are voltage sensitive, modeling without this factor cannot accurately reflect the CF propagation process in the power system with high proportion of new energy.Thus, the influence of voltage should be considered in the analysis of CFs.
Simulation modeling of CF process is an important basis for accurate assessment of CF risk.The existing CF modeling methods are mainly divided into two categories based on continuous quasi-steady power flow and based on time domain simulation.The model based on time domain simulation 3,4 proposed to consider the transient process of the generator and other components in the CF propagation process, and used a dynamic model to simulate the dynamic process of CF.However, the transient process is computationally complex, which seriously affects the efficiency of the simulation.The models based on continuous quasi-steady power flow calculation mainly include OPA model, Manchester model and so forth. 5-7These models are mainly modeled from the perspective of line overload tripping and further propagation diffusion.However, many large power outages have shown that the off-grid protection action of new energy units is also an important reason for the spread of CF.Because the devices with electronics are voltage sensitive, modeling without this factor cannot accurately reflect the CF propagation process in the power system with high proportion of new energy.Thus, the influence of voltage should be considered in the analysis of CFs.
On the other hand, how to effectively analyze CF simulation data and further propose CF risk assessment methods has attracted extensive attention from scholars at home and abroad.Reference 8 quantifies the interaction between faults of power system components through statistical fault data to determine the key lines and key components in the CF process; Reference 9 identifies key components based on Markov models and influence diagrams.However, these statistics-based models are too simple to accurately reflect the propagation process of CF. References 10-13 model CF as a search problem of Markov fault chains and further conduct risk assessment.However, with the increasing proportion of new energy sources, the discrete Markov state space often cannot accurately reflect the state changes of the power system in the process of CF propagation.Although continuous state space can reflect the characteristics more accurately, it may face the dilemma between model accuracy and computational complexity if continuous state space is considered.
Aiming at the above problems, this article first establishes a CF model considering the off-grid protection action of new energy units, and simulates its propagation and diffusion through continuous quasi-steady power flow calculation.Second, by introducing the fault graph chain (FGC), the operating state of the system under a certain section during the fault propagation period is described as graph structure data.Drawing on the experience replay pool technology in reinforcement learning, the fault map under the power flow section is replayed from experience, and its risk index (VaR) is calculated in a data-driven method.Then, the cascaded graph neural network is trained offline on a large amount of FGC data generated by the simulation model.
The main contributions of this article can be summarized as follows: (i) The FGC is introduced to describe the characteristics of fault propagation in the form of graph structure, where the system adjacency matrix and characteristic matrix are used to describe the topology and electrical changes of the power system during the CF period.The data considers the continuous changes of system states, rather than discrete forms.(ii) Based on the induced system fault graph, a novel CF risk analysis framework is proposed to find the risk index.
By using reinforcement learning technology, a neural network is applied to show the nonlinear mapping relationship and is used to fit the nonlinear mapping relationship between the system fault graph and its VaR index.
The rest of this article is organized as follows.Section 2 describes the CF model with high proportion of new energy, and Section 3 introduces the procedure of CF simulation and risk index calculation, with an online CF risk assessment framework proposed.In Section 4, the IEEE 39 node system is used to verify the effectiveness, followed by the conclusion in the last section.

CASCADING FAILURE SIMULATION MODEL OF POWER SYSTEM WITH HIGH PROPORTION OF NEW ENERGY
In recent years, OPA and its derivative models based on continuous power flow calculation 5,6 have taken into account the physical characteristics of the model and the simulation speed, so they have been widely used in CF analysis. 14,15 Based on the OPA model, this article further considers the off-grid protection action of power electronic equipment due to insufficient tolerance during CFs.The proposed model includes the following improvements.

Processing and power balancing of islands
When the initial fault occurs or the cascading fault spreads, the line may be out of service due to overload or natural disasters and so forth, resulting in the change of the power system topology and the possible formation of power islands.When there is no load or generator in the island, it is considered that the power in the island cannot be balanced, and the island is out of operation.If there are loads and generators in the island at the same time, load shedding in equal proportions or reduce the active output of generators to balance the power.

Off-grid protection action of new energy unit
When the line is out of operation due to an initial fault or an upper-level fault, it will cause the power flow to be redistributed, and may lead to over/under voltage at the nodes where the new energy units are connected to the grid.When the terminal voltage exceeds the threshold, the new energy may be out of operation due to the protection action.Combined with the existing literature and the current high and low voltage ride-through standards of new energy units, this article considers a random off-grid protection action for new energy units, as shown in formula (1).
where U g is the voltage at the grid-connected place of the new energy unit under a certain power flow section; U t l and U t h are the voltage thresholds specified by the high and low voltage ride-through standards for new energy units that do not allow the unit to be disconnected from the grid; U min l and U max h are the high and low voltage of new energy units, respectively.The voltage value at which the running unit is immediately disconnected from the grid specified by the crossing standard; p g−trip is the probability value of the new energy unit performing the off-grid protection action when the voltage at the grid-connected place fluctuates.

Line overload tripping
An initial failure or an upper-level failure can trigger a power flow transfer in the system, putting the line at risk of overloading.This article considers a random line tripping rule, 16 as shown in formula (2).
where p l−trip is the probability of the line tripping due to overload under a certain power flow section; f l is the transmission power of the line under the power flow section; f c l , f max are the rated transmission power and upper limit of the transmission power of the line, respectively.

Chain of fault graph
Power systems have a typical graph structure. 17In this article, the concept of FGC is introduced to replace the traditional Markov fault chain to describe the continuously variable power system operating state quantities during the propagation of cascading faults.In the process of CF propagation, the system topology and electrical characteristics under a certain power flow section can be described as adjacency matrix A and characteristic matrix H, respectively, as shown in Equations ( 3) and (4).
l , U (1) ,  (1) where n is the number of nodes in the system; the dimensions of the adjacency matrix A and the feature matrix H are n × n and n × 6 respectively; A i j represents the topological connection between node i and node j, 1 means two points are connected, 0 means not connected; ,  (i) represent the generator active output, reactive output, load active power, reactive power, voltage amplitude, voltage phase angle.
The system state under a certain power flow section is recorded as the fault graph G, G = (A, H).The propagation process of a cascading fault in a power system can be recorded as the FGC in Figure 1.
In Figure 1, the link relationship of the fault graph marked in red is the bifurcation of the fault graph, that is, the current fault graph G 2,m2 may evolve into the fault graph G 3,1 or G 3,m3 through the propagation of cascading faults; similar, marked as the blue fault map link relationship is the confluence of fault maps, that is, the current fault map G 4,0 may be derived from the propagation and evolution of its predecessor fault map G 3,0 or G 3,m3 .
It should be pointed out that the traditional Markov fault state can usually be regarded as a combination of binary discrete variables (outage or normal) of multiple components, while the fault diagram proposed in this article describes the multiple continuous on multiple nodes of the system.combination of variables.

Cascading failure simulation and risk index calculation
Inspired by the experience playback pool technology used in reinforcement learning, this article adopts the fault map playback technology to calculate the VaR index of the fault map of a power flow section.The CF simulation used consists of two parts, namely, the CF simulation under the initial random failure and the CF simulation under the fault diagram playback.The flow chart is shown in Figure 2. In stage 1, different initial faults are randomly set to obtain different FGCs, and the fault graphs under different power flow sections are stored in the experience playback pool D. In stage 2, the fault graphs G in the experience replay pool D are taken out in sequence, and the current fault graph G is used as the initial state of the CF, and multiple experience replays are performed, and the tth replay time is calculated according to formula (5).The fault map G corresponds to the system outage scale L (t) G .
where P load is the initial total system load; P (t) load−loss is the total system load loss when the tth experience is played back.The VaR value of a CF represents the maximum loss of the system at a certain confidence level during a certain period of time.At a given confidence level, the corresponding VaR value can be solved by Equation (6) [10,27]: where  is the confidence level, usually a given value; VaR  is the VaR value at the confidence level; p(x) is the probability density function of the system loss of load.However, due to the complex mechanism of CF propagation, it is usually very difficult to solve p(x) analytically by the mechanism method.Therefore, according to Equation (7), this article approximately calculates the value corresponding to the system fault map in a data-driven manner by playing back G: where T is the total number of times of playback of fault map G; L G,T, represents the scale of system power outage at the time of ranking according to the risk from serious to slight among all the playback results of fault map G.
When T is large enough, it is easy to know that the value of formula (7) satisfies its definition formula: 6,18   Prob(ΔV

DATA-DRIVEN REAL-TIME ASSESSMENT TECHNOLOGY OF CASCADING FAILURE RISK
CF risk assessment can be regarded as a regression problem, that is, given the operating state of the system to obtain the risk index of CF.The real-time assessment of CF risk proposed in this article can be divided into stages, namely offline training stage and the online evaluation stage.In the offline training stage, the neural network is trained to fit the nonlinear mapping relationship between the fault graph G obtained in Section 1.3 and its corresponding risk indicator VaR, and the trained neural network is applied to the online stage to evaluate the CF risk of the system in real time.

Design of cascaded graph neural network model
The power system network has an obvious topological structure and is also a typical graph structure data.Since the graph neural network can capture the feature relationship between nodes and also reflect the topological structure of the system, it has advantages in processing graph structure data. 17As shown in Figure 3, this article designs a cascaded graph neural network, including cascaded graph convolutional neural network (GCN) layers and multi-layer perceptron (MLP) layers.As a feature extractor, the GCN layer captures the topological and electrical features of the fault graph G, and outputs the extraction results to the MLP layer, which finally outputs the CF risk assessment results.In Figure 3, the GCN layer extracts the topological features of the system by acquiring the relationship between the selected node and its surrounding nodes.The forward transfer formula of the GCN layer is shown in Equations ( 9)- (11).
where the input of the lth GCN layer is the adjacency matrix A and the feature matrix H (l) , and the output is H (l+1) ; D is the degree matrix (DM) of the system topology; W is the degree matrix of the GCN layer and is the training parameter matrix; I is the identity matrix; f  is the activation function of this layer.

F I G U R E 3
The structure of the cascaded graph neural networks This article uses a two-layer GCN whose forward pass formula is shown in Equations ( 12) and (13).
where Y GCN is the output of the cascaded GCN layer; W 1 and W 2 are the trainable parameter matrices of the first and second layers of GCN, respectively.The output of the cascaded GCN is input to the MLP layer to finally obtain the CF risk assessment result under the fault graph G.The forward transfer formula of MLP is shown in formula (14).
where X (l) MLP and Y (l) MLP are the input and output of the lth MLP layer, respectively; W (l) MLP and b (l) are the weight matrix and bias matrix of the lth MLP layer, respectively, which are trainable parameters.At the same time, in order to reduce the risk of network overfitting, a dropout layer is added to the second layer of MLP.

Training and evaluation metrics for neural network models
In terms of model accuracy, the mean square error (MSE) between the calculated value of the VaR index corresponding to the fault graph G under the power flow section and the simulated value is used as the loss function, as shown in Equation (15).
where M is the sample size; VaR m and VaR m are the neural network calculation value and simulation value of the mth risk index, respectively.The values of the trainable parameters of the proposed cascaded neural network are adjusted using the Adam optimizer to minimize the value of the loss function.
In terms of model computation time, although training a more accurate neural network model needs to be based on a large amount of CF simulation data, since both CF simulation and neural network training are performed offline, only the average forward computation time of the neural network model (average calculation time, ACT) to evaluate the computational complexity of the model, as shown in Equation (16).
where T m is the time taken by the mth neural network forward calculation.

CASE STUDIES
4.1 IEEE 39-bus system

Case system description
In this article, an improved IEEE 39 node system is used as a simulation example to verify the effectiveness of the proposed method.The total load of the standard IEEE 39-node system is 6254 MW.In order to analyze the CF of the high-proportion new energy power system, 175 units with an installed capacity of 1.5 MW are installed in the 10-16, 19-20, and 22-24 nodes of the 39-node system.The wind turbine, the actual output of wind power and the wind speed obey the typical formula, and the wind speed is assumed to obey the Weibull distribution. 19The shape parameter of the Weibull distribution is set to 3 and the scale parameter is set to 5. At the same time, the generator output at the original 31-39 nodes is reduced to 50%, and the proportion of new energy is about 50%.Referring to the existing literature and related standards, the U t l and U t h of the wind turbine are set to 0.9 and 1.1 pu, respectively, and the Ulmin and Uhmax are set to 0.3 and 1.2 pu, respectively.The system passes the N-1 check.A CF is simulated according to the model described in Sections 1.1-1.3.Considering N-2, N-3 random three-phase short-circuit faults as the initial faults of cascading faults, 10,000 groups of initial faults are randomly selected in the first simulation stage in Figure 1, and 10,000 FGCs are obtained, including 76,032 sets of fault graphs.For each group of fault graphs G, 1000 experience replays are performed in the second simulation stage of Figure 1 to calculate the VaR index under different confidence levels, and the VaR index is updated every ten times of replay.For 76,032 sets of fault maps, the VaR index has converged after 1000 experience replays.
For power systems with a high proportion of renewable energy, the propagation of CFs has certain randomness.In order to balance the accuracy and speed of the simulation, it is necessary to set an appropriate number of simulation repetitions.For power systems with a high proportion of renewable energy, the propagation of CFs has certain randomness.In order to balance the accuracy and speed of the simulation, it is necessary to set an appropriate number of simulation repetitions.Taking the N-2 fault of line 23-24 and line 26-27 as a typical calculation example, the variation of VaR index with the number of experience replays under different confidence levels is shown in Figure 4A-D.When the number of repeated simulations is gradually increased, the calculation result of the risk of failure will gradually become stable, but at the same time it will also increase the time consumption.As shown in the figure, when the number of experience replays is selected as 1000 times, the VaR indicators under different confidence levels have obviously converged, that is, the VaR fluctuation is less than 0.1%.At the same time, as a comparison, the VaR index under different confidence levels is calculated without considering the off-grid protection action of new energy units.

4.1.2
Cascading failure risk assessment based on graph neural network According to the 76,032 groups of fault maps obtained in Section 3.1 and their corresponding confidence levels, they are randomly divided into training set, test set and validation set according to the proportions of 70%, 20%, and 10%.The hyperparameters of the proposed cascaded GCN are shown in Table 1.As a control, the hyperparameters of cascaded convolutional neural network (GNN) and multilayer perceptron (MLP) are set at the same time as shown in Table 1.Among the remaining hyperparameters, the learning rate of the optimizer Adam is set to 0.01, the batch-size is set to 128, and the epoch is set to 1000.When the confidence level is 99%, in the training process of different neural networks, the change of MSE with epoch on the test set is shown in Figure 5. From the overall trend, with the increase of the training times (epoch) of the neural network, the performance of the three neural networks on the test set has improved.To be more specifically, as shown in the figure, when epoch = 200, the MSE of three different neural network models values have converged.When the model converges, the MSE values of the three neural networks are about 0.006, 0.009, and 0.011, respectively.It can be seen that compared with CNN and MLP, the graph neural network has advantages in model accuracy because it can reflect the topological characteristics of the power system.When three neural network models and algorithms such as support vector machine (SVM) and polynomial regression (PR) fit the nonlinear mapping relationship between fault graph G and VaR at different confidence levels, the MSE and ACT indicators The calculation results are shown in Table 1.
As shown in Table 1, when the confidence level increases, the accuracy of the model decreases slightly as the machine learning algorithm needs to learn more extreme outage cases from the fault graph G.The proposed cascaded GCN model can still maintain a high accuracy at a confidence level of 99% (MSE = 0.006 after training convergence), while the corresponding MSEs of the CNN model and the MLP model are 0.011 and 0.009, respectively.Therefore, the model proposed in this article is significantly better than other machine learning algorithms in the table.While maintaining high accuracy, the proposed algorithm maintains a fast calculation speed, and a single feedforward calculation takes only 3.7 ms on average.Therefore, the proposed method can accurately and quickly complete the mapping from the fault graph G to the risk indicator VaR, so it can be applied to the online real-time assessment of CF risk.

Case system description
A larger scale and higher complexity power system called the 2868-bus system is used to validate the scalability of the proposed risk assessment method.The 2868-bus system contains 2868 buses, 599 generators, and 4414 transmission lines, which accurately represents the size and complexity of French power system.Further information can be found in Reference 20.In order to bring the system closer to the critical steady state for more CF scenarios, the line lengths of all branches are extended to 1.1 times.At the same time, 50% of the original generators in the system were replaced with renewable energy generators to meet the prerequisites for a high proportion of renewable energy.Consistent with the previous case, 50,000 N-2 three-phase-to-ground short-circuit faults are used as initial faults, while CF simulations are used to generate corresponding sample sets.The setting of the hyperparameters of the neural network is consistent with the previous case and will not be repeated here.

Test results
Similarly, three neural network models, SVM, PR are compared to fit the nonlinear mapping relationship between fault graph G and VaR at different confidence levels.MSE and ACT are also used as the indicators here.The calculation results are shown in Table 2.It can be found that compared with the 39-node system, other methods have experienced different degrees of performance degradation, but the proposed method still performs well.

CONCLUSIONS
This article proposes a CF analysis model based on cascaded graph neural network, and the following conclusions are drawn: (1) The off-grid protection action of new energy units caused by insufficient tolerance of power electronic equipment is an important reason for the spread of CFs in a high proportion of new energy power systems.If this reason is not considered, there may be an underestimated blackout risk of CFs; (2) Compared with the Markov fault chain in the traditional discrete state space, the proposed cascading FGC can accurately describe the electrical and topological characteristics of the continuous change of the power system during the propagation of cascading faults.It has advantages in off-grid protection action of new energy units caused by continuously changing physical quantities; (3) Compared with machine learning algorithms, the proposed cascaded graph neural network can capture the topological and electrical characteristics of the power system at the same time, so it has advantages in fitting the mapping relationship between the fault graph and the CF risk index.
It should be pointed out that the process of CF has the Markov property, that is, the occurrence of the failure is only related to the current system state, and has nothing to do with the previous state.Therefore, the failure risk assessment method in this article is also based on this property, and it is believed that the CF risk is only related to the current state.However, the fault chain in the CF process records rich fault timing information and fault space related information, so how to construct a suitable data-driven model to extract the spatiotemporal correlation in the fault chain process to provide faster and more accurate risk assessment, is the direction of further work.

F I G U R E 5
Cumulative distribution function of VaR under different confidence levels.
Comparisons results on the IEEE 39-bus system TA B L E 1 Comparisons results on the polished French 2868-bus system TA B L E 2