Surrogate-assisted optimal re-dispatch control for risk-aware regulation of dynamic total transfer capability

To enable reliable power delivery through transmission tie-lines, total transfer capability (TTC) must be calculated and regulated to accommodate the transferred amount. How-ever, the traditional optimal power ﬂow (OPF)-based total transfer capability calculation is computationally expensive for efﬁcient total transfer capability control due to the inclu-sion of a large set of differential-algebraic equations (DAEs) to verify transient stability constraints. In order to enable practicable total transfer capability regulation, a novel risk-aware deep learning-assisted paradigm is proposed here. First, a deep belief network (DBN) is employed to establish the total transfer capability predictor and surrogate the computation-intensive differential-algebraic equations in original optimal power ﬂow for-mulas, simplifying the high-dimensional and intractable constraints deep belief networks without loss of nonlinearity. Particularly, in order to be aware of control risk from the pre-dictive error of the deep belief networks, prediction intervals (PIs) are produced improved by using ensemble learning and used to disclose the probability of insufﬁcient actions, further guaranteeing the sufﬁcient and cost-effective control by compromising the tradeoff between cost and risk. Symbiotic organisms search (SOS) is then applied to solve the proposed risk-aware deep belief network-assisted total transfer capability control problem globally. The numerical studies testify that the proposed method enables economical, reliable, and full nonlinearity-retained dynamic total transfer capability regulation control within a risk-free surrogate-assisted and tractable physical model-driven hybrid framework.


INTRODUCTION
Since the scale of interconnected power grids has been increasingly growing, it is critically important for system operators to be aware of transmission security and promptly determine the preventive strategies to enable optimum inter-area power transfers. As a metric for quantifying the transfer path's margin, total transfer capability (TTC) is becoming increasingly crucial for operation scheduling and plays a critical role in uncovering the security boundary from the tie-lines transmission aspect.
According to previous studies, several TTC calculation models have been presented, and a majority of them only contain the trajectory sensitivity method was proposed, including the direct sensitivity method [6] and numerical sensitivity method [7,8]. Although it consumes less time to optimize the TSC-OPF problem than other methods, this method is likely to sacrifice accuracy due to the local linearization [9]. Towards an online and accurate TSC-OPF, data-driven techniques, such as nonparametric estimation [10] and machine learning [11], have demonstrated their feasibility. Besides, based on the data-driven methods, online proactive protection schemes for power system security can be exploited without sacrificing the solution accuracy and computing efficiency. For the first time, this paper describes the development of a data-driven strategy for regulating TTC.
In recent years, surrogate-assisted algorithms (SAAs) are emerging as promising techniques to efficiently deal with the nonlinear, non-convex, and computationally intractable problems [12][13][14][15][16][17][18][19][20]. SAAs deploy learning techniques to replace costly or time-consuming constraints in decision-making problems, thereby reducing the complexity of these constraints. Ref. [15] proposed the generic SAA evolutionary optimization frameworks, creating synergies between machine learning, optimization, and data science. Different categories of real-world applications have demonstrated the applicability of such optimization techniques. In power systems, the SAA methods have been introduced to resolve security-oriented problems [13,14,20,21]. Specifically, a support vector machine was deployed to infer transient stability in [13], while a similar framework is conducted in [14] based on XGBoost. Ref. [20] proposed a double-decision tree (DT) -based heuristic scheme to enable reliable preventive control. A commercial nonlinear solver was successfully conducted to handle shallow neural network-inferred securityconstrained OPF in [21]. Generally speaking, the applicability of using shallow learning in power system security control has been proved. However, many investigations also indicate that robustness, accuracy, and generalizability of shallow learning are limited in practical tasks. To determine more reliable control actions, the error of learning-based security inference must be carefully studied, and better learning methods (e.g. deep learning, etc.) should be introduced.
In detail, security control usually tends to cost minimization, which will drive the operating region (which is predicted by surrogate learners) to the security boundary. In this situation, the tiny error of security estimation based on the learning scheme might incur an incident, where the actual insecure operating condition is inferred as security. In other words, the imperfect nature of surrogate learners may mislead the preventive actions. In-depth uncertainties inside learners must be investigated and adequately quantified so that a scheme to de-risk the erroneous actions by surrogates can be further designed. However, only a limited amount of research has been carried out to circumvent such situations. Ref. [12] proposed a novel ensemble-based approach to assess the control risk probabilistically, indicating if the SAA may mislead the preventive actions. Nevertheless, this approach is oriented to classifiers, and the risk assessment of regressors is yet to be performed. Ref. [20] proposed an asymmetric weighting approach to approximate the security boundary more conservatively. However, this method is coarse and over-conservative. In [22], the confidence interval was estimated using an extra learner to predict the learner's error in the prediction task. Simultaneously, a critical decision boundary for classification was studied in [23] to show how learners classify errors, and then a designated scheme was deployed to lessen misclassifications. However, these ideas may have trouble integrating into control tasks due to their intricate structure.
To address these issues, this paper proposes a novel deep learning-assisted method to enable the risk-aware TTC regulation considering the compromise between insufficient control risk and the control cost. The significant contributions of this paper are summarized as follows.
(i) For the first time, nonlinear relations of TTC against critical operational variables are found based on DBNs. The relations are demonstrated to be efficient to accurately estimate TTC and provide straightforward rules to regulate TTC by tuning operational variables. (ii) Benefit from the DBN-based model-free practices, TTC security inference becomes computationally tractable, which can be embedded into security control problems through a surrogate-assisted way. (iii) To circumvent that imperfect inference by DBN surrogate misleads to insecure control action, the risk of an erroneous surrogate-assisted premise is studied. The risk is successfully parameterized through the prediction interval technique and modelled to be controllable by operational variables. This allowed us to consider the imperfect nature of surrogates into the control model, further enable a tradeoff between cost and risk for data-driven control.

Problem description
To secure the operating margin, preventive actions should be enabled to ensure that the power flow transferred through tie-lines is smaller than the value of TTC. TTC changes with the variation of operation conditions, especially generator outputs. Considering the operation uncertainty, the time-varying TTC possibly decreases to a value below power flow, causing the insufficient margin of secure power transfer. In that case, dispatchers need to regulate the transmission limit. Mathematically, the process could be built as a bi-level model. The upper level a cost-oriented generation re-dispatch problem, and the lower level is the TTC calculation model. The overall structure of the TTC regulation model is visualized in Figure 1.

Mathematical formulation
Follow the aforementioned description and the structure shown in Figure 1, the general formulation of the TTC regulation The overall structure of the TTC regulation. DSCOPF: dynamic security-constrained OPF model can be represented as Equation (1): where, Equations (1a)-(1d) formulate the upper-level model, while Equation (1e) is the lower level model and denotes TTC calculation model; x and y are the vectors of control variables (i.e. power generation), and state variables (e.g. voltage, phase etc.), respectively; F (x) is the action cost from re-dispatching generators outputs; Z (x, y) and (x, y) are the equality constraints (i.e. power flow equations) and inequality constraints (i.e., steady-state constraints), respectively; PF l (x, y) is the function for computing the power flow through the line l ; Equation (1d) defines TTC security margin, in which the optimal * is decided through solving the lower-level model Equation (1e). Equation (1d) implies that the power transfers should be operated without exceeding the TTC. In the implementation, Equations (1d) and (1e) are monitored online. When there is a violation of Equation (1d) identified, model (1) must be carried out for controlling TTC towards an objective margin.
Regarding the formulation of the lower-level TTC calculation model, two general categories of TTC models have been proposed in the existing literature. One considers the worst-case OCs scenario during TTC calculation so that an extremely conservative TTC is yielded [24]. This model results in a minimal and fixed TTC value, which is too conservative in cost minimization goal and economically inappropriate to be used for TTC regulation [25]. To avoid the traditional scheduling, the TTC model proposed in [10] and [26] is employed in this study, as shown in Equation (2): where, is scalar for increase generation in power sending area or load in the power receiving area, and the increase direction is specified by a user or calculated to deteriorate security [10];  ( ) and  ( ) are algebraic and state variables during the transient period ( 0 , end ], especially, [ ( 0 ),  ( 0 )] represents the initial condition for TTC calculation, it is equal to the operation state in Equation (1);  denotes the set of contingencies;  c denotes the set of differential algebraic equations (DAEs) expanded in transient time domain, and c (⋅) denotes the security constraints induced under contingency c. In this paper, we mainly concern transient stability, whose criterion is normally adopted as: where, denotes the set of generators. The above model can be addressed by Algorithm 1 proposed in [10]: * = arg max [Bisection] . * implies the maximum inter-area power transfers that the system could withstand without compromising any security verifications. Note that the major terms to determine a TTC is the initial operating condition and * , thereby, TTC calculation can be modelled by Equation (4): where, denotes the set of tie-lines;  l ([ ( 0 ),  ( 0 )], * ) is the function for calculating the transfer capability of line l , and means the power transfer of line l when the generation-load increase * upon the initial operation [ ( 0 ),  ( 0 )]. The introduction of multiple sets of DAEs and stability constraints raises the dimension, the complicity and challenge of the analysability of the TTC regulation problem. Especially when taking high-order generators or load dynamics into consideration, the TTC regulation will be unavailable to meet the efficiency demand for preventive control.

DEEP LEARNING-BASED TTC PREDICTION
As noted above, the TTC related problem is a computationally expensive optimization problem that might not be applicable for online use. This issue will be mitigated by using deep learning to capture TTC-oriented operational rules. In doing so, The optimal ensemble weights * ; the trained upper bound DBNs and lower bound DBNs, namely U-DBNs and L-DBNs, respectively.

Start
1. Generate small perturbation on the target vectors of Ψ( t ). For the y t fed into U-DBNs, the perturbation is positive, denoted as: ; generate the initial velocity v 1 of each particle under normal distribution, where v ∈ ℝ N×2M . 5. Ensemble-enhanced PIs procedure: for g = 1: G do 5.1 Calculate the upper bound and the lower bound corresponding to each particle by the follows: denotes the j-th element of i-th particle at g-th generation; Φ DBN, L j and Φ DBN, U j denotes the j-th L-DBN and j-th U-DBN, respectively. 5.2 Combine Equations (10)- (14) to compute the metrics of the PIs 5.3 Use Equation (16) to calculate the fitness of each particle.
the computation efficiency of both calculation and regulation for TTC would be substantially raised.

Restricted Boltzmann machine and deep belief network
Among quantities of deep learning, deep belief network (DBN) has demonstrated several salient merits, for example, automatic feature selection, raw data filtering, strong nonlinear representation, high accuracy, and robustness, in non-sequential, non-imaging, and regression data involved tasks of which the data structure is identical with the concerned TTC problem. Therefore, DBN is highly suitable for learning numerical correlations between the featured variables and TTC.
DBN is constructed by several stacked Restricted Boltzmann Machines (RBMs) [27]. Each RBM is pre-trained by applying contrastive divergence and greedy unsupervised learning. Then all the well-trained RBMs are unfolded and sequentially connected to build a parametric well-initialized DBN. Afterwards, a parametric fine-tuning process (i.e. BP) is activated to train DBN.

Sample production based on Good Point Set theory
To begin with the introduction for construction of training dataset for DBN, input feature space is defined as  = [P Gen , Q Gen , U Gen , P Load , Q Load , V Bus ], which enables full representation for an operating condition. As the critical index, TTC is naturally selected as output feature  .
Since rescheduling region Θ( ) of a power system is usually restricted by the safe limitations of electrical facilities, for example, the power generation limitations P Gen ∈ [P min Gen , P max Gen ]. The principle of producing data samples is to generate a complete sampling space Ψ s ( s ) covering Θ( ) as much as possible, where superscript s denotes the sampled vector from the corresponding feature space. To this end, a sampling method that enables fully expanded coverage is necessary for generating the needed samples.
Compared with the conventional sampling methods, a good point set (GPS) features some superiorities, such as stabilized sampling quality, fewer samples for a well-performed predictor, etc. [22]. Therefore, GPS is applied to generate samples by Equation (5): (5) where, r j is the "good point" value of GPS; m is the number of good points, that is, the sample size; {⋅} here is an operator able to attain the decimal fraction of its contained component;  max and  min denote operational lower and upper bounds of the variables from . Among these variables, the bounds of load can be determined based on historic statistics, and n represents the dimensions of the sampled points. Additionally, SMOTE [28] is used to ensure the distribution uniformity of the large set of samples. After input being sampled, the target  s (i.e. TTC) of every sampled point can be calculated by Equations (2)-(4).
In the end, collect  s and  s , we have a sample set: where, the symbol → denotes the exact mapping from  s to  s .

DBN-based TTC predictor
After collecting the sampled data, several steps are needed to construct the TTC predictor: First, the sample set Ψ s ( s ) is split into the training sample by 5-cross validation. In this way, the overfitting problem of DBN can be effectively prevented [29]. Second, the collected variables  t of Ψ t ( t ) are fed to the stacked RBMs to perform pre-training. Automatic feature selection and raw data filtering are executed in this step, and the deep feature representation is also obtained. Third, the training target  t is fed to DBN to fine-tune the weights. As a consequence, a TTC predictor Φ  is well trained. Before the predictor is implemented online, different test datasets will be conducted to justify the generalization ability of the predictor. Notate test dataset as ( e ,  e ), then use widely used metric (e.g. mean square error, MSE, or square correlation coefficient, SCC, etc.) to measure the error between  e and the prediction e = Φ  ( e ) by the predictor. Once the performance is acceptable (for example, prediction MSE < 20MW), the predictor can be forwarded to the online phase; otherwise, it will be retrained. In the online stage, we collect measurement of , then estimate TTC by = Φ  ( ).

Surrogate-assisted TTC regulation model
The properly built predictor Φ  provides a fast way to calculate TTC and a straightforward rule between TTC and controllable variables. Therefore, an efficient TTC regulation model can be constructed through the surrogate-assisted way that embeds Φ  into Equation (1) as an instant inference for TTC security. Notably, in Equation (1), the lower-level Equation (1e) is computationally intractable. Thus follow the above idea, Equation (1e) is surrogated by DBN-based TTC predictor Φ  . Then the DBN surrogate-assisted bi-level model (DBN-SABM) can be built as follows: where,ˆl denotes the security margin calculated based on surrogates; Φ  l ([x, y]) predicts the TTC of tie-line l , and [x, y] implies that the input for Φ  is collected from the controllable variables and state variables of the optimization model.

Solver for DBN-SABM
Notably, DBN is a composite nonlinear system with "blackbox" features, whose convexity is hard to be determined. Thus, gradient solvers are complex to solve DBN-SABM globally.
Metaheuristic field research has proposed optimization methods that could outperform the gradient-based approaches to solve challenging engineering problems [30]. In this paper, a new heuristic algorithm, Symbiotic Organisms Search (SOS), is employed to solve DBN-SABM of TTC regulation. In order to apply SOS on solving the proposed model, several modifications are made to improve computing stability and robustness, which is detailed as follows: (i) Improved initialization: To avoid slow convergence caused by the randomly generated initial ecosystem, GPS based initialization of the ecosystem is applied based on (4), where m is the number of organisms. (ii) Ecosystem parallelization: A classical parallel strategy is deployed to improve SOS-based optimization performance and maintain the diversity of the ecosystem [31]. (iii) Modified evolution: Self-adaptive mutualism At the early stage of evolution, the fitness of organisms is generally deprived. Thus, a global search should be carried out to efficiently explore the near-optimal solution space. When the evolution proceeds, SOS needs to search for the optimal solution by enhancing its local searchability. To this end, in this phase, we deploy the following mechanism: where, ∈ [0, 1] is an adaptive scaling factor; o is the mutual vector equivalent; and b 1 and b 2 are the benefit factors randomly determined as 1 or 2.

Random differential perturbation terms introduced commensalism
Adopting self-adaptive mutualism can increase the convergence rate; however, it decreases the global search ability of SOS. To compromise the convergence rate and global search capability, the random differential perturbation terms are introduced to the commensalism phase: where is a random number within an interval (-1, 1); X k is a randomly selected organism that serves as a perturbation to enlarge the diversity of the ecosystem.

Parasitism
In the original SOS algorithm, mutualism and commensalism cannot prevent the search from falling into the local space.

FIGURE 2
Illustration of insufficient control risk derived from imperfect predictors: the regulation control from 3 to 2 is the so-called "insufficient control" Thus, parasitism is introduced. First, a "Parasite Vector" (PV) is created by revising the randomly selected dimensions of an organism X i in the initialized ecosystem. The PV is then compared with X i . If the fitness of the PV is better than that of X i , the former will replace X i in the ecosystem; otherwise, the PV is abandoned, and X i will keep the current position.

Problem description
As aforementioned, unavoidable errors of traditional point prediction may lead to insufficient regulation control and may even cause erroneous preventive actions. For simplification, we assume that the feasible margin during TTC regulation is mainly defined by the TTC security margin so that the visualization of the security margin shift can be represented via Figure 2. From Figure 2, the initial system is identified to operate in insecure OC A. Since A is located outside the initial security boundary 3. Then the DBN-SABM model is carried out to drive the system into an optimized OC B within the DBN-predicted security margin 2 (indicated by purple dotted line area). Under such circumstances, the OC B is identified to be secure based on TTC predictors. However, due to the unavoidable TTC prediction error ζ of DBN, the actual security margin shrinks to margin 1 (indicated by green solid line area), such that the OC B is insecure. In that case, despite the DBN reveal, the system already operates under a secure OC. The power system is potentially endangered because of insufficient transmission margin. From an engineering aspect, the conservative control principle needs to be followed in the power system in order to ensure absolute robustness. The price of robustness, in turn, is paid in terms of slight conservativeness and cost in the solutions. As shown in Figure 2, prediction error must be considered into DBN-SABM so that the system can be directly driven into secure OC C.
Toward this end, one feasible way is to include a positive threshold to TTC constraint, that is,ˆl ≤ − , for driving the prediction towards a bigger value to offset ζ. The issue is that finding ε that is equal to ζ is hard, owing to the uncertainty of ζ. Apparently, setting a big value of ε to ensure that ε is much larger than ζ is an alternative to handle such a problem. However, intuitively the increase of ε will incur the increase of control cost. Also, for different OCs, ε must be determined by repeated trials, causing an extra computational burden. So it is complex to stipulate a moderate ε value. The above idea is known as the fixed-ε approach, and it will be adopted as a comparative study in Section 6. In this paper, we proposed a more feasible and generic approach to achieve a balance between the control cost and the control risk.

DBN based prediction intervals
Many probabilistic machine learning algorithms have been conducted to learn the uncertainties of learning errors [22,23,32]. Among them, the prediction interval (PI) technique shows the best interpretability due to its interpretable loss function. PI is also of great scalability and easy to implement in practice. These appealing features urge us to deploy the PI in this paper. Moreover, the proposed risk-aware SA scheme is extensible to other probabilistic learning algorithms. PI is a predicted interval where actual targets fall with a specified probability, called confidence level. It is composed of a lower boundL (c ) (x t ), an upper boundÛ (c ) (x t ), and the specified probability 100 × (1−c)% [33], where c is the significance level and 100 × (1−c)% is the confidence level.

PIs metrics
Assuming that there is a training sample set Ψ t ( t ) including m t samples (i.e., Ψ t ( t ) = {( t i ,  t i )| i = 1, 2, … , m t }) and a specified PIÎ (c) ( t ) can be defined as: Generally, three indices are employed to evaluate the quality ofÎ (c) ( t ), that is, PI coverage probability (PICP), PI normalized average width (PINAW), and accumulated width deviation (AWD).

PICP
PICP indicates the probability that targets y t will be covered bŷ I (c) ( t ), which can be calculated via Equation (10). Higher the PICP, higher the quality of PI.
where i is a Boolean variable defined as Equation (11):

PINAW
Indeed, for a given Ψ t , ifL (c ) ( t ) andÛ (c ) ( t ) are chosen to be extreme values, for example, negative infinity and positive infinity, PICP can be easily adjusted to 100%. However, such PI cannot provide any useful information. Therefore, to construct an informative PI, PINAW is proposed. Narrower the PINAW, higher the quality of PI.
where, Z is a normalization factor.

AWD
AWD is introduced to assess the extent to which the targets deviate from the upper bound (or lower bound) of PI, which can be calculated through Equations (13) and (14). Lower the AWD, higher the quality of PI.

PIs learning
In this research, the PIs are predicted by using ensemble learning; hence, they are calibrated by tuning the weights for assembling quantities of base learners. The PIs learning is theoretically a multi-objective optimization problem, where the decision variables are the weights of Ensemble, defined as , ∈ ℝ 1×2M , where 2M denotes the total number of base learners. The objective of PIs learning is to construct high-quality PIs for the given c, and PIs learning can be mathematically defined as Equation  (15) from the optimization perspective. This multi-objective optimization problem can be converted to a single-objective training task, whose loss function is shown as Equation (16), by introducing several important factors i c , i = 1, 2, 3. To determine the optimal decision variables, PIs learning is executed via Particle Swarm Optimization (PSO) algorithm. Moreover, ensemble learning is adopted to enhance the robustness and quality of PIs. The pseudo-code is illustrated as follows:

PIs based risk-aware TTC regulation
In the PIs based TTC regulation model, TTC security is inferred upon the learning confidence of DBN, which allows the regulation to circumvent misleading by inaccurate prediction. From the aspect of risk analysis, PIs could evaluate the risk that the DBNs misidentify TTC security. Further, based on the risk model, more conservative security actions can be conducted. The mechanism of PIs-based surrogate-assisted control is visualized in Figure 3. In Figure 3, the real value of TTC falls into the PI with probability 100 × (1−c)%. Although the TTC prediction is steered to a larger value than the power flow by DBN-SABM optimization, the actual TTC is still risky to fall into an insecure area with probability R. From Figure 3, the calculation formula of R can be deduced as follows: 17) To evaluate the tradeoff between the risk and the cost, R(x, y) is included in the objective function of DBN-SABM: where, α is a user-defined factor and ∈ [0, 1]; andF (x) is the normalized re-dispatch cost of generators. As for a conservative TTC regulation control, DBN-SABM sets a value of α > 0.5. Theoretically, setting α to 1 can ensure TTC regulation control; however, it may compromise the control cost.
Notably, since the control risk has been included in the objective function, the origin DBN-SABM model (6) should be rebuilt to a new model without including the lower-level model.

FIGURE 4
The flowchart of the proposed risk-aware TTC regulation approach Therefore, the risk-aware SA model can be represented as Equation (19): In summary, the followings illustrate the pseudo-code of the PIs based risk-aware TTC regulation as well as the flowchart of the proposed method (see Figure 4).
To make sure the surrogates being generalized to unseen operating conditions, DBNs should thoroughly learn the prior knowledge and patterns of the historical dispatches. Thus, sampling space spanning all dispatches that appeared in historical/nominal data should be elaborately devised. This paper proposes GPS and SMOTE-based methods for the above intention. After that, the generalization ability of the surrogates trained on the produced samples can be ensured.
Despite TTC surrogates can be properly initialized by the aforementioned scheme, there are still some unseen scenarios in terms of the availability of the preliminary surrogates, for example, maintenance planning etc. Since such scenarios can be foreseen, the related database can be generated in day/month/yearahead operation planning. Afterward, the surrogates should be updated to enable generalization under the unseen scenarios. The whole process should be executed offline and is expected to be timely accomplished before entering the online phase. Furthermore, high-performance computing clusters can speed up The optimal generation planning x * G Start: and  is the number of generators. 2. Randomly divide the initial ecosystem into P sub-ecosystems: Calculate the risk R(x q G ,k+i−1 ) using Equation (17) 3.8 Valid the steady constraints (1b) and (1c). If violations of constraints exist, include the constraints into fitness in the forms of penalty functions: Δ =  Δ + Z ΔZ where ΔZ and Δ denote the over-limit value of Equations (1b) and (1c), respectively; Z and  denote the penalty coefficients for violated-constraints of Equations (1b) and (1c), respectively. 3.9 Calculate the fitness using the follows: G ,k+i−1 ) + Δ end for 3.10 Perform SOS operations by Equations (7), (8), and parasitism. end par for 3.11 Deploy the ring migration to update each sub-ecosystem.

end for
End: obtain the optimal generation corresponding to the lowest fitness the offline process due to the high parallelism that offline procedures perform.

6
NUMERICAL STUDIES

Test system and initial settings
The IEEE 39-bus system is used to validate the proposed method. In the initial system, area 1 is the importing area The DBNs were set to four layers, and the number of neurons in each layer was 112-56-28-1. To ensure the generalization ability of the surrogates, the L2 norm was employed in the loss function of the DBNs. In the improved SOS algorithm, the number of organisms was 60, and the maximum number of iterations was 50. As for PSO-based PIs optimization, the maximum number of generations was set to 200, and each generation included 50 individuals, and c = {0.05, 0.01}. So, in the subsequent procedure, the c value was calculated using (c) = 0.03.

TTC prediction and DBN-SABM optimization
The sample generator introduced in Section 3.2 was used to produce 17,500 samples, in which 112 features (e.g. bus voltages, bus loads, and generators' active outputs) were considered. Set Ψ was composed of 15,138 samples, and the remaining 2,362 samples formed the test sample set Ψ e . Furthermore, 5-cross validation was deployed to split Ψ into the training sample set Ψ t and the validation sample set Ψ v . The results of TTC prediction on Ψ e are given in Table 1. In addition, Table 1 shows the comparisons among different algorithms (e.g. BPNN and multiple linear regression (MLR)) used to predict and regulate TTC. Table 1 shows that DBN outperforms others, and obviously, the enormous error of prediction incurs more costs spent in rescheduling. Also, insufficient regulation control may occur.
With the constructed TTC predictors, DBN-SABM can be solved through the improved SOS. The results of the approaches applied to the devised TCs are presented in Figure 5.
As for TC 1  . Thus, the system is unstable. By employing the SBM optimization, the system becomes stable. Its angle trajectories before and after regulation are shown in Figure 5(b).

PIs based risk awareness
In the remainder of this paper, only the original IEEE 39-bus system was used to testify the validity of the proposed risk-aware TTC regulation method. The proposed DBN ensemble-based PIs, BPNN, and MLR -based PIs methods are examined for comparisons. Specifically, four PIs methods were tested based on the same sample set. For each algorithm, PIs learning was executed ten times, and the performance was quantified using the average of PIs metrics. The results are shown in Figure 6 and Table 2. The metric ACE [34] was used to appraise the performances of PIs. From Figure 6 and Table 2, it can be noticed that the DBN ensemble-based method is superior to other traditional methods. Therefore, it was adopted in the risk-aware TTC regulation.

Tradeoff between risk and rescheduling cost
Although the DBN-SABM has proven to be capable of regulating TTC under the test conditions in Section 6.2, the proper control cannot be fully guaranteed when the approach is extended to other security-violated OCs. Therefore, two conservative methods, that is, the PIs-based risk-aware method and the fixed-ε method, are introduced to determine the appropriate compromise between the risk and cost. The insufficient control index (ICI) calculated by Equation (20) was used to appraise the applicability of the proposed approaches.
where, S F i denotes the insufficient control after the SA riskaware TTC regulation in the ith unstable OC. After the regulation control, if the actual security margin and the SA security margin are both less than 0, S F i = 1; otherwise, S F i = 0. Before checking the TTC security criterion, the steady constraints (e.g. generation limits) will be validated. If at least one of them is violated, the control is also deemed as fail and S F i = 1. S U denotes the number of unstable OCs for testing. In this research, the extra 200 unstable OCs were produced through Monte-Carlo sampling so that S U = 200.
By increasing the user-defined factor α from 0 to 1, the relation between α, cost, and risk can be represented. The results of risk-aware methods are shown in Figure 7(a). Specifically, the costF is the average value of all the action costs in the successfully controlled OCs. From Figure 7(a), the insufficient control can be effectively averted by increasing α, however, along with the increase of action cost. It can be worth noticing that when α = 0, the risk-aware method is equivalent to the costdominated DBN-SABM. As such, the regulation with the lowest cost (i.e. $700.12) can be devised. However, the highest ICI of 11.5% implies that such a regulation control is a high risk to operate a TTC security-violated OC. When α increases to 0.25, Control cost/×100 ($)

FIGURE 7
Control cost and ICI of (a) the risk-aware approach and (b) the fixed-ε approach. The dashed square frame makers in (a) and (b) indicate the cost and ICI metrics when the two methods first hit them attainable minimum risks, respectively.
ICI reaches the acceptable value, that is, ICI = 1.5% < 2%, andF is increased to $1,888.99. By determining α as 1, the model specifies a risk-averse action, leading the system to operate in an over-conservative condition. Although in this case, ICI reaches an acceptable low value, that is,1.5%, the action is overly costly, asF = $8, 017.86. Thus, towards a cost-risk compromised TTC regulation, [0.2, 0.5] could be a moderate interval to set α. However, it is still challenging to find an optimal α. One possible alternative is to employ multi-objective programming to provide Pareto solutions [8]. This allow dispatchers to be aware of all regulation strategies of α from 0 to 1. Combined with point estimation for TTC, TTC level of each Pareto strategy enables intuitive exhibition. Dispatchers can then select the best strategy according to the specified demand. More works on optimizing α settings are ongoing. Moreover, it should be noted that ICI cannot reach 0 due to some extreme OCs. The available capacity of generators is severely insufficient to maintain stability.
Similarly, the test of the fixed-ε method was conducted by increasing ε from 0 to 0.5 p.u. and the obtained results are shown in Figure 7(b). The increase of ε leads to an increase in cost, but this does not cause the monotonous ICI reduction. When ε value increases from 0 to 0.1 p.u., ICI decreases to a minimum of 3%, which is higher than 1.5% achieved by the risk-aware method. Nevertheless, when the ε value is increased from 0.1 to 0.5 p.u., the ICI increases from 3% to 8.5% instead of decreasing. Also, it is found that most optimization procedures with ε value being larger than 0.1 p.u. are non-convergent. In other words, setting a significant ε value leads to the non-convergence of the fixed-ε approach. However, as aforementioned, setting a small ε value cannot guarantee the risk-averse TTC regulation. The optimal ε value needs to be determined by additional procedures or experiments, which results in an extra computational burden. Given that the optimal ε value has been determined, the fixed-ε approach is still unable to decide a more risk-free TTC regulation action than the risk-aware approach. As a consequence, the risk-aware approach outperforms the fixed-ε one with respect to the conservative regulation. Comparing the two approaches in terms of economic performance and illustrating the sensitivity of cost and risk is shown in Figure 8.
From Figure 8, when the two approaches first reach the same ICI of 3%, the control cost of the risk-aware approach  Consequently, it is verified that the risk-aware approach outperforms the fixed-ε one in terms of economic regulation.

Discussion on computational overhead
The computational overhead statistics of all algorithms are given in Table 3. It is apparent from Table 3 that the model-based method is quite time-consuming for both TTC calculation and regulation. By contrast, the proposed DBN-based methodologies have reported significant improvements in computational speed. Thus, these results reveal that the proposed surrogateassisted methods provide an efficient and reliable way to flexibly regulate TTC.
It is also worth noticing that the applied solver was executed on the personal computer with only one CPU. By developing the algorithm on the parallelizable high-performance computing clusters, the computational efficiency can be further improved. Besides, the proposed framework is highly scalable, in which many heuristic solvers designed for large-scale problems, such as the algorithms proposed in [18] and [35], can be readily incorporated. However, the above discussions are beyond the scope of this paper and will be studied in future works.

CONCLUSION
This paper proposes a new surrogate-assisted (SA) risk-aware method for TTC regulation, enabling a probabilistic exposure of the constraints violated risk incurring by the natural imperfect surrogates. In addition, this research validates the feasibility of the proposed method used for a power system application, that is, the TTC regulation; the research also explores the tradeoff between control cost and conservativeness. The study has demonstrated that: (i) Compared with the traditional point prediction method, prediction intervals (PIs) method enables digging the richer endogenous information of surrogates, (ii) by considering the PIs-exposed probabilistic information into the process of the SA global search, the optimized solution could be more reliable than regular SA optimization, (iii) the proposed risk-aware SA method outperforms the fixed-ε approach, with respect to, convergence and applicability, and (iv) a risk-averse, conservative and economic TTC regulation strategy can be realized through the risk-aware SA method. In addition, the merits of the proposed methodology can be concluded as follows: (i) The DBN-driven TTC regulator provides an effective way for operators to online manage power transmissions; (ii) The PIs-based method enables disclosure of the insufficient control risks from the inescapable error of surrogates, as such the actions assisted by these surrogates can be ensured to be risk-averse; (iii) A generic risk-free machine learning-assisted decision framework can be realized via the proposed method. Besides, it can be noticed that the proposed scheme is not only limited to the TTC-oriented study but also can be extended to other computationally expensive and conservativeness-needed decision-making tasks in power systems.

NOMENCLATURE
The set of tie-lines The set of synchronous generators ∕c 0 The set of contingencies/ Initial operating condition without contingency. Ω m×n The sampling space with m observations and n features x∕y Controllable/state variables for TTC regulation problem l Actual TTC-based security margin i The rotor angle of the i-th generator ∕ s Determined input features/ Collected input features from sampled operating conditions  min ∕ max The allowed minimum/maximum of the determined input features Θ Formed sample space based on the input features and their limitations  s Collected target feature (i.e., TTC) corresponding to the input features Ψ s Sampled dataset Φ  Properly built predictor based on DBN l The estimated TTC security margin