Power system resilience assessment considering critical infrastructure resilience approaches and government policymaker criteria

The electric power system is one of the most important critical infrastructures of a country. Recently, the number of natural and man-made disasters is increased, which can impose extensive damages and costs to the power system. A resilient power system can withstand against, adapt to and recover from these disasters. Power system resilience is quantiﬁed by mathematical tools which are called “resilience metrics”. Currently, a lot of resilience metrics are proposed in the power system literature. In this paper, based on the extensive research in the critical infrastructure resilience literature which speciﬁcally concentrates on the “area-based” resilience metrics, a new area-based resilience metric is proposed which can measure the power system resilience considering the government policymaker criteria, which are rarely noticed before. The proposed and conventional area-based resilience metrics are evaluated based on the real data from the 2012 Superstorm Sandy in the USA, which led to signiﬁcant damage to the power distribution system. The simulation results show that the proposed area-based resilience metric is very simple, can successfully address actual power system performance curves and is more meaningful and tangible than the conventional area-based metrics for the government policymaker. The proposed area-based resilience metric has also a general form and can be used for other critical infrastructures.


INTRODUCTION
In recent years, the number of natural and man-made disasters has been gradually increased, which can lead to extensive damages and costs in the power systems. For instance, the Superstorm Sandy hit the USA in October 2012 and its estimated damage cost was more than $70 billion. The Superstorm Sandy affected more than 8 million customers (i.e. more than 20 million people) and the duration of electricity outage was approximately 10 days. The transmission network suffered little damage, but the distribution network damage was significant [1]. In another case, the Ukrainian electric power system suffered a cyber-attack in December 2015, which left 225,000 people without electricity for up to 6 h [2]. Accordingly, the "resilience" concept has been extensively developed and used in the power system literature in order to This  cope with these disasters. While a wide range of definitions are proposed for the power system resilience , there is currently no consensus on it. In this paper, based on the definitions which are presented in [18,20,30], the following definition has been applied (which is not necessarily comprehensive) [31]: "The power system resilience is the ability of this system to withstand disasters (low-frequency high-impact incidents) efficiently while ensuring the least possible interruption in the supply of electricity, sustain critical social services, and enabling a quick recovery and restoration to the normal operation state." In addition, three resilience components can be considered based on the above definition: withstand, adaptation, and recovery [31]. The "withstand" component is the ability of the power system when a disaster occurs to withstand against the disaster in such a way that the system performance is approximately not reduced. The "adaptation" component is the ability of the power system during the disaster to implement some tools to minimise the effect of the disaster on the system performance, with special attention to the critical loads. The "recovery" component is the ability of the power system after the disaster to return the system performance to the normal state as soon as possible using the restoration activities. A resilience enhancement strategy may improve one or more resilience components. These resilience enhancement strategies are discussed in detail in many well-established references [5,6,14].
In [31], a "conceptual framework" is presented which divides the power system resilience metrics into "non-performance-based" and "performance-based" groups, based on the independence/dependence on the "power system performance", which is the direct output quantity of a power system. The performance-based metrics are also divided into "performance" and "consequence (outcome)" groups, where the former is directly related to the power system performance and the latter is related to the effect of the power system on the diverse features of the society. The performance metrics are divided into "power", "duration", "frequency", "probability" and "curve" groups, whereas the consequence (outcome) metrics are divided into "economic", "social", "geographic" and "safety and health" groups. The current power system resilience metrics are then assigned to the framework's groups.
Since power system resilience metrics which are used in the literature mostly belong to the "performance-based" group, we here concentrate on these metrics, although there are other references in the literature that implement the "non-performance-based" resilience metrics [50,53,54]. Our extensive research in the power system resilience literature shows that there are two main sources for proposing the "performance-based" resilience metrics:

Reliability Metrics
Some well-established references have been used these metrics [28, 33, 38, 40, 43-46, 48-51, 55-57], where they can belong to different groups in the "conceptual framework", such as "power", "duration", "frequency", "probability" and "economic". However, the resilience and reliability of the power systems are completely different concepts. The power system resilience is related to the disasters (low-frequency high-impact incidents), whereas the power system reliability is related to the normal events (high-frequency low-impact incidents) [16,19,22,25,29]. In addition, the reliability metrics usually cannot consider all temporal aspects of the disaster effect on the power system. Thus, we believe that using the reliability metrics for quantifying the power system resilience is a controversial problem and is not recommended [31].

Metrics based on the critical infrastructure resilience approaches
These metrics usually implement the power system performance curves and belong to the "curve" group in the "conceptual framework", where the system performance is usually the supplied load or the number of customers with power. For instance, the slope of different parts of the performance curve can be used as a resilience metric. A special type of these metrics, which we call them "area-based" resilience metrics, are of prominent importance. These metrics include a set of similar metrics which calculate the area beneath the performance curve, the area beneath a special part of the performance curve, or the area between the real and ideal performance curves (the so-called "resilience triangle" or "resilience trapezoid"), which can be calculated in normalised forms [26,36,38,42,47,58,59]. Since we concentrate on these metrics, more clarification regarding the critical infrastructure resilience is needed.
Critical infrastructures are those systems and assets (both physical or virtual) and emergency response systems, that modern economy has become increasingly dependent on to sustain our daily lives [60]. In [4], the U.S. government classifies those critical infrastructures into 16 groups, including "Chemical", "Commercial Facilities", "Communications", "Critical Manufacturing", "Dams", "Defense Industrial Base", "Emergency Services", "Energy", "Financial Services", "Food and Agriculture", "Government Facilities", "Healthcare and Public Health", "Information Technology", "Nuclear Reactors, Materials, and Waste", "Transportation Systems", and "Water and Wastewater Systems". It is obvious that the power systems belong to the "Energy" group based on this classification.
Although the resilience is a relatively new concept in the power system literature, this concept has been extensively used from a long time ago in the critical infrastructure literature. The evaluation and quantification of the critical infrastructure resilience have a widespread literature which is reviewed in [61]. In our opinion, a thorough study and analysis of the most important references concerning critical infrastructure resilience quantification is required in order to obtain an in-depth view regarding the power system resilience quantification and to propose new power system resilience metrics. This paper concentrates on the "area-based" resilience metrics which are used in the critical infrastructure resilience literature, and their application in the power system resilience assessment. After an extensive and in-depth literature survey regarding these metrics, which now are extensively used in the power system resilience literature, we identified some important drawbacks. These drawbacks have great importance for the resilience evaluation in the actual power systems and can make essential obstacles for considering government policymaker expectations. The government policymaker have some expectations which are determined on behalf of the community benefits and are more important than the electric utility benefits. From the government policymakers's viewpoint, the power system resilience metric must be defined in such a way that violating their expectations can be seen in a tangible and meaningful manner, which can be lead to funding and performing proper resilience enhancement strategies. We summarise these expectations in five points which we call them the "government policymaker criteria".
Then, we propose a new "area-based" resilience metric for the power systems. The proposed resilience metric, which is based on the critical infrastructure resilience approaches, is very simple, can effectively consider the government policymaker criteria and can also be used for quantifying the resilience in other critical infrastructures due to its general form. The proposed and conventional area-based resilience metrics are then evaluated based on the real data from the 2012 Superstorm Sandy in the USA, which shows the usefulness of the new resilience metric in addressing the actual power system problems.
The rest of this paper is organised as follows. In Section 2, a literature review and analysis is done regarding the application of the area-based resilience metrics in the critical infrastructure resilience literature, and the identified drawbacks of these metrics are also described. In Section 3, the proposed area-based resilience metric is presented in detail. The metric's advantages compared with the existing area-based metrics and its effectiveness in applying the government policymaker criteria are also mentioned. In Section 4, the proposed and conventional area-based resilience metrics are evaluated using the real data from the Superstorm Sandy, and different scenarios (base case, sensitivity analysis) are analysed. Finally, conclusion and references will be presented.

THE AREA-BASED RESILIENCE METRICS FOR THE CRITICAL INFRASTRUCTURES: A REVIEW AND ANALYSIS
In this section, the "area-based resilience metrics" application in the critical infrastructure resilience literature is surveyed first. Then, the drawbacks and limitations of those metrics are analysed. Based on this analysis, we will be able to propose our new area-based resilience metric in Section 3.

Literature review
The "area-based" resilience metrics were first introduced in [62], where the performance (quality, functionality) of the system after an earthquake is defined according to Figure 1 and is expressed in percent. In this figure, the sudden decrease in the system performance is related to the nature of the earthquake phenomena. The performance (quality) of a critical infrastructure versus time and the concept of "resilience triangle" [62,63] Then, the resilience metric is defined by Equation (1), which means the area between the real and ideal performance curves in the recovery period.
Where R is the resilience metric (loss of resilience), Q is the performance (quality) of the critical infrastructure, and t 0 is the beginning time of the recovery period. Besides, t 1 is the ending time of the recovery period, where the system performance returns to the normal performance (i.e. the performance before the earthquake). It should be noted that this resilience metric is general and can be used against disasters other than earthquake.
In [63], the area between the real and ideal performance curves is approximated by a triangle, the so-called "resilience triangle", which is shown in Figure 1. This resilience triangle represents the loss of performance after the disaster and the pattern of recovery over the time. The resilience triangle sides include the loss of performance and the recovery time. Thus, the resilience enhancement strategies must reduce those sides, or in other words, the size (area) of the resilience triangle.
One important drawback of the aforementioned metric is that the effect of the system size is not considered, and the resilience of different systems, or one system with different resilience strategies, cannot be compared. However, these aspects are essential for the resilience evaluation. Thus, the "normalisation" is implemented for considering those aspects. The first effort is done in [64,65], where it is assumed that the critical infrastructure performance is defined according to Figure 2. Then, the resilience metric (R) is defined by Equation (2), which means the "normalised" area beneath the performance curve.
Where Q is the system performance (in percent), T LC is the control time (which will be introduced later in this section), and t 0E is the time of disaster occurrence. This means that the resilience metric is the average performance in the control time period. It is worth noting that in [66], the above formula is The performance of a critical infrastructure and its associated resilience metric [64,65] used without "normalisation", i.e. the term T LC is eliminated from the denominator of Equation (2), which means that the resilience metric is defined merely as the area beneath the performance curve.
In Equation (2), it is assumed that only one disaster happens in the control time. However, in [66,67], it is assumed that multiple disasters may happen in the control time, which can be of different types or the same type with different intensities. These disasters can also be in uncoupled (one disaster happens when the recovery from the previous disaster is completed) or coupled (one disaster may happen when the recovery from the previous disaster is not completed) forms. In these situations, the resilience metric is defined as the average (or the weighted linear combination of) normalised area related to all disasters (based on Equation (2)), considering the number of disasters during the control time, the number of different disaster intensities during the control time and the probability that a disaster with a given intensity happens in the control time. Besides, in [68], it is assumed that the system consists of several interdependent critical infrastructures. Then, the resilience metric for each critical infrastructure is defined by Equation (2), and the resilience metric for the whole system is defined as the weighted linear combination of each critical infrastructure resilience metric.
It is obvious that the above normalised resilience metric is dependent on the control time (T LC ), whether it is used considering a single disaster or multiple (uncoupled or coupled) disasters. In [64,65], it is stated that T LC usually is the life cycle or life span of the system which can be determined by the system owner or the society. We however believe that this quantity is very important and needs more clarification. In our opinion, the "control time" is used for two reasons: 1) As it can be seen from Figure 1, the final performance is equal to the initial performance. Thus, the upper limit of integral in Equation (1) is the time where the system performance reaches to the initial performance. However, Figure 2 shows that the final performance may be smaller than the initial performance and the upper limit of integral cannot be defined as before. Thus, for the resilience evaluation in these situations, an appropriate time must be considered at which all of the recovery efforts have been finished. This quantity, which also will be used in Equation (2) for determining the denominator and the upper limit of integral, must be defined by the system owner or the society. 2) When multiple disasters (uncoupled or coupled) must be considered, an appropriate time duration is needed where the disasters occurring within it will be used for the resilience evaluation. This concept is implemented in resilience metrics equations in [67,69].
Thus, there is a need for a "duration" or "period" in order to consider the above two points, which is called the "control time" in the critical infrastructure resilience literature, and can alternatively be called as the "period of study".
In Equation (2), the area beneath the system performance is normalised by the control time. However, in [70], the number 100 (constant desired performance in percent, which means the performance after the disaster returns to the performance before the disaster) is added to the denominator of Equation (2). Thus, the resilience metric is defined by Equation (3), which means that the area beneath the real performance curve is "normalised" by the area beneath the ideal performance curve, which has a better meaning from the dimension aspect.
In [71,72], considering multiple disasters in the control time, Equation (4) is presented as the area-based resilience metric, which is more general than Equation (3).
Where TQ(t ) is the ideal (target) performance curve, which usually is a horizontal line ( Figure 3), but it may have another form and varies with time. The shape of the area between the real and ideal performance curves can also have a form different The concept of "resilience trapezoid" for a critical infrastructure [73] from the triangle. In fact, according to Figure 3, a "resilience triangle" can be converted to a "resilience trapezoid". This shape can be more complicated if multiple disasters occur in the control time, or the degradation and rise of performance curve are not linear. This formula can also be used in an expected form in order to calculate the net effect of disasters [71,73]. The Equation (4) can also be calculated separately for the performance degradation period and the recovery period [74].
There are other area-based resilience metrics in the critical infrastructure literature. In [75,76], the area between the performance curves with and without recovery actions is defined as "dynamic resilience". In [60,[77][78][79], the area between the real and ideal performance curves (systemic impact) and the area beneath the recovery effort curve (total recovery effort) are computed. Then, the normalised weighted linear combination of these two quantities is defined as a resilience metric. This method is used for a SISO (single input single output) system, where the input is the recovery effort curve and the output is the system performance curve. Then, in [80,81], this method is extended for MIMO (multiple input multiple output) systems, where the system has multiple recovery effort curves (inputs) and multiple performance curves (outputs). Thus, the systemic impact is first computed for each output, and then the weighted sum of these quantities is defined as the final "systemic impact". In addition, the total recovery effort is first computed for each input, and then the weighted sum of these quantities is defined as the final "total recovery effort". For each output, the systemic impact can be computed as before, or based on the square of the difference between the real and ideal performance curves. For each input, the total recovery effort can be defined as before, or based on the square of the recovery effort curves.
In [82][83][84][85][86], the previous area-based resilience metrics are converted into time-dependent metrics which are called the "space-time dynamic" resilience metrics. This means that in both the numerator and the denominator of Equation (4), the upper limit of the integral is not the end of the "period of study (control time)", but is the "current time", and the metric is dependent on the current time. It is assumed that one disaster have several effects on the system that each of them are described by a distinct performance curve. The "space-time dynamic" resilience metric is then computed for each performance curve and the geometric mean of these quantities is presented as the final resilience metric of the system. Then, in [69,[87][88][89][90], this metric is generalised to consider interdependent infrastructure systems and multiple disasters, where the resilience metric may be deterministic or probabilistic, and the disasters may occur simultaneously or sequentially.

Literature analysis
After an extensive and in-depth survey in the critical infrastructure resilience literature, we infer that the area-based resilience metrics have some important drawbacks that rarely considered in the critical infrastructure resilience literature, and to the best of the authors' knowledge, have been never considered in the power system resilience literature. Our remarks about these types of resilience metrics are mentioned in the subsequent sections.

2.2.1
The shape of the performance curve Since the critical infrastructures provide vital services for the society, their loss of service, which means loss of performance in our terms, have extensive negative impacts on the various aspects of the society such as political, social and economic aspects. The governments usually want to prevent those negative impacts or compensate them as soon as possible. When a disaster occurs, the government expects that the critical infrastructure services remain for the most parts of the society as much as possible. In addition, for those parts of the society that the critical infrastructure is failed to service, the government expects that the loss of service is short and the service is restored as soon as possible. Thus, the government policymaker wants that any resilience analysis and evaluation considers these expectations. For considering those government policymaker's expectations, some parameters must be defined for the performance curve of an actual critical infrastructure. A "degraded performance limit" is first defined for the performance curve that if violated, the critical infrastructure strongly will be in danger. This parameter shows the government expectation to provide critical infrastructure service for the most parts of the society as much as possible. In addition, a "critical time" is defined for the performance curve, which means the period that the critical infrastructure performance is below the "degraded performance limit". This parameter shows the government expectation that the loss of service is as short as possible for those parts of the society that the critical infrastructure is failed to service. Finally, a predefined period which is called "desired recovery time" is considered after which the critical infrastructure performance must be recovered at least to the "desired recovered performance". These parameters show the government expectation that the service is restored as soon as possible for those parts of the society that the critical infrastructure is failed to service.

FIGURE 4
Two distinct performance curves with different shapes and the same resilience metrics [64] The government determines the aforementioned parameters based on the impact of the loss of service in each critical infrastructure on the political, social and economic aspects of the society. Thus, the government policymaker expects that the aforementioned parameters must be considered in the critical infrastructure resilience evaluation.
In the conventional area-based resilience metrics (Section 2.1), the shape of the performance curve is completely neglected, i.e. violations of the "degraded performance limit" and the "desired recovered performance" are not considered. These important drawbacks have been noticed in only a few references [64,[90][91][92][93][94]. For example, Figure 4 shows that two distinct performance curves with different initial loss and different recovery time have the same area between the real and ideal performance curves. This means that, according to Equation (2), these performance curves have the same resilience metrics [64]. Thus, the conventional area-based resilience metrics ignore the extent of limit violations and the corresponding damages and costs to the society, which is unacceptable from the viewpoint of government policymaker. For instance, Figure 4(a) shows a condition where the most parts of the society experience loss of service and the service is however restored very soon, which may cause severe impacts in only a few sections of the society (e.g. large industries). By contrast, Figure 4(b) shows a condition where some limited parts of the society experience loss of service for a long time, which may cause very severe consequences (e.g. social unrest or riots) in those parts of the society. Thus, it is obvious that these two performance curves are completely different in terms of political, social and economic aspects from the government policymaker's viewpoint, although their conventional area-based resilience metrics are the same.
In the critical infrastructure resilience literature, only a few references tried to consider the aforementioned problems in the area-based resilience metrics. The most important effort is done in [92], where the violations of "degraded performance limit" and "desired recovery time" are considered. The resilience metric is defined first using Equation (2), where the resilience triangle is multiplied by a variable multiplier (1 − ). Then, if a "degraded performance limit" violation exists, the resilience metric is computed using the real and limited performances, and the difference of these quantities will be added to the final resilience metric using a variable multiplier . Finally, if a "desired recovery time" violation exists, the resilience metric is computed using the real and limited recovery times, and the difference of these quantities will be added to the final resilience metric using a variable multiplier . The three variables ( , , ) must also meet some constraints and are computed based on an optimisation problem. Thus, this method is very complicated and simpler methods are needed for considering the government policymaker expectations in the area-based resilience metrics.
There are other resilience metrics in the critical infrastructure resilience literature that consider some of the aforementioned drawbacks but do not belong to the area-based resilience metrics. In [64], the ratio of the "critical time" to the "control time" is presented as a resilience metric. In [95], the ratio of the desired to real recovery times is used for defining a resilience metric. In [96], the common probability of the performance limit violation and the recovery time violation is presented as a resilience metric, although the extent of violation is not considered. In [97], the critical time is presented as a resilience metric. In [98], the "marginal performance" is defined as the difference between the performance and its limit, and then the resilience metric is defined as the ratio of the post-disaster marginal performance to the pre-disaster marginal performance.

2.2.2
The final versus initial performances In most of the area-based resilience metrics, an ideal performance curve is defined which usually has the form of a horizontal line. It means that the final performance after the recovery is equal to the initial performance before the disaster. However, this is not true for the actual critical infrastructures. The ideal performance curve may have a shape different from the horizontal line and maybe hardly estimated for a complex system [71,99]. In addition, the final performance may be smaller than, equal to or greater than the initial performance [64,75,82,100,101]. For example, 5 months after the Superstorm Sandy, 10% of customers in Rockaway (a peninsula of Long Island, New York) were without power [102], which means that the final performance was smaller than the initial performance (more examples will be shown in Section 4). By contrast, the state of infrastructures after the 2010 Haiti earthquake was improved [103], which means that the final performance was greater than the initial performance. If the initial performance of the system is relatively low and a disaster occurs, it may be required to consider the future system requirements and increase the final performance beyond the initial one, which may be temporal or permanent [62,77,79]. The full service recovery of the critical infrastructure has a great importance for the governments, and if the service (i.e. performance) cannot be fully recovered in the control time (period of study), it causes extensive negative consequences (political, social and economic) in those parts of the society that experience loss of service for a long time. Thus, the government policymaker expects that this important factor must be included in the critical infrastructure resilience evaluation. In addition, if the critical infrastructure is enhanced and the final performance is greater than the initial performance, it must be included in the critical infrastructure resilience evaluation. Thus, considering the relation between the final and initial performances must be added to the government policymaker expectations mentioned in Section 2.2.1.
In the critical infrastructure resilience literature, only a few resilience metrics consider this problem. The most important work is [104], which uses the ratio of the final to initial performances in a composite resilience metric. Another notable work is [101], where the difference between the final and worst performances is divided by the difference between the initial and worst performances and is used for building a resilience metric. In [95], the ratio of the recovered performance to the demand performance is used for defining a resilience metric. In [105], the time where the performance curve returns to the % of initial performance is presented as a resilience metric, which can be used when the final and initial performances are different. In [38], the time of recovery to % of the initial performance is compared with a specified limit, which can be used for the resilience evaluation.

PROPOSING THE NEW AREA-BASED RESILIENCE METRIC
In the previous section, the application of the area-based resilience metrics in the critical infrastructure literature is analysed from the government policymaker's viewpoint. In this section, we propose a new and simple area-based resilience metric for the power system resilience evaluation based on the critical infrastructure resilience approaches and considering the aforementioned government expectations. Before presenting the mathematical description of the new resilience metric, it is convenient to show the government expectations using a tangible example related to the power systems. It should be noted that all the values mentioned in the following example are hypothetical and for illustration purpose only.
Assume that a power system is exposed to a disaster (such as hurricane) in a country. The government's main goal is that most of the customers have power during the disaster, and if some of these customers lost their service, their power must be returned as soon as possible. This is an important government policy that prevents or reduces the extensive negative consequences (political, social and economic) on the society. Thus, the government wants that most of the customers (at least 80%) must have power during the disaster. In addition, the government expects that the outage duration for the customers without power (at most 20% of the customers) must be as short as possible. The government also wants that the power is restored to the most of the customers in a short period of time (at least 95% of the customers have power within 3 days). Finally, the government expects that all of the customers which have power before the disaster must have power after the restoration activities are finished. These are government expectations, and all disaster management activities must be accomplished with the aim of meeting these expectations. However, this may be impossible in some cases, and the above-mentioned limits may be violated. Thus, the government policymaker wants that these limit violations must be considered in the resilience evaluation and analysis process.
Thus, the government expectations regarding a critical infrastructure, which are expressed in Section 2.2, can be considered using five criteria related to the critical infrastructure performance curve, which we call them the "government policymaker criteria". These criteria, which will be used for the critical infrastructure resilience evaluation (e.g. for the power system resilience evaluation), are summarised as follows: Criterion 1: The relation between the final and initial performances must be considered in the resilience evaluation, because the final performance may be smaller than, equal to or greater than the initial performance.
Criterion 2: The minimum performance must be greater than or equal to the "degraded performance limit". Criterion 3: The "critical time", which means the period that the critical infrastructure performance is below the "degraded performance limit", must be included in the resilience evaluation. Criterion 4: At the "desired recovery time", the critical infrastructure performance must be greater than or equal to the "desired recovered performance". The "desired recovery time", which is determined by the government policymaker, is the ending time of the duration starting from the disaster occurrence time, at which it is expected that the most of recovery activities are completed. Criterion 5: If one limit is violated, the extent of the limit violation must be considered in the resilience evaluation.
However, regarding the application of criterion 1 in the actual power systems, it should be noted that the final performance is equal to or smaller than the initial performance, and usually is not greater than the initial performance. This paper mainly concentrates on the fact that the final performance may be smaller than the initial performance, which can be seen from the actual power system performance curves during the Superstorm Sandy (see Section 4). Now, the mathematical representation of the proposed area-based resilience metric is presented. It is assumed that the system performance curve for a critical infrastructure (e.g. a power system) has a typical form similar to Figure 5, where the disaster occurs at t = 0. After the disaster occurrence, the system performance first is not reduced (withstand component of the resilience). Then, the system performance is considerably reduced in a short time and the system performance remains FIGURE 5 A typical system performance curve for a critical infrastructure (e.g. a power system) low for some duration, where the critical loads must be supplied with available tools (adaptation component of the resilience). Finally, after the disaster, the restoration activities are implemented in order to return the system performance to a stabilised and normal operating state, which may be less than the system performance before the disaster (recovery component of the resilience). In Figure 5, the real performance curve variations assumed to be linear, but it can be in other nonlinear forms. Then, our proposed area-based resilience metric (R) is defined according to Equations (5) to (10).
In Equation (5), similar to Equation (4), the ratio of the area beneath the real performance curve (Q(t )) to the area beneath the ideal performance curve (TQ(t )) in the "control time" period (T LC ) is used (see our points regarding the control time in Section 2.1). However, this quantity is multiplied by a coefficient C , which is added for correcting the previous area-based resilience metric by considering all government policymaker criteria. The coefficient C is made by the multiplication of the coefficients C 1 to C 4 , for characterising government policymaker criterion 1 to 4, respectively. The coefficients C 1 to C 4 also consider criterion 5. All coefficients are defined in normalised forms and have no dimension. The coefficients C 2 to C 4 are smaller than or equal to one, but the coefficient C 1 may be smaller than, equal to or greater than one. Thus, the coefficient C may also be smaller than, equal to or greater than one. It is worth noting that in the power systems, as mentioned before, the final performance is smaller than or equal to the initial performance and therefore, the coefficients C 1 and C are smaller than or equal to one.
The coefficients C 1 to C 4 are defined as follows:

The first coefficient (C 1 )
This coefficient characterises "criterion 1" and "criterion 5" and shows the ratio of the final to initial performances. In this coefficient, Q f is the final performance after the recovery and Q i is the initial performance before the disaster. If the final performance is smaller than, equal to or greater than the initial performance, this coefficient will be smaller than, equal to or greater than one (as mentioned before, this coefficient will be smaller than or equal to one for the power systems).

The second coefficient (C 2 )
This coefficient characterises "criterion 2" and "criterion 5" and shows that to which extent the degraded performance limit is violated. If the worst performance after the disaster (Q min ) is greater than or equal to the degraded performance limit (Q min,l ), the limit is not violated, the correction is not needed and the coefficient is equal to one. However, if the degraded performance limit is violated, this coefficient is defined as the ratio of the permissible performance loss to the real performance loss. The more the limit is violated, the more C 2 will be decreased.

The third coefficient (C 3 )
This coefficient characterises "criterion 3" and "criterion 5" and represents the time duration at which the degraded performance limit is violated. If the worst performance after the disaster (Q min ) is greater than or equal to the degraded performance limit (Q min,l ), the limit is not violated, the correction is not needed and the coefficient is equal to one. However, if the limit is violated, the duration of degraded performance limit violation, or the violated duration time (ΔT cr ), is computed first. Then, C 3 is defined as the ratio of the non-violated duration time (the control time minus the violation duration time) to the control time. The more the violated duration time increases, the more C 3 will be decreased.

The fourth coefficient (C 4 )
This coefficient characterises "criterion 4" and "criterion 5" and represents the speed of performance recovery after a predefined period. In this coefficient, T rd is the desired recovery time, Q d (T rd ) is the desired performance at the desired recovery time (desired recovered performance), and Q(T rd ) is the real performance at the desired recovery time. If after the "desired recovery time" (T rd ), at least performance is recovered to the "desired recovered performance", the correction is not needed and the coefficient is equal to one. However, if this requirement is not met, C 4 is defined as the ratio of the real to desired performances at the "desired recovery time". It can be seen from Equations (5) to (10) that the proposed area-based resilience metric depends on the three following parameters: 1) degraded performance limit (Q min,l ); 2) desired recovery time (T rd ); 3) desired recovered performance (Q d (T rd )); These parameters must be determined in each country by the government policymaker considering the various aspects of the society (political, social, economic etc.) and the disaster properties. Thus, any disaster management activity and any resilience evaluation study in the different parts of the country must be accomplished based on the coordinated and identical goals (parameter values) which are made by the government policymaker at the national level.
The proposed resilience metric can consider the resilience enhancement strategies indirectly through their effect on the system performance curve. Thus, it can be used to compare the resilience of a power system with and without applying a specific resilience enhancement strategy, or to compare the resilience of a power system when alternative resilience enhancement strategies are implemented. The proposed resilience metric combines the advantages of the area-based resilience metrics with the advantages of considering the government policymaker criteria, which is a new idea. This metric is very simple compared with the area-based resilience metric presented in [92], and considers the variations of the performance curve with time, which is not considered in [38, 64, 95-98, 104, 105]. To the best of the authors' knowledge, no reference in the critical infrastructure resilience literature or the power system resilience literature summarised and considered all of those government policymaker criteria, whereas all of them are applied in the proposed resilience metric. Although the new area-based resilience metric is essentially developed for the power system resilience evaluation, the metric formulation is completely general and it can be used for quantifying the resilience in other types of critical infrastructures.

SIMULATION RESULTS
In order to compare the new and conventional area-based resilience metrics in the power systems, a power system performance curve is needed. This performance curve may be obtained from the historical data of the previous disasters (for the resilience evaluation in the past), or may be predicted using the modelling and simulation (for the resilience evaluation in the future). We here select the first approach for the two following reasons: 1) The simulation results for an actual power system which is exposed to an actual disaster are more tangible for the reader. 2) Although our study is dedicated to the resilience evaluation of the power system, which is only one of the critical infrastructures of the society, the actual power system performance curve is the net effect of the various factors which exist before, during and after the disaster, including the disaster management activities and the interdependence with the other critical infrastructures.
Thus, we evaluate the proposed area-based resilience metric using a prominent actual case: The U.S. power system during the Superstorm Sandy. The Superstorm Sandy hit the US east coast from October 28 to November 7, 2012 and led to significant damage to the power distribution system. The path of the Superstorm Sandy was completely different from the previous hurricanes since 1851, and the weather predictions regarding Sandy's path and intensity were inaccurate. Thus, the utilities were not sufficiently prepared for the Superstorm Sandy, which led to extensive power outages in the 21 U.S. states, including Connecticut, Delaware, District of Columbia, Illinois, Indiana, Kentucky, Maine, Maryland, Massachusetts, Michigan, New Hampshire, New Jersey, New York, North Carolina, Ohio, Pennsylvania, Rhode Island, Tennessee, Vermont, Virginia and West Virginia. During the Superstorm Sandy, approximately 8 million customers (20 million people) were without power within those 21 states, and the worst situation was in states such as New Jersey, New York, Connecticut and West Virginia [1,106,107]. The resilience analysis of the U.S. power system during the Superstorm Sandy is done in some previous works [106,108,109] from different viewpoints, and we present our resilience evaluation from an independent and new viewpoint. We present our resilience evaluation for a base case first (Section 4.1), and then sensitivity analysis is done for the three parameters which are related to the government policymaker criteria (Section 4.2).

The resilience evaluation for the Superstorm Sandy: Base case
For the U.S. power system, we define the ratio of "the number of customers with power" to "the total number of customers served" as the system performance, which is expressed in percent. In other words, the system performance is the proportion of customers with power which is expressed in percent. The customer outage data for all 21 impacted states are taken from U.S. DOE official reports during the Superstorm Sandy (from October 28 to November 7, 2012), where two reports are available for each day [107]. The above-mentioned FIGURE 6 The system performance curve during the Superstorm Sandy in New Jersey

FIGURE 7
The system performance curve during the Superstorm Sandy in New York system performance is defined based on the data available in these reports. However, if the required data are available, the system performance may be defined in other forms, such as the ratio of "the supplied load" to "the total demand", or the ratio of "the weighted sum of the supplied loads" to "the total demand" where the weight coefficients are computed based on the load importance. The duration of the control time, or the period of study (T LC ), is 235 h. The total number of customers served for all 21 impacted states are taken from the EIA data in 2012 [110]. Figure 6 shows the performance curve for New Jersey, which is one of the most severely impacted states during the Superstorm Sandy. It can be seen that at the end of the period of study, the final performance is approximately 90% (i.e. 10% of customers are not recovered yet) and the worst performance is approximately 33%. Figure 7 shows the performance curve for New York, which is another severely impacted state during the Superstorm Sandy. It can be seen that at the end of the period of study, the final performance is approximately 97% (i.e. 3% of customers are not recovered yet) and the worst performance is approximately 74%. Figure 8 shows the performance curve for Connecticut, which is another severely impacted state during the Superstorm Sandy. It can be seen that at the end of the period of study, the final performance is 100% (i.e. all customers are fully recovered) and the worst performance is approximately 61%.
However, there are other states with limited or minor damages. Figure 9 shows the performance curve for the total FIGURE 8 The system performance curve during the Superstorm Sandy in Connecticut   FIGURE 9 The system performance curve during the Superstorm Sandy for all of 21 impacted states of 21 impacted states. It can be seen that at the end of the period of study, the final performance is approximately 99% (i.e. 1% of customers are not recovered yet) and the worst performance is approximately 87%. Table 1 shows the most important characteristics of the system performance curves for each impacted state and the total of 21 impacted states. It can be seen that when an actual power system is exposed to an actual disaster, some practical issues are appeared in the system performance curves. In fact, the final performance may be different from the initial performance, and the worst performance after a disaster may be very large. Thus, it is required to consider these practical issues (i.e. the government policymaker criteria in Section 3) in resilience evaluation, as we proposed in our new resilience metric.
For the resilience evaluation, as a base case, it is assumed that Q min,l = 80%, T rd = 5 days (120 h) and Q d (T rd ) = 95%. These quantities are selected based on in-depth analysis of the data which are presented in [15,38,92,96,102,106,108,109]. In Table 2, the old resilience metric (RI Old , based on Equation (5) with C = 1) and the new resilience metric (RI New , based on Equations (5) to (10)) are computed for each of the 21 states and the total of 21 states. It can be seen that the new and old resilience metrics are the same for 15 states, but they differ for the other 6 states and the total of 21 states. The difference is related to the states with the most severe situation where the government policymaker criteria are violated. The new resilience metric is always smaller than or equal to the old one (i.e. the coefficient C is smaller than or equal to one) and the amount and percentage of the difference are also provided. The most difference is in states such as New Jersey, Connecticut, New York and West Virginia. However, the old resilience metric variations are in a relatively narrow interval within the 21 states. The value of RI Old in the New Jersey and New York, as the most impacted states during the Superstorm Sandy, is 0.698451 and 0.881341, respectively, whereas in Illinois, as a state with minor damage, RI Old is 0.99999. This small difference between the severe and nearly intact states may be less meaningful and tangible for the government policymakers who are concerned with the power system resilience evaluation, and their criteria cannot be applied using RI Old . By contrast, RI New variations are in a relatively broad interval within the 21 states. The value of RI New in the New Jersey and New York, as the most impacted states during the Superstorm Sandy, is 0.041875 and 0.458073, respectively, whereas in Illinois, as a state with minor damage, RI New is 0.99999. This large difference between the severe and nearly intact states is more meaningful and tangible for the government policymakers, and they can find their criteria applied using RI New .

4.2
The resilience evaluation for the Superstorm Sandy: Sensitivity analysis for the government policymaker criteria As mentioned before, the government policymaker criteria are considered in the new resilience metric. This metric uses three important parameters, including the degraded performance limit (Q min,l ), the desired recovery time (T rd ) and the desired recovered performance (Q d (T rd )), which must be determined by the government policymaker. In this section, we perform a sensitivity analysis for these parameters to show their effect on the proposed area-based resilience metric, and the flexibility provided for the government policymakers to consider and apply their criteria. We consider three policies for selecting those parameters: 1. Easy policy; 2. Normal policy; 3. Hard policy; The base case analysis presented in Section 4.1 is based on the "normal policy" for the three aforementioned parameters. The  parameter's values for the three policies are shown in Table 3. These quantities are selected based on in-depth analysis of the data which are presented in [15, 38 92, 96 102, 106 108, 109]. For each sensitivity analysis, it is assumed that only one parameter is changed and the other two parameters are according to the normal policy (base case). The sensitivity analysis results for parameters Q min,l , T rd and Q d (T rd ) are presented in Tables 4-6, respectively. It can be seen that RI New for the easy policy is greater than or equal to the normal policy, and for the normal policy is greater than or equal to the hard policy. In addition, if the normal policy replaced by the easy policy, it is possible that some or all of limit violations in the normal policy are removed, which means that RI New may be the same as RI Old . This case can be seen in Table 4 for Rhode Island. By contrast, if the normal policy replaced by the hard policy, it is possible that some or all of limits are violated, although they are not violated in the normal policy. This means that RI New may be the same as RI Old for the normal policy, but they are different for the hard policy. This can be seen in Table 4 (Maine, Maryland and New Hampshire), Table 5 (New Hampshire) and Table 6 (Ohio). However, there are other states where RI New is always equal to RI Old and this is not changed under the easy, normal and hard policies for three parameters, including Delaware, District of Columbia, Illinois, Indiana, Kentucky, Massachusetts, Michigan, North Carolina, Tennessee, Vermont and Virginia. In these states, the final performance is the same as the initial performance, and the worst performance is usually high, as it can be seen from Table 1.
It is obvious that in actual power systems, determining the three parameters related to the proposed area-based resilience metric (Q min,l , T rd and Q d (T rd )) is not straightforward. For example, if Q min,l is selected according to the easy, normal and hard policies (70%, 80% and 90%, respectively), all disaster management activities must be done considering the selected policy. Then, an easy policy needs less cost for the disaster management activities, and the social unrest may however be more probable. By contrast, a hard policy needs more cost for the disaster management activities, and the social unrest may however be less probable. Thus, a normal policy which compromise between these two aspects (and the other required aspects) may be convenient. However, it should be noted that the parameters given in Table 3 are hypothetical and for illustration only, and the values assigned to the normal policy must not be interpreted as our recommendations. In fact, determining these parameters is a complicated problem which needs extensive studies by the national government agencies considering the various aspects of the society (political, social, economic etc.) and the disaster properties.
According to the simulation results, it can be inferred that the proposed area-based resilience metric is very simple, can address the practical issues which are related to the actual power systems and disasters, and is more tangible and meaningful for the government policymakers since their criteria can be applied in the resilience evaluation in an easy and flexible manner.

CONCLUSION
In this paper, it is shown that the most important power system resilience metrics are taken from the critical infrastructure resilience literature, which have a very long history. The paper concentrates on the "area-based" resilience metrics, which are reviewed first based on the critical infrastructure resilience literature. Then, some drawbacks and problems regarding those metrics are mentioned, which are related to the practical issues that exist in the actual critical infrastructures (e.g. in the actual power systems) and are very important from the government policymaker's viewpoint. Considering a critical infrastructure performance curve, the government policymaker expectations means that the system performance curve must be limited in terms of the worst performance and the time duration it takes in the impermissible performance. In addition, the relation between the final and initial performances must be considered, and after a predefined duration from the disaster occurrence, a minimum performance must be recovered. These government expectations are summarised in five points for the power system resilience evaluation, which we call them the "government policymaker criteria". Then, a new area-based resilience metric is proposed to consider the aforementioned government policymaker criteria in the power system resilience evaluation. In this metric, the conventional area-based resilience metric is multiplied by four coefficients that correspond to the government policymaker criteria. The proposed area-based resilience metric is evaluated using the real data that shows the effect of the Superstorm Sandy on the power system in the 21 U.S. states. The new and old (proposed and conventional) area-based resilience metrics are then computed and compared with each other. The sensitivity analysis is also accomplished for the three parameters which are related to the government policymaker criteria (degraded performance limit, desired recovery time, desired recovered performance) by considering three policies: easy, normal and hard. The simulation results show that the new area-based resilience metric is very simple, can address the performance curve issues regarding an actual power system which is encountered with an actual disaster, and is more tangible and meaningful for the government policymakers since their criteria can be applied easily and flexibly in the power system resilience evaluation.
Although the new area-based resilience metric is proposed for the power system resilience quantification, it has a general form that is not restricted to this infrastructure type. Thus, the proposed area-based resilience metric can be used for the resilience quantification in other types of critical infrastructures.