A review : Monitoring situational awareness of smart grid cyber ‐ physical systems and critical asset identification

Alrow aili, Yazee d , S axe n a, N e e t e s h , S rivas t ava, Anur a g, Conti, M a u ro a n d Bur n a p, Pe t e r ORCID: h t t p s://o rcid.o rg/00 0 0-0 0 0 3-0 3 9 6-6 3 3X 2 0 2 3. Review: Mo ni to ring si tu a tion al a w a r e n e s s of s m a r t g rid cyb e r-p hysical sys t e m s a n d c ri tic al a s s e t ide n tifica tion. IET Cybe r-P hysical Sys t e m s: Theo ry & Applica tions 1 0.10 4 9/cps 2.12 0 5 9 file

structures. Then, the information collected can be analysed to identify problems inside these systems before they occur, providing comprehensive knowledge about the system and helping to build advanced alert capabilities.
To begin with, one of the various types of CPSs is energy systems. Moreover, these systems are considered extremely critical for two reasons: (1) any cyber incidents that happen on these systems can directly affect the safety of human life and (2) the cyber threats on these systems can affect the utility and/ or a nation from an economic perspective.
Motivation. We highlight a few cyber threats case studies that occurred on different CPSs, which show why protecting them is essential. Musleh et al. [3] discussed significant cyberattacks that occurred in the energy sector. In 2007, in Idaho, USA, an exploded generator caused by an Aurora attack manipulated a circuit breaker of a diesel generator. In Turkey, in 2008, 30,000 barrels of oil spilled into the water after an explosion occurred due to an attack that manipulated the control system parameters of an oil pipeline. In 2012, in Saudi Arabia, the generation and delivery of energy were severely affected by a malware injection that targeted the Aramco company. As we can observe from the previous case studies, securing such systems is paramount. These systems are not just regular systems but can be considered a backbone for a nation's economy. The increase in cyber threats that target these systems have raised these concerns. Likewise, in 2015, a cyberattack in Ukraine targeted three distribution substations due to unauthorised entry into the company's Supervisory Control and Data Acquisition (SCADA) system. This attack was highly severe because it caused a blackout affecting 225,000 customers for several hours in 103 cities [4]. Therefore, a cybersecurity threat on these systems can continue to affect the economy and many lives. Lastly, another recent incident in 2021 was about an attacker gaining unauthorised access to the Human-Machine Interface (HMI) located in a Florida water treatment plant. Moreover, the attacker tried to adjust the sodium hydroxide level from 100 ppm to 11,100 ppm, easily affecting human life falling under the attacked water supply network [5]. To conclude, these systems need the appropriate mechanisms to capture and analyse real-time data through SA, identify visible and invisible attacks and apply the best remediation plan.
Context. In this study, we consider energy systems as a case study. Currently, the energy sectors use Smart Grids (SGs) as one of their energy resources through electricity. The SG systems have critical procedures to generate, transmit and distribute energy, offer a mutual flow of electricity and provide information that can help create automated and distributed energy delivery networks [6]. Moreover, SGs use two-way communication in all their processes, from generation to distribution, providing numerous advantages to both consumers/producers. Nevertheless, procedures performed on these systems need to be monitored and controlled by devices specially designed for Industrial Control Systems (ICSs). Colbert and Kott [7] define ICSs as a set of various control systems and equipment that contain hardware, software and networks that operate and automate industrial processes. Still, the primary security goals in both OT and IT are not the same; OT devices are time-critical. The primary security goal of such a system is to provide availability at any time when needed. Therefore, SG systems are considered CPSs, and there is a high demand to secure such systems to avoid any adverse human or economic consequences. Furthermore, in this study, we mainly covered survey papers that delve into tools, techniques and simulations applied to the SG. These papers focus on asset discovery, identification of cybersecurity issues, and assessment of risks and failures where the impact on power components (physical assets) is illustrated. These points are highly relevant to the aim of our paper. We have explored journal articles, peer-reviewed international conferences, security blogs and books to refine and refer our work to offer constructive arguments. We are aware that there are additional studies conducted on other CPSs, such as oil and gas, and transportation network systems. However, due to the different physical functionalities and requirements within these CPSs, these studies are considered outside the scope of this study.
Challenges. Cyber-Physical Systems are considered different from other traditional systems because CPSs utilise ICS to perform specific operational and industrial physical tasks. These systems have attracted the attention of adversaries due to the high frequency of targeted cyber threats. Furthermore, monitoring such systems for any cyber threats has numerous challenges. One of the significant challenges in such a system is that the ICS assets are different from any IT assets, making them hard to deal with, secure, explore and identify. Moreover, some of these assets are considered legacy systems that can be incredibly difficult to deal with regarding security; for instance, Programmable Logic Controllers (PLCs), Intelligent Electronic Devices (IEDs) or even sensors can be undetectable because these assets are so out of date [8]. Another challenge is that these assets are time-critical with a limited process and have specified communication techniques, making them highly sensitive to any exploring technique (passive, active or hybrid). Unlike traditional IT assets, any unnecessary usage of the discovery tool can affect the entire system's performance [9].
It is also essential to classify all identified assets based on their criticality. Such classification is needed since a CPS is built to satisfy a specific need. The assets in SG systems have their characteristics. For example, one of the different assets of SGs is transformers, which transform voltage into either step-up/ step-down, and the failure to protect such an asset is not tolerated [10]. A further challenge is that it is vital to identify all possible cyber threats in such a system. Furthermore, the identified threats should be classified based on the assets' criticality, which must be handled immediately. For instance, targeting a critical asset with a simple attack should not be acceptable at all [11]. Such a system needs complete knowledge about the entire procedures, devices, users, resources and policies, which can be used to build the advanced alert capability. Moreover, it is challenging to have massive data from different systems analysed using different approaches.
Analysing data and indicators from different sources can highly enhance the SA of these systems that can better build a system with advanced altered capabilities [12].
Existing Review Papers. Before conducting this review paper, we focussed on exploring different related papers to ensure that our work provides a comprehensive overview of the state-of-the-art studies in CPS security. Initially, several informative and knowledge base works were carried out and well represented for securing CPSs. Firstly, [13][14][15][16][17] provides a well-defined survey on SG security and challenges, providing a deep understanding of security issues and solutions in the SG. Secondly, some survey papers have focussed on providing a specific comprehensive overview of SG networks [18][19][20]. These papers focussed on illustrating different network architectures that can operate ICSs, what available protocols need to be implemented and their limitations. Thirdly, one important aspect is to explore tools available in this field to help monitor SA in these systems. It should be considered that some work mentioned before has discussed the steps of monitoring SA. However, only a few works have conducted tools and techniques for constructing SA. The authors in Refs. [21][22][23] were focussed on listing tools used in CPSs for many purposes. Yet, these papers lack the exploration of other tools at each step to identify, monitor, alert and assess.
Our Contributions. While all survey papers mentioned above have covered SG aspects such as security, network, tools and threats, several limitations have been identified. One issue is that there is a need for comprehensive work that covers and links these available tools and techniques to utilise them to improve security and robustness for effective SA. Another issue is that while various papers were conducted to evaluate critical assets in the IT field from different perspectives, such as complex communication networks, vulnerabilities, physical security etc. In Refs. [24][25][26][27][28][29], there is little focus on evaluating OT assets, which can be considered an important issue that needs to be covered to ensure that the field of CPSs is equivalently secured from both OT and IT aspects. Additionally, while conducting this research, it has been found that there are no relevant existing review papers exploring studies for evaluating asset criticality in different CPSs. Therefore, the main contributions of this review paper are as follows.
� The paper provides a comprehensive background of available research and techniques for building a SA platform by exploring asset discovery in ICSs, their available tools, techniques and limitations. The evaluation deeply explores asset identification, ICS communication protocols deployed and vulnerability detection, which can help recognise and utilise the appropriate tools available in this field.
� It evaluates existing techniques that can be used to classify these assets based on their criticality and lists the most important significant cyber threats that can occur in these systems.
� The paper reviews the solutions and simulation environments used for critical assets identification in power system CPS. More specifically, the paper discusses available power systems, network simulation and integrated CPS simulation tools that are used to create a co-simulation to mount cyberattacks and evaluate the consequences on assets and on the physical system.

| SYSTEM AND RELATED TERMINOLOGIES
This section provides a unified base that can be used to illustrate the CPS's infrastructure. Moreover, the section continues to introduce any terminologies that can appear when exploring related work, tools and solutions that have been implemented in such systems.
� Cyber-Physical Systems can be defined as the integration between computation and physical processes. CPSs use computer hardware, software and a communication network to control physical processes in manufacturing and other fields [30]. Moreover, based on the National Institute of Standards and Technology Framework for CPSs [31], the SG is considered one of the many implementations of a CPS due to its heterogeneity environment and the need to determine positive emergent behaviours.
� Situational Awareness can be described as the perception of the current state and its consequences for the present and the future [32]. In terms of CPS cybersecurity, SA is used to capture and understand the threat information of both IT and OT infrastructures, identify a comprehensive, real-time view of cyber threats on key components and gain knowledge on potential actions an adversary can take to target assets [33]. SA can be vital to propose the best remediation plan to avoid possible cybersecurity consequences that can occur in such systems.
� Operational Technology can be characterised as all resources used to monitor any physical process. OT contains ICSs or consists of industrial resources such as factories, machines and networks [34]. Moreover, IT has been integrated with OT to monitor and control physical processes inside CPSs using IT technologies in recent years.
� Smart Grid is an electricity network that can be operated in an automated manner by enabling digital technology to efficiently supply electricity to consumers, as illustrated in Figure 1 [35]. Moreover, based on its definition, SG is considered an implementation of a CPS where a combination of OT equipment is integrated with IT infrastructure.
� Industrial Control Systems as shown in Figure 2, ICSs are a combination of several control systems that work together to achieve an industrial objective, such as energy distribution and manufacturing [36]. Additionally, these types of control systems can be listed as follows: � Supervisory Control and Data Acquisition (SCADA) is the main element of the ICS network. Moreover, these systems receive measurement data and monitor and control field equipment in real-time based on predefined control commands [37].
� Human-Machine Interface (HMI) can be considered the main link between the operators and the ICS process.

| ASSET DISCOVERY AND IDENTIFYING CYBERSECURITY ISSUES
Comprehensive information about OT and IT resources and physical access control systems will help identify problems and issues inside CPSs. Additionally, Multi-Intelligent Body Systems (MIBS) can be used in CPSs to enhance their performance and efficiency. In SA, MIBS can offer real-time monitoring and analysis of various data streams, and applying advanced analytics will allow multiagent systems to identify behaviour and anomalies in the data [38,39]. The data collected from OT, IT, physical access control, and MIBS can be used to have a comprehensive architecture that provides SA to analyse, capture and store real-time or near real-time data. According to the ICS Information Sharing and Analysis Centre (ICS-ISAC), the four components that can be applied to build SA for ICS are as follows: identify (the objectives, structure and skills of an organisation), inventory (the available hardware/software assets of an organisation), activity (of assets owned by the organisation) and sharing (internal and external communications) [40]. Moreover, all four components described above are categorised into the main three stages of the SA [41] as shown in Figure 3.
The main scope of this study is to provide a comprehensive review of available literature and tools for identifying critical assets, which is considered an important step for constructing a valuable SA. Furthermore, our approach is motivated by studies that covered critical asset identification in cyber-physical SG [42][43][44], also by including asset interdependencies between different layers inside the SG (e.g., cyber, and physical layers) or between connected CPSs (e.g., SG, and oil and gas) [45][46][47][48]. Therefore, this section is divided into four subsections based on the flow chart presented in Figure 4. The first subsection is a comprehensive review of the literature and tools used to identify assets, system components and interdependencies for constructing SA. The second subsection explores the techniques used to assess risks (vulnerabilities, threats and consequences). The third subsection examines different methods to prioritise assets by classifying them based on their criticality. The fourth subsection covers future directions, open issues in this field, and lessons learnt. Lastly, Section 4 discusses the final step illustrated in the flow chart, which utilises simulations to evaluate impacts once a critical asset has been compromised/identified.

| Identify assets, systems and networks
To gain comprehensive knowledge for developing SA, one essential step that needs to be implemented in any environment is identifying all assets, components and networks that exist in systems. Moreover, asset discovery can help systems   ensure a robust recovery process, maintain security configuration and manage patches for many software/hardware [49]. Figure 5 shows an example of asset inventory within an SG environment. However, discovering assets in OT resources inside SG systems is not the same as in IT; these are timecritical. Some are not connected to the network and cannot be discovered with traditional discovery tools [50]. Hence, choosing the appropriate tools available in the market with proper asset discovery techniques that must not affect the process is essential [51]. These systems are time-critical, and failure to operate them is not tolerated; the inability to use them has the potential to result in severe complications, including blackouts, serious injuries to individuals, service interruption and power overloading. Smart Grid shares common components, such as PLCs, Remote Terminal Units (RTUs), SCADA, and HMIs, which can be found in other CPSs. Table 1 illustrates three techniques used to identify and discover the common assets that can be found in SG and other CPSs. Furthermore, the three techniques can be listed as active, passive and hybrid scanning.
Each technique has its characteristics and approach for finding different system assets, and these features sometimes can affect the system's main mission. For instance, the active scanning technique expects full system information from the device as a response, which sometimes distracts the device from performing the main task, causing latency in the overall procedure [60]. Several studies argue that passive scanning is the safest option to discover ICS/OT resources in such an environment [55,[61][62][63]. These studies argue that passive scanning can only listen without intercepting or asking for a response from assets. However, passive scanning can only bring limited information into account and cannot discover an asset that is not connected directly to the network system [59]. Moreover, passive scanning cannot identify non-IP assets in such an environment [21]. Other studies argue that active scanning can be implemented in CPSs without affecting the system functionality [52,59]. Additionally, they maintain that it is possible to discover assets actively by using native ICS protocols such as Siemens s7comm (S7 Communication) and Modbus, but this approach is not suitable for handling any sensitive assets [59]. Other providers [64,65] argue that a third option, a combination of active and passive scanning, can be implemented using passive scanning to gather data from network traffic while sending out active queries as needed. This technique, if applied appropriately, can provide useful information without interrupting any ICS assets from performing their main tasks.
Lastly, choosing an execution technique for asset discovery depends on the organisation's goals and the requirements or information needed to be collected from the system. These discovery techniques can be applied by deploying them into specific tools. One main challenge is the need to differentiate between asset discovery tools applied to OT and IT. Therefore, Table 2 illustrates a broad review of the asset discovery tools used for ICSs and maps them based on available features such as vulnerability detection, any knowledge (e.g., manufacture, system OS, last update, etc.) provided, ICS protocol applied and system visualisation. Furthermore, some listed tools can be integrated easily with other tools, which helps provide comprehensive asset information for organisations. This integration is vital for collecting information to build SA. Additionally, it should be highlighted that some tools mentioned in    Table 2 are marked as a hybrid because these tools can use both active and passive in conjunction or perform both methods independently each time they are deployed.
Based on the steps identified by ICS-ISAC for building SA, this section has discussed the first two segments, identify and inventory. Nevertheless, activity and sharing are yet to be covered for developing a complete SA architecture. These two remaining elements are vital for understanding connections and activities between assets identified in CPSs, which in the cybersecurity context can help recognise that direct and indirect elements are most likely to be compromised and provide the ability to prioritise key components. One approach for demonstrating connectivity between assets is dependency modelling. According to the authors in Ref. [106], dependency can be defined as the correlation between two or more elements in which modifying a component's state can cause changes to other components' conditions. Dependency modelling can be either component-driven or system-driven, depending on the levels of abstraction defined in the chosen risk management technique [107]. Dependencies can be constructed on several relationships that components have, and according to the authors in Refs. [108,109], some of these can be listed as follows. � Input/Output Dependency: This can arise when a component needs to request/deliver information from another component.
Various works have been presented for dependency modelling in several sectors of CPSs [110][111][112][113]. Though this review focussing on defining critical components in SG CPSs, Table 3 provides studies conducted using component-based dependency modelling based on their connections. Furthermore, the table categorises each work based on the approach implemented, the type of relationships deployed and their limitations.

| Assess risks and failures
Another step needed to be implemented is to assess risks and failures assigned to all types of assets identified within an organisation. Assessing risks and possible failures is vital for building and moving to the next stage of SA. It enables security analysts, business decision-makers or operators to perceive and comprehend the current situation [121], which will help analysts prioritise their resources based on the achievement of desired goals and objectives. Therefore, this subsection reviews possible techniques for identifying risks and failures in CPSs. Table 4 illustrates several works that have been done to apply different risk/failure techniques inside the SG system. Moreover, it should be noted that the risk is a result of likelihood (threats exploiting vulnerabilities or potential failures) multiplied by consequences (the impact specified for an organisation, e.g., business, operations, environment, economic and safety) [142]. However, one challenge is reducing the uncertainty of a consequence. One solution is that this can be done by including as much information as possible, such as professional judgement, models, expectations, datasets etc. into the risk computation. Another approach to reducing uncertainty was proposed by the authors in Ref. [133], which considered several scenarios (including their description, likelihood and potential impact) that could be used to answer three questions: What can happen? How likely? and What is the impact if the scenario successfully occurred? The illustrated techniques were evaluated based on their application, background knowledge, scenarios included and the ability to create undiscovered scenarios.

Description Advantages Limitations
Active scanning [52][53][54] A technique that uses active network communications to identify devices in the environment.
Identifies assets with more valuable information for ICSs. Lock up network interfaces and exhaust CPU resources.
It can be implanted with stand-alone tools or even commands into the network. It can introduce additional latency into the environment.
Does not actively poll network-connected devices.
Any device not communicating with the passive discovery sensor will go undiscovered.
It can prevent devices from becoming unstable or crashing, leading to a denial of service and posing an immediate safety risk for humans and the environment.
The usefulness of a deployed passive scanner would be greatly reduced if other communications (e.g., radio connections, modems) were used.
Hybrid scanning [58,59] Combines active and passive methods, using active techniques as an enabler for passive techniques.
An example of this would be to use the address resolution protocol (ARP) technique to force all traffic on a network through a central host, allowing the host to be detected and classified by a passive sensor.
It is possible to inherit the same disadvantages that exist in active scanning if implemented wrongly.    [114] The work proposes an approach for modelling the intraand inter-dependency of a micro-distribution network. Additionally, four parameters were proposed: the impact of dependency, the susceptibility of dependency, the weight of dependency and the criticality of dependency for quantitative assessment of the characteristics of dependencies.

✓ ✓
This work considers buses only as electrical nodes in addition to routers and multiplexers as information and communications technology (ICT) nodes.
[115] This work presents a security model that can show the privilege states in a large architecture and evaluates possible paths that attackers could exploit. Moreover, the quantitative information produced from the model is used to identify information dependencies to enhance the risk management processes.

✓ ✓
The proposed work focusses on privileged states. There is little focus on identifying critical nodes, and the main goal is identifying information dependencies only.
[116] The work investigates the cause-and-effect dependency between the SG and ICT components by categorising and defining the state of the SG assets and their impact once an ICT component fails.

✓ ✓
The main aim of this work is to show the dependencies between ICT and power components if ICT nodes fail. There is little effort in modelling power components, and no information was introduced regarding compromised nodes from cyberattacks. [117] This work provides an overview of different techniques for modelling dependencies between various critical infrastructure systems. Moreover, it delves into the interdependency approaches at transmission and distribution levels to outline the validity of using these dependency approaches on real-time systems.
✓ ✓ This work covers interdependence between critical infrastructure and shows interdependencies at transmission and distribution levels. Yet, this work is focussed on covering electric nodes only, and there is little focus on cyber nodes.
[118] The work proposes a framework to assess the impact of cyberattacks on SG. Furthermore, this study presents a cause-effect relationship between cyber and physical components.
✓ ✓ This work focusses on showing the physics of the interaction for power nodes and uses the functionality for cyber grid elements. However, in terms of cybersecurity assessments, the evaluation was conducted using unauthorised access only.
[119] This study proposes a wide area measurement system (WAMS) model in an SG combined with graph-based dependency to show the dependency between communication and measurement layers to enhance the SG WAMS resilience.

✓ ✓
This work discusses developing a dependency model for SG. However, its main focus is on WAMS. The presented metrics were related to the measuring and communication of WAMS layers. There is little focus on other cyber assets, and the evaluation was only conducted on PMUs and optical ground wires. (Continues)

| Identifying and ranking critical assets
The next phase that needs to be studied is to monitor CPS's SA regarding asset criticality. Distinguishing key assets in CPSs is vital and will help the organisation develop the best remediation plan in advance to avoid catastrophic consequences. Moreover, it should be considered that implementing this step is a challenge due to the different characteristics of each CPS. For example, one of the different assets inside an SG is a transformer that converts the voltage into either step-up or stepdown, and the failure to protect such an asset is not tolerated; currently, there is no universal method to identify these assets. Therefore, this section discusses identifying critical asset methodologies, listing key requirements and exploring techniques and related work conducted in this field. Initially, based on Ref. [143], two requirements must be met for successful asset identification in critical infrastructure. The first requirement is qualitative, which refers to meeting certain soft criteria to develop an efficient identification methodology. Moreover, The National Infrastructure Protection Plan [144] listed these criteria as completeness, reproducibility, documentation and defensibility, which can be used to evaluate the critical identification methodology of CPSs. The second requirement is quantitative which can be defined as solid measurement criteria for the critical asset identification process. These measures are referred to as the assets weight score, the organisation's mission criteria and scoring guide and the asset identification process, which was covered in 3.1. Therefore, according to the authors in Refs. [143,144], Figure 6 shows a complete map of achieving the following requirements for critical asset identification.
� Completeness: This represents the requirement where an approach provides comprehensive component evaluation (threats, vulnerabilities and consequences) into the many different critical infrastructures.
� Reproducibility: The risk methodology should ensure that the proposed results are qualified and comparable with different sectors, making it easy to evaluate risks against other CPSs.
� Documentation: The methodology must document the approach, techniques and information conducted, any remediation plans applied or suggested and the users involved.
� Defensibility: The proposed risk methodology should be error-free, reducing uncertainty and efficiently integrating its components.
� Elements: Requirements, such as asset identification list, criteria, weighted score to indicate asset criticality, scoring guide and how it is applied, are used as input to the risk methodology.
� Components: These include the scope of the methodology (systematic or unsystematic), the applied approach used (network, function or logic based) and how the information gathered is evaluated. Table 5 illustrates a brief description of common risk assessment and asset identification methodologies. The table Matrix based [120] The work proposes a dependency graph using phasor angles to model SG fault detection. Additionally, phasor angles were driven as random variables in Gaussian Markov random field (GMRF) to determine fault detection and the localisation of transmission lines.

✓ ✓
The work has been conducted to address fault diagnosis in power grids using the conditional correlation matrix of the GMRF. Yet, it lacks identifying cyber nodes, cybersecurity challenges were not included and the evaluation was applied to limited hierarchical power systems nodes.

Technique Description
Fault tree analysis (FTA) [122,123] � It focusses on a system's safety and reliability. � It can be used for observing the impact and likelihood of equipment failure. � It allows experts to be involved in obtaining much background knowledge about the system. � It can reflect the logical relationships and interdependencies between components. � It can show complete information about a malicious attack using FTA when used in a cyber threat context. � Complete information/components about the system are needed. � The core function of this analysis is to predict equipment failure. � As it depends on the probability of failure, it can be integrated with PRA Event tree analysis (ETA) [124,125] � It is a qualitative system analysis similar to FTA. � The key difference is that it considers impacts using inductive reasoning. � It assumes that each event being analysed has two results (success or failure). � As it depends on the probability of failure, it can be integrated with PRA.
Bow-Tie analysis [126] � It is an analysis that is a combination of FTA and ETA. � It uses ETA to reflect the consequences of an incident, while FTA is used to learn what may have caused the incident.
� It can create several unconsidered scenarios that will comprehensively help to understand a system failure.
Attack tree [127,128] � A threat model technique describes how an attacker can attack a target through the network. � Unlike event and fault trees, the main focus of the attack tree is on malicious activities, not failure activities. � ETA and FTA can be linked to attack trees for a comprehensive malicious and failure analysis.
Monte Carlo simulations [129,130] � A random analysis establishes unconsidered points and scenarios that can determine a system's availability and operability.
� As this simulation created random initial values, it requires a long-running time to obtain valid results and deep data. � This technique aims not for a cyber threat context, but it generates random situations that have not been considered before and can include cyber threats.

Failure modes, effects and criticality analysis [131]
� It is a hazard analysis based on skills, engineering best practices and standards. � It can integrate standards and good practice policies and analyse the different approaches for node failure along with its impact.
� As it focusses on specific components, it can also be used to analyse components' criticality. � This analysis focusses on the impact but not on the reasons for the impact, but using FTA at the same time can help with this.
Hazard and operability (HAZOP) method [132] � The main usage is for hazard analysis, and it is the most applied in CPSs for this kind of analysis. � It divides the system into components and will provide a strict assessment of the impact on each one. � It can be used for cyberattack analysis when assuming each component/process is affected by a cyber threat.
Probabilistic risk assessment (PRA) [133,134] � It is a scenario-based approach that utilises three questions as follows: 1. What can go wrong? 2. How likely? 3. What is the impact?
� Its main usage is for accident analysis, and it is not designed for cyberattack analysis.
Markov models [135] � A risk analysis based on Markov models. � It is similar to Monte Carlo simulations and can be used with BN for comprehensive failure and malicious scenarios.
� Using CARVER helps to determine the criticality and vulnerability of components in the system, resulting in a list that has critical asset information.
Game theory [137][138][139] � Game theory in cybersecurity is modelling and analysing the behaviour of attackers and defenders in different scenarios. Game theory helps to recognise the strategies and motives of attackers and develop effective defensive measures.
� Game theory is used to analyse the situation of two or more participants depending on how the context is listed. � The outcome is a useful resource to minimise risk for an organisation and, in contrast, raise the danger for an adversary.
Bayesian network (BN) [140,141] � A conditional probabilistic method based on theory depends on a piece of evidence to validate whether a specific scenario is possibly true, fully true or false.
� Unlike PRA, this technique can consider the context of cyber threats. � The main difficulty of BN is that it needs evidence to calculate its belief but obtaining real-time evidence in a CPS is quite difficult. also maps the listed methodologies to the abovementioned requirements in Figure 6. Nevertheless, aiming to secure and identify critical assets in power systems, The North American Electric Reliability Corporation Critical Infrastructure Protection proposed a set of standers applied specifically to cybersecurity. The main aim of these standards is to legalise, implement, monitor and control the security of power systems [145,146]. NERC CIP consists of nine rules, and one significant rule is critical cyber asset identification CIP-002. Moreover, this standard requires identifying critical assets through implementing risk assessment methodologies, and assets will be declared critical if any compromise may threaten electric reliability [147]. Table 6 illustrates a survey of several works that researchers have conducted to prioritise and classify assets based on their criticality in the SG. The studies analyse and justify their work from different perspectives, such as business, impact and equipment health. In addition, they continue to validate their proposed methods by conducting different scenarios, such as natural fails, operator miss behaviour and, most importantly, cyberattacks.

| Lessons learnt
Determining the best technique for asset discovery in CPSs is not straightforward. Asset discovery depends on the characteristics that the organisation defines. For instance, IT asset identification can be used with passive and active techniques, but system availability and latency should be considered when dealing with OT assets. Recently, new techniques have surfaced that use ICS protocols to deal with OT, which can be considered promising. As mentioned in section 3.3, some critical asset identification requirements are criteria and asset weight scores, which should be chosen based on their distinctive characteristics. Consequences and impact on CPSs are focussed in a specified system area, for example, SCADA, PLCs, and HMI, but there is little focus on the lower levels (physical process), that should be explored in great details. Risk techniques have different aims and goals, yet they should be chosen to meet two main goals: (1) obtaining as much threat knowledge as possible and (2) including more scenarios that are not discovered yet, which can enhance the SA process in these systems.

| MODELLING, SIMULATION AND EVALUATION
Information aggregated from CPSs, such as system components, assets criteria, interdependencies and critical identification, needs to be validated in order to evaluate the situation in different scenarios. Moreover, in the case of the SG, it is impossible to validate proposed approaches for risk assessment in real-life systems [169]. However, the possible way to do that is by applying them to SG simulations that simulate power from generation to distribution, similar to a real case of SG. Likewise, an appropriate simulator should consider three aspects: power, cyber and transmission. Table 7 lists the existing simulation tools that can be used to cover the power system, cyber security and communication network aspects in the SG. These aspects in one simulator can help present the SG systems and their communication, physical devices, protocols and control centre systems. Smart Grid simulations can help to explore complex attacks similar to the real world, for example, the Ukraine power grid cyberattack [4], which helps recognise the impacts and assess components that are most likely to be targeted and affected in such a critical system. Therefore, this section lists tools used to simulate power systems, communication networks, control centre systems and related work conducted to apply attack scenarios and failure analysis into SG simulations that aim to validate asset criticality.

| Electric power simulators
This subsection illustrates different tools available for power simulators that are used for power generation, transmission and distribution, similar to the real-life power grid.
� PowerWorld: A commercial tool used for power system simulators provides a comprehensive simulation for generation, transmission and distribution. Furthermore, Power-World gives the user different options, for example, choosing fuel types for generators and specifying maximum and minimum power transmitted, which can be used to mimic real-life scenarios for SG. PowerWorld can be integrated with other tools using SimAuto, making the case accessible through MATLAB, Python and Visual Basic [170][171][172].
� PSS: A commercial power system simulator (PSS) tool was developed by Siemens that includes different solutions to   [151] The paper discusses the development of three criticality metrics by modelling an attacker's opportunity to compromise several hosts. The work also presents a system design to validate each metric in real-case scenarios.
✓ ✓ ✓ ✓ Three novel metrics were presented, yet the focus was only from an attacker's opportunity perspective with no focus on the lower levels.
[152] The work proposes critical asset classification in industry 4.0; primarily, the classification is based on the business impact while a cyberattack compromises the system. Additionally, the paper introduces a correlation between critical assets and the business impact for improved decision-making in cybersecurity policies.

✓
This work implemented several metrics to evaluate critical assets, but it mainly focussed on potential business impacts.
(Continues)  [148] It is used by the department of homeland security for critical infrastructure security and resilience by assessing and analysing critical infrastructure threats, vulnerabilities and consequences.
The European programme for critical infrastructure protection (EPCIP) is an optional methodology that enables information-sharing among the European Union (EU) member states and other critical infrastructure protection (CIP) group participants.

DCIP [150]
The US department of defence employs the CIP framework to identify, rank and protect critical infrastructure and its areas from terrorist attacks. ✓ ✓  [153] This paper proposes a critical analysis approach to identify key power lines in power systems for maintenance improvement. Moreover, the work continues to offer automation for the proposed analysis by utilising existing IT solutions in this field.

✓
The context of ICS systems was not identified in this work, and the evaluation was conducted on limited nodes for implementation only.
[154] The work offers a long-term health index prediction for power assets using a sequence learning-based method. Furthermore, the method was assessed using actual utility data for validation and asset health prediction.

✓ ✓
The main limitation of this work was that it focussed on the health index in energy management systems.
[155] The work illustrates different approaches that can be used to identify critical information assets and communication network components. In addition, the paper delves into cascading effects while targeting communication network assets in critical infrastructure.

✓
There was no application of this method nor a specific focus on a specific zone.
[156] This study proposes an ICS security assessment framework based on open-source intelligence. Moreover, the work consists of three stages: Data collection, assessment and ranking critical components using qualitative and quantitative metrics.
This work aimed to build an ICS ranking tool, but it was too general in CPS, making it hard to be applied to the process level since different CPS systems have different factors.
[157] This work offers a multiple attribute decision-making (MADM-based) ranking algorithm that can be used for critical asset identification and ranking. Moreover, another contribution is the multiple vulnerability node rank, which uses vulnerability information for cybersecurity classification.
There was little focus on the physical/process level.
[158] The work offers an analytics approach for modelling asset health and network reliability by predicting the remaining life cycle and the ageing of assets. The information used for this approach is obtained from different sources, such as SCADA, geographic information systems and outage management systems.

✓ ✓
This work covered lower levels of SG, yet the main focus of the classification was based on predicting the ageing of assets.  [159] The paper utilises asset knowledge, such as performance, location and functionality, to predict the trend of the impact of cybersecurity incidents. In addition, the paper examines both component and system levels to determine the propagation of an impact when the system is compromised.

✓ ✓
The evaluation of the proposed method was conducted on a chemical control system, and the impact is quantified based on location and business only.
[160] The work provides an implementation of a proposed approach for asset management in electrical systems. The case was applied to determine critical power transformers within three stages: a) Modelling and estimating temperatures in transformers in the short term, b) estimating the health condition for the medium term and c) estimating the remaining life of the power transformers.

✓
The focus of the proposed framework was mainly on asset maintenance performance and contingencies analysis. There was no consecration of cyber threats or risk assessment.
[161] This work proposes a framework based on the OCTAVE allegro method to rank critical information assets. The framework utilises several decision-support methods, such as simple additive weighting and the analytic hierarchy process, to prioritise risks targeting information assets.

✓ ✓
This work aimed to classify critical information assets, and there was no focus on physical devices or lower levels of CPSs. There was no evaluation of the proposed method.
[162] The work implements the cyber attack impact assessment technique to evaluate the impact of different cyber threats. Additionally, the paper provides numerical results for the implemented approach obtained using a chemical process simulation.
The main aim was specified on the closed-loop process control system. In addition, the evaluation was performed on the chemical process model. [163] This paper provides a solution for identifying key cyber terrains assets. Moreover, the proposed approach explores the dependency degrees among tasks and assets by building a connection between the operational network and the asset vulnerabilities.
✓ This work's main aim was only on cyber assets, and there was no information about network security. Also, the work lacked the identification of physical layer assets or an analysis of their impact.
[164] The work provides an intuitive approach to identifying critical digital assets (CDAs) inside the nuclear reactor domain. Furthermore, the approach was conducted using three different implementations: 1) identifying CDAs with an attack graph tool, 2) identifying CDAs with a purpose-built programme and 3) identifying CDAs with a modified attack graph.

✓ ✓ ✓
This work focussed on the automatic identification of critical assets. Still, one limitation was a need for manual evaluation, and physical layer assets were not covered. Another limitation of the proposed method was that it could not discover assets in unconnected or isolated networks.
(Continues)  [165] This paper focusses on the measurement of the sensitivity levels of enterprise assets using enterprise information security management. Moreover, the measurement process is divided into two stages. Firstly, for data assets, the measurement is based on the sensitivity of the data. Secondly, for non-data assets, the measurement is based on their usage patterns and the attributes of users.

✓
The main aim of this work was focussed on the data sensitivity an asset holds, and the data classification applied was on IT assets only. There was no implementation of CPS or other asset criteria.
[166] The work investigates grid-forming inverters by integrating high levels of renewable energy and distributed energy resources in the power system. This integration can reduce the physical and electrical distance between generation and loads in power systems.

✓
The asset criticality focussed on the physical layer and the interactions between inverter base sources and the bulk-power system. However, it lacked other OT layers, and there was no information about cybersecurity threats.
[167] This paper proposes an enhanced cybersecurity risk management (CSRM) for asset criticality, threat prediction and evaluating existing controls. In addition, the paper utilises different approaches for the developed CSRM, including fuzzy set theory for asset classification, machine learning for risk prediction and the comprehensive assessment model for evaluating existing controls.

✓ ✓ ✓
This work was conducted on a limited range of assets, and there was less focus on physical layer assets. Key performance indicators were defined but on a general basis (availability, confidentiality, integrity, accountability and conformance).
[168] The work aims to present a novel asset-focussed risk management approach for critical infrastructure, with a main focus on asset interdependence and cascading effects. Moreover, implementation is conducted on a running example from an SG system to test the approach validation.

✓ ✓
There was less focus on physical layer assets. Asset criteria were defined on a broad scale. The implementation was on limited assets, with no damage analysis.
integrate them with the simulator. One solution is the PSS SINCAL, which allows communications to be established to achieve transmission and distribution to the grid [173][174][175].
� MATPOWER: This is an open-source tool for power simulation developed using MATLAB language. The tool gives the user the ability to resolve and analyse steady-state power systems and optimisation problems such as Power Flow (PF), Continuation Power Flow, extensible Optimal Power Flow (OPF) and Unit Commitment (UC) along with stochastic, secure multi-interval OPF/UC. Additionally, this tool can be integrated with other tools using PYPOWER [176][177][178].
� DigSilent-Powerfactory: A commercial power simulator tool is used for power, generation distribution and transmission. Moreover, similar to PSS and PowerWorld, DigSilent provides a comprehensive simulation for the SG and can be integrated with other tools, such as MATLAB; however, hardware needs to be deployed within the simulations [179][180][181].
� PSCAD/EMTDC: It is a commercial tool used for power system simulators that simulates power from generation to distribution, giving the user the ability to analyse a power system comprehensively. Moreover, this tool can be integrated with other tools, such as MATLAB, for research/ experiment purposes [182][183][184].

� Power System Simulation Toolbox:
An open-source tool developed by [185] written in Python code is used for Agent-Based Modelling of Electricity Systems. � Homer: A commercial tool uses simulation power generation only; this simulator is focussed on simulating energy generation resources, for example, wind, solar power and others. Unlike other tools, Homer cannot be used for other power systems domains like distribution and transmission [188].
� PandaPower: This is a Python-based tool that is used for power system analysis by providing PF, OPF, state estimation, topological graph searches and short-circuit calculations.
In addition, other power simulators, Modelica [190], ObjectStab [191], EuroStag SmartFlow [192], EMTP-RV [193] and positive Sequence Load Flow (PSLF) [194], are available and are widely used by researchers to indicate power reliability and security. MATLAB/Simulink can be used to create SG simulations, but one disadvantage is that MATLAB/Simulink cannot produce realistic measurements as well as other reliable power simulators.

| Hardware-in-the-Loop
Solutions that involve hardware such as OPAL-RT [195][196][197] and Real-time digital Simulator [198] provide a realistic and real-time simulation environment for power systems. Hardware-in-the-Loop brings many advantages, such as creating a real-time virtual environment similar to real CPSs, which allows the user to test large-scale power systems in realtime. The main disadvantage of this solution is that researchers consider it to be expensive.

| Co-simulation
It is not easy to create a comprehensive simulation for a specific system. Designing a comprehensive simulation can consume time and money, especially for researchers aiming to analyse a power system for a specific need. One way to avoid that is to combine different available simulations from different levels to establish a power system with telecommunication capabilities for a comprehensive analysis. Co-simulations bring several advantages, allowing researchers to propose novel ideas and solutions that help increase the reliability of energy systems [215]. Additionally, many researchers have proposed their cosimulation frameworks for different domains. Table 8 provides a survey of available co-simulations, briefly describing each work, the power system and telecommunication simulator used in the study and whether it is cybersecurity-focussed.

| Lessons learnt
The best approach to validate asset criticality, possible threats and equipment failure is to implement them into an integrated simulation. There is little focus on end-to-end comprehensive cyber-physical simulations for analysing impacts and identifying assets. A stand-alone simulator, such as a telecommunication networks simulator to present CPS's cyber layer, is inadequate to assess threats and build SA. In order to validate the consequences of possible threats and build SA for SG CPSs, a proper simulator that covers cyber, physical and transmission layers should be chosen. Cyberattacks, failures and time delays are possibly found in the cyber layer components, for example, SCADA, IEDs, PLCs and RTUs. Yet, measuring damage and identifying cascading effects can be found in lower layers (physical layers). Linking cyber and physical layers is done by the transmission layer (connection medium), which sends commands to control physical assets via ICS controllers (actuators) and feeds measurements to ICS components using sensors. Moreover, it is important to distinguish between cyber threats that can lead to failures and failures caused by natural events by gathering as much knowledge about these scenarios as possible using risk/failure techniques conducted in CPSs.

| FUTURE DIRECTIONS AND OPEN ISSUES
This section lists key open problems while reviewing related studies in this field. Equally, it is important to address them rapidly, as this field is considered globally critical.

| Scalable data collection
It is one of the biggest issues inside these systems. This can be because (1) availability is an important aspect that should not be affected whatsoever while the system is running, (2) pure OT and legacy system components are installed and serving their purpose without deploying any cyber solutions that can be helpful to avoid increasing cyber threats and (3) the heterogeneity in the environment that exists between industrial protocol and IT protocols makes communication difficult. Moreover, scalable data collection can be divided into three main categories, each of which has many reasons why collecting data from these time-critical systems is difficult. The first is the lack of an overall asset visibility technique that can be used to identify resources without affecting the process continuity. As discussed in section three, each discovery technique available has its characteristics and needs to be applied to the system to collect as much information as needed to build appropriate asset management, which will help reduce and mitigate cyber threats. Second, some industrial protocols communicate in plain text, meaning that any responses received and stored are readable. Furthermore, applying encryption algorithms in these protocols is difficult and can consume time and memory resources, which are limited in industrial components. Therefore, there is a need to have secure communication links, and the storage of the collected data should prevent unauthorised access by applying the appropriate authentication mechanisms.

| Adapting new techniques
The need for new techniques (such as machine learning) for specific risk type prediction can help different industries rank threats and prioritise assets based on cruelty, risk level and 18 -ALROWAILI ET AL. calculation. Moreover, threats in CPSs are becoming more advanced with new techniques and resources that can expose many vulnerabilities in OT components. Additionally, risk and threat levels are continuously changing. They can be considered important because predicting and ranking threats can help businesses make appropriate and informed decisions since each business's aim and field can differ from each other. Therefore, applying advanced techniques for detecting threat patterns and accurately measuring risk type and level is needed to improve CPS security using several input types, such as skills, motive, location, techniques, assets resources etc. Using the collected information in machine learning classifiers provides a prediction of risk, which will help organisations apply security mitigation in advance and improve their incident response.

| Heterogeneity between operational technology and information technology
Another open issue is the heterogeneity between OT and IT from different aspects. Communication between their components to aggregate information is needed; however, enabling such communication and translation can bring vulnerabilities and breaches. As mentioned, industrial control protocols were designed to execute field procedures without considering security aspects. Recently, new industrial protocols were deployed with security features but are mostly incompatible with legacy devices. In addition, physical devices and their operating systems can be vulnerable to cybersecurity threats since most physical devices are outdated or have limited computational capacity and insufficient memory, making it difficult to apply security measurements. Furthermore, specific security solutions need to be implemented here. It is difficult to apply traditional security solutions, such as Intrusion Detection System and encryption methods, to these systems due to the specific requirement for its components and the new sophisticated attacks that have occurred in recent years. Therefore, combining security mechanisms with CPSs must be used, while also considering the requirements for IT/OT components and enhancing the overall security technology in this field.

| Lack of focus on cascading effects
Cascading effects of cybersecurity risk on interconnected components is another issue that has been raised in CPSs. Moreover, cascading effects can be considered one of the most complicated issues in CPS safety. This issue can be caused by many factors, including natural disasters and physical, technical or human errors, which can initiate a sequence of serious events affecting a system, an entire city or something much bigger. Understanding and exploring interdependencies between systems and analysing (in case a cyber threat is successful) the cascading effects on a targeted asset can help prepare an organisation to make firm decisions to avoid massive disasters.

| Evaluating and identifying critical assets
Finally, evaluating and classifying assets is an issue that must be addressed in the CPS, either by the damage that can be caused, the cost or even by its capability to achieve a business goal. Assets are not only related to IT components in the enterprise network. Assets are more than that in CPS; they can be hardware/software components run at the operational level (SCADA, PLC, RTU, IED, etc.), human (operators, maintainers, engineers) and data collected from the system, which can be significant for analysis, network communication and physical assets (as in the case of SG generators, transformers, transmission lines etc.). Furthermore, the need to evaluate and focus on physical assets in criticality when cyber threats arise is important nowadays, and a specific framework should exist since each industry runs different physical devices. Physical devices in the SG are not the same in water treatment, oil and gas or nuclear systems. Criticality evaluation can be used regarding the type of connection, operator management, lifespan and health and potential physical damage. Having comprehensive information from the asset visibility can help build specifically designed and appropriate asset management, which enhances the SA for these systems. Combining all this aggregated information will allow them to be used in advanced alerting capabilities, addressing the need to construct an alerting system that combines possible cyber threats into such a system as well as failure analysis for all types of equipment and devices.

| CONCLUSION
In this study, stages of monitoring SA of SG have been surveyed in terms of the following facets. At the beginning, we discussed the different approaches that can be used to identify IT and OT assets, listing their limitations and the tools applied in this field. Then, we provided a detailed analysis of the studies, frameworks, methodologies and risk techniques that can be used for critical asset identification and evaluation in power systems. Afterwards, we presented an outline of the methods used to evaluate the consequences on the cyber and physical system to emphasise the effects of failing to secure SG assets. Furthermore, the open issues and future directions for monitoring SA and critical asset identification are carefully summarised at the end of this study.
As it is reviewed, we have noticed that security solutions, along with critical asset identification, were mainly focussed on IT assets, and currently, there is little focus on looking at physical OT assets. Gathering complete information about OT assets is essential for monitoring power systems SA. Nevertheless, any discovery techniques/tools implemented need to be chosen in a way that does not affect the system's functionality, as OT assets are considered time-critical and cannot afford to disturb their main tasks. Moreover, integrating sophisticated techniques such as machine learning can help predict specific risk types and keep up with the continuous change of existing cyberattacks.