A network perspective on assessing system architectures: Robustness to cascading failure

Despite a wealth of system architecture frameworks and methodologies available, approaches to evaluate the robustness and resiliency of architectures for complex systems or systems of systems are few in number. As a result, system architects may turn to graph‐theoretic methods to assess architecture robustness and vulnerability to cascading failure. Here, we explore the application of such methods to the analysis of two real‐world system architectures (a military communications system and a search and rescue system). Both architectures are found to be relatively robust to random vertex removal but more vulnerable to targeted vertex removal. Hardening strategies for limiting the extent of cascading failure are demonstrated to have varying degrees of effectiveness. However, in taking a network perspective on architecture robustness and susceptibility to cascade failure, we find several significant challenges that impede the straightforward use of graph‐theoretic methods. Most fundamentally, the conceptualization of failure dynamics across heterogeneous architectural entities requires considerable further investigation.

are located is less well assured, as exemplified by the direct and indirect impact of the 2017 Grenfell Tower fire in London. The resulting Public Inquiry "Phase 1 Report" details that the principal reason the fire spread was the aluminium composite cladding filled with plastic used on the building exterior, and that the London Fire Brigade suffered "...significant systemic and operational failings revealed by the evidence." 14 A further example that suggests further work is required to promote consideration of system robustness within the systems engineering community can be found from the recent (2017) crash of a Watchkeeper Unmanned Air Vehicle into the sea in Wales, UK. The subsequent Service Inquiry by the UK Defence Safety Authority concluded that the incident (an aerodynamic stall) was a result of pitot tube blockages leading to inaccurate air speed reporting, citing a "lack of system robustness testing" as an organizational influence that was a contributing factor. 15 While systems can be designed with resilience engineering principles to try to minimize the likelihood and severity of such incidents, eg, physical or functional redundancy, fail-safe principles, layered defense, and conducting extensive failure effects mode analysis, it seems that for complex systems or SoS, further effort is still required to ensure system resilience. 12,16 Despite several architecture frameworks utilized by industry seeking to establish common practices for analyzing architectures within a particular domain, [17][18][19][20][21] there is only a relatively sparse literature concerning approaches to evaluating architecture robustness and resilience. 16,[22][23][24][25][26] Several previous studies have analyzed network representations of system architectures, sometimes described as social network analysis, to support their evaluation. [27][28][29][30][31][32] This paper builds on previous work, [33][34][35] in seeking to explore the robustness and susceptibility to cascading failure of candidate system architectures from a network perspective, in order to assess the extent to which taking a network perspective on SoS architectures can assist architecture evaluation by helping determine whether one candidate architecture is more robust or resilient than another. Such an approach may address notions of dynamic complexity, such as: "what is the effect of the removal of some of these entities, whether in the form of a single failure or a cascade of failures?" In this study, we make use of two real-world enterprise architectures originally created and validated by Thales, and chosen as representative of real-world SoS architectures featuring a diversity of entities and relationships. For further details about the use cases, the interested reader is directed to Ref. 33. The first use case is a Search and Rescue (SAR) NATO Architecture Framework (NAF)-based architecture, developed by Thales in order to inform systems architecture training and help the development of NAF v4. 36 The SAR architecture was produced in a common commercial enterprise architecture software package, created by following the NAF v4 eight architecting stages. The architecture includes the architecture products corresponding to NAF viewpoints described in NAF v4 (corresponding to the Subjects of Concern and Aspects of Concern in NAF v4 parlance) 36  interoperate. This architecture was created with the same commonly used commercial enterprise architecture software package as the SAR architecture. The architecture was created as part of concept development work while developing a bid for a particular client organization. The architecture includes the architecture products corresponding to MODAF viewpoints for the "as-is" architecture of the client's solutions, along with architecture products corresponding to the "tobe" architecture of the proposed solution. As is common practice, not all architecture viewpoints in MODAF had an architecture product created as the selection of viewpoints is tailored to support specific organizational objectives (in this case, supporting concept development and the refinement of bid response activity).
Here, we follow previous studies [33][34][35]  This paper makes three primary contributions to the emerging literature describing the application of networks science tools to system architecting. First, we assess the robustness of two real-world architectures to vertex removal using two standard centrality metrics as measures of network viability. Second, we evaluate each architecture's susceptibility to cascading failure using a simple threshold model, before exploring the effects of two hardening strategies for limiting the extent of this cascading failure. Finally, we assess the strengths and weaknesses of networks analysis techniques for the evaluation of system architectures in general, highlighting the challenges that currently limit their utility, and recommend guiding principles for architecture robustness evaluation in light of these challenges.
The next section briefly reviews relevant literature and defines the terms robustness and resilience for the purposes of this paper. A synthetic motivating example is then introduced, and revisited later, to provide context for the approach taken in this paper. The theory supporting the assessment of network robustness is presented. The methodology used to assess architecture response to perturbation and susceptibility to cascades is then detailed before the results of the assessments are presented. The discussion then turns to the challenges of adopting a network perspective to evaluate the robustness of a complex SoS architecture, highlighting the areas where care must be taken, before conclusions are drawn.

Literature review
Some authors argue that an SoS is not fit for purpose when it cannot transfer materials, energy, or information in a timely, correct, costeffective way and encourage designers to consider these transfers as areas to focus on, whether in terms of opportunity enhancement or risk reduction. 22 However, for a complex SoS with considerable scope, how can an organization refine their focus to something tractable? Fur-thermore, what makes one SoS configuration more robust or resilient than another?
Designers are concerned with the ability of their designed systems to cope with perturbation and change. Several related concepts are often invoked in discussing this issue. System robustness, for example, can be characterized as the ability of a system to withstand perturbation-its ability to remain unchanged in the face of some assault or change. Resilience, by contrast, is the ability of a system to recover from some degree of failure. However, this relatively clean distinction is difficult to maintain in practice, 23,26 where robustly resisting perturbation may always involve some microscale reorganization or refreshment and recovering from a failure may or may not return the system to precisely its original functional state. To some extent the character of a system's response to perturbation is observer-relative, depending on subjective factors such as the timescale of interest, the granularity of the analysis, etc. 37,38 Here, we will use the term resiliency to include the three most important aspects of a system's ability to cope with perturbation; robustness, how much damage a system can withstand before it fails to function; recovery, the ability of the system to repair or recuperate within some resource constraints; and adaptability, the ability of the system to effectively change over time so that its likelihood of success does not decrease. 37 Similarly, resilience has been defined as "the ability to prepare and plan for, absorb or mitigate, recover from, or more successfully adapt to actual or potential adverse events" (emphasis added). 9,12,39 The same aspects of resilience are presented by Madni and Jackson, and Goerger et al, suggesting notions of anticipation (prepare and plan for, avoid), resistance (absorb, withstand), respond and adapt to, and recover from. 12,40 There are, however, a wealth of qualities with which to describe resiliency, and the interested reader is directed to Ref. 16 for a more detailed review. While other aspects of resilience may also include: quality, agility, repairability, extensibility, flexibility, and versatility, here we focus on the three aforementioned aspects of robustness, recovery, and adaptability. 16,41 For this paper, resiliency can be described as: "a system's ability to adjust its activity to retain its basic functionality when errors, failures, and environmental changes occur." 42 While system resilience has been argued to be an emergent property that cannot be measured, 9 evaluating system robustness may provide a quantifiable means to support resilience engineering. The Engineering Systems community argues that the architecture of a system is a key contributor to functional behavior and desirable properties, such as resilience and robustness. 5,6 Approaches to evaluate the resiliency of a system, early in its lifecycle (ie, without probabilistic calculations of component reliability) are relatively sparse.
Design strategies for improved system robustness have been proposed, such as "Relax a constraint limit on an uncoupled control factor" or "Create two distinct operating modes for two different demand conditions" and may aid the design of predominately mechanical systems (ie, a paper feeder in a printer), but may not be straightforward to implement for complex engineered systems (ie, an air-traffic management system). 43 Similarly, "design principles" for resilient enterprise information systems have been proposed by Zhang and Lin, based on derived axioms from literature in ecology, which may usefully prompt system designers, but whose utility for organizations involved in the design of complex SoS may be limited (eg, see "A resilient system should be designed to have a certain degree of redundancy, preferably functional redundancy. The more redundancy the system has, the higher the degree of resilience."). 44 The "SoS Architecting with Ilities (SAI)" method encourages early phases of SoS design to explicitly include design options: so-called change options that change the design of the SoS in order to respond to a perturbation, or so-called resistance options that resist perturbationimposed changes in the design of the SoS. 25 Deliberate design emphasis on ensuring robustness and resiliency seems appropriate. However, the qualitative generation, evaluation, analysis, and trade-off of a diverse and ever growing set of "ilities" and "options" for architecture alternatives places a considerable resource burden on design organizations. Similarly, the following techniques are encouraged to ensure a design organization arrives at a resilient design: "Developing applicable and realizable resilience heuristics-to inform and guide resilient system design" and "Developing appropriate resilience metrics-to evaluate candidate resilience strategies." 40  Some authors have used a network model of an SoS to examine failure propagation in order to evaluate the effects of disruptions, eg, the effect of local airport disruptions on national commercial air travel, or network models of SoS functions to examine critical functional dependencies between constituent systems. 16 Often these approaches require considerable complexity in the real SoS to be abstracted away in order to define a mathematical model of a network; eg, creating "scale-free" or "exponential" network topologies, upon which resiliency can be evaluated by considering the impact of vertex removal on message propagation through the network. Again, however, in real-world SoS architectures such topologies are not necessarily present and there is an assumption of architectural entity homogeneity (architectural entities are understood as differing only in so far as they occupy a different location in the network). 32 These network models can struggle to cope with the heterogeneity of a complex SoS, even when they employ more sophisticated "multilayer" network models, or adopt a "narrower" modeling choice in the case of examining functional dependencies. 16,32 The current paper seeks to utilize a network perspective (or social network analysis) on failure propagation and the identification of influential entities in terms of their contribution to robustness and failure cascades, but applied to complex SoS enterprise architectures modeled as networks. In a departure from other methods, we start with a network model of a complex real-world SoS created directly from the enterprise architecture. 33 Furthermore, we evaluate the robustness of an architecture in terms of its susceptibility to vertex removal measured in terms of the impact on average network centrality as a proxy for architecture viability. We also evaluate architecture robustness in terms of susceptibility to cascading failure using a load shedding model. This approach may be useful for system architects undertaking trade-off studies when evaluating competing architectures. Furthermore, the approach may help ensure subsequent designs are more robust and resilient by adopting protection strategies for the entities which, when removed, have the greatest effect on architecture viability or which are the greatest contributors to cascading failure.
While this approach offers some interesting insights, there are several significant conceptual hurdles that we wish to highlight so that system architects attempting to make use of these techniques can do so with "eyes open" to the challenges they face. completed an initial design to meet the customer requirements. The systems engineering manager notes that their system operates within a broader, complex, maritime SAR SoS with challenges of autonomy, diversity, operational, and managerial independence, among other challenges, and instigates the creation of an Enterprise Architecture, using a common Architecture Framework to guide their activity.

Motivating example
The customer and supplying organization may both be concerned with the robustness of the architecture; is one candidate architecture more robust than another, are some architectural entities more important in the architecture in the sense that their removal impacts overall SoS effectiveness (whether the loss of a single entity is considered or the triggering of a cascade through the architecture), can hardening or protection strategies be put in place to improve architecture robustness, and to aid an understanding of the overall SoS by considering where responsibility for these important entities resides.
While organizations likely have processes and guidelines to support architecture evaluation, 3 and they may have approaches to promote design principles that encourage resilience, 25,40,43 they may not have enough information early in the system lifecycle to effectively utilize these (eg, a lack of data to support Failure Mode and Effects Analysis (FMEA)), or they may have an enterprise architecture that is more diverse than a product architecture (eg, approaches that are based on design structure matrices 24,48 ), and more diverse than approaches that treat an entire SoS as a network. 28,29,31 System architects within an organization can instead represent their enterprise architecture as a network, and ask if one architecture is more robust to architectural entity changes or removal than another, and is therefore more desirable than another? For one particular architecture, which architectural entities are important in the sense that their removal affects network viability, or triggers cascading failure, ergo affecting SoS architecture effectiveness? In particular for a complex SoS, where do these entities exist: are they part of a prime organization's remit of control or are they external and thus a source of technical or operational risk? 11,46 This research attempts to provide system architects with an approach to support such evaluations.

APPROACHES
This section introduces and defines graph-theoretic approaches to exploring network robustness. The selected approaches are later applied to evaluate the robustness of network representations of complex system architectures.

Network perturbation
Large complex networks, such as the World Wide Web, the Internet, metabolic networks, etc, can be considered to rely on their continued connectivity for their effective operation. In such networks, the removal of vertices (nodes) interferes with paths between pairs of vertices until the network becomes largely disconnected and unable to effectively function. 52,53 Some of these networks have strongly skewed degree distributions, with many low-degree nodes and very few highdegree nodes, or even have degree distributions that are approximated by power laws, eg, some technological or social media networks. As a consequence, they tend to exhibit high robustness to the random removal of vertices but have significant vulnerability to the targeted removal of the highest degree vertices. 54,55 Similar results have been shown for social and biological networks, eg, email networks 56 and metabolic networks. 57 If systems architects could utilize similar analyses they could provide an evaluation of a complex SoS architecture's robustness and consider potential recovery and adaption strategies to ensure overall resiliency, considerations which may influence the selection of a candidate architecture or inform design decisions.
There are some challenges in applying such approaches to complex SoS architectures, however, which are introduced here but revisited in more detail in Section 5. The first is that in complex network research the removal of a vertex has a very natural meaning in context. For computer networks, the loss of a vertex models the failure of a system component. In epidemiology, the loss of a vertex models the death of an individual or them gaining immunity to the disease being modeled.
In graphical models of complex SoS architecture, however, finding a meaningful abstraction that corresponds to vertex removal is difficult, given the heterogeneity of system entities modeled: people, teams, services, physical systems, software, and communications carriers. to be removed, to correspond to a less diverse notion of "failure." A potential solution, and the one used in this manuscript, is that the removal of a vertex is taken to correspond to the failure of a system entity to cope with some change. As the graphical model represents a diverse range of system entities and their interactions, the notion of change employed here must also capture some diversity: in the context of a complex SoS architecture, a relevant change might correspond to an alteration to the operational status of a system (a system becomes unserviceable for example), a change in environmental conditions, a change in doctrine or concepts for systems or the overall SoS, etc. In this way, the removal of a vertex corresponds to a change, internal or external, that effects the constituent entities of the SoS to such an extent that the entity is no longer effective.
One approach to characterizing robustness, is to consider the fraction of vertices in the largest component of the network after it has suffered a perturbation of some kind. 38

Cascading failure
Given the interconnectedness of a complex SoS it is appropriate to assess the potential for "avalanches" of failure that spread across an architecture when perturbed. Models from epidemiology (eg, Susceptible Infectious Removed (SIR) or Susceptible Infectious Susceptible (SIS) models 52,61 ) are used to assess the spread of disease over a network (or the spread of computer viruses over computer networks 56 ).
Although such models have been widely studied, they become less tractable, and hence less suitable, for highly heterogeneous systems such as the SoS architectures being considered here.
Instead, an approach that explores cascading changes in terms of "load shedding" could be more suitable. In electrical power network models, the removal of a vertex could correspond to the failure of a component causing its electrical load to be passed on to its neighbors.
The same idea can be considered for a complex SoS architecture, where the removal of a vertex corresponds to some change that has rendered that entity ineffective. The neighboring entities (in the graphical model) are then required to cope with the demand originally placed on the entity, or the lack of support from that entity in carrying out their own function.  63 Consequently, since developing detailed models of dynamic change processes for an SoS architecture may not be feasible at an early or predesign stage, the focus for a system architect may perhaps best be placed on understanding the general susceptibility and impact of cascading changes for a class of architecture, in order to support architecture selection decisions or inform system designs with mitigation strategies in mind. in directed graphs as considered here for closeness centrality, and we will return to these challenges in Section 5. 33 There is no consensus on the most suitable network measure of importance or influence in a network; alternative approaches include eigenvector centrality and characteristic path length. 31 We evaluate the susceptibility to cascading failure of each architecture using a simple threshold model originally presented by Watts. 62 Each vertex, i, in the directed graph, G, is initially labeled "effective" and is given a threshold value i , specifying the fraction of that vertex's neighbors that must be in state "ineffective" before it is rendered ineffective itself. This threshold value is either a fixed uniform value for all nodes in G, or is drawn from a uniform distribution over a given analysis of the degree to which a network is vulnerable to cascades in general, but also the degree to which particular nodes in the network are more liable to generate significant cascades.

Perturbations
Removing the most connected vertices from a network representing a complex SoS architecture has a significant impact on centrality mea- effectiveness. An alternative solution would be to use the average harmonic closeness centrality as a proxy instead of closeness centrality. 33 However, as we will see in Section 5, despite the choice of network measures used, there remain significant hurdles to overcome.
Correlations between the significance of a vertex and the impact on the network of its removal were calculated using various measures of significance and impact were examined to see if it is possible to predict which vertices cause large changes in architecture topology. These correlations were examined for both the original measures calculated from the unperturbed network (Table 1) and for the "live" measures recalculated as the network was perturbed ( Table 2).
The lack of significant correlations shown in Table 1 Table 2). The implication of this result is that as a system architecture is perturbed, system architects cannot look to individual architectural entities to try and predict the impact of further degradation. Thus, while the initial connectivity of an architectural entity can serve as an identifier of importance to overall architecture effectiveness, this is not the case as the architecture is perturbed.

Cascades
The cascading failure model seeks to determine the role of local

Hardening against cascades
To what extent can an architect protect their architecture from the cascading failures explored above? Here, a simple hardening strategy, inspired by other research, 38  In order to harden the most critical vertices, this hardening strategy relies on a posteriori knowledge of the architecture and its vulnerability to cascade. An alternative hardening strategy was also considered.
Noting the correlation uncovered earlier between degree and criticality, the 10 highest-degree vertices had their thresholds swapped with the highest thresholds in the network. Unlike the hardening procedure outlined above, this hardening strategy relies only on knowing, and protecting, the most connected entities without any further knowledge of how susceptible the vertices are to cascades.
The results of these simple hardening strategies are shown in Figures 15 and 16 and Table 4. Outcomes of the hardening strategies were compared with the unhardened outcomes using a Mann-Whitney two-sample rank-sum U test to determine if the distributions differed significantly, with the resulting U test statistic and P values provided in Table 4.
The vulnerability of the MComms network to cascading failure is relatively unaffected by either of the hardening strategies when vertex thresholds are drawn from a uniform distribution between 0.1 and 0.9. This is perhaps because the architecture is already robust to cascades for this threshold profile, as shown in Figure 15A and in the first row of Table 4. Consistent with this thinking, both hardening strategies have a more significant impact for the same network when vertex thresholds are drawn from a uniform distribution between 0.0 and 1.0, which is associated with more significant cascades in the unhardened network ( Figure 16).
The results suggest that a simple hardening strategy targeted at only the 10 most connected entities can reduce the number of cascades triggered to a similar extent as hardening with prior knowledge

DISCUSSION
We return to the worked example to discuss how the findings can However, at several points throughout this paper, we have raised the issue that taking a network perspective on complex SoS architectures to support architectural robustness analysis confronts significant conceptual and technical challenges and these are considered in more detail here.
Perhaps the most fundamental observation relevant to this discussion is one that can be made whenever a network perspective is taken on any real-world system: "the map is not the territory." An SoS is not a graph, and neither is an SOS architecture. Rather, an SoS is a real-world system, actual or envisaged, and an SoS architecture is a rich set of models, documents, resources, and ideas related to this real-world system, whereas a network or graph is a mathematical While the path length metric is only a proxy for the complicated set of processes and factors that influence the real dynamics of the school population, it has come to be regarded as a useful proxy. This is partly as a result of two factors: (a) the relatively straightforward relationship between path length in a graph and spreading processes in a social network, and (b) considerable effort by the social science community in coming to understand the relationship between social networks as mathematical objects and social networks as real-world systems. 52,53,64,65  Moreover, in order to make direct claims about SoS resilience, the characterization of system failure discussed above must be extended to deal with recovery and repair processes. A binary switch between "effective" and "ineffective" states, as employed in our analyses here, may not be sufficient to capture the dynamics of failure and recovery in a real SoS. In particular, the timescales of recovery in different parts of the system and the mitigating interventions in place may have a significant influence on recovery.
What types of perturbation should an architecture's resilience be evaluated against? Here, we have tended to consider the impact of a sequence of nodes becoming ineffective, but a more realistic characterization might allow for several, potentially correlated entities to become ineffective at the same time. Current studies tend to either consider independent random failures or targeted attacks. In real systems, however, scenarios in which multiple problems cooccur tend to arise as the result of common cause failures, perhaps because assets are colocated in the same geographical region or subject to the same latent vulnerability to an extreme environmental factor. 67 The correlated failure profiles that result are not addressed by modeling either random perturbation or attacks targeting vertices with particular network properties (eg, high degree). Potential solutions include providing a weighting to edges in the network, 33 or considering "interdependent" network models, 38 or adopting an approach cognisant of vertex connectivity (ie, approaches that consider the "fan-in" or "fan-out" between vertices). 31 However, the intelligence upon which to base more relevant or sophisticated models of perturbation may be difficult to collate and interpret, especially at an early design stage, or for SoS contexts that are innovative or first-of-kind in some respect.
Finally, in order to characterize the negative impact of perturbation or the positive impact of recovery, some notion of system viability or performance is required. 68,69 To what extent can an effected SoS be expected to continue to achieve its function? At what point is system effectiveness wholly compromised by structural changes to its graphical representation? Is it sufficient to assume that once the SoS architecture network is fragmented to a certain degree, the associated SoS architecture is ineffective? How would one determine such a threshold a priori? While the size of the largest surviving connected component may be used to characterize the viability of a postattack graph, and this may be a better measure than simply counting the number of surviving nodes, this approach may not translate smoothly to the context of a complex SoS architecture. It may be the case that the loss of a small number of subcomponents on the "periphery" of an architecture is likely to be less damaging to overall system viability than a loss of the same number of entities from the "core" of the architecture. However, we must be careful not to necessarily equate the structural core of a graph with the functional core of an SoS.
To a first approximation, the issues described above tend to stem from the fact that SoS architectures involve a high degree of heterogeneity in their component entities and the relationships among these entities (see "Guiding Principle 1" 33 ). To the extent that an architecture is relatively homogeneous (eg, it comprises a set of similar components organized in some "flat" configuration), relatively standard network analysis approaches can be applied-although it is still the case that the interpretation of the results of such analyses may require careful thought. In the case of less homogeneous SoS architectures, in order for the desired insights into, say, system robustness to be more amenable to graph-theoretic analysis it may be appropriate to examine a more homogeneous subset of the architecture that can be more readily abstracted as a simple network, eg, the communications architecture within a complex SoS. However, carving up an SoS in this way is problematic as it precludes the kind of holistic analysis that should, ultimately, be the aim of a systems architect working in the context of a complex SoS project. 70

CONCLUSIONS
In this paper, we have taken a network perspective on SoS architecture analysis in order to assess robustness, by examining the effect of vertex removal on the overall architecture topology, and vulnerability to cascading failure. Both analyses may be useful to system architects looking to inform their evaluations of candidate architectures, by considering if one architecture is more robust to perturbation than another. Furthermore, system architects can identify which architectural entities are important, or influential, in the sense that their removal affects network viability either as a consequence of the direct impact of their absence, or the likelihood of their failure triggering a cascade. In particular, for a complex SoS, such an approach may support technical or operational risk identification by identifying particularly vulnerable architectural entities, and determining if they are within or outside of an organization's control.
Both of the architectures analyzed here were are found to be relatively robust to random vertex removal but more vulnerable to targeted vertex removal, implying that system architects should pay particular attention to system architectural entities with high connectivity, noting that without the adoption of a network perspective, the full extent of an architectural entity's connectivity may not be known. While hardening strategies for limiting the extent of cascading failure were shown to have varying degrees of effectiveness, results suggested that system architects should look to the most connected architectural entities as areas of particular concern when seeking to mitigate the potential impact of cascading failure.
The results presented here for the two real-world use case architectures suggest that SoS robustness and vulnerability to cascades depends on both the model imposed on the architecture and the topology of the architecture, and that improvements in robustness to cascades may be achieved by hardening the most connected entities.
While the results of such analyses have the potential to inform architecture design and selection, we argue that a series of challenges need to be addressed if the approach is to be useful for system architects.
These challenges include the conceptualization of failure dynamics across heterogeneous system architecture entities, the conceptualization of resilience aspects and approaches to assess network representations of architectures for both network perturbation and cascading failure analysis approaches.
The challenges described here are all compounded by the tendency for system architecting activity to correspond to early system lifecycle phases where information which would support such evaluations may be limited. Furthermore, the simple models used here neglect potentially more important contextual information that may be present in the full architecture, such as recovery strategies or criticality of system entities for overall functionality.
A network science perspective encourages us to ask important questions of our system architectures: how robust is our system to perturbations; how well protected are the most critical architectural entities; how extensive will the negative impact be when these protections fail; and how readily will our system be able to adapt and recover?
Given that networks science has delivered powerful insights into a wide range of social and technological systems, it likely also has the potential to inform our understanding of SoS robustness and resilience in order to help us answer these questions. However, we argue that the analyses presented here imply that this will only be possible if the numerical analyses that networks science enables are deployed and interpreted through the lens of an appropriately rich contextual understanding of the system architectures in question.