New approximations for network reliability

We introduce two new methods for approximating the all‐terminal reliability of undirected graphs. First, we introduce an edge removal process: remove edges at random, one at a time, until the graph becomes disconnected. We show that the expected number of edges thus removed is equal to (m+1)A$$ \left(m+1\right)A $$ , where m$$ m $$ is the number of edges in the graph, and A$$ A $$ is the average of the all‐terminal reliability polynomial. Based on this process, we propose a Monte‐Carlo algorithm to quickly estimate the graph reliability (whose exact computation is NP‐hard). Moreover, we show that the distribution of the edge removal process can be used to quickly approximate the reliability polynomial. We then propose increasingly accurate asymptotics for graph reliability based solely on degree distributions of the graph. These asymptotics are tested against several real‐world networks and are shown to be accurate for sufficiently dense graphs. While the approach starts to fail for “subway‐like” networks that contain many paths of vertices of degree two, different asymptotics are derived for such networks.


INTRODUCTION
For a network (which we assume here is represented by a finite undirected graph, possibly with multiple edges) there are many models of robustness to component failure.The simplest measures include the minimum degree (the minimum number of edges whose removal disconnects a vertex from the rest of the graph), the edge connectivity (the minimum number of edges whose removal disconnects the graph), and the vertex connectivity (the minimum number of vertices whose removal disconnects the graph or reduces it down to a single vertex).Another indirect measure is the algebraic connectivity of the graph [12] which measures how fast information propagates through the graph [13,16,23,24].
However, such static measures are rather coarse and do not take into account that the components of a graph may, at different junctures, fail to be operational.So more nuanced probabilistic models have been described, where the components (vertices and/or edges) are subject to random failures.In the most common of these models, it is the edges that fail independently with the same probability q, while the vertices are always operational, and we ask for the probability that the network is operational, where operational can mean, for example: • all vertices can communicate (the all-terminal reliability R G (q), or simply R(q)), or • two specific vertices s and t (called the terminals) can communicate (two-terminal reliability R G,s,t (q)), or • a specific subset K of vertices can communicate with one another (K-terminal reliability, R G,K (q)).
This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.© 2024 The Authors.Networks published by Wiley Periodicals LLC. Networks.2024;84:51-63.
wileyonlinelibrary.com/journal/net 51 The computations of all of these polynomials (in variable q) are intractable, that is, NP-hard (see [6]).Of course, there are other variants in the literature-some for node failures rather than edge failures, some for directed graphs, and some that allow for failure dependencies.A general reference is [6], while more recent surveys include [2,3].

FIRST APPROACH: THE EDGE DELETION PROCESS
In our study of all-terminal reliability, we have been led recently to propose a new variant of network robustness based on an algorithmic procedure.Consider the following edge deletion process: delete edges from the graph at random, one at a time and stop when the graph becomes disconnected.Let s k denote the probability that under the edge deletion process the process stops exactly after k steps.Can we determine the probability distribution of the s k 's?Of specific interest is the average number of edge deletions to lead to disconnection, given by How does this measure relate (if at all) to the existing reliability measures?We are interested in the answers to these questions, both for synthetic graphs as well as for networks that appear in practice.We remark that our interest grew out of a common Monte Carlo simulation of all-terminal reliability (see, e.g., [22]).Namely, start with a (connected) graph G of order n and size m (i.e., with n vertices and m edges), a non-negative integer k ∈ {0, 1, … , m − n + 1} and for a large positive integer N, choose N random spanning subgraphs of G, each with m − k edges, and determine the proportion r k of such spanning subgraphs that are connected.This value r k is an approximation to R G (k∕m).A more efficient algorithm is to chose the k edges sequentially for deletion and stop when the graph becomes disconnected (as further edge deletion cannot subsequently make the graph connected!).
Let us begin with a couple of examples.
Example 1.First of all, consider the cycle graph C n of order n.Clearly the removal of any single edge does not disconnect the graph, but the removal of any two edges does disconnect the graph.Thus s 2 = 1 while s k = 0 for all k ≠ 2, and hence Example 2. As another example, consider the graph H of order 3, with vertices x 1 , y and x 2 , where each x i is joined to y by l edges.It is not hard to see that and hence While the definition of the s k 's and  do not, on the surface, seem to relate to the all-terminal reliability (the latter is a polynomial in variable q), we shall see in the next section that there is indeed a deep connection between the two notions.

Connections to all-terminal reliability
The aim of this section is to quantify the connections between the edge deletion process, in terms of the stochastic variable s k and the all-terminal reliability.We will establish connections for three metrics related to the all-terminal reliability: the average reliability (AR) of the graph, the full all-terminal reliability polynomial of the graph, and the 99-percentile value of the all-terminal reliability (the choice of the 99-percentile is arbitrary, but is indicative of how reliable a network is expected to be in practice).
We begin by enumerating a number of useful forms of the all-terminal reliability polynomial.For a graph G of order n and size m (here and elsewhere, we shall always assume that any graph under question is connected), its all-terminal reliability has the following forms (each is an expansion of the polynomial under different bases for the underlying vector space of polynomials-see [6]): We refer to these as the N-, F-, and C-forms of the reliability polynomial, respectively.The interpretation of the coefficients is as follows: • N i counts the number of spanning connected subgraphs with i edges.
• F i counts the number of spanning connected subgraphs with m − i edges.
• C i counts the number of cut sets with i edges, that is, the number of subsets of i edges whose removal leaves the graph disconnected.
Of these three forms, the first two have attracted the most attention in the literature.However, we will see that the connection between the s k 's and all-terminal reliability is most easily drawn with the C-form.
We now state our main result, which connects the edge deletion process to the average reliability.
Theorem 2.1.Let G be a graph having m edges.Remove edges from G at random and one at a time, until the graph becomes disconnected.Let  be the average number of edges thus removed.Then Theorem 2.1 states that  is m + 1 times the average of the all-terminal reliability polynomial over the interval [0, 1]; the latter is known as the average reliability (AR) of the graph, and has been studied in [4].The average reliability of a graph was proposed as a single numerical measure of the robustness of a graph, and allows one to search for the existence of an optimal graph with respect to reliability (even when a graph optimal for all q ∈ [0, 1] need not exist [1]) among all graphs with a given fixed number of vertices n and a fixed number of edges m.We remark that the average reliability (and hence ) is also known to be intractable [4].
) , where, as in the C-form, C k is the number of edge subsets of size k whose deletion disconnects the graph.To see the latter, note that if a subset S of k edges has the property that G − S is disconnected, then this is true under any edge ordering of S, so that ) .
We now write the mean  as ) .
On the other hand, from the C-form of the reliability polynomial, so that, from the above formula for , where we used the fact (see, e.g., [17]) that for i ∈ {0, 1, … , m}, the integral of the Bernstein basis polynomial [9] satisfies ∫ 1 0 To illustrate the theorem, we again consider the two examples from Section 1.

Example 4.
For the graph H introduced in Section 1, the reliability polynomial satisfies R(q) = (1 − q l ) 2 while m = 2l.We have consistent with the direct computations that  = 2l 2 l+1 in (2.4).Theorem 2.1 gives a practical way to approximate  for large graphs using Monte Carlo simulation without computing the reliability polynomial explicitly, which in general is intractable.
We will now show an even deeper connection between the edge deletion process and the all-terminal reliability polynomial.From the C-form of reliability, we can write Note that ) is a binomial distribution.In the limit of large m, it converges to a normal distribution of mean  = qm and standard deviation  = √ mq(1 − q), so that ) .Recall that for any continuous function f k , Laplace's method gives the asymptotics assuming  ≪  (see e.g., [14]).Approximating the sum in (2.7) by an integral then yields ) k ∼ c  and it follows that 1 − R(q) ∼ c  , q = ∕m.Replacing  by k so that q = k∕m, we obtain This means that the complement of the reliability polynomial (i.e., 1 − R(q)), is approximated by the normalized histogram of the c k 's.To be more precise, assume we run the edge deletion process M times and denote the number of times the process stops after exactly k steps by S k .Then we have From now on, we will refer to the expression 1 − R(q) as the complementary reliability polynomial.Note that it follows from (2.8) that for large m, s k approximates minus the derivative of the reliability polynomial evaluated at k∕m, that is, (2.9) We now show that for sufficiently small values of k, c k can underestimate the values of 1 − R(k∕m) while for sufficiently large values of k overestimation can occur.Let (G) denote the edge connectivity of a graph G.Then, for graphs with  > 1 and any 1 ≤ k <  it holds that c k = 0, while 1 − R(k∕m) > 0 although, in general, the value is very small.On the other hand, for k > m − n + 1, the edge deletion process always disconnects the network hence for those k it holds that c k = 1, while 1 − R(k∕m) < 1 although again, in general, the difference is very small.
Finally, we can also use (2.8) to approximate the 99%-percentile of R(q).First determine the largest value of k which satisfies c k ≤ 0.01 and denote this value by k 99% .Then q 99% can be approximated by (2.10)

Validation on a synthetic and a real-world network
First we validate the results obtained in this section on a so-called crown graph.This graph is constructed by taking a complete bipartite graph K N,2 and joining the two nodes in the independent set of size 2 (see Figure 1).This graph has N + 2 nodes and 2N + 1 links, and will be denoted as Cr N+2 .
For the crown graph Cr N+2 , according to [27], we have The explicit expression (2.11) allows us to obtain a closed-form expression for  through (2.6): where Γ is the Gamma function.
We now use (2.11) and (2.12) to validate Theorem 2.1 by means of Monte Carlo simulation, as shown in Figure 1.Taking N = 48, formula (2.12) yields  exact = 12.4389.On the other hand, using M = 1000 Monte Carlo simulations we obtained  MC = 12.39, in excellent agreement with the exact result.Finally, we use (2.11) to determine q 99% numerically by solving R(q) = 0.99.With N = 48, (2.11) yields q 99% = 0.014469, so that removing k 99% = 1.40 edges on average results in a 99% probability of still being connected.According to Monte Carlo simulation, there is about a 1.3% probability of disconnection after removing two edges (whereas removing one edge never disconnects a crown graph).Using linear interpolation, this yields k 99%,MC ≈ 1.78.Note that the 99% threshold is in the very tail of the distribution, where the Monte Carlo approximation to R(q) is degraded.Using a 90% threshold instead, the exact formula (2.11) yields k 90%,exact = 4.589, compared to k 90%,MC = 5.0364, a better agreement (9.7% relative error for k 90% instead of 27% for k 99% ).
Due to the nature of MC simulation, the accuracy increases with M but relatively slowly.To measure the accuracy, let K j be the number of edges removed before the disconnection during simulation j.Define the Root Mean Square Error (RMSE) as Next we validate our results on a real-life network, namely the DFN communication network from the Internet Topology Zoo [15].This network has n = 58 nodes and m = 87 edges.This network is small enough to determine its reliability polynomial exactly.We used the ReliabilityPolynomial command from Maple's GraphTheory package to do so.shows the comparison between the normalized histogram for the c k 's (computed using 1000 Monte Carlo simulations), and the complementary reliability polynomial 1 − R(q).Visually we see a good fit.In addition,  MC = 6.673 while  exact = 6.896, so the relative error is only about 1%.Similarly we obtain k 99%,MC = 0.1252 while k 99%,exact = 0.1190, a relative error of about 5%.

ANALYTIC APPROXIMATIONS FOR RELIABILITY POLYNOMIALS
So far we have compared the results obtained from the edge deletion process for graphs for which an explicit expression for its reliability polynomial is available.However, in general such explicit expressions are intractable.In this section, we will introduce two closed-form approximations for R(q), which allow us to compare the results of the edge deletion simulation with the performance metrics  and the 99%-percentile.

First and second order approximations
The probability of a single vertex being isolated is q  , where  denotes the degree of the vertex.When q is small and for a sufficiently dense graph, the probability of having no isolated vertices asymptotically can be approximated by

.13)
This heuristic ignores any inter-dependence of the vertices, but works well when the graph is sufficiently "dense".Note that (3.13)only depends on the degree sequence of the graph and not on its finer structure.This is a property exhibited by the so-called "random configuration model", see for example, [11,18,21] and references therein.For sufficiently dense graphs, we use the heuristic that the probability of being disconnected approaches the probability of having an isolated vertex.While the rigorous justification of this heuristic remains an open problem, it is similar to the classical results for the Erdős-Rényi random graph model [8].We will see below that R 1 provides a good approximation to R for many realistic networks as well as for random regular graphs of degree 3 or more.A more accurate formula for R(q) also incorporates the probability of having no isolated 2-vertex subgraphs.Given an edge, the probability that it is disconnected from the graph is q a−2 (1 − q), where a is the sum of the degrees of the vertices adjoining this edge.This leads to the following, more accurate asymptotics: R ∼ R 2 , where where a j is the sum of degrees of the two vertices adjoining an edge j.Higher-order asymptotics can be written by considering isolated graphs of 3 or more vertices.However we will show in the next section that for most practical examples we considered, R 2 (and even R 1 ) provides high accuracy.The asymptotic formulae (3.13,3.14)can also be used to estimate  in (2.6), the expected number of edges that need to be deleted in order to disconnect the network.We have the following, increasingly accurate approximations for : (3.15) In addition, we will also use the asymptotics (3.13) and (3.14) in the right-hand side of (2.8), which approximates the CDF of the s k 's.

Regular graphs
Consider the special case of -regular graphs, in particular for large n.For this case, the approximation (3.13) becomes From (3.17) we estimate: Substituting m = n 2 for -regular graphs into (2.6)yields  ∼  1 for n ≫ 1 where: For example for a three-regular graph, this estimate yields  1 = 1.3394n 2∕3 .
Next we compute the asymptotics to two orders.Applying Laplace's method, we estimate ) , so to two orders in n, we obtain  ∼  2 , where

VALIDATION OF THE APPROXIMATIONS FOR THE RELIABILITY POLYNOMIALS
In this section, we compare the outcomes of the edge deletion process with the approximations Equations (3.13) and (3.14) for the reliability polynomial.Table 2 shows numerical values for the average number  of deleted links to disconnect the networks, as well as 99% threshold values.

Validation on real-world networks
In this subsection, we consider a number of real-world networks, taken from the Internet Topology Zoo [15] and the Network Repository [25].We apply Theorem 2.1 and asymptotics (3.15) and (3.16) to several real-world networks; the results are presented in Figure 3 and Table 2.The column  MC is computed using the Monte Carlo method, averaged over 1000 simulations.
Columns  1 and  2 are asymptotic estimates as given by (3.15) and (3.16), respectively.In addition, we include the reliability measure q 99% .Despite the diversity of networks presented in Figure 3, the asymptotics agree very well with Monte Carlo simulations for all networks shown except "Singapore", which represents the subway network in Singapore.The agreement breaks down for such network because it consists of many "strings": paths where each vertex has degree at most two.We will discuss how to improve asymptotics of such "subway" networks in Section 5.
It is interesting to note that the approximations for q 99% using either R 1 or R 2 give identical or nearly identical results (to all digits shown) in about half of the cases.

Validation on regular graphs
Table 2 shows the comparison between asymptotic results and Monte Carlo simulations for random -regular graphs (constructed using the configuration model [18]) with n = 100 and  as indicated.Here,  and q 99%,MC are computed using the Monte Carlo method using 10 000 realizations and is believed to be accurate to within less than 1% (as validated by averaging over several random subsets of simulations of size 5000).The relative errors err 1 = 1 − ∕ 1 and err 2 = 1 − ∕ 2 , are also shown.Note that for  = 3,  1 captures about 82% of cases, whereas getting either an isolated vertex or an isolated two-graph captures 92% of all cases (the remaining cases correspond to getting an isolated graphs of order 3 and higher).As the graph density is increased, disconnection due to isolated vertices captures more and more cases; for example, it captures >98% of cases when  = 10 (Figure 4).The last two columns in Table 3 show the 99-percentile, computed using Monte Carlo simulation (q 99%,MC ) and the asymptotic approximation in formula (3.19) (q 99%,as ).We observe an increasing accuracy of the asymptotics with increased .The agreement for q 99% is rather poor when  = 3 and n = 100 because in that case mq 99%,as ≈ 2, which is rather small (at the very tail of the distribution).The agreement is much better when  = 10.

SUBWAY-LIKE NETWORKS
As seen in Figure 3, the asymptotics (3.13) and (3.14) break down for the subway network of Singapore.It is characterized by many paths made of consecutive vertices of degree 2, interspersed with a few transfer stations that have higher degree.We will refer to these types of networks as "subway-type" networks [7].For this type of network, the breakdown of connectivity is most likely to happen because one of the "paths" gets disconnected, rather than getting an isolated vertex.This is the reason why R 1 or R 2 fail to estimate reliability in this case.
To obtain a better estimate of  for "subway-type" networks, we first process the graph by removing the "one-shell" from the graph as follows.Remove all vertices of degree one and their associated edges from the graph.Repeat, until there are no more vertices of degree one.If the reliability polynomial of the graph with the one-shell removed is R(q), then the reliability polynomial of the original graph is (1 − q) K R(q), where K is the number of edges/vertices in the one-shell that were removed.
In what follows, we shall assume without loss of generality that the one-shell has been removed to simplify the computation.We first compute the probability that in a single path consisting of l edges, some vertex cannot communicate with either of the end terminals of the path.This probability corresponds to failure of at least two edges along this path.The probability of such a failure is thus given by 1 − p l − lp l−1 q where p = 1 − q.This yields the following estimate for the reliability polynomial R ∼ R s for subway-type networks: where the product is taken over all the paths (consisting of vertices of degree 2 inside the graph, and with l j being the number of edges in such a path.The corresponding estimate for  becomes R s (q)q. (5.24) The function R shell (q) and the corresponding  s is shown in Figure 5.As can be seen,  s does a much better job for subway-type graphs than either  1 or  2 .
Finally, the two asymptotics (isolated vertex, path failure) can be combined for an even better approximation, but we do not pursue this further here.

DISCUSSION
Graph connectivity is an important measure in network theory.In this work, we have presented a simple Monte Carlo algorithm which consists of removing edges at random until the graph becomes disconnected.As the number of simulations increases, this method recovers exactly the average of the reliability polynomial.It can also be used to estimate the full reliability polynomial, whose exact computation is an NP-hard problem.We presented a heuristic for simple asymptotic estimates for the reliability polynomial of sufficiently dense graphs, based on the probability of getting an isolated vertex or a two-vertex graph.For sparse "subway-type" networks, we presented a different estimate based on the number of cycles in the graph.All of these estimates have been shown to work for many real-world networks as well as random regular graphs.
We end this paper with some open questions and conjectures.
Open question 1. Choose a random graph consisting of n 1 vertices of degree 2, and n 2 vertices of degree 3. Suppose that n 1 ≫ n 2 , in which case the graph is subway-like.What are the asymptotics of average reliability (AR) in this case?
Open question 2. What is the average reliability of Erdős-Rényi graphs?The degree distribution is Poisson in this case.
More generally, what is the "best" degree distribution if we wish to optimize the average reliability?
Open question 3.Among all graphs of a fixed number of vertices n and edges m, which graph maximizes the average reliability?Consider the case of n = 12 and m = 18 (which includes all cubic graphs on 12 vertices).We used Brendon McKay's program Nauty [19] to generate a total of about 2 × 10 7 such graphs.Further restricting to only graphs whose vertices have a minimum degree of 2 yields about 2 million graphs.By contrast, there are only 87 cubic graphs on 12 vertices.Figure 6 shows the plot of AR versus the algebraic connectivity (AC) for this collection of graphs.
The unique maximizer of AR = 0.350925 has AC = 1.467911, which is the second-highest AC.The unique maximizer of AC = 1.438447 has AR = 0.350792, which is the second-highest AR.Among high-AR graphs, the first 28 are cubic, having AR from 0.3509 to 0.3429.The highest non-cubic graph has AR of 0.34121, and has girth 5.By contrast, there are many non-cubic graphs with high AC; among the 30 graphs having the highest AC, only 2 are cubic.In terms of girth, there are a total of 7 girth-5 graphs, two of which are cubic.
Based on these observations, we propose the following conjecture.
Conjecture.Among all the graphs with a fixed number of nodes n and a fixed integer average degree , the graph that maximizes the average reliability is a −regular graph.In conclusion, let us mention some other related problems.In [10,20], the authors modelled human frailty and death events using complex scale-free networks.In their model, nodes are damaged at random (simulating an ageing process) until too much damage is accumulated in the main nodes.Asymptotics could be derived for their model.
Monte Carlo techniques are used extensively for reliability and failure analyses.For example, reliability of Mobile Ad-Hoc Networks (where the network changes in time) has been studied using Monte Carlo techniques [5].It would be interesting to extend the edge removal process to these classes of models.
Network reliability also has a connection with percolation theory.The percolation process can be viewed as a version of the edge removal process on a grid but with a different termination condition, namely the emergence of a giant component, rather than the network becoming disconnected [26].Depending on the situation, this may be a better measure of network reliability than a simple disconnection threshold.
Finally, we remark that our approach for analyzing network reliability can also be applied to the computation of two-terminal reliability.

FIGURE 2
FIGURE 2Top: The DFN network (consisting of 58 nodes and 87 edges) and its degree histogram.Complementary reliability polynomial (dashed line is 1 − R(q) computed exactly) compared with the CDF for the s k 's obtained through Monte Carlo simulations (yellow histogram).

FIGURE 3
FIGURE3 Comparison of Monte Carlo simulations (using M = 1000) and asymptotics based upon R 1 (q) and R 2 (q) for some real-world networks.Shown are the CDF's for the s k 's obtained using Monte Carlo simulations, and complementary reliability polynomials, based upon the asymptotics (3.13) and (3.14).Table2shows numerical values for the average number  of deleted links to disconnect the networks, as well as 99% threshold values.

FIGURE 6
FIGURE6 Algebraic connectivity (AC) versus average reliability (AR) for all connected graphs on 12 vertices and 18 edges with minimum degree 2. The total number of such graphs is 2 189 608, of which 7 have girth 5, 74 021 have girth 4, while the remaining 2 115 580 graphs have girth 3.

Table 1
lists E M as a function of M. As expected, square root scaling E M = O(M −1∕2 ), typical of Monte Carlo simulations, is observed.

TABLE 1
RMSE as a function of the number of simulations M.

TABLE 2
Comparison of Monte Carlo and asymptotic estimates for selected real-world networks (see also Figure3).Comparison of Monte Carlo simulations and asymptotic complementary reliability polynomials for random -regular graphs, with  as indicated and n = 100.

TABLE 3
Asymptotics versus Monte Carlo results for -regular graphs.Left: Subway-type network "Singapore", with one-shell removed.Right: Comparison of Monte Carlo simulation and approximations of the complementary reliability polynomials, based upon (3.13), (3.14) and (5.23).Also shown are the averages.