Random walk hitting times and effective resistance in sparsely connected Erd\H{o}s-R\'enyi random graphs

We prove expectation and concentration results for the following random variables on an Erd\H{o}s-R\'enyi random graph $\mathcal{G}\left(n,p\right)$ in the sparsely connected regime $\log n + \log\log \log n \leq np<n^{1/10}$: effective resistances, random walk hitting and commute times, the Kirchoff index, cover cost, random target times, the mean hitting time and Kemeny's constant. For the effective resistance between two vertices our concentration result extends further to $np\geq c\log n, \; c>0$. To achieve these results, we show that a strong connectedness property holds with high probability for $\mathcal{G}(n,p)$ in this regime.


Overview & Results
We calculate the effective resistance R(i, j) between two vertices i, j of G(n, p), the distribution over n-vertex simple labelled graphs generated by including each edge independently with probability p. Exploiting the strong connection between electrical networks and random walks -an outline of this connection is given in Sections 2.1 & 2.4-we then deduce random walk hitting and commute times, denoted h(i, j) and κ(i, j) respectively; these are the expected time taken for a random walk from i ∈ V to first visit j ∈ V , and then also return to i in the case of κ(i, j). In addition we obtain results for a range of other graph indices on G(n, p). One of these indices is the Kirchoff index, K(G), which is the sum of all effective resistances in the graph [5,16]. The other indices studied here are random target times H i (G), the mean hitting time T (G), Kemeny's constant H(G), and cover costs cc i (G), cc(G). These are sums of hitting times weighted by combinations of stationary or uniform distributions of vertices. The indices H(G), H i (G) arise in the study of random walks and Markov chain mixing [1,20], cc i (G) can be used to bound the cover time of a random walk [15,16] and the expected running time of Wilson's algorithm on connected graph G is O(T (G)), [27]. For definitions of these quantities see Section 2. There are a number of results in the literature concerning quantities related to random walks on Erdős-Rényi graphs -some of the most relevant work to the results presented here are the following [5,18,22,26]. Our results extend or complement some or all of the results in each of these papers as outlined in Section 1.2. Many of the results in the literature rely on exploiting connections between various random walk related quantities and spectral statistics of the graph. In this paper we do not employ spectral methods; the results we achieve hold for G(n, p) close to the connectivity threshold where it is hard to obtain good estimates on the relevant spectral statistics of G(n, p).
Throughout we take G ∼ d G(n, p) to mean G is distributed according to the law of G(n, p). Let C := C n be the event that G ∼ d G(n, p) is connected. Let a(n), b(n) : N → R, then for ease of presentation we use the notation a(n) O = b(n) to denote a(n) = 1 ± O log n np log(np) b(n).
Theorem 1.1 concerns moments of the above graph indices on G(n, p) conditioned to be connected. This conditioning is to ensure the expectation is bounded.
Theorem 1.1. Let G ∼ d G(n, p) with log n + log log log n ≤ np ≤ n 1/10 . Then for any i, j ∈ V (G) where i = j, For some of the indices, such as R(i, j) and K(G), tighter lower bounds than those stated above can be obtained from the proof of Theorem 1.1 which is located in Section 4. Concentration for many of these quantities is a consequence of the bounds on their moments. Theorem 1.2. Let G ∼ d G(n, p) with log n + log log log n ≤ np ≤ n 1/10 , f (n) : N → R + . Then for X ∈ {h(i, j), κ(i, j), K(G), H i (G), H(G), T (G), cc i (G), cc(G)}, i, j ∈ V, i = j, In particular by choosing f (n) = log log(np) above we see that these random variables concentrate in a sub-mean interval with high probability. Theorems 1.1 and 1.2 are valid only for np ≤ n 1/10 , however concentration for all of the aforementioned random variables has been determined for np above this range. The original contribution of this paper is determining expectation and concentration close to the connectivity threshold np = log n, see the literature review in Section 1.2 for more details.
As will be seen in Section 2.1, the graph indices in Theorems 1.1 and 1.2 are determined by effective resistances. Our approach is to control the effective resistances and in turn use these to control the other quantities. We must now clarify some notation.
For a graph G let d(i, j) be the graph distance between i, j ∈ V and define the following which are the k th neighbourhood of i, size of k th neighbourhood and the ball of radius k centred at i respectively. Throughout we say that if f (n) = ω(g(n)) then for any K ∈ R there exists some N 0 ∈ N such that for all n ≥ N 0 , f (n) ≥ K|g(n)|. The next theorem shows that with high probability the main contribution to the effective resistance R(i, j) between vertices i, j ∈ V comes from the flow through edges connecting i and j to their immediate neighbours. Theorem 1.3. Let G ∼ d G(n, p) with c log n ≤ np ≤ n 1/10 , c > 0. Then for i, j ∈ V, i = j (i) P R(i, j) − 1 γ 1 (i) + 1 γ 1 (j) > max 1 γ 1 (i) 2 + 1 γ 1 (j) 2 , 9(γ 1 (i) + γ 1 (j)) log n γ 1 (i)γ 1 (j)np log(np) ≤ 2np 2 + o e −np/4 .
(ii) If np = c log n for c > 0 then for any k > 0, P R(i, j) − 2 c log n > 10 c 2 log(n) log log(n) ≤ 5 (log n) k .
(iii) If np = ω(log n) then From the definition of the effective resistance between two vertices i, j ∈ V (G), see (17) below, one observes that the contribution to R(i, j) from each edge in the graph is quadratic in the amount of flow passing through that edge. The main work in this paper is to show that there are many edge disjoint paths from each first neighbour of i to the first neighbours of j. If this is the case then flow divides up between the edges outside of the first neighbourhoods in such a way that the contribution to the effective resistance from these edges is negligible.
To make this idea precise we formulate the strong k-path property, Definition 3.2, and in Lemma 3.3 provide an upper bound on effective resistance for any graph which satisfies the strong k-path property. This bound may potentially be applied to other classes of graphs. In this paper we focus on Erdős-Rényi graphs and in Lemma 3.7 we show that for some k the strong k-path property holds with high probability in the sparsely connected regime.
Bollobás & Thomason [4,Theorem 7.4] showed the threshold for having minimum degree k(n) coincides with the threshold for having at least k(n) vertex-disjoint paths between any two points. Let paths 2 (i, j, l) be the maximum number of paths of length at most l between vertices i and j of G that are vertex disjoint on V \ (B 1 (i) ∪ B 1 (j)). The strong k-path property can be used to prove a related "local first neighbourhood relaxation" of this statement for two vertices. Theorem 1.4. Let G ∼ d G(n, p) with c log n ≤ np ≤ n 1/10 , c > 0 and l := log n/ log(np)+ 9. Then for i, j ∈ V where i = j, (i) P(paths 2 (i, j, l) = min{γ 2 (i), γ 2 (j)}) ≤ 5n 3 p 4 + o e −7 min{np,log n}/2 , (ii) P paths 2 (i, j, l) − (np) 2 > 3(np) 3/2 √ log np = o (1/np).
It is of note that unlike Bollobás & Thomason's result, Theorem 1.4 (i) is a statement about the paths between two given vertices rather than a global statement. In fact P(paths 2 (i, j, l) = min{γ 2 (i), γ 2 (j)} for all {i, j} ⊂ V ) = 0, as there are many pairs of vertices at distance one from each other. If one wishes to prove a similar relaxed connectivity condition on the whole graph a more sophisticated statement is needed -this is work in progress by the author.

Literature & Background
As noted above many results in the literature on random walk indices arise from connections with spectral theory. To discuss these results we must first clarify some definitions. Let A be the adjacency matrix of a graph G and D be the diagonal matrix with D i,j = γ 1 (i) if i = j and D i,j = 0 otherwise. The combinatorial Laplacian L is defined as L := D − A.
Let L † (G) denote the Moore-Penrose pseudoinverse of L(G). This is a generalisation of the inverse of a matrix, see [24] for more details.
Boumal & Cheng [5] exploit an expression for the Kirchoff index K(G) in terms of the trace of L † (G) to obtain expectation and concentration for K(G) on G(n, p) with np = ω (log n) 6 . We will now outline a related expression for K(G) and explain how this can also be used with spectral statistics to control K(G). Let λ i be the eigenvalues of L(G), where G is a finite connected graph. Then by the matrix tree theorem [16]: A theorem of Coja-Oghlan, [8,Theorem 1.3], states that if G ∼ d G(n, p) with np ≥ C 0 log n for sufficiently large C 0 the non-zero eigenvalues of L(G) concentrate around the mean. Combining these estimates with (2) yields concentration for K(G) and with extra work the leading order term of E K(G) C can be determined when np ≥ C 0 log n. It is of note however that Boumal & Cheng obtain second order terms for E K(G) C , which is not possible with the latter method. Theorems 1.1 and 1.2 give expectation and concentration for K(G) also when np ≥ log n + log log log n.
Löwe & Torres [22] obtain concentration results for H(G), H i (G), κ(i, j) on G(n, p), defined as Kemeny's constant, random target times and commute times respectively. Again, the result comes from using expressions for these quantities in terms of the eigenvectors and eigenvalues of the transition matrix of the simple random walk, these expressions can be found in [21]. Löwe & Torres then apply results from Erdős et. al. [11,12] to bound from above the reciprocal of the spectral gap. Löwe & Torres require np = ω (log n) C0 for some C 0 > 0 sufficiently large as this is needed to apply the results in [11,12]. Theorems 1.1 and 1.2 extend these results to the range np ≥ log n + log log log n.
Von Luxburg, Radl & Hein [26,Theorem 5] prove bounds on the difference of h(i, j)/2|E| and κ(i, j)/2|E| from 1/γ 1 (i) + 1/γ 1 (j) and 1/γ 1 (i) respectively for non bipartite graphs by the reciprocal of the spectral gap and the minimum degree of G. They then apply these to various geometric random graphs. The issue with applying these bounds to Erdős-Rényi graphs is that we have to bound from above the reciprocal of the spectral gap so a lower bound on the spectral gap is required. This appears to be a very hard problem and to the author's knowledge the state of the art in eigenvalue separation for G(n, p) are the papers [11,12]. So, as is the case with the Löwe & Torres result, if we wish to apply these to get concentration for h(i, j), κ(i, j) in G(n, p), then we have to make the assumption np = ω (log n) C0 for some C 0 sufficiently large. Theorem 1.2 however provides concentration results for h(i, j) and κ(i, j) when log n + log log log n ≤ np ≤ n 1/10 .
In [18] Jonasson studies the cover time, the expected time to visit all vertices from the worst start vertex, for G(n, p). He bounds the cover time by showing effective resistances and hitting times on G(n, p) concentrate in the regimes where ω(log n) = np ≤ n 1/3 . Jonasson does not use spectral methods and instead achieves an upper bound on the effective resistance by finding a suitable flow. This is the approach we have also taken, however we use a refined analysis and extend Jonasson's results for hitting times to the case where np ≥ log n + log log log n and for effective resistance to the case np ≥ c log n, c > 0.
It is worth noting that the cover time has since been determined for all connected G(n, p) by Cooper & Frieze [9] using the first visit time Lemma and mixing time estimates. One cannot deduce much about the individual hitting times h(i, j) from this result. The question we address in this paper is: "what does a typical hitting time look like?"

Random walks on graphs and related indices
Throughout we will be working on a finite simple connected graph G = (V, E) with |V | = n and |E| =: m. Let X := (X t ) t≥0 be the simple random walk on G.
The hitting time h(i, j) is the expected time for X to hit vertex j when started from vertex i.
Let π(u) = γ 1 (u)/2m be the mass of u ∈ V with respect to the stationary distribution of the simple random walk X on G. We then define the following two indices for j ∈ V , The index H j (G) is known as the random target time to j, H(G) is known as Kemeny's constant, see [1,20]. Kemeny's constant is independent of the vertex i, see [21,Eq. 3.3]. Let be the mean hitting time of G, see [1,20,27]. Let R(i, j) be the effective resistance between two vertices i, j ∈ V with unit resistances on the edges, this is formally defined in Section 2.4. The following sum of resistances is known as the Kirchoff index, see [5,16], The cover cost cc i (G) of a finite connected graph G from a vertex i was studied in [15,16]. We also introduce the uniform cover cost cc(G). For i ∈ V we define these indices as The hitting times h(i, j) can be far from symmetric, see the example of the lollipop graph [21]. The commute time κ(i, j) is the expected number of steps for a random walk from i to reach j and return back to i. The commute time κ(i, j) is symmetric and related to hitting times and effective resistances by the commute time formula [25] Using (9) we can relate the uniform cover cost to the Kirchoff index The following relation for hitting times is know as Tetali's formula [21] h(i, j) = mR(i, j) + Relations (9), (10) and (11) will be useful to us as they allow us to control commute times, cover costs and hitting times by effective resistances.

Erdős-Rényi graphs
The Erdős-Rényi random graph model G(n, p) is a probability distribution over simple n vertex graphs. Any given n vertex graph G = (V, E) is sampled with probability This P is the product measure over edges of the complete graph K n where each edge occurs as an i.i.d. Bernoulli random variable with probability 0 < p := p(n) < 1. Throughout E will denote expectation with respect to P. Another feature of Erdős-Rényi graphs worth mentioning is that for each u ∈ V the degree of u is binomially distributed γ 1 (u) ∼ d Bin(n − 1, p) and the degrees are not independent. This model has received near constant attention in the literature since the original G(n, m) model was studied by Erdős & Rényi [13]. For more information consult one of the many books on random graphs [4,14,17]. In this paper we will look at the graph indices mentioned above when the graph is drawn from G(n, p), so each of the graph indices becomes a random variable. For any of these random variables to be well defined and finite we need G to be connected. Take C := C n to be the event G is connected; we will drop the subscript n where it is implicit. Let P C (·) := P (· | C) and E C := E [· | C] be the expectation with respect to P C . The following theorem gives a bound on being disconnected above the np = log n connectivity threshold.

Probabilistic notions and tools
For a random variable X let X ∼ d Y denote X being distributed according to the law of Y .
for every x and we use the notation for any α ≥ 1. Let Bin(n, p) denote the binomial distribution over n trials each of probability p. We will make frequent use of the following binomial tail bounds. . If X ∼ d Bin(n, p), then for any a > 0 2(np+a/3) . We also have the following closed form for moments of binomial random variables, Let X ∼ d Bin(n, p), 0 < p := p(n) < 1 and d ≥ 0 fixed. Then by Theorem 2.3 we have The following is a special case of the coupling inequality.
This next Proposition is useful in combination with the lemma following it. Proof.
The lemma below gives an upper bound on the expectation of reciprocal powers of X ∼ d B(n, p) when p := p(n) is allowed to tend to 0. This lemma may be of independent interest since other results in the literature appear to require p bounded away from 0.
Proof. Let f (x) := f a,b (x) = (a + x) −b for any constants a, b > 0. The lower bound follows from Jensen's inequality since f (x) is convex for a, b > 0.
With this r we can achieve the following a priori upper bound for any b ≥ 1: By Taylor's theorem there is some ξ n between X n and µ n such that Using Hölder's inequality (4) and the fact f (x) is decreasing when x > 0, we have The last inequality follows by (14) this can be calculated using the binomial moment generating function or by Theorem 2.3. Hence by (15), (16) and (f a,b (x)) Let Y be a random variable and f : R → R such that E[f (Y )] exists. Then if P(C) ≥ 1/2,

Electrical network basics
There is a rich connection between random walks on graphs and electrical networks. Here we will give a brief introduction in order to cover essential notation and definitions used in the paper; consult either of the books [10,23] for an introduction to the subject. An electrical network, N := (G, C), is a graph G and an assignment of conductances C : E(G) → R + to the edges of G. Our graph G is undirected and we define E(G) := { xy : xy ∈ E(G)}, this is the set of all possible oriented edges for which there is an edge in G. For some i, j ∈ V (G), a flow from i to j is a function θ : E(G) → R satisfying θ( xy) = −θ( yx) for every xy ∈ E(G) as well as Kirchoff's node law for every vertex apart from i and j, i.e.
A flow from i and j is called a unit flow if in addition to the above it has strength 1, i.e.
For the network N = (G, C) we can then define the effective resistance R C (i, j) between two vertices i, j ∈ V (G). First for a flow θ on N let This is the energy dissipated by the current of strength 1 from i to j in N = (G, C). This current exists and is unique since we are working on a finite graph. We will work with unit conductances so we have C(e) = 1 for all e ∈ E(G). When this is the case we write R(i, j) instead of R C (i, j). This corresponds to the effective resistance in Equations (7), (9) and (11). One very useful tool is Rayleigh's monotonicity law [23, § 2.4 ]: If C, C ′ : E(G) → R + are conductances on the edge set E(G) of a connected graph G and C(e) ≤ C ′ (e) for all e ∈ E(G) then for all pairs {i, j} ⊂ V (G), we have

Bounds on effective resistance
The aim of this section is to obtain lower and upper bounds on R(u, v) for u, v ∈ V (G) for a graph G where the main contribution to R(u, v) is from the first neighbourhoods of u and v. These bounds will later be applied to Erdős-Rényi random graphs.

Bounds in terms of degrees
Recall that γ 1 (v) denotes the size of the first neighbourhood of vertex v ∈ V (G). Jonasson gives the following lower bound on effective resistance.
Observe that although the above bound holds for any two distinct vertices it is only really meaningful if they are in the same connected component. This is since otherwise the effective resistance between the two vertices is defined to be infinite.
We now aim to obtain an upper bound where the dominant term looks roughly like the one in Lemma 3.1. To achieve this we analyse the following modified breadth-first search (MBFS) algorithm. The MBFS algorithm outputs sets I i and S i which are indexed by the graph distance from {u, v}. The algorithm is similar to one used in [2,Ch. 11.5] to explore the giant component of an Erdős-Rényi graph. However the MBFS algorithm differs from other variations on breadth-first search algorithms used in the literature as it starts from two distinct vertices. More importantly it also differs by removing clashes, where a clash is a vertex with more than one parent in the previous generation as exposed by a breadth-first search from two root vertices.
Modified breadth-first search algorithm, MBFS(G, I 0 ): The inputs to the algorithm are a graph G and I 0 = {u, v} ⊆ V (G). At any time a vertex in V (G) will be in one of three states: live, dead or neutral. To run the MBFS algorithm on our graph G we begin with two root vertices u, v. Declare u, v to be live and all other vertices in the graph to be neutral. We then generate the sets S i and I i+1 from I i by the following procedure: Step 1: Given a set of live vertices I i , declare the set of all the neutral vertices at this time to be S i . Check all pairs {w, w ′ } where w ∈ I i and w ′ ∈ S i and if ww ′ ∈ E(G) then add w ′ to I i+1 and declare it live. The order in which we consider these pairs is unimportant. Finally, declare all vertices in I i to be dead.
Step 2: For each w ′ ∈ I i+1 count the number of w ∈ I i such that there is some edge ww ′ ∈ E(G); again order is unimportant. If this number is greater than 1 remove w ′ from I i+1 and declare it dead.
Step 3: If there are still neutral vertices left return to Step 1. Otherwise end.
Observe that the role of Step 2 is to remove clashes. If we skip this then the procedure would describe a breadth-first search starting from two root vertices. If in addition to skipping step 2 we also started with I 0 = {u} as opposed to I 0 = {u, v}, then this would just be a standard breadth-first search from u. We will define the following edge sets E j , j ≥ 0 produced by running MBFS (G, I 0 ): Let x ∈ I k where I k is produced by running MBFS(G, I 0 ) for some given I 0 . Recall the definition (1) of Γ(x) and define the following sets for i ≥ 0 The set Γ * i (x) is the i th neighbourhood of x ∈ I k with clashes removed. Define for some constant d the pruned neighbourhood Φ 1 (x) of x ∈ I 1 by x Figure 1: This diagram illustrates the strong k-path property A n,k u,v , see definition 3.2. In the above example the the vertex z is not in Ψ 2 (u) since it is connected to less than d vertices in I 3 and the vertex w is not in I 2 as it has more than one parent in I 1 .
Then define the pruned neighbourhoods Ψ 1 (w) of w ∈ I 0 by We can then define the pruned second neighbourhood Φ 2 (w) of w ∈ I 0 by For MBFS(G, {u, v}) define Ψ i , the pruned version of I i for i = 1, 2, by We prune the first neighbourhoods of vertices x ∈ I 1 to obtain Φ 1 (x) so that later on when we consider the trees induced by the union up to i of the Γ * -neighbourhoods of y ∈ Φ 1 (x) we can get good control over the growth rate of the trees. We prune the first neighbourhoods of vertices w ∈ I 0 as above so that we can send flow from our source vertex w to its pruned neighbourhood Ψ 1 (w) without having to worry about it getting stuck in any "dead ends".
Recall (18), the definition of the filtration F k (G, I 0 ). Observe that if x ∈ I k then Γ * 1 (x) is F k+1 measurable. It is worth noting however that if y ∈ I 1 then Φ 1 (y) is F 3 measurable and not F 2 measurable since Φ 1 (y) is determined by vertices at distances 2 and 3 from I 0 . A consequence of this is that for w ∈ I 0 , Ψ 1 (w), Ψ 2 (w) are both F 3 measurable as they are both determined by the Φ 1 -neighbourhoods of points in Γ * 1 (w). We use the sets Ψ and Γ * returned from running the MBFS algorithm on a graph G in the following definitions.
Definition 3.2 (Strong k-path property). We say that a graph G on [n] := {1, . . . , n} has the strong k-path property for an integer k ≥ 0 and a pair of vertices u,v if for every pair (x, y) ∈ (Ψ 2 (u) × Ψ 2 (v)) the neighbourhoods Γ * k (x) and Γ * k (y) are non-empty and there is at u,v be the set of graphs on [n] satisfying the strong k-path property for u, v ∈ [n]. For y ∈ I k we define the following sets S k (y) which are the neutral vertices at time k, i.e. those that will not cause any clashes when the Γ * -neighbourhood Γ * (y) of y is explored, The sets B u,v w for w ∈ {u, v} are also defined using the output of MBFS(G, {u, v}): The next Lemma provides an upper bound on the effective resistance for graphs satisfying the strong k-path property.
Proof. We will follow the convention that 1/0 = ∞. If G / ∈ B i,j then the bound holds trivially as at least one of the first two terms on the right is infinite.
We will now define a graph H which must exist as a subgraph of G whenever G ∈ A n,k i,j ∩ B i,j . The subgraph H will be defined as a union of many subgraphs of G which are themselves described by the sets produced from running MBFS(G, {i, j}).
By the strong k-path property there is at least If there is more than one edge we select one and disregard the others. Let this set of edges be E * . Let F be the graph E(F ) = E * and V (F ) := {z : zw ∈ E * }. Thus F is a set of edges complete with end vertices which bridge some leaf of tree T k (x) to some leaf of T k (y) for each pair (x, y) ∈ Ψ 2 (i) × Ψ 2 (j).
With the above definitions the subgraph H is then Consult Figure 1 for more details. We will now describe a unit flow θ from i to j through the network N = (H, C) where C(e) = 1 for all e ∈ E(H). This flow will be used to bound from above the effective resistance R(i, j) in G.
Observe that one unit of flow leaves i and enters j. The contribution to E(θ) from the flow through these edges is By definition of Ψ 1 (i), Ψ 1 (j) the sets Φ 1 (i a ) and Φ 1 (j b ) are non-empty so this is well defined. We see that Kirchoff's node law is satisfied at each vertex i a ∈ Ψ 1 (i) since Figure 2: The descendants of t ∈ I d−2 in the tree T k (i a,f ) rooted at i a,f , where the notation is consistent with Step (iv) from the proof of Lemma 3.3. Here the descendants of w are shown in green and those that also have z as an ancestor are are shown in red. The edges of E * and their endpoints are shown in blue. and likewise for each j b ∈ Ψ 1 (j). The contribution to E(θ) from these edges is .
. We then assign the following flow to xy: The reason for this is that if we sum the flows leaving T k (i a,f ) through the vertex set which is the amount of flow entering T k (i a,f ) at the vertex i a,f and likewise for the trees In the next step we show Kirchoff's node law will be satisfied at each vertex in V (F ) by virtue of the assignment of flow through the trees The inequality above follows since when G ∈ B i,j we have ψ 1 (i), ψ 1 (j) ≥ 1 and (iv) For each wz ∈ E (T k (i a,f )) we set θ( wz) proportional to the amount of flow leaving z's descendants in the set Γ * k (i a,f ), see Figure 2. If z ∈ I d then let t be the parent of w when T k (i a,f ) is rooted at i a,f and let t = i a if w = i a,f . We set It is very complicated to work out the contribution to E(θ) by the edges of every T k (x) for x ∈ Ψ 2 so we give the following upper bound.
First we identify the vertices in Γ * k (x) as a single vertex. This does not change the effective resistance since two vertices in a tree at the same distance from the root have the same potential in the electrical current from the root to the leaves. Now we choose one non-backtracking path P k (x) in T k (x) from x to some vertex in Γ * k (x) and send the whole flow through this path. The energy dissipated by the flow in the path .
Now we collect the contributions to E(θ) from the edges in E(H) in Steps (i)-(v) above to obtain the following bound on 3.2 Neighbourhood growth bounds and the strong k-path property for G(n, p) In the previous section we obtained Lemma 3.3 which is an upper bound for the effective resistance in a graph with the strong k-path property. This bound is by an expression involving the pruned neighbourhoods Φ 1 and Ψ 1 , defined at (20) and (21) respectively. In this section we show that the strong k-path property holds with high probability for G(n, p) in an appropriate range of p, which we call sparsely connected. To do this we must gain control over the distributions of γ * , ϕ and ψ.
A key feature of the MBFS algorithm is that the clashing vertices are removed rather than being assigned a unique parent. Though this means we are reducing the sizes of the neighbourhoods, removing clashing vertices in this way ensures that for MBFS on p) and run MBFS(G, I 0 ).
(iv) Let u ∈ V , then conditioned on γ 1 (u), This happens independently with probability (1 − p) 2 for each of the n − 2 vertices in S 0 thus A vertex in S 0 is in I 1 if it is connected to exactly one vertex in I 0 . This happens independently with probability 2p(1 − p) for each of the n − 2 vertices in S 0 thus Item (ii): recall the definitions of Γ * 1 (x) and S k (x) for x ∈ I k , given by (19) and (23) respectively. Observe the following relation: Since we completely remove the vertices if they clash, and the edges of G are independent, the order MBFS explores the neighbourhoods of each y ∈ I k is unimportant. Assume that we have explored the neighbourhood of every y ∈ I k with y = x. We then know which vertices in the neutral set S k will not clash if included in Γ 1 (x) and these are the vertices in S k (x). Since edges occur independently with probability p, conditioning on |S k (x)| yields and there is no edge of the form y ′ v ∈ E where y ′ ∈ I k+i and y ′ = y. Conditioning on the sizes of I k+i and Γ These events are independent as each edge occurs independently. Thus, conditioning on |S k+i |, |I k+i | and γ * i (x), we have and these events are all independent. Thus conditioning on γ 1 (u) we have Let x ∈ I k . Choosing i = 0 in Lemma 3.4 (iii) gives γ * 1 (x) ∼ d Bin |S k |, p(1 − p) |I k |−1 conditional on |S k | and |I k |. This appears to differ from the distribution Bin (|S k (x)|, p) given by Lemma 3.4 (ii). However this is not the case as, conditional on |S k | and |I k |, The following branching estimates will be used to show G(n, p) has the strong k-path property w.h.p. The estimates are very similar to the bounds on neighbourhood growth obtained in [6] however we need far greater control of the exceptional probabilities.
Lemma 3.5 (Γ-Neighbourhood bounds). Let G ∼ d G(n, p) where np = ω (log log n). Then for u ∈ V and any i ≤ log n/ log(np), k > 3, Proof. Item (i): we wish to show the following by induction on i ≥ 0 Let H i := {γ i (u) ≤ a i (np) i } and observe that for the base case γ 0 (u) = 1 = a 0 . Notice that Conditional on γ i (u) we have γ i+1 (u) 1 Bin (γ i (u) · n, p). Thus by (25) above An application of Lemma 2.2 (ii) and the inductive hypothesis (bound on P(H c i )) yields Since a i , np ≥ 1, the exponent of the first term is smaller than the second, thus Let λ = k √ np for any k ≥ 0 and observe that Then, since np = ω (log log n) and i ≤ log n/ log(np), we have We will show that a i ≤ 2k 2 for all i. Since a 0 = 1 ≤ 2k 2 assume a i ≤ 2k 2 , then by (25) Item (ii): since {γ j (u) ≤ 2k 2 (np) j , for all 0 ≤ j ≤ i} ⊆ {|B i (u)| ≤ (2k 2 + 1)(np) i } we have the following by Item (i) above for np = ω (log log n), i ≤ log n/ log(np) and u ∈ V : Lemma 3.6 (Γ * -Neighbourhood lower bounds). Let G ∼ d G(n, p) and i ∈ Z satisfy Let Ψ 2 be defined with respect to MBFS(G, {u, v}) for some given u, v ∈ V .
(i) If np ≥ c log n for any fixed c > 0 then (ii) If np = ω (log n) then for any fixed K > 0 (iii) If np ≥ log n + log log log n then for any 5 ≤ i ≤ ⌊log(n)/ log(np)⌋ − 5 Proof. We will first set up the general framework for a neighbourhood growth bound and then apply this bound under different conditions to prove Items (i), (ii) and (iii). Run MBFS(G, {u, v}) and let y ∈ I h , n i := |S i+h |, p i := p · (1 − p) |I i+h |−1 and r i = i j=i0 n j p j . We wish to show that there exists some i 0 ∈ Z, i ≥ 0 such that for all i ≥ i 0 : where a i satisfies a i+1 = a i − λ √ a i / √ r i , for some initial a i0 we will find later. Observe

Now by Lemma 2.2 (i) and the inductive hypothesis
The above always holds, however it may be vacuous as if i is too large then a i may be negative. This can also happen for an incorrect choice of the starting time i 0 and initial value a i0 . We address this in the application making sure to condition on events where everything is well defined. In this spirit let l := ⌊log(n)/ log(np)⌋ − h − 1 and Conditioning on the event D and the filtration F i+h for any i ≤ l ensures Bin (n i , γ * i (y)p i ) is a valid probability distribution and n i p i = (1 − o(1))np. By Lemma 3.5 with k = 6, .
Recall the definition (24) of B u,v w , B u,v for w ∈ {u, v} ⊂ V . Let G ∼ d G(n, p) and define We are now in a position to show that the strong k-path property holds in sparsely connected Erdős-Rényi graphs with high probability.
On the event T 1 when MBFS(G, {u, v}) has run for k + 2 iterations there is still a lot of the graph yet to explore and the algorithm will run for at least one more iteration. The k in the definition of T will be the one occurring in A n,k u,v . Set the value of k to be k := k(n, p) = log 4n (15) 2 /2 log(np) + 1 if np = c log n where c > 0 log 400n Notice k ≤ log(np)/2 log n + 2, it remains to show P G / ∈ A n,k u,v = o e −7 min{np,log n}/2 for k given by (30). Provided np ≤ n 1/10 this choice of k satisfies (26) in Lemma 3.6. Let Since ψ 2 (u) ≤ γ 2 (u) for any u ∈ V an application of Lemma 3.5 with k = 6 yields We have the following by the tower property and the bound (31) for P C (R c ) By Lemmas 3.5, 3.6 (i) and 3.6 (ii): The bound P γ * k (w) < 2n 1/2 w ∈ Ψ 2 ≤ e −4 min{np,log n} comes from an amalgamation of Lemmas 3.6 (i) and 3.6 (ii), where we have chosen K = 4 for Lemma 3.6 (ii). This is so we can cover the different values of np with one bound.
By (31), (32) and the bound on P G / ∈ A n,k u,v ∩ R ∩ T directly above: For P((B u,v ) c ), use Lemma 2.4 to bound the difference between the ψ and γ * -distributions: Then since P(ψ 1 (u) = γ * 1 (u)) is known by Lemma 3.9 we have Applying Lemma 3.4 (ii) to the first term and Lemma 3.5 (i) with k = 4 to the second: When conditioning on the event A n,k u,v to apply the effective resistance bound from Lemma 3.3 we normally condition instead on A n,k u,v ∩ B u,v . This is because G ∈ A n,k u,v is fairly meaningless if G / ∈ B u,v . However we have kept the bounds on P(B u,v ) , P A n,k u,v in the lemma above separate as sometimes it is necessary to condition on something stronger than the event B u,v . The bound on R(u, v) for G ∈ A n,k u,v , Lemma 3.3, is sensitive to the Ψ-neighbourhoods being empty and so we will also need the following crude but resilient bound on effective resistance in connected Erdős-Rényi graphs when calculating errors.
The next lemma in combination with Lemma 2.4 will allow us to gain control over the Ψ 1 and Φ 1 neighbourhood distributions in G(n, p) by relating them to the Γ * -neighbourhood distributions which are known by Lemma 3.4.
The law of total expectation and harmonic series sum yields
Item (ii): recall A n,k u,v := {there exists some k ≤ log n 2 log np + 2 such that G ∈ A n,k u,v }, thus Item (iii): Let H be the event {ϕ 1 (x) = γ * 1 (x) for all x ∈ I 1 } ∈ F 3 and define Recall ψ 1 (u) ⊂ I 1 for u ∈ I 0 and switch between the ϕ and γ * 1 distributions on the event H:

Now by the tower property and the definition of H we have
Applying the union bound since I 1 ∈ F 1 yields Let a := 4/ min{c, 1} where c > 0 is any fixed positive real number such that np ≥ c log n. Separate the expectations into parts |I 1 | ≤ 4a 2 np and |I 1 | > 4a 2 np : Since γ * 1 (x) ∼ d Bin(|S 1 (x)|, p) by Lemma 3.4, S 1 (x) ∈ F 2 , and by Lemma 3.9 (i) we have Applying Lemma 2.2 to the first term and Lemma 3.5 (i) with k = a to the last yields Separating the expectation into parts |S 1 (x)| ≤ n − 66(np) 2 and |S 1 (x)| > n − 66(np) 2 : Rearranging the first term and applying Lemma 3.5 (ii) with k = 4 to the middle term: Recall sup x∈Ψ1(u) 1 B u,v u /ϕ 1 (x) < 1/d, see (20) & (21). By Bernoulli's inequality (3): Note that the bound (43) on P u holds for any np ≥ c log n, c > 0 fixed. The restriction on np to np ≥ log n comes from (44), where we need P(C) bounded below by a constant. Item (iv): conditioning on the event A n,k u,v and applying Lemma 3.3 yields .
as K p ≥ np/9 for large n. Bounds on P u from (43) and on P A n,k If G ∈ C then there is a path of length at most n − 1 between any i, j ∈ V . Since effective resistance is bounded by graph distance for all i, j ∈ V we have the bound Proof of Theorem 1.1 (i) Proof of E C [R(i, j)]. We will partition Ω into the disjoint sets C 1 := A n,k i,j ∩ B i,j and (C 1 ) c . First we apply the bound on resistance from Lemma 3.3 to bound E C [R(i, j)1 C1 ] : By Lemma 3.10 (i) the first term in the sum is 1/(np) + O log n/(np) 2 log(np) . To bound the second term, start by pulling out sup 1/ϕ(a) from the sum over a ∈ Ψ 1 (x): Using Hölder's inequality (4) on the product of random variables in the expectation gives Upper bounds for each of the expectation terms can be found in Lemma 3.10, yielding .
Combining the estimates on E above with the bound on E C 1 B i,j x ψ1(x) by Lemma 3.10 (i): .
When np ≥ c log n and c > 3 we have the following for E C R(i, j)1 (C1) c by first applying the effective resistance bound (46) then bounds on P[C c 1 ] from Lemma 3.7: If log n + log log log n ≤ np ≤ 3 log n then we further partition using S i,j from (45) to obtain

The upper bound follows as
log n if np = ω(log n) and a = 3 √ log log n if np = O(log n). Then applying Lemma 3.1 and 1 ≥ 1 D yields when i = j. Since P C (D c ) ≤ P(D c ) /P(C) and bounding P(D c ) by Lemma 2.2 we have Proof of E C [h(i, j)]. We have the following expression for hitting times from (11): when i = j, by symmetry. We will calculate E C [γ 1 (u)R(i, j)] and apply Let M be the event {γ 1 (u) ≤ 5np, for all u ∈ V }. Then for each {i, j} ⊂ V partition Ω into We will now upper bound E C [γ 1 (u) · R(i, j) · 1 C1 ] using the Hölder inequality (4). This is almost identical to the calculation for E C [R(i, j) · 1 C1 ], see (47). However, we also use (13) to give bounds of the form E[γ 1 (u) α ] = (np) α + O (np) α−1 where α ∈ Z + . We have .
When np ≥ c log n and c > 3 for expectation on C 2 := C c 1 ∩M we apply the effective resistance bound (46) and γ 1 (u)1 M ≤ 5np, then bound P(C c 1 ) by Lemma 3.7 yielding If log n + log log log n ≤ np ≤ 3 log n then we further partition using S i,j from (45), to obtain Since P C (M c ) ≤ n · exp −3 · 4 2 np/8 /P(C) = o(1/n 5 ) by Lemma 2.2 we have Combining expectations over C 1 , C 2 and C 3 yields the following for any u, i, j ∈ V, i = j Let D be the event √ log log n if np = O(log n). Then by Lemma 3.1 and 1 ≥ 1 D : when i = j. Since P C (D c ) ≤ P(D c ) /P(C) and bounding P(D c ) by Lemma 2.2 we have Summing (48) and (49) over u ∈ V yields the required bounds for Recall that for functions a(n), b(n) we use Proof of E C [κ(i, j)]. This follows from the result for E C [h(i, j)] as by (9) we have We will use linearity of expectation to express the expectations of these indices in terms of quantities we have already calculated. The bounds for E C [R(i, j)] in Theorem 1.1 (i) hold for all {i, j} ⊆ V . Hence by (7) we have The bounds for E C [h(i, j)] in Theorem 1.1 (i) hold for all i, j ∈ V, i = j. So by (8) we have The bounds for E C [κ(i, j)] in Theorem 1.1 (i) hold for all {i, j} ⊆ V . Thus by (10) we have Proof of Theorem 1.1 (iii) Proof of E C K(G) 2 . Observe that by (7) we have For each pair {i, j}, {w, z} ⊂ V partition Ω into the following disjoint sets The effective resistance bound from Lemma 3.3 yields By removing sup 1/ϕ(a) from the sums over a ∈ Ψ 1 (x), Ψ 1 (y) and by symmetry we have that E := E C [R(i, j)R(w, z)1 C1 ] is bounded from above by Then applying Hölder's inequality (4) and substituting like terms yields . Now applying the estimates in Lemma 3.10 to the expectations above we obtain .
When np ≥ c log n and c > 3 we have the following for expectation on C 2 by first applying the effective resistance bound (46) then bounds on P(C c 1 ) from Lemma 3.7: If log n + log log log n ≤ np ≤ 3 log n then we further partition using S i,j from (45) to obtain E C R(i, j)R(w, z)1 C2 1 Si,j∩Sw,z + 1 (Si,j ∩Sw,z) c ≤ (3 log(n)/ log(np)) 2 P(C 2 ) /P(C) for i = j, w = z. Now since P C (D c ) ≤ P(D c ) /P(C) and bounding P(D c ) by Lemma 2.2 The result follows from the above bounds and (50). Proof , if we use Tetali's formula (11) and expand E C [h(i, j)h(i, a)] we obtain the following for any i, j, a ∈ V : To see the above, observe that R(a, b)R(c, d) = 0 if and only a = b or c = d. Thus only the first term, g(i, j, i, a), will always be non-zero. All the other terms contain one or more input from {u, v} so will be zero at different times. Of the eight other terms there are two positive and two negative terms containing one of {u, v}, then two positive and two negative terms containing both u and v as inputs. Thus by symmetry when the sums are expanded everything apart from the first term g(i, j, i, a) cancels.
By removing sup 1/ϕ(a) from the sums and reducing using symmetry we have Then applying Hölder's inequality (4) and collecting similar terms we obtain . Now applying the estimates in Lemma 3.10 to the expectations above yields .
Since P C (M c ) ≤ exp −3 · 6 2 np/18 /P(C) = o(1/n 6 ) by Lemma 2.2 we have Combining expectations over C 1 , Let D be the event where a = 3 √ log log n if np = O(log n) and a = 3 √ log n if np = ω(log n). By Lemma 3.1: for i = j, w = z. The bound on P C (D) is by Lemma 2.2. Combining (51)-(53) yields for any i, j, w, z ∈ V, i = j, w = z. Thus we have the result for E C h(i, j) 2 .
Proof of E C cc i (G) 2 . This follows from (54) above as by the definition (8) of cc i (G), Proof of Theorem 1.1 (iv) Recall the definitions (5),(6) for i ∈ V : where m := |E| ∼ d Bin n 2 , p . Let h = n 2 − 1, m * ∼ d Bin (h, p). Then we have the following for any given k ∈ Z, k ≥ 1 using Proposition 2.5 and the fact that C ⊂ {m ≥ 1}: Observe that by (12), P(C c ) ≤ O (log n/(np log(np))) whenever np ≥ log n + log log log n. Using Lemma 2.6 to bound the expectation term we have . (3) for any given a, k ∈ Z, a, k ≥ 1 we have E C 1 m k 1/a = 2 k/a n 2k/a p k/a 1 + O log n np log(np) 1/a ≤ 2 k/a n 2k/a p k/a + O log n n 2k/a+1 p k/a+1 log(np)

Now by the Bernoulli inequality
.
Using Hölder's inequality to break the product of random variables in the expectation: Then applying (13), (55) and the upper bound on E C h(i, j) 2 from Theorem 1.1 (iii) yields The same upper bounds for E C [H i (G)] and E C [H(G)] follow similarly. By (11) we have for G connected. As G is connected the effective resistance bound, Lemma 3.1, yields Rearranging and reducing sums using the bound γ 1 (i)/(γ 1 (i) + 1) ≤ 1 we have Manipulating the sums and bounding terms in a similar manner yields Again by a similar procedure we have the following for the random target time H i (G) Let D be the event {m ≥ n 2 p/2 − a n 2 p/2} ∩ {γ 1 (j) ≤ np + a √ np} where a = 3 √ log log n if np = O(log n) and a = 3 √ log n if np = ω(log n). Now by Lemma 2.2 we obtain By Hölder's inequality (4), 1 ≥ 1 D and the bound on P C (D) in the line above we have The last equality comes from applying estimates to the expectation terms which are given by Lemma 2.6, (13), (55) and (52) respectively. Similarly we have and also, .
Proof of Theorem 1.1 (v) Proof of E C H(G) 2 , E C H i (G) 2 , E C T (G) 2 . We will first bound E C h(i, j) 3 from above. By Tetali's formula (11) we obtain the following for any i, j, a ∈ V Similarly to (51) when the product is expanded everything apart from the only term with effective resistances not dependent on the indices of summation cancels. There are three positive and three negative terms containing one of {x, y, z}, then six positive and six negative terms containing two of {x, y, z}, finally four positive and four negative terms containing all three indices {x, y, z}. When the sum over x, y, z is taken all the terms containing at least one of x, y, z cancel. For each (x, y, z) ∈ V 3 let M x,y,z be the event {γ 1 (x), γ 1 (y), γ 1 (z) ≤ 8np} and partition Ω into C 1 := A n,k i,j ∩ B i,j , C 2 := C c 1 ∩ M x,y,z , C 3 := C c 1 ∩ M c x,y,z .