Slightly subcritical hypercube percolation

We study bond percolation on the hypercube $\{0,1\}^m$ in the slightly subcritical regime where $p = p_c (1-\varepsilon_m)$ and $\varepsilon_m = o(1)$ but $\varepsilon_m \gg 2^{-m/3}$ and study the clusters of largest volume and diameter. We establish that with high probability the largest component has cardinality $\Theta\left(\varepsilon_m^{-2} \log(\varepsilon_m^3 2^m)\right)$, that the maximal diameter of all clusters is $(1+o(1)) \varepsilon_m^{-1} \log(\varepsilon_m^3 2^m)$, and that the maximal mixing time of all clusters is $\Theta\left(\varepsilon_m^{-3} \log^2(\varepsilon_m^3 2^m)\right)$. These results hold in different levels of generality, and in particular, some of the estimates hold for various classes of graphs such as high-dimensional tori, expanders of high degree and girth, products of complete graphs, and infinite lattices in high dimensions.

Theirs and subsequent investigations (e.g., [3,6,20,26] for the ERRG, and [2,[7][8][9][10]17] for hypercube percolation) confirm that this speculation holds to a very high degree. In fact, with various levels of success, this paradigm holds not just for the hypercube but for many classes of "high-dimensional" graphs. In other words, the behavior of the ERRG is universal.
Due to the ERRG's complete symmetry, one can employ combinatorial arguments and branching processes comparisons to study it with great precision. Near the critical probability, these methods tend to fail in the presence of geometry, even very simple geometries, such as the hypercube's. Finding arguments that work in greater generality is the main challenge and motivation for studying hypercube percolation.
To understand the context of our results, it helps to first discuss the behavior of the ERRG. We put p = c∕n for some constant c, and write  j for the jth largest connected component of G (n, p). It holds that when c < 1 we have | 1 | = Θ(log n) whp, 1 while when c > 1 we have that | 1 | = Θ(n) whp [12]. Bollobás [6] was the first to study the delicate features of this transition that become apparent when, instead of keeping c constant, we allow c to depend on n and let c → 1 as n → ∞. That and subsequent papers [3,6,20,26] led to the following intricate picture.
Let n = o(1) be a nonnegative sequence. We can distinguish the following three regimes of the phase transition: • The slightly subcritical regime: if 2 n ≫ n −1∕3 and p = (1 − n )∕n, we have for all fixed j ≥ 1 that | j | 2 −2 n log( 3 n n) P −→ 1. 1 For a sequence of random variables {X n } and a function f (n) we write X n = Θ(f (n)) with high probability (or whp) if there exist constants C ≥ c > 0 such that lim n→∞ P(cf (n) ≤ X n ≤ Cf (n)) = 1. 2 For two positive sequences a n and b n we write a n ≫ b n when a n ∕b n → ∞.
• The critical window: if n = an −1∕3 for some fixed a ∈ R and p = (1 ± n )∕n, we have for all fixed j ≥ 1 that ( | 1 | n 2∕3 , … , for some sequence of random variables ( i ) j i=1 (that depends on the constant a) supported on [0, ∞). • The slightly supercritical regime: if n ≫ n −1∕3 and p = (1 + n )∕n, we have for j ≥ 2, It has been shown so far that many of the features of the ERRG phase transition also hold for hypercube percolation. To state these results we first need to discuss the percolation threshold probability. We write V ∶= 2 m for the number of vertices of Q m and (x) for the vertex set of the connected component of the vertex x, that is, where {x ↔ y} denotes the event that the vertices x and y are connected by a path of open edges in the percolation configuration (with the convention that x ↔ x for all x). 3 Since the hypercube is a transitive graph, the distribution of (x) as an unlabeled finite rooted graph is independent of our choice of x, so we will often consider (x) for some x but simply write .
We define the susceptibility (p) ∶= E p [||] and the critical parameter p c = p c (Q m ) ∈ [0, 1] as the unique solution to (p c ) = V 1∕3 (1.1) for some fixed ∈ (0, ∞). There is some freedom in the choice of , see [8] for a detailed explanation. We can make sense of this definition via comparison with the ERRG, where ((1 ± )∕n) = Θ(n 1∕3 ) if and only if = O(n −1∕3 ) [6,26]. Although the state of affairs for hypercube percolation is not nearly as complete as that of the ERRG, a rather thorough investigation is performed in [2,[7][8][9][10][17][18][19]. It is established there that there exists a critical window of width O(V −1∕3 ) around p c in which the largest components are of order V 2∕3 . The value of p c has been estimated using the lace expansion [19] to be (see also [18] for an elementary proof).

The maximal volume of clusters in the slightly subcritical regime
The first result in this regime is due to Bollobás, Kohayakawa, and Łuczak [7], who showed that for percolation on Q m with p = p c (1 − m ), Note that this constraint on m is very far from m ≫ V −1∕3 , so this bound does not hold all the way up to the critical window. The second bound, due to Borgs and coworkers [8,9] is that for any fixed > 0, So their bound works all the way up to the critical window, but the lower bound is not of the same order as the upper bound. The first result of the current paper is to prove the correct order of magnitude for the largest component throughout the entire slightly subcritical regime. ) with high probability.
The upper bound is an immediate consequence of (1.3) and in Section 3 we provide the corresponding lower bound. The proof of Theorem 1.1 relies on the following lower bound on the tail of ||. Remark We believe that | 1 | = (2 + o(1)) −2 m log( 3 m V)) whp, as expected from the comparison with ERRG (and from [10,Conjecture 3.2]). This would follow if the constant c a in the latter theorem could be taken to be 1∕2 − o (1), but we are unable to prove this. See Section 1.3 for further discussion.

1.2
The maximal diameter, one-arm probability, and mixing time The diameter of a finite connected graph is the largest graph distance between any two vertices. We define the maximal diameter Δ max as the largest diameter among all the connected components. In [27] Łuczak shows that the slightly subcritical ERRG, that is, G(n, p) with p = (1 − n )∕n where n = o(1) but n ≫ n −1∕3 , satisfies Δ max = (1 ± o(1)) −1 n log( 3 n n) with high probability.
Our second result in this paper is the analogous statement for the hypercube.
Remark Note that in the subcritical phase of the ERRG the largest cluster  1 is not the cluster with the largest diameter whp. Indeed, one can readily show that  1 is a tree whp, and thus, if we condition on | 1 |, any tree with | 1 | vertices has the same probability of being  1 . Since a uniformly chosen tree on k vertices has diameter Θ( √ k) whp [30], we conclude that diam( 1 ) = Θ( −1 √ log( 3 n)) whp, a factor √ log( 3 n) away from the maximal diameter. We expect this to hold in subcritical hypercube percolation but we were unable to prove this, see Section 1.3.
The main ingredients in the proof of Theorem 1.3 are the following sharp bounds on the slightly subcritical boundary volume and one-arm probability: We next turn to analyzing the mixing time of simple random walk on the clusters. Recall that the total variation distance between two probability measures and on a finite set  is defined as Given a graph G = (, ), let (x) be the stationary distribution of the simple random walk on it, that is, (x) = deg(x)∕(2||) and let S t be the lazy simple random walk on G (i.e., a discrete time simple random walk that at each time step with probability 1 2 stays put and otherwise jumps to a uniformly chosen neighbor). The mixing time of the lazy random walk on G is defined by (the choice of 1 4 is standard and inessential, see [25]). The mixing time thus describes the time at which the random walk's distribution first comes "close" to the stationary distribution in total variation. As usual, let n be a nonnegative sequence such that n = o(1) and n ≫ n −1∕3 , and write  ⋆ for the component of G(n, (1 − n )∕n) with the largest mixing time. Ding, Lubetzky, and Peres [11,Theorem 2] proved that with high probability.
Our final result for the hypercube is almost the analogous statement.

About the proofs
Our proofs use the many tools and techniques developed in [8,17,23,28] to study the volume, diameter, and mixing time of large clusters in critical percolation. The main new ingredients we develop in this paper, which can be seen as further developments to the aforementioned papers, are bounds on the moments of clusters conditioned on having large diameter (see Section 3), sharp estimates for the one-arm event (Theorem 1.4 and Theorems 4.3 and 4.4), and an estimate showing that the probability of a long arm event cannot be increased significantly by removing a small number of edges from the graph (Theorem 4.5).
Our methods have an inherent limitation that prevents us from obtaining much sharper results on the volume of the largest cluster. In particular, we are unable to prove that the largest cluster is of size (2 + o(1)) −2 log( 3 V) whp. The limitation stems from the fact that in the subcritical phase there are many clusters of volume comparable to the largest one that exhibit many different geometries. The triangle condition (1.6) below gives us a firm understanding of the event that the cluster has a large diameter, but less so on the event that its volume is large. Thus, the lower bound on | j | obtained in Theorem 1.1 is obtained by showing that clusters of large diameter (that is, diameter of order −1 log( 3 V)) exist and that such clusters typically have large volume, that is, volume of order −2 log( 3 V). Unfortunately, the leading constant for the volume for such "long" clusters is strictly smaller than 2. In fact, the largest cluster is expected to have much smaller diameter, that is, of order −1 √ log( 3 V), as in the ERRG case.

General theorems
Theorems 1.1-1.5 are stated for the hypercube Q m , but the assertions there hold in various levels of generality. Theorems 1.1 and 1.2 hold under the assumption of the triangle condition (see [4,8] and (1.6)) and therefore hold, for instance, for the hypercube, finite tori Z n with large but fixed, and to expander families of high degree and high girth. Theorem 1.2 even holds for infinite graphs that satisfy the triangle condition of [4]. The bounds on the diameter, one-arm probability, and mixing time of Theorems 1.3, 1.4, and 1.5 hold under the stronger assumptions of [17,Theorem 1.3]. We now describe these general conditions and state our most general theorems. Given a graph G and p ∈ [0, 1] we write G p for the random graph obtained from G by performing bond-percolation on G with parameter p and denote by P p this probability measure. We call the edges of G p open and the edges not in G p closed. For each vertex x ∈ G we write (x) for the connected component of x in G p . Recall that we write (p) = E p [|(x)], and that this quantity does not depend on our choice of x when G is transitive. For two vertices x, y of G we write x ↔ y for the event that there exists an open path in G p connecting x to y.
In our general setting we are given a sequence of transitive graphs (G m ) with vertex degree m and the numbers p c (G m ) as defined in (1.1). We write V m for the number of vertices in G m . We are also given a sequence of nonnegative numbers m satisfying m = o(1) and 3 m V m → ∞. For ease of notation, we will often write G, p c , , and V instead of G m , p c (G m ), m , and V m , respectively.
The triangle condition, first defined in [4] and refined to the finite graph setting in [8], is a certain condition on the sequence (G m ) implying several results for the percolation phase transition. This is an extensively studied topic, see for example, [1,4,8,9,15,16,[22][23][24]31]. We state here a useful variant of the triangle condition: the strong triangle condition holds if there exists C > 0 such that for any two vertices x, y and any p ≤ p c we have Remark A version of Theorem 1.6(b) also holds for percolation on infinite lattices when the dimension is sufficiently large. In particular, our proof can be modified to show the analogous result when the infinite-lattice version of the triangle condition given by This has been confirmed, among others, for nearest-neighbor percolation on Z when ≥ 11 [15], for certain "finite-range spread-out" percolation models on Z when > 6 [16], and for percolation on certain nonamenable Cayley graphs [31,32]. In this setting one can follow our proof-with straightforward modifications-to conclude that there exist c ′ , c ′ a > 0 such that for percolation at p = p c (1− ) and all A ≥ 1, We now present the general version of Theorems 1.3, 1.4, and 1.5. Given a graph G, the t-step nonbacktracking random walk on G starting from a vertex x is a uniform measure on all paths (X 1 , … , X t ) in G such that X 1 = x and X i ≠ X i−2 for all 3 ≤ i ≤ t (so the walk never backtracks). For two vertices x, y of G we write p t (x, y) for the probability that a t-step nonbacktracking random walk starting at x ends at y. Given a connected graph G and ∈ (0, 1) we define the uniform nonbacktracking mixing time as The averaging between p t and p t+1 is incorporated to admit bipartite graphs, such as the hypercube, to the general setting. Note that although t mix is superficially similar to T mix , they are different quantities. For later reference, we remark that Fitzner and van der Hofstad [14,Theorem 3.5] show that on the hypercube, and that (1.12) and that ( (c) Let  ⋆ be the component with the largest mixing time. Then there exist C > 0 such that for any For percolation on the hypercube Q m , assumptions (1.11) and (1.12) follow immediately from the estimates (1.2) and (1.10). In [17,Section 7.2] it is shown that (1.13) holds for the hypercube. Hence, Theorem 1.7 implies Theorems 1.3, 1.4, and 1.5.
Furthermore, assumptions (1.11), (1.12), (1.13) were verified in [17,Theorem 1.4] for expanders of high degree and high girth, hypercubes, and for products of complete graphs, and hence the conclusions Theorem 1.7 hold for these classes of graphs as well. Lastly we remark that these assumption in fact imply the strong triangle condition [17, Theorem 1.3(a)], but are not equivalent. Indeed, the tori Z n when n → ∞ and fixed satisfy (1.6) but do not satisfy (1.12).

1.5
The structure of this paper In Section 2 we start with some preliminaries: we recall bounds for subcritical and critical percolation from the literature, and we prove some easy consequences of these bounds. We also prove the (easy) lower bounds on the one-arm probability of Theorems 1.4 and 1.7(b). In Section 3 we establish bounds on the moments of || conditionally on having a large diameter, and use them to prove Theorems 1.1, 1.2, and 1.6. In Section 4 we prove the upper bounds of Theorems 1.4 and 1.7(b), as well as Theorem 4.5 concerning the effect that removing edges from the graph has on the one-arm probability. In Section 5 we then use these results to prove the bounds on the maximal diameter from Theorems 1.3 and 1.7(a). Finally, in Section 6 we prove the bounds on the mixing time from Theorem 1.7(c) and Theorem 1.5.

PRELIMINARIES
In this section we recall some of definitions, tools and previous results used in the proofs, and use them draw some simple conclusions. The first estimates involve the distribution of |(x)|. Aizenman and Newman [1, Proposition 5.1] proved that if G is a finite or infinite transitive graph, 4 then for any k ≥ (p) 2 (2. 2) The following estimates concern the "intrinsic" metric of the percolation cluster, we require a few definitions first. Given vertices x and y and a nonnegative integer r, we define the events if the shortest path in G p connecting x and y has length precisely r, The intrinsic metric ball of radius r around a vertex x in the graph G and its boundary are defined by and we note that both are random sets with respect to P p . When G is transitive we often abbreviate . It is proved in [23] that if G satisfies the strong triangle condition (1.6), then there exist finite constants C 1 and C 2 , that may depend on of (1.1), such that is monotone increasing in p and so (2.3) holds for any p ≤ p c .
Furthermore, even though monotonicity in p is unknown to hold for the quantity P p ( x =r ← → y ) , the triangle condition (1.6) from which (2.4) follows is monotone in p and therefore (2.4) holds for any p ≤ p c as well. Since however, an open problem to show that the triangle condition implies this. In [17,Theorem 4.1] it is proved under the stronger conditions (1.12) and (1.13). In fact, a stronger statement is proved under these assumptions: there exists a constant C b > 0 such that for any G ′ ⊆ G and any r we have We remark here that since this estimate relies on conditions (1.12) and (1.13), it will not be used to prove Theorem 1.6. While [23] gives a corresponding lower bound for (2.3) when G is any infinite transitive graph, we obviously cannot expect such a lower bound to be valid for all r when G is a finite graph. In [17,Lemmas 4.2 and 4.3] it is proved for any transitive graph that there exist constants c, > 0 (that may depend on in (1.1)) such that From here it is easy to obtain similar lower bounds for (2.4) and (2.5) and this is the content of Lemma 2.1. Lastly, in [28] general estimates that bound the probability that a cluster has small volume but large diameter are given. We recall these estimates now. Assume that G = (, ) is a graph with V vertices and p ∈ [0, 1] is such that (2.3) and (2.4) hold at p and that r and k are integers satisfying where C 2 is the constant from (2.4). Then, by [28, Lemma 6.2], for any x ∈ , Furthermore, by [28,Lemma 6.3], if k and r satisfy The following lemma provides a corresponding lower bound to (2.4).

Lemma 2.1 Let (G m ) be a sequence of finite transitive graphs satisfying the triangle condition (1.6).
Then there exist constants c 3 , > 0 such that, Proof We follow the proof of [23, Theorem 1.3(i)], where the equivalent statement is proved for critical percolation on Z with large. For any a ≥ 1, valid for any nonnegative random variable. Let > 0 be the constant from (2.6), so that (2.3) and (2.6) yield that Next, by a standard application of the BK-inequality [5] (see for example, [23, p. 652] and also footnotes 5 and 6) we have that Putting these together gives We maximize the right-hand side by putting a = 12C 1 and choose = a −1 and c 3 as the constant we get on the right-hand side above, concluding the proof. ▪ We may now use our previous estimates on critical percolation to deduce a simple lower bound on the probability of the one-arm event in the subcritical phase.
Remark Note that the upper bound in part (b) is weaker than the upper bound in Theorem 1.7(b), but that the assumptions here are also weaker.
Proof (a) It is an easy consequence (see [17,Lemma 3.4] for a proof) of the standard simultaneous coupling between percolation with parameters p 1 and p 2 satisfying 0 ≤ p 1 ≤ p 2 ≤ 1 that for any integer r. Thus the proof of part (a) is concluded by taking p 1 = p c (1 − ) and p 2 = p c and applying (2.6) and Lemma 2.1, respectively. (b) Put r = A −1 and k = A −2 and bound Note that as long as A > 1 and V is large enough, by (2.2) we have that k ≥ (p) 2 and since 3 V → ∞ we also have that r ≫ kV −1∕3 . Hence we may apply (2.1) and (2.7) in the above inequality to obtain for some c 4 > 0 sufficiently small. ▪

Proof of the lower bounds in Theorems 1.4 and 1.7(b)
These follow immediately from Lemma 2.2 and the fact that when 3

CLUSTER SIZES: PROOFS OF THEOREMS 1.1 AND 1.2
We start with bounds on the first and second moment of the typical cluster size, conditioned on the event that the diameter of the cluster is large.
Proof We put r = A −1 and k = A −2 for some small > 0 that will be chosen later. We bound Since 3 V → ∞, the conditions of (2.7) hold, so we obtain for some constant C > 0. Since r ≤ V 1∕3 , by Lemma 2.2(a) we get P p ( B(r) ≠ ∅) ≥ c A −1 e −cA , so when > 0 is chosen to be a small enough (but fixed) we get that giving the lemma. ▪ Proof For a simple path of length r starting at v we write the event where by "first" we mean according to some fixed predetermined ordering of paths (such as the lexicographical order). In other words, Since the events { r (v, )} are mutually disjoint we can write occurs, then one of the following events must occur (see Figure 2): (i) There exist integers m ≠ n with 1 ≤ m, n ≤ r such that the events  r (v, ), (m) ↔ x and (n) ↔ y occur disjointly, 5 or, (ii) There exist 1 ≤ m ≤ r and a vertex z such that the events  r (v, ), (m) ↔ z, z ↔ x and z ↔ y occur disjointly.
To see this implication consider an open path from x to v and let x be the part of this path from x until the first time it hits , so that x and are edgewise disjoint. (If x is a vertex on then x = ∅). Now consider another open path, from y from v and let y be the part of this path from y until the first time it hits ∪ x . If y ends at rather than x , then this is an instance of case (i) above when we write m and n for the positions on of the meeting points of x and y with , respectively. If y ends at x instead of , then this is an instance of case (ii) above when we write z for that meeting point and m for the position on of the meeting point of with x .
In case (i) the disjoint witnesses for the occurrence of the events are the edges of together with all the closed edges (these open and closed edges determine  r (v, ) since one can check that is open and any other path of length r that is prior to in the fixed ordering has a closed edge in it), the edges of x (for (m) ↔ x) and the edges of y (for (n) ↔ y). Similarly, in case (ii) the witnesses are the edges of together with all closed edges, the edges on x from (m) to z, the edges of x from z to x and the edges of y .
BKR-inequality 6 [29] now yields For the term on the first right-hand side we first sum over x and y and get (p) 2 , then over m, n and get a r(r + 1) factor and lastly the sum over gives another factor P p ( B(r) ≠ ∅). For the second term we first sum over x and y, then over z, m, and get (p) 3 (r + 1)P p ( B(r) ≠ ∅), concluding the proof of the lemma. ▪ Proof of Theorem 1. 6 We start with the proof of part (b) of the theorem, which is a straightforward application of the two previous lemmas and a second moment bound. Let c, > 0 be the constants from Lemmas 2.2 and 3.1. Put A ∈ [2∕c, V 1∕3 ] and r = A −1 and k = (c∕2)A −2 . Recall that for any . Hence, by Lemma 3.1 we may bound Now we apply the bounds from Lemmas 2.2, 3.1, and 3.2, and (2.2) and get concluding the proof of part (b).
To prove part (a), let ∈ (0, 1) be arbitrary, and let = (1 − ). By part (b) of this theorem we may choose some c > 0 so that when k = c −2 log( 3 V) we have 6 The van den Berg-Kesten-Reimer inequality (or BKR-inequality) states that disjoint events are negatively correlated, that is, . If A and B are increasing events (i.e., if P p (A) ≤ P q (A) for all 0 ≤ p ≤ q ≤ 1), then we call this bound the BK-inequality [5]. The BK-inequality is usually easier to apply, because it is easy to verify whether increasing events occur disjointly. Applying the BKR-inequality to nonincreasing events (such as { B(r) ≠ ∅}) often requires more care, see [17, Section 3] for a discussion.
Write Z ≥k for the random variable counting the number of vertices in clusters of size at least k, that is, By the pigeonhole principle we have that for s ≥ t, We now let j be an integer satisfying j ∈ [1, ( 3 V) ] and put t = 3∕c and s = jt. It follows from (1.3) that (1). By the Paley-Zygmund inequality, , concluding the proof. ▪

IMPROVED BOUNDS ON THE ONE-ARM PROBABILITY
In this section we prove upper bounds (that give the sharp exponents) for the probability of the one-arm event (improving upon Lemma 2.2(b)) and on the expected size of the boundary, thus completing the proof of Theorem 1.7(b). In the next section we will use these to prove the upper bound in Theorem 1.4 and to prove Theorem 1.7(a).

The off-method and bounds on the probability of a long connection
For the proofs we will require two useful estimates from [17]. The first is a sharp upper bound on the connection probabilities between any two vertices by an open path that is longer than t mix , see [17,Section 3.4] and in particular Lemma 3.15 of that paper for the proofs. We do not quote the precise statements from [17], but rather state only the consequences that we require in this paper.
One of these bounds, and several more below, make use of the so-called off-method. Given a graph G = (, ) and a subset of the edge set A ⊂ , we say that an event "F occurs off A" if F occurs without using any edges in A. 7 More precisely, given a configuration ∈ {0, 1}  , let A be the configuration such that A (e) = (e) if e ∉ A, and A (e) = 0 if e ∈ A. Then ∈ {F off A} iff A ∈ F. We frequently write P A p for the measure P A p (E) = P p (E off A), and similarly, we write E A p . Note that P A p is a product measure on {0, 1} ⧵A . We use the off-method to factorize probabilities. The off-method, for example, can be used to enforce independence, since 7 In the literature, the off-method is usually applied with reference to a vertex set, implicitly using the set of all edges that contain a vertex of A in the graph. Here we use an edge set because the traditional definition is a bit unwieldy in our setting. All results from the literature that we use are valid with our more general definition.
We allow A = A( ), that is, the set A may depend on the configuration. In particular, we will often take A to be a metric ball, that is, we consider events of the form {E off B x (r)}. In this case we take A = A( ) to be the set of all open edges on a path of open edges of length at most r started at x, and of all closed edges that share an end-point with one or two of those open edges. Observe that we can indeed determine what B x (r) is for any given by inspecting only the status of the edges in A( ). In this setting, P A( ) p is of course no longer a product measure. We deal with this difficulty whenever it occurs below by using an appropriate conditioning scheme.
Recall the definition of the nonbacktracking walk mixing time t mix defined in (1.9).

1)
and for any t ≥ t mix and any A ⊂  The heuristics behind the above lemma are that when a graph satisfies (1.12) and (1.13) a long percolation path has similar properties to a simple random walk path.
The second estimate we need from [17] is a nonbacktracking random walk estimate bounding a particular sum of the heat kernel of graphs, like the hypercube, that satisfy (1.13). Its proof is not difficult and can be found in the last paragraph in the proof of Theorem 4.5 of [17].

The expected volume of the boundary of a subcritical ball
We prove the volume bound in Theorem 1.7(b) in a slightly stronger version, allowing the bound to be "off" any arbitrary set of vertices.
Proof We prove the claim by induction on r. The induction hypothesis is that (4.3) holds for any integer k < r. The induction is initialized by choosing C sufficiently large. We start by setting up a coupling that allows us to use the BKR-inequality. Let G ′ = (, ( 1 ,  2 )) be the multigraph with a pair of edges e 1 ∈  1 and e 2 ∈  2 between v, w ∈  iff {v, w} ∈  (i.e., we take G and replace each edge by a pair of parallel edges). Put p 1 = p c (1 − ) and p 2 = p c . Independently of everything else, we declare each edge in  1 open with probability p 1 and each edge in  2 to be open with probability q, where q ∈ [0, 1] is determined by We write P A for the associated product measure off A (i.e., all edges in  1 and  2 corresponding to some edge in A are closed). We say that an edge e ∈  is p 1 -open iff e 1 is open, and that e is p 2 -open iff at least one of e 1 or e 2 are open. For i = 1, 2 we write G p i for the graph spanned by the p i -open edges. Note that the marginal law of G p i is P p i .
For an integer r ≥ 0 we define and given a simple path in G from 0 to v of length r we define . We also define  r,p 1 (v, ) to be the event that the edges of are We will show using the induction hypothesis that This establishes the proof, since then where C b is the constant from (2.5). It remains to prove (4.4). Fix a set A ⊂ . To start, we assume that the event occurs off A, and that is the first shortest p 1 -open path connecting 0 to v. Since  r,p 2 (v, )∩ r,p 1 (v, ) does not occur, we deduce that either (i) the shortest p 2 -open path connecting 0 to v has length less than r, or . Here the black path is and the red path is , which passes through at least one edge of  2 . [Color figure can be viewed at wileyonlinelibrary.com] (ii) that both are of length at least r but the first shortest p 2 -open path uses an edge that belongs to  2 .
Both cases imply that there are vertices x and y on such that the length of between them is some t ≤ r and there exists a p 2 -open path between them with ∩ = {x, y} and | | ≤ t, and contains at least one edge of  2 . See Figure 3. Hence, the event (4.5) implies that there exist nonnegative integers k, t satisfying k + t ≤ r and vertices x, y such that the following two events occur disjointly: Indeed, the witness edges for  1 are the p 1 -open edges of together with all the closed edges of  1 , and the witness edges for  2 are the open edges of . Denote the event of the disjoint occurrence of  1 and  2 by  (v, x, y, k, t). We will prove (4.4) by summing the probability of  (v, x, y, k, t) over v, x, y, k, and t. We split the sum according to whether t < t mix or t ≥ t mix , starting with the latter.
Applying the BKR-inequality and using the inclusion ) .
(We dropped the condition "off A" for the first factor because the event is increasing.) We proceed by bounding P A ( 1 (v, x, y, k, t)). We condition on the open and closed edges that determine B 0 (k + t), as described in Section 4.1, and use the induction hypothesis to get ) .
We condition similarly on the closed and open edges that determine B 0 (k), and since t ≥ t mix and we assume that (1.12) and (1.13) hold, we may use (4.2) and the induction hypothesis to bound We now sum the last term over y and get a factor O(t) by (2.3). We then sum the one before last term over x and get a factor C(1 − ) k by the induction hypothesis. Finally, we sum over k, t ≤ r and get a factor O(r 3 Since r = o(V 1∕3 ), we get that this sum is o((1 − ) r ) for any fixed A ⊂ , as required. We now bound in the case that t ∈ [2, t mix ]. Again we start by applying the BKR-inequality to the probability of  (v, x, y, k, t). This time we bound the probability of  2 by enumerating over paths. Indeed,  2 (x, y, t) implies that there exists a path with | | ≤ t such that is a p 2 -open path between x and y such that one of its edges belongs to  2 . For each such simple path of length s ≤ t the probability that this occurs is precisely and the number of such 's is at most m(m − 1) s−1 p s (x, y). Hence where in the last inequality we used that (1 −(1 − ) s ) ≤ s and that p s c m(m − 1) s−1 = 1 + o(1) by (1.12). For the probability of  1 , we first sum over v as before to get a factor (1 − ) r−k−t . Afterwards, we condition on the closed and open edges that determine B 0 (k + t) and bound the conditional probability of x =t ← → y by C(1 − ) t p t (x, y) as before, by enumerating paths and using (1.12). We gained the factor (1 − ) t relative to the estimate of P p c ( 2 (x, y, t)), because the event x =t ← → y occurs on G p 1 , where the percolation probability is p 1 = p c (1 − ). We get that By Lemma 4.2 we may sum sp s (x, y)p t (x, y) over s, t, y and get a factor O( m ∕ log V). We then sum over x using the induction hypothesis to get a factor C(1 − ) k . Finally we sum over k and get a factor r. This yields ∑ Now, since r = O( −1 log( 3 V)) and m = o(1) we get that this is also o((1 − ) r ) for any fixed A, as required. ▪

The subcritical one-arm probability
The next theorem gives the sharp estimate on the subcritical one-arm probability in Theorem 1.7(b) (again, in the slightly stronger form allowing it to be "off" any arbitrary set). The proof is of similar nature to the proof of the previous theorem but is not quite analogous, because here the case t ≥ t mix gives rise to a technical difficulty when is very close to V −1∕3 . Note also that Theorem 1.7(b) is not entirely sharp, as it does not meet the lower bound of Lemma 2.2(a). However, we only use this theorem with r of order −1 log( 3 V), so the ratio between the lower and the upper bound is at most log( 3 V) and this logarithmic difference should, in practice, not matter much. Our bounds can be improved to give the sharpest upper bound of order r −1 (1− ) r , but this seems to require longer technical work and is unnecessary for our purposes, so we omit it. (The current proof actually gives an upper bound of (1 − ) r ∕ log m, but we also do not spell out the details for this.) Proof We again prove the claim by induction. Our induction hypothesis is that (4.6) holds for any k satisfying −1 ≤ k < r. The induction is initialized by observing that for r = −1 the claim follows from (2.4).
As in the proof of the previous theorem, we start by constructing the multigraph G ′ that is a copy of G with each edge replaced with a pair of edges subject to different percolation probabilities, p 1 on  1 and q on  2 , where q is the solution to (1 − q)(1 − p 1 ) = 1 − p 2 . Also as in the previous proof, we put p 1 = p c (1 − ) and p 2 = p c . We use the terms "p 1 -open" and "p 2 -open" as before, and write G p i for the subgraph of G ′ of p i -open edges.
Define for i = 1, 2, and given a simple path in G of length r we write  r,p i ( ) ∶= { is the lexicographical first p i -open shortest-path of length r starting at 0 } , so that  r,p i = ⊎  r,p i ( ). We also write  r,p 1 ( ) for the event that the edges of the path are p 1 -open. Note that ⨄ (  r,p 2 ( ) ∩  r,p 1 ( ) ) ⊆  r,p 1 .
We will use the induction hypothesis to show that Given (4.7) the proof can be quickly completed since we have which concludes the proof using (2.4) since r ≥ −1 .
We now turn to proving (4.7). Assume that the event Fix a set A ⊂ . Let ( ) denote the set of the r+1 vertices on the path . Both cases (i) and (ii) imply that there exist vertices u, v ∈ ( ) that are connected by a p 2 -open path that is disjoint from , and additionally, that this path has at least one edge that is p 1 -closed but p 2 -open. We write  ≥ ( ) for the event that there exists such a with | | ≥ t mix and by  ≤ ( ) the event that all such 's have length less than t mix . Our goal is to bound from above the probability of ⊎ (  ≥ ( ) ∪  ≤ ( ) ) by the right-hand side of (4.7). We start with  ≥ ( ), which is simpler to analyze. Here we drop the requirement that one of the edges of is p 1 -closed. The event  ≥ ( ) implies that there exists x, y ∈ ( ) such that the events (i) is the first p 1 -open shortest path of length r starting from 0, and (ii) there exists a p 2 -open path from x to y of length at least t mix that is disjoint from , occur disjointly. Indeed, the witness set for the first event is the set of edges of and all the p 1 -closed edges (the closed edges determine that is the first shortest p 1 -open path), and the witness set for the second event is the set of (open) edges of . These witness sets are disjoint by construction. By the BKR-inequality and the union bound we get By (4.1) and (2.2), and since V −1 r 2 −1 = o(1) by the assumptions on r and , we get that corresponding to the last term on the right-hand side of (4.7).
To bound the probability of ⊎  ≤ ( ) we observe that this union implies that there exist nonnegative integers k, t, with k + t ≤ r, ≤ t mix and vertices x, y such that the following two events occur disjointly: Indeed, as before, the witness set for the first event is and all the p 1 -closed edges, while the witness set for the second event are the edges of . These witness sets are again disjoint. The BKR-inequality gives x, y, )) . (4.9) To bound the probability of the second event, we enumerate all the possible 's. The number of such 's is at most m(m−1) −1 p (x, y) and the probability that ∩ 2 ≠ ∅ is precisely p 2 (1−(p 1 ∕p 2 ) ). Since (p 2 (m − 1)) = 1 + o(1) when ≤ t mix by (1.12), and since (1 − (p 1 ∕p 2 ) ) ≤ we bound To bound the first term in the sum on the right-hand side of (4.9) we condition on the open and closed edges that determine B 0 (k + t) using the same approach as in the proof of Theorem 4.3. Afterwards we condition on the open and closed edges of B 0 (k) and proceed similarly. We get We now separate into four cases, corresponding to whether t ≥ t mix or not, and whether r − k − t ≥ −1 or not.
The first case we consider is when t ≥ t mix and r − k − t ≥ −1 . In this case we may use the induction hypothesis on the last term of (4.11), and (4.2) together with Theorem 4.3 to bound the second term on the right-hand side of (4.11), yielding This together with (4.10) gives that the sum in (4.9) when t ≥ t mix and r − k − t ≥ −1 is at most Since ∑ y p (x, y) = 1, we may sum the term P(0 =k ← → x) over x and bound it by C(1− ) k using Theorem 4.3. This yields (4.12) By (1.11) and since m ≫ V −1∕3 we get that The second case is when t ≥ t mix and r − k − t ≤ −1 . In this case we bound the last term of (4.11) using (2.4) (instead of the induction hypothesis) and then proceed as in the previous case to obtain P A ( 1 (x, y, k, t) So the sum in (4.9) when t ≥ t mix and As before we use ∑ y p (x, y) = 1 and apply Theorem 4.3 to By (1.11) and since r ≤ V 1∕3 , we get that V −1 r log( −1 )t 2 mix = o(1) as required. The third case is when t ≤ t mix and r − k − t ≥ −1 . We proceed from (4.11). Since t ≤ t mix we may enumerate the paths connecting x to y in the same manner that we reached (4.10) to get that y) (we dropped the requirement that the paths avoid B 1 for an upper bound). We now proceed as in the first case, and get that the sum in (4.9) when t ≤ t mix and By Lemma 4.2, summing p t (x, y)p (x, y) over , t, y gives a factor O( m ∕ log V). We also apply (4.14) Since r = O( −1 log( 3 V)) and m = o(1) we get that that last factor is o(1), as required.
The fourth and final case is when t ≤ t mix and r − k − t ≤ −1 . We proceed from (4.11) and use (2.4), and then proceed exactly as in the first case. We get that the sum in (4.9) over t ≤ t mix and We start by summing over x to get a factor (1 − ) k , and then bound the product (1 − ) k+t by C(1 − ) r since r − k − t ≤ −1 . We then sum (r − k − t) −1 over k and get a factor log( −1 ). We finish by summing over , t, y using Lemma 4.2 to get 15) and the proof is completed since log( −1 ) ≤ log V and m = o(1) ▪

Proof of the upper bounds in Theorem 1.7(b)
These follow directly from Theorems 4.3 and 4.4.

The one-arm probability off a set
As we have seen several times before, since B(r) ≠ ∅ is not monotone, it is not a priori clear that the probability of this event could not increase if we restrict ourselves to a subgraph. We believe that the unrestricted setting maximizes the one-arm probability, but we are unable to prove this. The following estimate (which we shall use several times later on) shows that as long as we do not remove too many edges, the probability does not change much. In what follows, for a subset of edges A we write (A) for the set of vertices which are touched by A.  occurs. Indeed, the set of witness edges for the first event are the edges of together with all the closed edges (which determine that is the first p-open shortest path). The sets of witness edges for the second and third event are the edges on two disjoint paths, connecting a to u and a to v, respectively (such paths must exists because there exists a shortcut to passing through (A)).
We again split the event according to whether these shortcuts are longer than t mix or not. Denote by  ≥ (x, , a; A) the event that  r (x, ; A) occurs, and that both disjoint connections from a to u and a to v have length at least t mix , and  ≤ (x, , a; A) analogously, except that now one of the connections has length at most t mix , that is, We bound the probability of  ≥ (x, , a; A) using the BKR-inequality, (2.2), and (4.1), Hence, by our assumptions on |(A)| and r we get where the last bound is due to Lemma 2.2(a) and Theorem 4.4.
To bound the probability of ⊎ ∪ a∈(A)  ≤ (x, , a; A) we consider further two cases: either the disjoint paths from a to u and a to v are both of length at most t mix , or one of these paths has length at most t mix and the other is longer than t mix . For fixed a ∈ (A), the union over of the first case implies that there exist vertices u, v ∈ ( ) and integers k, t ≤ r such that occurs. As usual, the witness edges for the first disjointly occurring event are the edges of and all the closed edges, and the other two sets of witness edges are simply the disjoint open paths between a and u and between a and v. The analysis now proceeds similarly to the proof of previous theorems in this section, splitting the sum into four parts, according to whether t ≥ t mix or not, and whether r − k − t ≥ −1 or not.
We start with the case t ≥ t mix and r − k − t ≥ −1 . We use the BKR-inequality and as before, we condition on B x (k + t) and use Theorem 4.4 to get a factor (1 − ) r−k−t for the probability of We proceed by conditioning on B x (k) and use Theorem 4.3 and (4.1) to get a factor V −1 (1 − ) t . We then sum the probability of the third disjoint event over v to get a factor t mix by (2.3). Then we sum the probability of the first event in the first disjoint event over x to get a factor (1 − ) k by Theorem 4.3. Lastly, we sum the probability of the second disjoint event over u to get a factor t mix , again by (2.3). All this gives the bound ∑ By (1.11), our assumption on r and since ≫ V −1∕3 , this quantity is o(V(1 − ) r ∕r). Lemma 2.2(a) now gives the claimed bound in this case.
The case where t ≥ t mix and r − k − t < −1 are very similar, except that we now use (2.4) to get a factor (r − k − t) −1 . This gives the bound ∑ a∈(A) which is o(Vr −1 (1 − ) r ) by (1.11) and our usual assumptions on r and .
For the case t ≤ t mix and r −k −t ≥ −1 we again apply the same method of conditioning on B x (k +t) and B x (k) to get a factor C (1 − ) r−k . At this point, instead of summing over u, we sum over x, using Theorem 4.3 to bound (the restriction a ≠ u, u ≠ v, v ≠ a follows since by construction a, u, and v must be distinct vertices). Summing over k gives a factor r. By enumerating over paths, in the same way we derived (4.10), when s ≤ t mix we may bound P p (a =s ← → u) ≤ (1 + o(1))p s (a, u). This yields where the sum t 1 + t 2 + t 3 ≥ 3 since a, u, and v are distinct vertices. By (1.13) and the rest of our assumptions we get this sum is again o(Vr −1 (1 − ) r ).
The fourth and final case is when t ≤ t mix but r − k − t ≤ −1 . We take the same first steps as in the previous case, but when we sum over x we now get the bound where we used (2.4). This is at most by (1.13) and the rest of our assumptions, and since log r ≤ log V. ▪

THE COMPONENT OF MAXIMAL DIAMETER
In this section we prove that Δ max = (1 + o(1)) −1 log( 3 V) with high probability. To start, we need a refinement of (2.7).

Lemma 5.1
Assume the setting of Theorem 1.6. Then there exist C < ∞ and c > 0 such that for any any , > 0 satisfying ≤ 2 ∕(32C 2 ) where C 2 is the constant from (2.4), and any r ≥ 4( ∨ 1) −1 , we have Proof The proof is very similar to [28, Lemma 6.2], however, minor changes are required so we briefly repeat it here for completeness. Put h = 4 −1 ∕ . We say that a level j ∈ [ −1 ∕2, −1 Define j 1 to be the first thin level larger than −1 ∕2 and recursively define for i ≥ 2, where C 2 is the constant from (2.4). We say that level j is good if there exists a vertex w ∈ B(j) such that B w (2C 2 h) ≠ ∅ off B(j). By (2.4) and the union bound we have that We iterate this and get that for any n we have Now, if the events B(r) ≠ ∅ and B( −1 ) ≤ −2 occur, then the following occurs: (i) B( −1 ∕2) ≠ ∅, and (ii) levels j 1 , j 2 , … , j n are good with n satisfying n ≥ −1 ∕(8C 2 h), and (iii) there exists w ∈ B(j n ) such that B w (r − j n ) ≠ ∅ off B(j n ).
By (2.4) the probability of (i) is at most C ∕ . By Theorem 4.4, the fact that level j n is thin, j n ≤ −1 , and the union bound, we get that Combining these with (5.1) and plugging in the value of h gives that where c = (32C 2 ) −1 and we used again our assumption ≤ 2 ∕(32C 2 ). ▪

Proof of the upper bound in Theorem 1.7(a)
We begin by proving the upper bound on Δ max , that is, we will prove that under the conditions of the theorem, for any > 0 we have The initial idea is that if there is a vertex x such that B x (r) ≠ ∅, then |B x ( −1 )| is typically of order −2 and so there are in fact −2 vertices u with B u (r − −1 ) ≠ ∅, allowing us to use Markov's inequality. However, with some small probability the −1 -ball will have o( −2 ) vertices, invalidating the argument. We fix this with a multiscale argument using Lemma 5.1.
This simple idea works rather easily when |B x ( −1 )| is smaller than −2 ∕ log( −1 ). Indeed, by Lemma 5.1 and the union bound we get that for any > 0 We have that (1 − ) r ≤ ( 3 V) −1− . We put = c∕2 so that e −c log( −1 )∕ = 2 , and hence the above probability is at most ∕2 , then we may conclude, since by the triangle inequality the event implies that there are at least −2 ∕ log( −1 ) vertices u such that B u (r − −1 ) ≠ ∅. By Theorem 4.4 and Markov's inequality we get that this probability is at most If on the other hand log where log (n) is the composition of log with itself n − 1 times. Define an increasing sequence of radii (r k ) N k=1 by If there exists a vertex x such that B x (r) ≠ ∅ and |B x ( −1 )| ≥ −2 ∕ log( −1 ) both occur, then one of the following events must occur: By the triangle inequality and since r k ≤ 2 −1 , if (i) occurs, then there are at least −2 ∕(log (N) ( −1 )) 2 vertices u such that B u (r − 2 −1 ) ≠ ∅. As before, Theorem 4.4 together with Markov's inequality gives that the probability of this is at most by definition of N. If (ii) occurs for some k ∈ {2, … , N}, then each vertex u ∈ B x (r k−1 ) satisfies (log (k) ( −1 )) 2 and B u (r − 2 −1 ) ≠ ∅.
By Lemma 5.1, for each vertex u the probability of this is at most where the last inequality follows from our usual assignment of variables and the fact that log (k) ( −1 )∕ (log (k+1) ( −1 )) 2 → ∞. Since |B x (r k−1 )| ≥ −2 ∕(log (k−1) ( −1 )) 2 we get by Markov's inequality that the probability that (ii) occurs for some k ∈ {2, … , N} is at most and summing over k gives that the probability of (ii) tends to 0 as well.
The bound on (iii) is performed in the same way, using Lemma 5.1. This concludes the proof of (5.2).

Proof of the lower bound in Theorem 1.7(a)
Let us now prove the lower bound on Δ max , that is, that for any > 0 we have Let > 0 be arbitrary and put r = (1 − ) −1 log( 3 V). Let D r denote the random variable so that it suffices prove that D r > 0 with probability tending to 1. We prove this using a second moment argument. By (2.1) and (2.2) it follows that By Lemma 2.2(a) again we get that for some fixed c > 0, Any pair of vertices u and v counted in D r can either belong to the same component or not. Thus, the second moment of D r can be bounded by Denote the first sum by (I) and the second by (II). Bounding the first term is easy: where the one before last inequality is due to Theorem 4.4, and the last inequality comes from plugging in the value of r.
To estimate (II) we condition on (u) such that B u (r) ≠ ∅, and then require that B v (r) ≠ ∅ occurs off (u). We write this as Such subgraphs A satisfy the condition of Theorem 4.5 that |(A)| = O( −2 log( 3 V)), so we bound Applying this bound and summing (5.7) over A and u gives Comparing with (5.6) we get that (II)= (1 + o(1))E[D r ] 2 , which, together with our previous estimate, implies that The proof is now completed using the inequality P( , valid for any nonnegative random variable Z.

Proof of the upper bound in Theorem 1.7(c)
The upper bound follows from the lemma below, which is proved in [28].
We know from Theorems 1.6(a) and 1.7(a) that for all clusters  at p = p c (1 − ) with = o(1) and 3 V → ∞, for all > 0 we have that || ≤ (2 + ) −2 log( 3 V) and diam() ≤ (1 + ) −1 log( 3 V) with high probability. This does not, however, directly imply a good estimate of the maximal number of edges in a cluster, which is what we need. The following lemma gives such an estimate, and the proof of the upper bound in Theorem 1.5 then follows.
Proof Fix > 0 and write M = 3 −2 log( 3 V). We bound By Theorem 1.6(a) and our choice of M the first term on the right-hand side of the above is o(1) and it remains to show the second term is also o (1).
To that aim, given a vertex x, we write  x for the number of edges of the connected component containing x. Conditioned on the vertex set of (x) and on a spanning tree of (x) that consists only of open edges (such a spanning tree could be, for instance, the BFS tree of (x)) we have that  x − |(x)| is stochastically dominated by a binomial random variable with parameters m|(x)| and p. Thus, if |(x)| ≤ M, the probability of the event that  x ≥ 3M is bounded above by a probability that the value of a Binomial (mM, p) random variable exceeds 3M. We use the standard Chernoff bound [21, Theorem 2.1] that if X ∼ Bin(n, q), then for any t > 0. Since p = m −1 (1 + o (1)) we obtain that for some universal c > 0. It is straightforward to see that by our assumption on the latter quantity is o(V −1 ) and so the probability that there exists such vertex x is o(1), concluding our proof. ▪

Proof of the lower bound in Theorem 1.7(c)
For the proof of the lower bound we use a lemma from [28] for which we require some definitions: (i) For integer r ′ and vertex v we call an edge e a lane for (v, r ′ ) if e is an edge between B v (j − 1) and B v (j) for some 0 < j < r ′ , and there exists an open path with first edge e from B v (j − 1) to B v (r ′ ) that does not pass through B v (j − 1). (ii) For integers r ′ , j and with 0 < j < r ′ we say that level j has lanes for (v, r ′ ) if there are at least edges between B v (j − 1) and B v (j) that are a lane for (v, r ′ ). (iii) We say that v is -lane rich for (k, r ′ ) if more than half the levels j ∈ [k∕2, k] have at least lanes for (v, r ′ ). Lemma 6.3 (Lemma 5.4 from [28]) Let G = (, ) be a graph and v ∈ . Suppose that q, h, k, r ′ and are positive integers satisfying: Thus, our goal is to choose the parameters of the above lemma appropriately and to show that a vertex v satisfying the assumptions of the lemma above exists. We fix a positive sequence where m is the sequence given in the statement of Theorem 1.7. We also fix some > 0 and set our parameters accordingly by: ) .
See Figure 4 for a sketch of these three events. (b) Let L denote the number of lanes between levels k∕2 and k. If v is -lane rich for (k, r ′ ), then L ≥ k∕4, so by Markov's inequality, k∕4 .
The claim follows if we prove that E p [L1 { B v (r)≠∅} ] ≤ CkP p ( B v (r) ≠ ∅), since 1∕ = o(1). Recall from (3.1) and (3.2) that  r (v, ) is the event that is the first p-open path of length r emanating from v, and that ⊎  r (v, ) = { B v (r) ≠ ∅}. We condition on : Conditioned on  r (v, ), any edge that is a lane for (v, r ′ ) can either belong to , or be on a path extending from to B v (r ′ ) without intersecting again, or be on a path starting from a vertex of and ending in a different vertex of . More precisely, if  r (v, ) happens and e = {e, e} is a lane such that e ∈ B v (j − 1) and e ∈ B v (j) for some j ∈ [k∕2, k], then one of the following must occur: | | ≥ 2t mix + 1, respectively. We bound these two events separately, starting with  ≥ ( , e), for which we bound its probability conditioned on  r (v, ): If  ≥ ( , e) occurs, then either the part of leading to e is longer than t mix , or the part of starting from e is longer than t mix . Thus by the BKR-inequality (where, as usual, the witnesses to  r (v, ) are the open edges of together with closed edges, and the other two are the corresponding open paths) we get where the second bound follows from (4.1) and (2.2). Since r −2 V −1 = o(1), the contribution from  ≥ ( , e) is o(k). The contribution of  ≤ ( , e) to E p [L1 { B v (r)≠∅} ] is bounded differently. We write L as a sum of indicators over the edge e, and for each edge separately we take the union of of the events  ≤ ( , e). The event ⊎  ≤ ( , e) implies that there exist integers s, t, , with s + t ≤ r and s ≤ k and ≤ 2t mix , and vertices x, y such that the following events occur disjointly: Applying the BKR-inequality yields ∑ e P p ( ⊎  ≤ ( , e) ) ≤ ∑ x,y,e ∑ s≤k,s+t≤r, l≤2t mix +1 P p ( 1 (x, y, s, t))P p ( 2 (x, y, e, )).
To bound the probability of ∑ e P p ( 2 (x, y, e, )) we first apply the union bound to . To this end, write Γ (x, y) for the set of all simple paths of length from x to y. We bound where the factor is due to the fact that any fixed ∈ Γ (x, y) contains edges, so the sum over e contains exactly nonzero terms whose value is p . We bound |Γ (x, y)| by m(m − 1) −1 p (x, y) as usual. By (1.12) we have that (p(m − 1)) = O(1) for ≤ 2t mix , so we obtain ∑ e P p ( 2 (x, y, e, )) ≤ C p (x, y).
Compare this with the bound in (4.10) and note that the current bound is a factor −1 bigger. The rest of the analysis is now performed exactly as the analysis of four cases of ⊎  ≤ ( ) in last part of the proof of Theorem 4.4 (starting with (4.11)). Deriving the four bounds analogous to (4.12)-(4.15), we get We make two remarks about this derivation: (1) we need 2 m log( 3 V) → ∞, because the proof requires that k ≥ −1 , and (2) it follows immediately from [17, proof of Theorem 4.5] that Lemma 4.2 remains valid upon replacing t mix by 2t mix .
The lower and upper bounds from Lemma 2.2(a) and Theorem 4.4 differ by a factor log( 3 V) for our choice of R, so the desired bound follows if each of the four factors on the right-hand side is o(k∕ log( 3 V)). The first error term satisfies this bound by our choice of r in (6.3) and by (1.11), and similarly for the second term. The third term is bounded likewise simply because m = o (1). The fourth satisfies the required bound since we assumed 2 m ≫ m . Combining the contributions due to (i), (ii), and (iii), we obtain as desired. This completes the proof of (b).
(c) Let M = c ′ −2 log( 3 V), where c ′ > 0 is a small constant that will be chosen soon. If E(B(r ′ )) ≥ We now show each term is o(P p ( B(r) ≠ ∅)). We first choose c ′ > 0 small enough so that by (2.7) and Lemma 2.2(a) we get that the first term is o(P p ( B(r) ≠ ∅)). The second term is bounded by o(V −1 ) as in (6.1), which is much smaller than P p ( B(r) ≠ ∅) by our choice of r and Lemma 2.2(a). For the third term we use a similar proof strategy as in Lemma 3.2. Using Markov's inequality we bound ) .
and  r (0, ) occur, then there must exist an integer j ∈ [0, r ′ ] such that  r (0, )•{ (j) ↔ x} occurs. Applying the BKR inequality and summing over x (using (2.2)) and then gives  By our choices of r ′ and M, and since m = o(1), this bound is also o(P p ( B(r) ≠ ∅)), completing the proof of (c). ▪