Broadcasting induced colourings of random recursive trees and preferential attachment trees

In this work we consider random two-colourings of random linear preferential attachment trees, which includes random recursive trees, random plane-oriented recursive trees, random binary search trees, and a class of random $d$-ary trees. The random colouring is defined by assigning the root of the tree the colour red or blue with equal probability, and all other vertices are assigned the colour of their parent with probability $p$ and the other colour otherwise. These colourings have been previously studied in other contexts, including Ising models and broadcasting, and can be considered as generalizations of bond percolation. With the help of P\'olya urns, we prove limiting distributions, after proper rescalings, for the number of vertices of each colour, the number of monochromatic subtrees of each colour, as well as the number of leaves and fringe subtrees with two-colourings. Using methods from analytic combinatorics, we also provide precise descriptions of the limiting distribution after proper rescaling of the size of the root cluster; the largest monochromatic subtree containing the root. The description of the limiting distributions extends previous work on bond percolation in random preferential attachment trees.

The colouring σ T,p is induced from a broadcast process, in which the root of T is assigned a bit 0 or 1 uniformly at random, and this bit is propagated along the tree in the following way: any vertex in the tree takes the same bit as its parent with probability p and the other bit with probability 1 − p.By assigning a vertex the colour red if its bit value is 0 and blue if its bit value is 1, we recover the colouring σ T,p .This broadcast process was described by Evans, Kenyon, Peres, and Schulman [12], where they outline a correspondence of this process to the Ising model (see [12, Section 2.2]).The reconstruction problem is then to reconstruct the bit value of the root ρ from the bit values of some subset of vertices in T after broadcasting.This problem has long been studied, see for example the survey [26] of early works.Applications of the reconstruction problem in trees include its connection to stochastic block models [27,14], a random graph model with applications in machine learning.Of particular interest to this work, Addario-Berry, Devroye, Lugosi, and Velona studied the reconstruction problem in random recursive trees and preferential attachment trees [1].
For a real number α, a random (linear) preferential attachment tree T α,n is grown recursively in the following manner.The tree T α,1 consists of a single vertex ρ, the root of all trees that follow.
The tree T α,n is grown from T α,n−1 by choosing a vertex v at random and adding a child to v, where v is chosen with probability where deg + (u) (called the outdegree of u) is the number of children of u.To avoid degenerate cases, we only allow α ∈ {. . ., − 1 4 , − 1 3 , − 1 2 } [0, ∞).If α = −1, then only leaves can be chosen as the parent of a new vertex, resulting in T −1,n simply being a path of length n.If α is a different negative number outside of {. . ., − 1  4 , − 1 3 , − 1 2 }, then there may be vertices v in T α,n for which (1) is negative.This problem is avoided when α = − 1 d , since (1) is positive when deg + (u) < d and is zero when deg + (u) = d, resulting in a tree T −1/d,n whose vertices all have outdegree less than or equal to d.
The random tree T α,n has several names in the literature.When α = 0, the vertex v is chosen uniformly at random amongst all the vertices in the tree.This random tree is called a random recursive tree, and has been extensively studied for many years; since at least 1967 [32].When α = 1, the tree T n is called a random plane-oriented recursive tree which was introduced by Szymański [31].The more general linear preferential attachment tree T α,n coincides with a special case of the preferential attachment model studied by Barabási and Albert [2], but has also been studied in several other contexts (see for example [30,7,17]).When α = − 1 d for a positive integer d, the tree T −1/d,n is a model of random d-ary trees, and corresponds to a random binary search tree when d = 2.The random trees T α,n also fall into the class of increasing trees (see [6,10]), so named since if we label the vertices 1, . . ., n by the time they appear in the tree, then the labels increase along all paths from the root.
For ease of notation, we may sometimes fix α and p, and let T n denote the tree T α,n and let σ n denote the random broadcasting induced colouring σ Tn,p .For fixed α and p we can consider a random sequence ((T n , σ n )) ∞ n=1 of preferential attachment trees with broadcasting induced colourings where T n is grown from T n−1 in the manner outlined above, and where σ n restricted to the n − 1 vertices of T n−1 is equal to σ n−1 (and the colour of the newest vertex v in T n is randomly chosen such that with probability p the colour of v is the same as its parent).
As an example of the growth process we describe, consider the trees with broadcasting induced colourings in Figure 1.The tree T 8 is grown from T 7 by choosing v according to the probability (1) and adding a child u (notice that deg + (v) = 0).The probability that u takes a different colour from v is 1 − p, and so Figure 1: A tree T 8 with broadcasting induced colouring σ 8 grown from (T 7 , σ 7 ).
Our contribution in this work is to study asymptotic properties of the trees in the sequence ((T n , σ n )) ∞ n=1 .These results are gathered in Section 2, and come in two categories.In Section 2.1 we list global properties of the trees ((T n , σ n )) ∞ n=1 .These include limit laws (after appropriate rescaling) for the number of vertices of each colour, the number of clusters of each colour (maximal monochromatic subtrees of T n ), the number of leaves of each colour, and the number of trees T 1 , . . ., T m with respective 2-colourings ς 1 , . . ., ς m appearing in the fringe.These results are proved using results on Pólya urns from [18].The limiting distributions experience different phases.When p < (3 − α)/4, we observe normal limit laws after rescaling by √ n.Normal limit laws are also observed when p = (3 − α)/4 but with a rescaling factor of √ n ln n, while convergence to a nonnormal distribution is observed when p > (3 − α)/4.
In Section 2.2 we study the size of the cluster C n containing the root ρ.If we consider T n with a random broadcasting induced colouring σ n and remove edges between two vertices if they do not have the same colour, we are left with a forest of trees corresponding to clusters after performing Bernoulli bond percolation with parameter p on T n ; Bernoulli bond percolation with parameter p is a process by which each edge in a graph is kept with probability p and removed with probability 1 − p, independently of every other edge.The size of C n in the context of percolation, with a connection to memory-reinforced random walks, has previously been studied (see [3,4,21,8]).In particular, Businger [8] has shown that for random recursive trees, |C n |/n p converges in distribution to a Mittag-Leffler distribution (see also [4,Theorem 3.1] and [25]).Baur [3] studied |C n | in linear preferential attachment trees with α ≥ 0. He showed that |C n |/n (p+α)/(1+α) converges in distribution to some random variable C, and provided the first two moments of this random variable.In this paper, we reprove these earlier results, and give a more precise description of this random variable C by providing a recursion to calculate the integer moments of C.These results are proved using methods of analytic combinatorics.When α = 1 we give a closed form for the integer moments of C. We further extend these results by studying the size of C n when α = − 1 d , where we observe different phases.When p > 1 d , we observe a similar limiting distribution as when α > 0, and find closed forms for the integer moments of this limiting distribution when d = 2.When p ≤ 1 d , the size of the root cluster C n is bounded almost surely as n → ∞, and we describe the limiting distribution of |C n | as the size of a Galton-Watson tree with binomial Bin(d, p) offspring distribution.

Main results
In this section, we gather our main results.They are seperated in two categories: global properties, and the size of the root cluster.

Global properties
Throughout this section we define a random variable Then B ∼ (−1) Be(1/2) , where Be is a Bernoulli random variable.
We start with the number of vertices of each colour in a random preferential attachment tree T n = T α,n with a broadcasting induced colouring σ n = σ Tn,p .
Theorem 2.1.Let R n and B n denote the number of red and blue vertices respectively in a preferential attachment tree T n with broadcasting induced colouring σ n .
Remark 2.2.When p = 1/2, every new vertex added to the tree is either red or blue with probability 1/2, independent of everything that happened before.Therefore, R n is simply a sum of independent Bernoulli Be(1/2) random variables, as is B n = n − R n .We see that in this case, the matrix Σ I in the convergence (ii) simplifies to and we see that the random variables on the right hand side of the convergences in (iii) and (iv) degenererate to (0,0).
We now turn to the number of clusters (maximal monochromatic subtrees).If we want to know the total number of clusters, we can first notice that whenever a child takes a different colour from its parent, a new cluster is formed.This is also the only way of forming a new cluster (in addition to the initial cluster containing the root).The probability that a newly added vertex does not take the colour of its parent is 1 − p, from which we can conclude that the total number of clusters at time n is simply 1 + Bin(n − 1, 1 − p), where Bin denotes a binomial random variable.
Theorem 2.3.Let R c n and B c n denote the number of red and blue clusters respectively in a preferential attachment tree T n with broadcasting induced colouring σ n .
(i) The following strong law of large numbers holds (ii) If p < (3 − α)/4, then the following multivariate normal limit law holds where , then the following multivariate normal limit law holds where where Z is the same random variable as in (3), We finish our summary of global properties with fringe subtrees.In a rooted tree T a fringe subtree T consists of a vertex and all its descendents.The simplest example of a fringe subtree in T is a vertex with no descendents (a leaf of T ).Normal limit laws for the number of leaves in preferential attachment trees are already well known (see [28,23,19,17]).
In this simplest case, we offer covariance matrices for the limiting normal limit laws for the number of leaves of each colour in T n , though the distributions are already markedly more complicated than what we have described above.
Theorem 2.4.Let R l n and B l n denote the number of red and blue leaves respectively in a preferential attachment tree T n with broadcasting induced colouring σ n .
Remark 2.5.Once more, we witness a special case p = 1/2.Though the explanation is not as simple as in Remark 2.2, the colour of a newly added leaf is independent of the colour of its parent.
We see again that the random variables on the right hand side of the convergences in (iii) and (iv) degenerate to (0,0).When p = 1/2 = (3 − α)/4, both matrices in (ii) are equal.
Limiting joint distributions for the number of fringe subtrees (without colours) have already been studied [17].Let T 1 , . . ., T m be a sequence of finite trees of sizes k 1 , . . ., k m with colourings ς 1 , . . ., ς m .Let σ n | T denote the colouring σ n restricted to the subtree T .We say that two coloured rooted trees are isomorphic if there is an isomorphism between them that preserves roots and colours.
Theorem 2.6.Let X i n be the number of fringe subtrees T in T n isomorphic to T i with colouring ς i , and let The following strong law of large numbers holds (ii) If p < (3 − α)/4 or if p = 1/2, then the following multivariate normal limit law holds for some covariance matrix Σ f I .
(iv) If p > (3 − α)/4 and p = 1 2 , then the following convergence in distribution holds for some vector v, where Z is the same random variable as in (3).
Remark 2.7.The matrices Σ f I , Σ f II , as well as the vector v can be calculated explicitly from the sequence T 1 , . . ., T m of trees and the colourings ς 1 , . . ., ς m (see Theorem 3.1 below).
Remark 2.8.The statements of Theorem 2.1 (iv), Theorem 2.3 (iv), Theorem 2.4 (iv), and Theorem 2.6 (iv), all contain the same random variable BZ in the limit.More precisely, as is made evident by the proofs in Section 3, the sequences of random vectors in all of these statements converge jointly.This is proved by studying a Pólya urn process (see Proposition 3.2) which is a linear transformation of each of the Pólya urn processes used in the proofs of Theorems 2.1, 2.3, 2.4, and 2.6, and applying the Cramér-Wold Theorem [15, ch. 5, Theorem 10.5].

Root cluster
We let C n denote the root cluster in the random preferential attachment tree T n with broadcasting induced colouring σ n , that is, the maximal monochromatic subtree containing the root.As noted in the introduction, C n is identically distributed as the root cluster in T n after applying Bernoulli bond percolation with parameter p.
Theorem 2.9.Let α = 0. Then where C is a Mittag-Leffler distribution with parameter p; that is, C is characterized by its integer moments .
Baur [3] proved convergence in distribution for |C n |/n (p+α)/(1+α) when α > 0 (though we believe the method may also apply for applicable α > −1/p).We extend the results by finding a recursion for the integer moments of the limiting distribution.This recursion uses the partial Bell polynomials (see [9,Chapter 3.3]) where C has integer moments , where C k satisfies the recursion C 1 = α/(p + α) and By using the recursion in Theorem 2.10, we calculate the first two moments of C to be , which agrees with the calculations given in [3].
Baur provides a description of the sizes of the remaining clusters [3,Corollary 4.3].If the i'th added vertex is the root of a cluster, the size of this cluster, scaled by n (p+α)/(1+α) , converges in distribution to where C is the random variable given in Theorem 2.10 and β i is a Beta(1/(1 + α), i) distributed random variable.
In the special case α = 1, we are able to find a closed form for the recursion given in Theorem 2.10, and with it, a more precise description of the limiting distribution of |C n | after proper rescaling.
Theorem 2.11.Let the underlying tree be a random plane-oriented recursive tree, so α = 1.Then the integer moments of the limiting distribution C in Theorem 2.10 can be written as ) .
We now turn to the case when α = −1/d for an integer d ≥ 2, that is, when the underlying tree T n is a random increasing d-ary tree.If we consider T n as a subtree of an infinite d-ary tree T d , then the colouring σ n can be recovered from bond percolation on T d : start by assigning the root either red or blue, and assign to a vertex v the colour of its parent u if the edge joining u and v is still present after performing bond percolation, and the other colour otherwise.In this way, the root cluster C n of T n is a subtree of the cluster K d of T d containing the root after performing bond percolation.Using this fact, we can prove the following result on the sizes of |C n | and |K d | (which may be infinite).
Remark 2.14.When p ≤ 1/d, the probabilities in (5) sum to 1, and so the root cluster is almost surely finite.
When α = −p, the root cluster is almost surely finite, though its expected size grows to infinity.
In fact, we can describe the asymptotic behaviour of all the moments of |C n |.
where E k satisfies the recursion E 1 = 1/(d − 1) and In the case α > −p, a similar limiting distribution C to that found in Theorem 2.10 exists.
where C has integer moments , where D k satisfies the recursion D 1 = 1/(pd − 1) and By using the recursion in Theorem 2.16, we calculate the first two moments of C to be When the underlying tree is a binary search tree (when d = 2), we can once again find a complete description of the limiting distribution.
Theorem 2.17.Let the underlying tree be a random binary search tree, so d = 2, and let p > −α = 1/2.Then the integer moments of the limiting distribution C in Theorem 2.16 can be written as .
Remark 2.18.From Theorem 2.12, there is a positive probability that the size of the root cluster is finite, even in the case p > 1/d.For any p and d, let p ∞ be the smallest positive solution to It is known that p ∞ is the probability that the cluster K d containing the root after performing Bernoulli bond percolation on an infinite d-ary tree is infinite (see for example [22,Exercise 5.41]).
From Theorem 2.12, |C n | converges almost surely to |K d |.The limiting random variable C in Theorem 2.16 can be broken down in the following way: where C ∞ has moments , which are the moments of a generalized Mittag-Leffler distribution.

Proofs of global properties
We start by summarizing some results on generalized Pólya urns that will be used throughout this section.A generalized Pólya urn process (X n ) ∞ n=0 is defined as follows.There are q types (or colours) 1, 2, . . ., q of balls, and for each vector X n = (X n,1 , X n,2 , . . ., X n,q ), the entry X n,i ≥ 0 is the number of balls of type i in the urn at time n.For each type i, an activity a i ≥ 0 is assigned, as well as a random vector ξ i = (ξ i,1 , ξ i,2 , . . ., ξ i,q ) such that ξ i,j ≥ 0 for i = j and ξ i,i ≥ −1.The urn process begins with a given vector X 0 .At time n ≥ 1, a ball is drawn uniformly at random from the urn, so that the probability that a ball of colour i is chosen is If the drawn ball is of type i, then we set X n = X n−1 + ∆X n , where ∆X n ∼ ξ i and is independent of everything that has happened so far.The intensity matrix of the Pólya urn is the q × q matrix A := (a j E[ξ j,i ]) q i,j=1 .
Note that while several authors place E[ξ i ] for row i of A, we follow the notation of [18] by placing E[ξ j ] for column j of A. As noted in [18], since all off diagonal entries of A are non-negative, A has a largest real eigenvalue λ 1 such that λ 1 > Reλ for all other eigenvalues λ of A. A type i is called dominating if for all other types j, it is possible to find a ball of type j in an urn beginning with a single ball of type i.By ordering the types such that every dominating i is smaller than every nondominating type j, the matrix A will be a block diagonal matrix.We say that an eigenvalue λ of A belongs to the dominating class if it is also an eigenvalue of the submatrix of A restricted to the dominating types.
The following six assumptions appear in [18] (the assumption (A1) is a generalization from [18, Remark 4.2], note the indices of the variables in (A1)): (A1) For each i = 1, . . ., q, either (a) there is a real number d i > 0 such that X 0,i and ξ 1,i , ξ 2,i , . . ., ξ q,i are multiples of d i and (A3) The largest real eigenvalue λ 1 of A is positive.
(A5) There exists a dominating type i with X 0,i > 0.
We add the following simplifying assumption (A7) For each n ≥ 1 there exists a ball of dominating type in the urn.
We further add the following assumption which will make the covariance matrix calculations simpler (A8) There exists c > 0 such that q i=1 a i E[ξ j,i ] = c for every j = 1, . . ., q where a j > 0.
All vectors v for the remainder of this discussion are assumed to be column vectors.Let a = (a 1 , . . ., a q ) T be the vector of activities.Let v 1 and u 1 be the right and left eigenvectors associated If A is diagonalizable, then there are q linearly independent right eigenvectors of A and q linearly independent left eigenvectors of A. Let v i and u T i be dual bases for the eigenspaces of A, that is, right and left eigenvectors of A associated with λ i such that u T i v j = δ i,j for all i, j = 1, . . ., q, where T and define the matrices whenever none of the denominators is equal to zero (which holds in the cases relevant to us).Let If λ 2 is real and λ 2 > Reλ 3 , then define the matrix We are now ready to gather results from [18].
Proof.The convergence in (i) follows from [18, Theorem 3.21] (essential non-existence is always guaranteed if (A7) holds), the convergence in (ii) follows from [18,Theorem 3.22], and the convergence in (iii) follows from [18, Theorem 3.23], while the covariance matrix calculations in (ii) and (iii) follow from [18, Lemma 5.3(i), Lemma 5.4], where we note that the proof of [18,Lemma 5.4] follows exactly the same with the slightly more general assumption (A8).The convergence in (iv) follows from [18,Theorem 3.24] by letting Z = u T 2 W (the random vector W is an element of the eigenspace of λ 2 ).
We are now ready to prove our results of global properties for T n := T α,n with broadcasting induced colouring σ n := σ Tn,p .Let α deg + (v) + 1 be the weight of the vertex v in T n .We consider an urn with two colours of balls: red r and blue b, both with activity 1.In this urn, the total activity of red and blue balls at time n will correspond to the sum of the total weights of red and blue vertices in the tree T n with colouring σ n , respectively.When a ball is picked, with probability p it is replaced with an additional 1 + α balls of the same colour; 1 corresponding to the addition of a new vertex, while the extra α corresponds to the increase in weight of the selected vertex.With probability 1 − p, the chosen ball is replaced along with α balls of the same colour (corresponding to the increase in weight), while an additional 1 ball of the other colour is added (corresponding to the new vertex added).Let R w n and B w n be the total activity of red and blue balls respectively at time n, which is also the total weight of the vertices of each colour in T n .We therefore have the following activity matrix for our urn: This particular Pólya urn process was previously studied in the context of preferential attachment trees by Baur and Bertoin to study elephant random walks [5].The eigenvalues of A are . Therefore, Theorem 3.1 applies with We can say something more about the limiting distribution in this case when 2λ 2 > λ 1 (so when where Be denotes a Bernoulli random variable).
Proposition 3.2.Let R w n and B w n be the total weight of red and blue balls respectively, and suppose that p > where Z is a real random variable with , and .
Proof.First, suppose we always start with a red root (so start the urn with a red ball).Then the convergence in (8) with B = 1 follows from 3.1 (iv).For the calculation of the expected value and the variance (again assuming we start with a red ball), we appeal to [18, Theorem 3.10, Theorem 1 corresponds in this case to starting with a single ball of colour r, and , and so Rearranging for σ 2  1 and adding Then by applying [18, Eq. 3.21], we get , and .
Next we multiply by B since the urn starts with a single red ball with probability 1/2, and a single blue ball with probability 1/2.
While a Pólya urn can be used to study the number of vertices of each colour, a simpler proof follows from limit laws for the number of clusters (maximal monochromatic subtrees) of each colour.
We therefore start with studying the clusters in T n .
Proof of Theorem 2.  If a red vertex is chosen at step n, this corresponds to choosing a ball of colour r.Then with probability p it is replaced with an additional 1+ α balls of colour r, just as above.With probability 1 − p however, the ball is replaced along with α balls of colour r along with 1 ball of colour b (just as above), with an addition ball of colour b c added representing the new blue cluster that is formed.
Therefore, we have the first column in the intensity matrix A c for this urn.The symmetric argument for balls of colour b holds, contributing to the second column of A c .Balls of colour r c and b c have activity 0, and so the intensity matrix for this urn is The eigenvalues of We see that the assumptions (A1)-(A8) hold.The matrix A c is diagonalizable when α = 1 − 2p, and a dual basis for the eigenspaces of A in this case is given by (0, 0, 0, 1), We can therefore apply Theorem 3.1.Using Mathematica, the covariance matrices for the limiting distribution for this urn when 1 , and the calculation for Σ † 1 from (6) yields the same result as the calculation for Σ I .When 1 + α = λ 1 < 2λ 2 = 4p − 2 + 2α, we conclude from Theorem 3.1 (iv) that n (2p+α−1)/(1+α) (R w n , B w n , R c n , B c n ) converges to Zv 2 , for some random variable Z.If we restrict to R w n and B w n , we see that Z is the same random variable BZ as in (8).Restricted to R c n and B c n , the results of Theorem 2.3 follow.
A similar urn process to the one above (with balls of activity 0 representing the vertices) can be be used to find limit laws for the number of vertices of each colour.But we can instead use the following observation: if a vertex of one colour contributes to the weight of another vertex of a different colour, then it must be the root of a cluster.Therefore, from the previous proof, we can now derive convergence for the number of vertices of each colour.
Proof of Theorem 2.1.If we consider again the urn in the previous proof, we can recover the number of vertices R n and B n of each colour in our tree.Each red vertex contributes (1 + α) to the value R w n , except those that are roots of red clusters; these contribues 1 to R w n .The root of a blue cluster contributes α to the weight of its parent, and so α to R w n .The only root of a blue cluster that does not contribute to R w n is the root of T n if this root is blue.Using B defined in (2), we see that . Performing the symmetric analysis for B w n and rearranging gives When scaled by √ n, √ ln n, or n (2p+α−1)(1+α) , the last term of each of the equations above vanishes.
By the Cramér-Wold Theorem, since R c n , B c n , R w n , B w n converge jointly in distribution so do linear combinations of these random variables.The limiting distributions are also normal when λ 1 ≥ 2λ 2 , and the covariance matrices can be calculated from the covariance matrices in Appendix A.1.
As discussed in Remark 2.2, we can treat the special case when p = 1/2 directly since the number of red vertices is simply given by R n = n i=1 X i where X i ∼ Be(1/2) are independent Bernoulli random variables.Then we can apply the central limit theorem to get A multivariate normal limit law for the number of red and blue vertices follows since We turn now to the number of leaves of each colour.
Proof of Theorem 2.4.Consider an urn with four colours of balls: r l , b l , r u , b u , each with activity 1.
Let R l n , B l n , R u n , B u n be the total numbers of the balls of colour r l , b l , r u , b u respectively at time n.The balls of colour r l and b l represent red and blue leaves respectively.The other balls represent the remaining weights of the red and blue vertices respectively.
If a red leaf is chosen at step n, this corresponds to choosing a ball of colour r l .Then with probability 1 − p it is removed and replaced with one ball of colour b l for the new blue leaf that is added, and 1 + α balls of colour r u , representing the weight of the now non-leaf vertex that was chosen.With probability p, the ball is placed back in the urn for the new red leaf that was added, along with 1 + α balls of colour r u , representing the weight of the now non-leaf vertex that was chosen.Therefore, we have the first column of the intensity matrix A l for this urn.If a red vertex that is not a leaf is chosen, then an additional α balls of colour r u are added (for the increase in weight of that vertex), along with either one ball of colour r l with probability p, or one ball of colour b l with probability 1 − p.
Therefore, we have the third column of A l .The symmetric arguments hold when balls of colour b l or b u are chosen.
Therefore, the intensity matrix for this urn is We see immediately that assumptions (A1) -(A8) hold.The eigenvalues of A are The matrix A l is diagonalizable when α = −2p, and a dual basis for the eigenspaces of A in this case is given by We can therefore apply Theorem 3.1.Using Mathematica, the covariance matrix for the limiting distribution for this urn when 1 , and the calculation for Σ † 1 from (6) yields the same result as the calculation for Σ I .
When 1 + α = λ 1 < 2λ 2 = 4p − 2 + 2α, the limiting distribution depends on the colour of the root vertex, so just as in (8), we multiply by the random variable B defined in (2).Notice also that R l n + R u n = R w n and B l n + B u n = B w n , and so from the Cramér-Wold Theorem and the uniqueness of limits in distribution, the random variable Z achieved from Theorem 3.1(iv) is identical to the random variable Z in (8).
When p = 1/2, the colour of a newly added vertex does not depend on the colour of its parent.
In this case, consider an urn with three colours of balls: r l , b l , v u , each with activity 1.The balls of colour r l and b l represent red and blue leaves respectively, while v u represents the remaining weights of all non-leaf vertices.Performing a similar analysis as above, we get the following intensity matrix for this urn: The eigenvalues of A are λ 1 = 1 + α, λ 2 = λ 3 = −1, and the matrix is diagonalizable for all valid values of α.A dual basis for the eigenspaces of A is given by Once more, assumptions (A1) -(A8) hold.By looking at the eigenvalues of A, we see immediately that Theorem 3.1 (ii) applies.The covariance matrix for this case is included in Appendix A.2.
Restricted to R l n and B l n , the results of Theorem 2.4 follow.
The proof of Theorem 2.6 follows much the same way as the proof of [17, Theorem 3.9].Consider a partial ordering on the set of all pairs (T, ς), where T is a rooted tree and ς is a two-colouring of the vertices, such that (T 1 , ς 1 ) (T 2 , ς 2 ) if T 1 is a subtree of T 2 (preserving the root) and Let S = {(T 1 , ς 1 ), . . ., (T q , ς q )} such that if (T, ς) ∈ S and (T ′ , ς ′ ) (T, ς), then (T ′ , ς ′ ) ∈ S. Assume that the pairs (T 1 , ς 1 ), . . ., (T q , ς q ) are indexed so that if (T i , ς i ) (T j , ς j ) then i < j, and assume that (T 1 , ς 1 ) corresponds to a single red vertex, and (T 2 , ς 2 ) corresponds to a single blue vertex.We define an urn such that for the tree T n with colouring σ n , if a vertex v is the root of a fringe subtree T isomorphic to T i with σ n | T = ς i for which (T i , ς i ) ∈ S and if v does not belong to another fringe subtree T ′ isomorphic to T j with σ n | T ′ = ς j such that (T i , ς i ) (T j , ς j ) ∈ S, then v is represented in the urn by the ball of type i.If v is not the root of a fringe subtree isomorphic to a tree with colouring in S, then v is represented by α deg + (v) + 1 balls of special type * r if v is red, and * b if v is blue.Let Y i n be the number of balls of type i at time n, and let Y * r n and Y * b n be the number of balls of special type * r and * b respectively at time n, and let Y n = (Y 1 n , . . ., Y q n , Y * r n , Y * b n ).For example, consider S = {(T 1 , ς 1 ), . . ., (T 6 , ς 6 )}, where (T i , ς i ) are identified on the right side of Figure 3.A tree T 23 with colouring σ 23 is given in Figure 3. Then the urn we consider will contain two balls of type 1, two balls of type 2, one ball of type 3, one ball of type 4, one ball of type 5, and two balls of type 6.There are a further 7α + 4 balls of type * r for the remaining red vertices, and 6α + 2 balls of type * b for the remaining blue vertices.Note that only two red leaves contribute balls of type 1, since the remaining red leaves are subtrees of fringe subtrees isomorphic to (T 4 , σ 4 ) or (T 6 , σ 6 ).The activity of each ball of type i is given by the sum of the weights of the vertices in the tree T i , which is a i := |T i |(α + 1) − α.The activities of the balls of special type is 1.When a ball of type i is picked, this corresponds to adding a child u to a vertex v that lies in a fringe subtree isomorphic to T i .Let (T j , ς j ) denote the fringe subtree with u attached and coloured.If (T j , ς j ) ∈ S, then the ball of type i is removed and replaced with a ball of type j.If (T j , ς j ) / ∈ S, then the ball of type i is removed, the root ρ j of T j is now represented by α deg + (ρ j ) + 1 balls of special type (with the appropriate colours) that are newly added, and the children of Proof of Theorem 2.6.We start with convergence of the random vector Y n .We would like to know the eigenvalues of the matrix A q+2 .We proceed by induction on k.Let 4 ≤ k ≤ q + 2 and consider type 1 or 2 is added with equal probability.For 3 ≤ k ≤ q + 1, let A ′ k be the intensity matrix for the urn with balls of type 1, . . ., k − 1 along with balls of type * .Similar arguments as above hold, but in this case, the base case A ′ 3 is the matrix A l from (10).Thus, the eigenvalues A ′ q+1 are The conditions are once again met for convergence in distribution of the urn process, and by applying Theorem 3.1 and for the appropriate right eigenvector v ′ 1 of A ′ q+1 , we get The random variables X 1 n , . . ., X q n are linear combinations of Y 1 n , . . ., Y q n , and so the convergences of Theorem 2.6 hold by ( 11) -( 15) above and the Cramér-Wold Theorem, though we need to replace µ from (4) with some vector µ ′ for now.We can show that µ ′ = µ by looking at E[(X 1 n , . . ., X q n )]/n.We have just argued that (X 1 n , . . ., X q n )/n converges almost surely to µ ′ , and since no number of fringe trees exceeds the number of vertices, (X 1 n , . . ., X q n )/n is uniformly bounded.Therefore, (X 1 n , . . ., X q n )/n converges in mean to µ ′ , and so E[(X 1 n , . . ., X q n )]/n converges to µ ′ .From [17, Remark 3.10] we know that the expected number of fringe subtrees where k i is the number of vertices in T i .Since the root of T n is red or blue with equal probability, then by symmetry, the root of a fringe subtree T isomorphic to T i is red or blue with equal probability.
Then by definition of σ n , the colouring ς of T follows the same distribution as σ T .From this, we conclude that Therefore, E[(X 1 n , . . ., X q n )]/n converges to µ, and so µ ′ = µ.

Proofs of properties of the root cluster
As mentioned in the introduction, convergence in distribution for the size of the root cluster has previously been proven [4,3,8,25] using random walks and branching processes.Here we use results from analytic combinatorics to get recursions for the moments of the limiting distributions.
We start by a useful description of the trees studied.Since we are only interested in the size of the root cluster and not the colour of this cluster, we can assume without loss of generality that the root is red.In the following, we define φ : For a particular tree T on n vertices, define the weight of T to be Then the probability of producing the tree T is given by , see for example [10,Section 1.1.3].The probability of producing T with a broadcasting induced colouring σ T being the 2-colouring c is then given by , the factor of 2 appearing since we condition on the root being coloured red.If we define the weight ω(T, c) = 2P(σ T = c)w(T ), then where (T ′ , c ′ ) ranges over all rooted trees on n vertices and over all 2-colourings of the vertices such that the root is red.Symmetrically, define ω ′ (T, c) to be the weight of T and c where c is conditioned such that the colour of the root is blue.
Let r n,k be the sum of the weights ω(T, c) over all trees with n vertices whose root vertex is red and whose root cluster has size k, and let b n be the sum of the weights ω ′ (T, c) over all trees on n vertices with a blue root.Equivalently, b n is the sum of the weights w(T ) over all trees T on n vertices.Then notice that and the probability that T n with colouring σ n has a root cluster of size k ′ is given by r n,k ′ / k r n,k .
We develop a recursion formula for r n,k .Take any tree T on n vertices, with colouring c conditioned on the root ρ being red, whose root cluster is of size k, and with δ subtrees rooted at the children of the root ρ.Suppose we order the subtrees such that the first s subtrees T 1 , . . ., T s have red roots, and the remaining subtrees T s+1 , . . ., T δ have blue roots.Then the weight of T and c can be written as where the c i 's and c j 's are the colouring c restricted to the subtrees T i and T j respectively.If the trees T 1 , . . ., T δ are of size n 1 , . . ., n δ , then Now sum over all such trees T on n vertices with root clusters of size k.The degree δ of the root can range from 0 to n − 1.The number s of children with the colour red ranges from 0 to δ.There are δ s ways of choosing these s children.There are n−1 n 1 ,...,n δ ways of distributing the remaining n − 1 vertices to the δ subtrees T 1 , . . ., T δ .Finally, to unorder the subtrees we divide by δ! to get the recursion where n 1 , . . ., n δ range over all non-negative n 1 + • • • + n δ = n − 1, and k 1 , . . ., k s range over Finally we can let δ range to infinity since r 0,k = b 0 = 0. We start with the case α = 0.It is already known that the size of the root cluster converges to a Mittag-Leffler distribution after proper rescaling.For the sake of completeness and as a simpler example of our more general methods, we reprove the result here.
Let R(x, u) be the bivariate generating function for r n,k , so and let B(x) be the exponential generating function for b n .The first thing to notice is that b n and n k=1 r n,k are simply the number of recursive trees of size n, which is (n − 1)!.Therefore, Using (18), we establish a partial differential equation for R(x, u).The resulting differential equation is then solved to get the following closed form for R(x, u).
Proposition 4.1.For α = 0, the bivariate generating function R(x, u) is given by Proof.From (18), where φ(δ) = 1 for all δ (recall ( 16)), we get the partial differential equation Replacing B(x) with − ln(1 − x) and with the initial condition R(0, u) = 0, this linear differential equation has the solution To prove Theorem 2.9, we use the method of moments to establish a limiting distribution for the size of the root cluster after appropriate scaling.
Proof of Theorem 2.9.From Proposition 4.1, we calculate We then extract the coefficients, Let C n be the root cluster at time n.The factorial moments of |C n | are extracted from the bivariate generating function (see for example [13, Proposition III.2]) to get It can be seen (say by induction), that once expanded and scaled by n pk , all but the E[|C n | k ] term on the left hand side of the above equation vanish to zero, and thus , which are the moments of the Mittag-Leffler distribution with parameter p.The Mittag-Leffler distribution is uniquely determined by its moments (since its moment generating function, the , converges for all values of s [24]).Therefore, where M p has the Mittag-Leffler distribution with parameter p.
We move on now to the case α > 0. We again let The functions B(x) and R(x, 1) are simply the generating function of preferential attachment trees, which is already known (see for example [10, p. 252]) to be Unlike the case when α = 0, we were unable to derive a closed form for R(x, u).But by applying the method of moments, we only need the k'th partial derivatives of R(x, u) with respect to u. Define Throughout the remainder of this section, we will make use of the partial Bell polynomials, which are defined to be for some ε > 0, where C k satisfies the recursion C 1 = α/(p + α) and Proof.Using the recursion in (18), where φ(d) = Γ(d + 1/α)/Γ(1/α) (recall ( 16)), we get the following partial differential equation: We proceed by strong induction.Using the above differential equation, we see that Solving this differential equation with the initial condition R 1 (0) = 0 yields , which is analytic on the desired cut plane.
For the inductive step, using the product rule at higher orders of partial differentiation produces Then and by using Faá di Bruno's formula for higher order derivatives (see [9, p. 139, Theorem C]), we see that where Since analyticity is preserved under arithmetic operations as well as integration, the analyticity of R k (x) on the desired cut plane follows by the analyticity in the induction hypothesis.By using the forms of R j (x) in the induction hypothesis and ( 19), then for some ε > 0, where From (20), the induction hypothesis, and the assumption α > 0, we can also conclude that +ε for some ε > 0. By solving the differential equation we get that where , concluding the proof of the lemma.
When proving Theorem 2.10, we need to show that our limiting distribution is uniquely determined by its moments.This is accomplished by verifying that the moment generating function exists for some positive radius.To prove this fact, we will instead show that the exponential generating function for the coefficients C k from the previous lemma exists for some positive radius around has a unique analytic solution for some neighbourhood around x = 0. Furthermore, this solution can be written as Proof.By using the Taylor expansion we see that which is analytic on |x| < 1/p.Therefore, we can rewrite the differential equation in (21) as where f (x) = 1 + O(x), so in particular, f (0) = 1.Furthermore, f (x) maintains the same radius of convergence as 1 (1−px) 1/α .We solve the above separable differential equation for some constant K, where F (y) = (1−f (c))dc cf (c) .The analyticity of F (x) in some neighbourhood of x = 0 is guaranteed by preservation of analyticity through integration and the analyticity of 1−f (c)  cf (c) , which is itself analytic due to the analyticity of f (x) and the fact that f (0) = 1.Thus, using the implicit value theorem, there exists a unique analytic function c(x) in the neighbourhood of x = 0 such that c(0) = 0.
To prove the last part of the lemma, it suffices to show that the power series satisfies the differential equation (21).Recall the recursion for C k given in Lemma 4.2, which states that k j=2 Recall that Then by using known results about composition of functions and Bell polynomials (see e.g.[9, p. 137, Theorem A]), which can be rearranged to give (21).
We now have all the tools necessary to prove Theorem 2.10.
Proof of Theorem 2.10.Using a transfer theorem (see [13,Corollary VI.1]) and Lemma 4.2, and Let C n be the root cluster at time n. .
It can be seen (say by induction) that once expanded and scaled by n k(p+α)/(1+α) , all but the E[|C n | k ] term on the left hand side of the above equation vanish to zero, and thus For all k large enough, M k < C k , and so m( x k , which is guaranteed to be nonzero by Lemma 4.3.Let C be the distribution uniquely determined by its moments M k .Then by using the method of moments, we have shown that In general, we were unable to derive a closed form for C k .We were, however, able to derive a closed form when α = 1.
Proof of Theorem 2.11.We use Lemma 4.3, and replace α with 1 to get which is rewritten as Applying the Lagrange inversion formula (see for example [13,Theorem A.2]) to this functional equation yields So finally Theorem 2.11 now follows from the above derivation and Theorem 2.10.
We turn our attention to the case α = −1/d for some integer d ≥ 2, and α > −p.In this case the functions B(x) and R(x, 1) are equal to the generating function for increasing d-ary trees, which is known (see for example [10,Lemma 6.5]) to be Once more, we were unable to derive a closed form for R(x, u) in this case.Recall the notation for some ε > 0, where D k satisfies the recursion D 1 = 1/(pd − 1) and Since the proof of Lemma 4.4 follows much the same way as the proof of Lemma 4.2, the argument is relegated to Appendix B. Much like the case above for α > 0, we will prove the existence of the moment generating function of our limiting distribution in a neighbourhood of 0, and this is done by studying the exponential generating function of D k .
Lemma 4.5.The differential equation has a unique analytic solution for some neighbourhood around x = 0. Furthermore, this solution can be written as Proof.By using the Binomial Theorem, we rewrite the differential equation as where which is simply a polynomial (and so an entire function), and g(0) = 1.The remainder of the existence part of the proof now follows much the same as that of Lemma 4.3.
To prove the last part of the theorem, recall the recursion of D k given in Lemma 4.4.Then min{k,d} j=2 By using known results about composition of functions and Bell polynomials, which can be rearranged to give (22).
The proof of Theorem 2.16 now follows in much the same way as the proof of Theorem 2.10.
Let C n be the root cluster at time n.The factorial moments of |C n | are extracted from the bivariate generating function (see for example [13, Proposition III.2]), and once scaled by n k(pd−1)/(d−1) , we get For all k large enough, M k < D k , and so m( x k , which is guaranteed to be nonzero by Lemma 4.5.Let C be the distribution uniquely determined by its moments M k .Then by using the method of moments, we have shown that We were unable to find a closed form for D k in general.However, a closed form can be found in the case of binary search trees, when d = 2.
Proof of Theorem 2.17.We use Lemma 4.5 and replace d with 2 to get which is rewritten as The Lagrange inversion formula yields Theorem 2.17 now follows from the above derivation and Theorem 2.16.
We now look at the cases when the root cluster is finite.Our strategy in these cases is to look at bond percolation on the complete infinite d-ary tree T d .The root cluster K d after performing bond percolation on T d is distributed as a Galton-Watson tree with binomial Bin(d, p) offspring distribution.The size (total progeny) of such (finite) trees is known to follow where X 1 , . . ., X k are independent binomial random variables X i ∼ Bin(d, p) (this result was proved by Otter [29]; a more general result was proved by Dwass [11].See also [16, Exercises 2.2-2.4]). Thus If we now let T n be the rooted subtree of T d corresponding to a random increasing d-ary tree at time n, then C n ∼ T n ∩ K d is distributed as the root cluster of T n with a random broadcasting induced colouring σ n , where the intersection is the subtree of both T n and K d .For example in Figure 5, we see a tree T 9 , with thick edges in the figure, grown on a complete infinite 3-ary tree T 3 .Bond percolation has been performed on T 3 (dashed edges represent edges that were removed), and the root cluster K 3 is shown surrounded by dotted lines.The root cluster C 9 of T 9 is the intersection of K 3 and T 9 .
Proof of Theorem 2.12.First note that with the set-up above, |C n | ≤ |K d |.We look at the fill-up or saturation level H n of T n , the greatest value m for which T n has d m vertices at distance m from the root.From [10, Theorem 6.46], we know that for some constants a, c > 0 (not depending on n).
The last line tends to zero thanks to (24) and the fact that K d is finite.The desired convergence in probability is then achieved, completing the proof.
When p < 1/d, the distribution described by ( 23) has finite moments.Since {|C n |} ∞ n=1 consists of increasing positive random variables bounded by K d , then their moments are uniformly bounded by the moments of K d .Thus, along with the almost sure convergence of Theorem 2.12, convergence in all moments holds as well (see [15,ch. 5,Theorem 5.2]).Howewer, when p = 1/d, the distribution described by (23) does not even have finite expectation.
We can, however, derive asymptotic results for the moments of |C n |.
We start by once more approximating the functions R k (x).
where E k satisfies the recursion E 1 = 1/(d − 1) and Since the proof of Lemma 4.6 also follows much the same way as the proof of Lemma 4.2, the argument is relegated to Appendix B.
Proof of Theorem 2.15.From the approximations of the functions R k (x) in the previous proofs, we conclude by a transfer theorem [13, Corollary VI.1] (or also [20, Théorème A]) that ) .
Therefore, we see that

A Covariance matrices A.1 Number of clusters of each colour
For p < (3 − α)/4: We proceed by strong induction.Using the above differential equation, we see that Solving this differential equation with the initial condition R 1 (0) = 0 yields , which is analytic on the desired cut plane.
For the inductive step, using the product rule at higher orders of partial differentiation produces for 0 ≤ m ≤ d.By using Faá di Bruno's formula for higher order chain rule, we see that where Since analyticity is preserved under arithmetic operations as well as integration, the analyticity of R k (x) on the desired cut plane follows by the induction hypothesis.By using the forms of R j (x) in the induction hypothesis and (28), then which is analytic on the desired cut plane.
For the inductive step, the derivation ( 27) holds here as well.We get that By following the same steps as the proof of Lemma 4.4, we see that As before, analyticity is preserved.By using the induction hypothesis and the simplification Setting concludes the proof of the lemma.

Theorem 2 . 12 .
Let |C n | be the size of the root cluster of T n with broadcasting induced colouring σ n .Then |C n | a.s.− − → |K d |.Using well known results on the size of |K d |, the following corollary is immediate: Corollary 2.13.Let α = −1/d, where d ≥ 2 is a positive integer.Then for every positive integer k,

3 .
Consider an urn with four colours of balls: r, b with activity 1, and r c , b c , with activity 0. Let R w n , B w n be the total number of balls (and so the total activity of the balls) of colour r, b, respectively, and let R c n , B c n be the number of balls of colour r c , r b respectively.As in the urn above, the balls r and b represent the weights of the red and blue vertices in T n with colouring σ Tn .The balls of colours r c and b c represent clusters of colour red and blue respectively.We start the urn with a ball of colour r and a ball of colour r c if the root is red, and a ball of colour b and a ball of colour b c if the root is blue.Therefore the number of red and blue clusters at time n is exactly R c n and B c n respectively.For example in Figure 2, there are 7 red clusters and 5 blue clusters, so R c n = 7 and B c n = 5.Each vertex v contributes α deg + (v) + 1 to the total weight of its colour.Summing over all red vertices yields R w n = 13 + 11α and summing over all blue vertices yields B w n = 10 + 11α.

Figure 2 :
Figure 2: A tree T 23 with broadcasting induced colouring σ 23 with the clusters identified.
distribution to a normal distribution for all a, b ∈ R, so the Cramér-Wold theorem applies.Finally, a quick calculation shows that Cov(R n , B n ) = −Var(R n ), implying the convergence in Theorem 2.1(ii) when p = 1/2.

Figure 3 :
Figure 3: A tree T 23 with broadcasting induced colouring σ 23 with fringe subtrees identified.
ρ j are roots to deg + (ρ j ) newly considered fringe subtrees.If these subtrees (along with their colouring) appear in S, then balls representing them are added.Otherwise, balls of special type are added for the root, and the subtrees of that vertex are considered, continuing this process until all vertices are represented by balls in the urn.If a new vertex u added to T n is the child of a vertex v that is represented by balls of special type in the urn, then α balls of special type with the appropriate colour are added to the urn, representing the increase in the weight of v, while either a ball of type 1 or 2 is added as well, representing the new leaf u added to T n .For 4 ≤ k ≤ q + 2, let S k = {(T 1 , ς 1 ), . . ., (T k−2 , ς k−2 )}, and let A k be the intensity matrix for the urn with balls of type 1, . . ., k − 2 along with * r and * b .Let a i := |T i |(α + 1) − α.

Figure 4 :
Figure 4: A recursion for the weight of T is established by examining the subtrees rooted at children of the root of T .

Lemma 4 . 4 .
Let d ≥ 2 be a positive integer and let p > 1/d.Then R k (x) is analytic on the cut plane

First
Figure5: A random increasing 3-ary tree is grown on a complete infinite 3-ary tree with bond percolation performed.The root cluster C 9 has 4 vertices at this stage.
The factorial moments of |C n | are extracted from the bivariate generating function (see for example [13, Proposition III.2]) to get