The number of descendants in a random directed acyclic graph

We consider a well known model of random directed acyclic graphs of order $n$, obtained by recursively adding vertices, where each new vertex has a fixed outdegree $d\ge2$ and the endpoints of the $d$ edges from it are chosen uniformly at random among previously existing vertices. Our main results concern the number $X$ of vertices that are descendants of $n$. We show that $X/\sqrt n$ converges in distribution; the limit distribution is, up to a constant factor, given by the $d$th root of a Gamma distributed variable. $\Gamma(d/(d-1))$. When $d=2$, the limit distribution can also be described as a chi distribution $\chi(4)$. We also show convergence of moments, and find thus the asymptotics of the mean and higher moments.


Introduction
A dag is a directed acyclic (multi)graph, and a d-dag is a dag where one or several vertices are roots with outdegree 0, and all other vertices have outdegrees d. (Here, d is a positive integer; we assume below d ě 2.) We consider, as many before us, the random d-dag D n on n vertices constructed recursively by starting with a single root 1, and then adding vertices 2, 3, . . ., n one by one, giving each new vertex, k say, d outgoing edges with endpoints uniformly and independently chosen at random among the already existing vertices t1, . . ., k ´1u.(We thus allow multiple edges, so D n is a directed multigraph.)Two minor variations that will be discussed in Section 10 are that we may start with any number m ě 1 of roots, and that we may select the d parents of a new node without replacement, thus not allowing multiple edges.(In the latter case, we have to start with ě d roots.) Note that for d " 1, the model becomes the well known random recursive tree; the properties in this case are quite different from the case d ě 2, and we assume throughout the paper d ě 2. In fact, to concentrate on the essential features, in the bulk of the paper we consider the most important case d " 2; the minor differences in the case d ą 2 are briefly treated in Section 8.
The random d-dag has been studied as a model for a random circuit where each gate has d inputs chosen at random [10; 19; 1; 18; 6; 16].(In this case it seems more natural to reverse all edges, and regard a d-dag as a graph with indegrees 0 or d.In the present paper, we direct the edges towards the root(s) as above.)The model has also been studied in connection with constraint satisfaction [13,.Among results shown earlier for random d-dags, we mention results on vertex degrees and leaves [9; 18; 15; 16; 14], and on lengths of paths and depth [10; 19; 1; 8; 6].
In the present paper, we study the following problem, as far as we know first considered by Knuth [13,  vertex n have?In other words, how many vertices can be reached by a directed path from vertex n?In the random circuit interpretation, this is the number of gates (and inputs) that are used in the calculation of an output.
We state our main results in the next subsection, and prove them in Sections 2-8.Along the way, we prove some results on the structure of the subgraph of descendants which may be of independent interest.Some further results are given in Section 9.As said above, we discuss two variations of the model in Section 10.
Remark 1.1.We emphasise that we in this paper exclusively consider random dags constructed by uniform attachment.Another popular model that has been studied by many authors (often as an undirected graph) is preferential attachment, see e.g.[3] and [5].A different model of non-uniform attachment is studied in [6].△ Problem 1.2.Find results for preferential attachment random dags corresponding to the results above!Do the same for the model in [6]! 1.1.Main result.We introduce some notation; for further (mainly standard) notation, see Section 1.2.We let d ě 2 be fixed and consider asymptotics as n Ñ 8. Let D n be the random d-dag defined above, let p D n be the subdigraph of D n consisting of all vertices and edges that can be reached by a directed path from vertex n (including vertex n itself), and let X pnq :" | p D n |, the number of descendants of n.We thus want to find the asymptotic behaviour of the random variable X pnq and its expectation E X pnq as n Ñ 8.Note that p D n also is a d-dag, and has 1 root; thus the number of edges in p D n is dpX pnq ´1q, and hence our results also yield the asymptotics of the number of edges.
Our main result in the case d " 2 is the following theorem, proved in two parts in Sections 6 and 7.
Let χ 4 denote a random variable with the χp4q distribution.Recall that this means that χ 4 has the distribution of |η| where η is a standard normal random vector in R 4 , and that thus (or by (1.7) and a change of variables) χ 4 has density function x 3 e ´x2 {2 , x ą 0. (1.1) Theorem 1.3.Let d " 2.Then, as n Ñ 8, with convergence of all moments.Hence, for every fixed r ą 0, We note that the convergence in (1.2) and (1.5) does not hold a.s.; see Remark 9.6.We will see in Section 10 that the same results hold for the variations with m ě 1 roots (as long as m is fixed or does not grow too fast) and without multiple edges (i.e., drawing without replacement).
Example 1.5.Knuth [13,] considers the version with d " 2, m ě 2 roots, and drawing without replacement (i.e., no multiple edges); for this version he provides recursion formulas that yield the exact value of E X pnq (there denoted C m,n ).For example, for m " 2 and n " 100, his formulas yield E X pnq .
" 20.79 while the asymptotic value (1.4) is ." 20.88, with an error of less than 0.5%.△ 1.2.Notation.The random d-dag D n , its subdigraph p D n , and the number X pnq of descendants of n are defined above.The outdegree d is fixed and not shown in the notation.As said above we usually assume d " 2; in particular this is the case in the proof in Sections 2-7, while we consider general d ě 2 in Section 8.
We say that the vertices and edges of p D n are red.Thus X pnq :" | p D n | is the number of red vertices in D n .(For any digraph D, we let |D| denote its number of vertices.)Essentially all random variables below depend on n.We may denote the dependency on n by a supersript pnq for clarity (in particular in limit statements), but we often omit this.We sometimes in the proofs tacitly assume that n is large enough.Unspecified limits are as n Ñ 8.
We will in the proofs consider three different phases of the dag D n , see Sections 3-5.We will then use fixed integers n 1 " n pnq 1 and n 2 " n pnq 2 ; these can be chosen rather arbitrarily with n 1 {n Ñ 0 slowly and n 2 { ?n Ñ 8 slowly, see the beginnings of Sections 3 and 4.
We use ÝÑ, for convergence in probability, distribution and L 1 , respectively, and d " for equality in distribution.As usual, a.s.(almost surely) means with probability 1, while w.h.p. (with high probability) means with probability tending to 1 as n Ñ 8.
Remark 1.6.For simplicity, and to avoid unnecessary distraction, we often state results with convergence in probability, also when the proof yields the stronger convergence in L 1 .(For example, this applies to all three results in Section 4.) Actually, in many (all?) cases, convergence in probability can be improved to convergence in L p for any p ă 8, as a consequence of the estimates in Section 7. △ Remark 1.7.The construction of the random dag D n naturally constructs D n for all n ě 1 together.In other words, it yields a coupling of D n for all n ě 1.However, in the proofs below we will not use this coupling; instead we regard D n as constructed separately for each n, which allows us to use a different coupling in the proof.△

Basic analysis
For simplicity, we assume d " 2 from now on until the proof of Theorem 1.3 is completed at the end of Section 7. The modifications for general d are discussed in Section 8.

2.1.
A stochastic recursion.We consider in the sequel only the red subgraph p D n of D n , which we recall consists of the descendants of n and and all edges between them.
In the definition in Section 1 of the dag D n , we start with vertex 1 and add vertices in increasing order.In our analysis, we will instead start at vertex n and go backwards to 1.The red dag p D n then may be generated by the following procedure.(1) Start by declaring vertex n to be red, and all others black.Let k :" n.
(2) If vertex k is red, then create two new edges from that vertex, with endpoints that are randomly drawn from 1, . . ., k ´1, and declare these endpoints red.If k is black, delete k (and do nothing else).(3) If k " 2 then STOP; otherwise let k :" k ´1 and REPEAT from (2).
Let Y k be the number of edges in p D n that start in tk`1, . . ., nu and end in t1, . . ., ku.In other words, Y k is the number of edges that cross the gap between k `1 and k.Furthermore, let Z k be the number of these edges that end in k.We here consider integers k with 0 ď k ď n ´1, and have the boundary conditions Y n´1 " 2 and Y 0 " 0; also Z 1 " Y 1 and Z 0 " 0.
Let also, for 1 ď k ď n ´1, the indicator that at least one edge ends at k, which equals the indicator that k is red, and thus can be reached from n.
We will study the random dag p D n by travelling from vertex n backwards to the root; we thus consider the sequence Y n´1 , . . ., Y 1 , Y 0 in reverse order.In the procedure above, there are Z k edges that end at k, and 2J k edges that start there; hence, for (2.2) In our analysis, we modify the procedure above by not revealing the endpoint of the edges until needed.This means that when coming to a vertex k P t1, . . ., n ´1u, we have a list of Y k edges where we know only the start but not the end (except that the end should be in t1, . . ., ku).We then randomly select a subset by throwing a coin with success probability 1{k for each of the Y k edges; these edges end at k and are removed from the list, and thus Z k is the number of them.This determines also J k by (2.1), and if J k " 1, we add two new edges starting at k to our list.It is evident that this gives the same distribution of random edges as the original algorithm above.(It is here important that the two edges from a given vertex are chosen with replacement, so that we can treat the Y k edges passing over the gap between k `1 and k as independent.Note that the endpoints of these edges are uniformly distributed on t1, . . ., ku.) It follows from the modified procedure that Y n´1 , . . ., Y 1 is a Markov chain.More precisely, let F k be the σ-field generated by our coin tosses at vertices n ´1, . . ., k `1, and note that these coin tosses determine Y k (and also Y n´1 , . . ., Y k`1 ).Then, for 1 ď k ď n ´1, conditioned on F k , Z k has a binomial distribution (2.3) 2) and (2.3) give a stochastic recursion of Markov type for Y k .Note that F k Ă F k´1 , so F 1 , . . ., F n´1 form a decreasing sequence of σ-fields, i.e., a revcerse filtration.We therefore may change sign of the indices and consider, for example, Y ´j and F ´j for j P t´pn ´1q, . . ., ´1u so that we have a filtration of the standard type.
The recursion (2.2)-(2.3)yields, for 2 ď k ď n ´1, We obtain also, by Markov's inequality, A reverse supermartingale and some estimates.We define, for 0 ď k ď n ´1, and find from (2.5) This shows that W ´j , ´pn ´1q ď j ď 0, is a supermartingale for the filtration pF ´j q; in other words, W 0 , . . ., W n´1 is a reverse supermartingale.We have the initial value We thus have the Doob decomposition where is positive and reverse increasing: (2.7) yields 0 " A n´1 ď . . .ď A 1 ď A 0 . (2.12) In particular, W k ď M k and We note also from (2.4) the exact formula

Phase I: a Yule process
In this section we consider the first part of the evolution of the red dag p D n , and consider the variables Y n´1 , ..., Y n 1 , where (for definiteness) we let n 1 :" n as any (deterministic) sequence of integers such that n 1 {n Ñ 0 slowly; in particular, any such sequence with n 1 ě n{ log n will also do.We leave it to the reader to see precisely how small n 1 can be.)We will show that the variables Y n´1 , ..., Y n 1 can be approximated (as n Ñ 8) by a time-changed Yule process.
Recall that the Yule process is a continuous-time branching process, where each particle lives a lifetime that has an exponential Expp1q distribution, and then the particle splits into two new particles.(All lifetimes are independent.)Let Y t be the number of particles at time t.The standard version, which we denote by Y 1 t , starts with one particle at time 0, but we start with Y 0 " 2; thus the process Y t can be seen as the sum of two independent copies of the standard Yule process Y 1 t .It is well known, see e.g.[2, Section III.5], that for the standard Yule process, the number of particles at time t has the geometric distribution Gepe ´tq with mean e t and Moreover, Y 1 t {e t a.s.ÝÑ ξ as t Ñ 8, where (e.g. as a consequence of (3.1)) ξ P Expp1q.Hence, Y t has a shifted negative binomial distribution NegBinp2, e ´tq `2 with In particular, for all t ě 0 we have and, as t Ñ 8, e ´tY t a.s.
ÝÑ ξ :" ξ1 `ξ 2 P Γp2q, ( with ξ1 , ξ2 P Expp1q independent, so that their sum has a Gamma distribution.We may also regard the Yule process Y as an infinite tree (the Yule tree), with one vertex γ 0 :" 0 (the root), and one vertex γ i at each time a particle splits (a.s.these times are distinct, and we may number them in increasing order); each particle is then represented by an edge from its time of birth to its time of death.Note that Y t , the number of living particles, equals the number of edges alive at time t, and that the number of particles that have died before (or at) t is Y t ´2.
We now change time by the mapping t Þ Ñ e ´t; thus the vertices in the Yule tree are mapped to the points e ´γi P p0, 1s.The root is now at 1, and edges go from a larger label to a smaller.If a particle is born at one of these times x " e ´γi , and its lifetime in the original Yule process is τ P Expp1q, then it lives there from γ i to γ i `τ , and after the time change it is represented by an edge from x " e ´γi to e ´pγ i `τ q " xe ´τ " xU , where U :" e ´τ P Up0, 1q has a uniform distribution.Going backwards in time, we thus begin with two particles (edges) starting at 1.Each edge starting at a point x has endpoints xU 1 x and xU 2 x , where U 1 x , U 2 x P Up0, 1q, and all these uniform random variables are independent.As before, we start two new edges at each endpoint.We let p Y denote this (infinite) random tree with vertices in p0, 1s, and let p Y x be the number of particles (edges) alive at time x.
We may now compare the time-changed Yule tree to the red dag p D n constructed above, scaled to r0, 1s.An edge from a vertex k ends at a vertex uniformly distributed on t1, . . ., k ´1u, which we may construct as tpk ´1qU u `1, where U P Up0, 1q.We thus start with one point at n, and add again two edges from it and from the endpoint of every edge (except at 1), where now an edge started at j `1 goes to tjU u `1 with U P Up0, 1q.However, if two or more edges have the same endpoint, we still only start two new edges there.
A point in p D n that is m generations away from the root, thus has label for the some U ν 1 , . . ., U νm P r0, 1s (from the construction of the edges), and then Let p D 1 n denote the random red dag p D n with all labels divided by n; thus the vertices are now points in p0, 1s.We then see that p D 1 n coincides with the time-changed Yule tree up to small errors.More precisely, we couple the two by first constructing the Yule tree Y, and its time-changed version p Y, and then making a perturbation of p Y by replacing each label U ν 1 ¨¨¨U νm by X{n with X as in (3.5).This gives a dag that coincides (in distribution) with p D 1 n until the first time that two edges in p D 1 n have the same endpoint.Theorem 3.1.We may w.h.p. couple the random dag p D 1 n and the time-changed Yule tree p Y, such that considering only vertices with labels in rn 1 {n, 1s, and edges with the starting point in this set, there is a bijection between these sets of vertices in the two models which displaces each label by at most log 2 n{n, and a corresponding bijection between the edges (preserving the incidence relations).
Proof.We have p Y x " Y ´log x for every x P p0, 1s, and thus by (3. 3) The number of vertices with labels in rx, 1s is p Y x ´1, and taking x " n 1 {n " 1{ log n, we thus have O p plog nq vertices; in particular w.h.p. less than log 2 n vertices.Consequently, w.h.p., the number of generations from the root to any point in rn 1 {n, 1s is at most log 2 n, and then the bound (3.6) shows that all vertex displacements are at most plog nq 2 {n.
Furthermore, it follows from (3.7) that the expected number of vertices in p Y that are within plog nq 2 {n from n 1 {n is and thus w.h.p. no vertex is pushed across the boundary n 1 {n by the displacements in the coupling.Finally, it follows from Lemma 2.2 that the probability that two edges in the dag p D n have the same endpoint k for some k ě n 1 is at most Consequently, w.h.p. the coupling above between p Y and p D n yields a bijection for vertices in rn 1 {n, ns and their edges.
We define a random variable that will play an important role later: let Hence, recalling (2.6), The convergence in probability in (3.16)-(3.17)depends on the coupling used above, but it follows that convergence in distribution holds also without it, which completes the proof.

Phase II: a boring flat part
Let n 2 " n pnq 2 be any sequence of integers with ?n !n 2 ď n 1 .We will show that in the range n 1 ě k ě n 2 , the variable W k essentially does not change, so it is equal to a random constant.We begin with two lemmas valid for larger ranges.
We have, using (3.10) and where the convergence follows by (4.4) and Lemma 4.1.
Theorem 4.3.As n Ñ 8, Proof.We have, for any k, and thus the result follows from Lemmas 4.1 and 4.2.

Phase III: deterministic decay from a random level
We extend the processes W k , M k and A k to real arguments t P r0, n ´1s by linear interpolation.Since the extended version A t is piecewise linear, it is differentiable everywhere except at integer points, where we (arbitrarily) take the left derivative. (5.1) 2) The result (5.1) follows by Lemma 2.1.
We rescale and define p A Recall also that Cra, bs is the (Banach) space of continuous functions on ra, bs. for every fixed b ą 0.
Remark 5.4.We may note that (5.7) means convergence, in probability, in the space Cr0, 8q with its standard topology (uniform convergence on compact sets).Equivalently, we may consider the step functions n ´1W pnq tt ?
nu and convergence in Dr0, 8q.△ Proof.We divide the proof into several steps.
Step 1: A subsequence.By Lemma 5.2 and Prohorov's theorem [4, Theorem 6.1], for every compact interval rδ, bs Ă p0, 8q we can find a subsequence pn ν q such that, along the subsequence, for some continuous random function A δ,b ptq on rδ, bs.Furthermore, it suffices to consider a countable set of such intervals, for example I :" trm ´1, ms, m ě 2u, and by considering convergence in the product space ś rδ,bsPI Crδ, bs we can find a subsequence such that (5.8) holds jointly for all compact intervals rδ, bs P I; by adding a factor R 2 , we may also assume that this holds jointly with (3.11) and (4.3).We consider until the last step of the proof only this subsequence.
Step 2: A coupling.By the Skorohod coupling theorem [12,Theorem 4.30], we may couple D n for different n such that the convergence in (5.8) holds a.s.for every rδ, bs P I, and also (3.11) uniformly on each compact interval in p0, 8q.
Step 5: Uncoupling.The a.s.convergence in (5.33) depends on the chosen coupling of D n for different n, but this yields (5.33) with convergence in probability in general, i.e., (5.7).
Step 6: Conclusion.We have so far proved (5.7)only for a subsequence, but the same proof shows that every subsequence has a subsubsequence such that (5.7) holds, which as is well known implies that (5.7) holds for the full sequence, see e.g.[11, Section 5.7].

The number of descendants
Recall that the random variable X " X pnq is the number of descendants of n, i.e. red vertices, and thus, counting the root n separately, We make a Doob decomposition similar to (2.9); in this case it takes the form, since where so that pL k q n´1 0 is a reverse martingale with L n´1 " 0: E pL k´1 | F k q " L k , and, using (2.1) and (2.3), is positive and increasing backwards: By (6.4) and Lemma 2.2, for every k ď n ´1, This is too coarse for small k; however, since 0 ď J i ď 1 for every i, we also have B 0 ´Bℓ ď ℓ for every ℓ ď n ´1.Hence, (6.6) implies and thus the (reverse) martingale property of pL k q yields In particular, L 0 { ?n p ÝÑ 0, which will show that L 0 is negligible in (6.2).Lemma 6.1.As n Ñ 8, X pnq ?n ´π 2 a Ξ pnq p ÝÑ 0. (6.9) Thus, with ξ P Γp2q.
Proof.For convenience, we use the Skorohod coupling theorem as in the proof of Theorem 5.3; we may thus assume that all a.s.convergence results in the proof of Theorem 5.3 hold.(We may for simplicity consider the same subsequence as in the proof of Theorem 5.3, and then draw the conclusion for the full sequence as there; alternatively, we may argue that now when Theorem 5.3 is proved, we may consider the full sequence when we apply the Skorohod coupling theorem.)In particular, (5. Recall from (6.12) that B 0 " ?n p B pnq 0 .The results (6.9)-(6.10)now follow from (6.18) by (6.2), (6.8), and (5.10).
Proof of Theorem 1.3, first part.The limit in distribution (1.2) follows immediately from (6.10), using the well known facts that χ 2 4 P χ 2 p4q and thus

Higher moments
In this section we prove some inequalities for higher moments.We do not care about exact constants, and we use the convention that c p stands for constants that may (and will) depend on the parameter p, but not on n; the value of c p may change from one occurrence to another.
We consider first the reverse martingale M k .We define the maximal function the martingale differences, for n ´1 ě k ě 1, recalling (2.10), (2.6) and (2.2), and the conditional square function We use one of Burkholder's martingale inequalities [7, Theorem 21.1], [11, Corollary 10.9.1] on the martingale M k ´Mn´1 " M k ´2n, which yields (This is valid for any p ą 0, although we only use p ě 2.) Lemma 7.1.For every p ą 0, E pM ˚qp ď c p n p .(7.5) Proof.By Lyapunov's inequality, it suffices to prove (7.5) for p " 2 j , j ě 1 integer.We use induction on j.The base case p " 2 is proved in Lemma 2.1.In the rest of the proof, we thus assume p ě 4 and that (7.5) holds for the exponent p{2 (or smaller).We use (7.4), and it remains to estimate the two last terms on its right-hand side.First, by (7.2) and (2.17), Hence, (7.3) yields spM q ď ?10nM ˚(7.7) and the induction hypothesis yields and thus Consequently, by the induction hypothesis, Similarly, since J k has a conditional Bernoulli distribution, and using (2.24), and thus, using again (2.20), Hence, (7.2), (7.12) and (7.14) yield The induction step is shown by (7.4), (7.8) and (7.15), which completes the proof.
We proceed to our main objective, the number X of vertices in p D n .
Proof of Theorem 1.3, conclusion.Lemma 7.2 shows that E |X pnq { ?n| p " Op1q for every fixed p ą 0. By a standard argument, see e.g.[11, Theorems 5.4.2 and 5.5.9], this implies uniform integrability of the sequence |X pnq { ?n| p for every p ą 0 and thus convergence of all moments in (6.10).(Recall that convergence in distribution was proved in Section 6.) Finally, (1.3)-(1.4)now follow from the formula which is a simple consequence of (1.1), or of (6.19) and (1.8).This completes the proof.

Higher degree d
We have so far considered the random 2-dag, with outdegree d " 2. The arguments and results above extend to any constant d ě 2 with minor modifications which we sketch here, omitting straightforward details.We let d ě 2 be fixed, and let c and c p denote constants that may depend on d (and p); these may change value from one occurrence to the next.Note that the case d " 2 treated above is included as a special case below.
We define Y k , Z k , J k , and F k as in Section 2; thus Y n´1 " d, (2.1) and ( 2.3) still hold, but (2.2) is replaced by Then, instead of (2.4)-(2.5), and We now define, letting m ℓ :" mpm `1q ¨¨¨pm `ℓ ´1q denote the rising factorial, Then, (8.3) yields and thus again W k is a reverse supermartingale, with a Doob decomposition (2.9) where now We still have (2.16), up to the numerical constants (which depend on d), while we now have and Lemmas 2.2-2.3 take the form ) The moment estimates in Section 7 extend too.We find spM q ď c ? n d´1 M ˚and obtain by induction, for every p ą 0, and thus, after the same time change as before, We may choose n 1 :" tn{ log nu as in Section 3, and then Theorem 3.1 holds, except that log 2 n{n is replced by log d n{n.Furthermore, we now have In Section 4, we now choose n 2 " n pd´1q{d , and we have max uniformly on compact intervals in p0, 8q, and then to the differential equation (instead of (5.17)) with the solution where again we find C " ξ a.s., and consequently Finally, we extend the convergence to r0, 8q as above, and reach the conclusion that (generalizing Theorem 5.3) for every fixed b ą 0.
In Section 6, we replace ( This leads to where the integral is evaluated by a substitution yielding a Beta integral [17, 5.12.3, together with 5.12.1 and 5.5.3]:

Further results
We give here some further results on the structure of the random dag p D n .Again, we consider for simplicity only the case d " 2, and leave the straightforward extensions to larger d to the reader.
the number of descendants of n (red vertices) in the interval pa ?n, b ?ns.(Thus, X pnq " X pnq 0,8 .)Then, Lemma 6.1 can be extended: Lemma 9.1.If 0 ď a ď b ď 8 are fixed, then as n Ñ 8, and thus and, unconditionally, where pptq :" E ξ ξ `t2 " x `t2 e ´x dx. (9.5) Proof.If 0 ă a ď b ă 8, let k a :" ta ?nu and k b :" tb ?nu.Then (6.3)-(6.4)show that, provided n is so large that b Convergence in probability in (9.2) then follows from (6.12) and (6.15) together with (6.8) (and, for example, Doob's inequality), and as always (5.10).If a " 0 or b " 8, this result follows similarly using also (6.16)-(6.17)as in the proof of Lemma 6.1.Thus, (9.2) holds in probability.This implies convergence also in L 1 , since uniform integrability holds because and these are uniformly integrable by Lemmas 7.2 and Lemma 7.1.Next, (9.3) follows from (9.2) by taking the conditional expectation, and (9.4) follows by taking the unconditional expectation, using (3.11) and Fubini's theorem, and again the uniform integrability of (9.7).The final equality in (9.4) Thus, by taking the conditional expectation with respect to Ξ pnq , assuming that n is so large that b ?n ď n 1 and thus Furthermore, if n 1 ě k ą ℓ ě 1, then when the evolution comes to k, we have Y k red edges, and each of them ends at k with probability 1{k.We have the same probability for each of these edges to end at ℓ instead, and since the endpoints are independent, we see that conditioned on F k , Z ℓ is stochastically larger than Z k .(Larger, since there may also be red edges ending at ℓ that start at k or later.)Hence, and thus E pJ ℓ | Ξ pnq q ě E pJ k | Ξ pnq q.In other words, E pJ k | Ξ pnq q is decreasing in k P r1, n 1 s.The same obviously holds for Ξ pnq {pΞ pnq `k2 {nq.Consequently, with k b :" tb ?nu, Consequently, w.h.p. p D n and p D n`1 are independent until n 1 ; more formally, we may couple the pair `p D n , p D n`1 ˘with a pair `p D 1 n , p D 1 n`1 ˘of independent copies of them such that the two pairs w.h.p. coincide until n pnq 1 .In particular, this and the definition (3.10) show that the pair pΞ pnq , Ξ pn`1q q can be coupled with a pair of independent copies of them (defined in the same way from p D 1 n and p D 1 n`1 ) such that the two pairs coincide w.h.p.Consequently, Lemma 3.2 implies that `Ξpnq , Ξ pn`1q ˘d ÝÑ `ξ, ξ 1 ˘, where ξ, ξ 1 P Γp2q are independent.The result then follows by (6.9).
This result may seem surprising, since we have seen that most vertices k in p D n and p D n`1 have k of the order ?n, and that in this range, the density of vertices is high, which means that p D n and p D n`1 necessarily have a large number of common vertices.Since the p D n and p D n`1 have the same descendants of any common vertex, it follows that the graphs p D n and p D n`1 are strongly dependent.Nevertheless, the proof above shows that p D n and p D n`1 are essentially independent in the first phase, which determines Ξ pnq and Ξ pn`1q .Almost all vertices that contribute to X pnq and X pn`1q are in the later dense phase, where there is strong dependence, but this does not prevent the asymptotic independence of X pnq and X pn`1q because in this phase, there are so many vertices and edges that the evolution is governed by a law of large numbers and is essentially deterministic; hence the strong dependence here does not matter.Remark 9.5.We considered above X pnq and X pn`1q only to be concrete.The result extends to X pn 1 ν q and X pn 2 ν q for any two sequences n 1 ν and n 2 ν that tend to infinity, with n 1 ν ă n 2 ν .(This follows by the same proof, where we treat the cases pn 2 ν q 1 ď n 1 ν and pn 2 ν q 1 ą n

Some variations
We consider here the two variations mentioned in the introduction, and show that the same results hold for them too.10.1.Several roots.We may start with any given number m ě 1 roots, and then add n ´m vertices with outdegree d recursively as above.(We assume 1 ď m ď n.) Denote the resulting random d-dag by D n,m , and let p D n,m be the subgraph consisting of all vertices and edges that can be reached from n.
Note that D n,m can be obtained from D n by simply removing all edges between the roots, i.e., all edges within r1, ms.Consequently, D n,m and D n have the same descendants in the interval pm, ns, and it follows that Proof.An immediate consequence of (10.1).
We may also obtain results for larger m.For simplicity we consider only the case d " 2. Define, for µ ą 0 and x ą 0, and thus it follows from Lemma 9.1 that Let R n,m :" p D n X r1, ms be the number of roots that are descendants of n.When the procedure in Section 2 reaches m, there are Y m edges left.Each of these selects an endpoint in t1, . . ., mu at random, uniformly and independently, and R n,m is the number of vertices in t1, . . ., mu that are selected at least once.(This is a classical occupancy problem, often described as throwing Y m balls into m cells.) Conditioned on F m , each vertex k ď m thus has the same probability E J k " 1 ´p1 ´1 m q Ym of becoming red.The covariances can easily be calculated, but we note instead that if we also condition on J k " 0, this increases the probability that J ℓ " 1 for every ℓ ‰ k; thus Cov `Jk , J ℓ | F m ˘ď 0, and  The analysis in Section 2 is based on the independence of the endpoints of different edges; this is no longer true since edges from the same vertex now are dependent.However, a minor variation of the arguments allows us to reach the same conclusions.For simplicity, we consider again the case d " 2, and leave the straightforward generalization to higher d to the reader.We use the same notations as above, with the additions below.
Say that the two edges starting together from a red vertex are twins.We thus now do not allow two twins to have the same endpoint.
Consider the Y k red edges that cross the gap between k `1 and k.Some of these come in pairs of twins, while others are single (because their twin has already found an endpoint).Let Y k,1 be the number of single edges, and Y k,2 the number of pairs of twins among these edges.Thus Y k " Y k,1 `2Y k,2 . (10.12) Similarly, let Z k,1 be the number of single edges that end at k, and let Z k,2 be the number of edges that end in k and still having a living twin (that will later find an endpoint ℓ ă k).Thus The rest of Section 2 holds with minor changes: the numerical constants in inequalities may change (perhaps including cases where we had constant 1), we estimate (conditional) variances of Z k,1 and Z k,2 separately in (2.16), the exact formula in (2.28) is modified as above, and the equality in (2.25) is modified; we omit the details.
In Section 3, we note that for the version studied in the previous sections, the probability that two twins starting at k have the same endpoint is 1{pk ´1q.Hence, the expected number of such collisions among twins starting at k ě n 1 is (with J n :" 1), using ( Thus, w.h.p. there are no such collisions, which means that we may couple the versions using drawing with and without replacement such that they w.h.p. coincide on the interval rn 1 , ns.Consequently, Theorem 3.1 giving a coupling with the Yule process holds also for drawing without replacement.The results in Sections 3-7 now hold as before, with some numerical constants changed and a few minor changes.The most important is that (5.14)There is a similar modification in (6.13), but again the conclusion (6.14) holds by (10.26).In Section 7, we argue as in (7.10) for Z k,1 and Z k,2 separately.Hence, Theorem 1.3 holds also for drawing without replacement.(And so does Theorem 1.4, by similar arguments.)

9. 1 .
Density of descendants.The proof of Theorem 1.3 shows that most vertices in p D n are in the range O `?n ˘.More preciesely, let 0 ď a ď b ď 8, and let X pnq a,b :" p D n X pa ?n, b ?ns ,

. 1 ) 10 . 1 .
Theorem If the process starts with m " o `npd´1q{d ˘roots, and we thus define X pnq :" | p D n,m |, then the results in Theorems 1.3 and 1.4 still hold.

10. 2 .
Drawing without replacement.Consider now the case when the endpoints of the d edges from a vertex k are selected by drawing without replacement; in other words, the endpoints form a uniformly random subset of t1, . . ., k ´1u with d elements.(We start with m ě d roots.)Thus there are no multiple edges and D n is a simple multigraph.
and (4.3) hold a.s.Since convergence in Crδ, bs means uniform convergence, this means that a.s.p A It is evident that a.s. the different limits A δ,b ptq have to agree whenever intervals overlap, and thus there exists a continuous random function Aptq defined on p0, 8q such that a.s.
pnq t Ñ A δ,b ptq uniformly on rδ, bs for each rδ, bs P I. pnq t Ñ ξ ´Aptq, It follows from (5.9) and (5.15) that a.s., if 0 ă t 1 ă t 2 ă 8, This and the definition of Bptq in (5.13) yield a differential equation for Aptq, which we solve as follows.First, let and thus(5.13)impliesthata.s., uniformly on each compact interval in p0, 8q, with the solution, for some c P R,log `eBptq ´1˘" c ´2 log t (5.22)and thus, with C :" e c ą 0,Bptq " log `1 `C{t 2 ˘, t ą 0.(5.23)Note that the constants c and C may be random.We have shown that (5.23) holds a.s., for some random C, and (5.13) then yieldsAptq " ξ ´t2 Bptq " ξ ´t2 log `1 `C{t 2 ˘, Then tightness holds as in Lemma 5.2, and we can argue as in the proof of Theorem 5.3 using a suitable subsequence and a suitable coupling.Then (5.9) holds a.s.uniformly on compact intervals, and (5.13) becomes follows since ξ P Γp2q has density function xe ´x by (1.7).Remark 9.2.The function pptq can be expressed using the exponential integral E 1 pxq, see[17, 6.2.1-2 and 6.7.1]:Lemma 9.1 says that, asymptotically, the density of descendants of n around any k ă n is Ξ{pΞ `k2 {nq conditioned on Ξ, and ppk{ ?nq unconditionally.Another aspect of this is the following theorem, where we consider a single vertex k.Recall that J k " 1tk P p D n u is a Bernoulli variable; hence PpJ k " 1q " E J k , and this holds also conditionally.
Different n yield asymptotically independent results.As noted in Remark 1.7, the construction naturally constructs the dags D n for all n together.Using this coupling, we may consider the joint distribution of, for example, X pnq and X pn`1q .Somewhat surprisingly, X pnq and X pn`1q are asymptotically independent:Proof.Consider the evolutions of the red dags p D n and p D n`1 together, starting at n and n `1 and going down, as always; these evolutions are independent until they first have a common vertex.The probability that k is the first common vertex is thus at most the probability that two independent versions of p 1ν separately in the first part.)Furthermore,the theorem extends to any finite number of such sequences.△What is the asymptotics of the number of common vertices Υ pnq :" p D pnq X p D pn`1q ?(I.e., the vertices that are descendants of both n and n `1.)We conjecture that E `Υpnq { ?n ˘Ñ υ for some constant υ ą 0. Show this!What is υ?What is the asymptotic distribution of Υ pnq { ?n? (Assuming that it exists.) .2) Theorem 10.2.Let d " 2. Suppose that m " m n Ñ 8 such that m{ ?n Ñ µ P p0, 8q.
We obtain(10.4)bysumming(10.6)and(10.11),andthisimplies(10.3) by(3.11).Finally, moment convergence follows, since every power is uniformly integrable by | p D n,m | ď X pnq and Lemma 7.2.It is possible to obtain results also for the case m{ ?n Ñ 8 by our methods, but we leave this case to the reader.