Quasi-shuffle algebras and renormalisation of rough differential equations

The objective of this work is to compare several approaches to the process of renormalisation in the context of rough differential equations using the substitution bialgebra on rooted trees known from backward error analysis of $B$-series. For this purpose, we present a so-called arborification of the Hoffman--Ihara theory of quasi-shuffle algebra automorphisms. The latter are induced by formal power series, which can be seen to be special cases of the cointeraction of two Hopf algebra structures on rooted forests. In particular, the arborification of Hoffman's exponential map, which defines a Hopf algebra isomorphism between the shuffle and quasi-shuffle Hopf algebra, leads to a canonical renormalisation that coincides with Marcus' canonical extension for semimartingale driving signals. This is contrasted with the canonical geometric rough path of Hairer and Kelly by means of a recursive formula defined in terms of the coaction of the substitution bialgebra.


Introduction
We consider controlled differential equations of the form where the f i are vector fields on R e and X : [0, T ] → R d+1 is the driving signal.
For sufficiently regular f i and X, e.g., Lipschitz continuous vector fields and X of bounded variation, a unique solution exists, for which we have the formal expansion where the sum is indexed by the set of non-planar rooted trees τ ∈ T A with vertex decorations in the set A = {0, 1, . . . , d}. The symmetry factor σ(τ ) is a combinatorial coefficient defined in §2. 2. The F f [τ ] are the so-called elementary differentials corresponding to the vector fields f i and the rooted tree τ ∈ T A with decorations in A. The X τ st are multiple integrals of the driving signal over domains indexed by τ ∈ T A . For instance, as the domain of integration specified by a (decorated) non-planar rooted tree involves a partial order, the integral X τ st corresponding to the tree i2 i3 i1 ∈ T A is given by The reader is referred to Gubinelli's work [23,24] for more details. Under the assumption that iterated integrals obey classical integration by parts, the domain may be split up according to all the linear extensions of the partial order specified by the rooted tree, such that X τ st can be resolved into linear combinations of strictly iterated integrals. For the above example this yields Collecting the coefficients of the resulting iterated integrals in the expansion (2) yields Chen's classical word series as solution to the controlled differential equation (1), where the sum is now over words in the tensor algebra T (R d+1 ) over R d+1 with basis vectors {e 0 , . . . , e d }.
The key to Lyons' theory of rough differential equations [31,32] is that for rough signals, say α-Hölder continuous for some 0 < α < 1 2 , a complete set X st of iterated integrals X w st may be constructed canonically from a finite subset of iterated integrals. The amount of multiple integrals required to complete the construction depends on the regularity of the driving path. The differential equation is then interpreted as being driven by the rough path X st : where existence and uniqueness is determined by means of fixed point arguments in a space of suitably controlled rough paths.
The assumption that the iterated integrals obey a conventional integration by parts relation is at times restrictive, and led Gubinelli [23,24] to introduce the notion of a branched rough path as a collection of iterated integrals X τ st indexed by rooted trees. Differential equations driven by branched rough paths are interpreted in a similar manner. We remark here that from an algebraic point of view, a geometric (branched) rough path is nothing other than a two-parameter family of characters over the shuffle (Butcher-Connes-Kreimer) Hopf algebra (of rooted trees), obeying certain analytical conditions. The reader is referred to Hairer's and Kelly's article [26] for detailed accounts on Lyons' and Gubinelli's rough paths.
The importance of the process of renormalisation in Hairer's theory of regularity structures and its relation to rough path theory [4,21,27] has led to recent interest in the analogous role of renormalisation of rough differential equations [5]. For example, suppose that B t is a Brownian path that we wish to lift to a (branched) rough path B st . This lift may be accomplished by stochastic integration, but in doing so we introduce a lifting parameter corresponding to the point where the integrand is evaluated in each subinterval of the Riemann sums. Taking the left endpoint results in an Itô integral, whilst taking the midpoint gives a Stratonovich integral. Different rough path lifts are related by a renormalisation procedure [5], e.g., the usual Itô-Stratonovich conversion is given by The importance of deciding upon and relating such lifts in practical stochastic modeling has resulted in several alternative descriptions of this apparent ambiguity in defining the stochastic integral, namely (1) The canonical extensions of McShane and Marcus [35,36,37] (2) Hoffman's exponential and logarithm [14,16,17] (3) Translations of rough paths by Bruned et al. [5] In this paper we show that the substitution bialgebra on non-planar rooted trees defined in [6], which here is extended to decorated trees, provides a common method to understand these approaches. Hoffman's exponential and logarithm follow from a general theory showing that every formal power series induces a quasi-shuffle algebra automorphism, and that the composition of two such automorphisms is equivalent to the automorphism induced by the composition of the power series [28,29]. Among our main results is the description of an "arborified" Hoffman exponential using the substitution bialgebra. In a nutshell, it corresponds to a simple replacement of the usual factorials in the exponential map by rooted tree factorials. Then we show that Marcus canonical extension [35,36] is an instance of the adjoint of the arborified Hoffman exponential.
Bruned et al. [5] show that the effect of translations on a rough differential equation can be measured by a change in the driving vector field f , and give a procedure to express multiple integrals with respect to the translated path in terms of multiple integrals of the original path. The mechanism relies on an extension to the decorated case of the cointeraction of the substitution bialgebra on the Butcher-Connes-Kreimer Hopf algebra, originally shown in [6]. Among the possible lifts of Brownian paths, the Stratonovich lift is canonical in the sense that it gives rise to iterated integrals obeying the usual rules of calculus -the resulting rough path is geometric in Lyons' terminology. Whenever a branched rough path is constructed by integration of paths obeying a quasi-shuffle law, a canonical renormalisation is obtained by the translation associated to the inverse tree factorial character. We show that in the case of semimartingale integrators, this coincides with the canonical extension of Marcus.
We conclude with a brief discussion of an alternative approach, where a canonical geometric rough path is constructed above a given branched rough path. This approach was initiated by Hairer and Kelly in [26], and a related strategy has recently been pursued in [2]. The idea is to expand the vector space on which the signal is considered to live by defining a Hopf algebra morphism from the Butcher-Connes-Kreimer Hopf algebra to the tensor algebra with shuffle product defined over the set of rooted trees. Superficially this is unrelated to the substitution algebra approach, and we confine ourselves to the observation that the Hairer-Kelly map can be understood using the right comodule structure introduced in [6].
The paper is organised into six sections. The next section gives a brief overview of the necessary algebraic structures. In §3 we review how the effects of certain modifications of vector fields (equivalently, translations of rough paths) can be understood algebraically. Hoffman's exponential is given in §4, and extended to the arborified case. A formula for its adjoint is also given. We follow this with a discussion of the Marcus canonical extension in §5, showing that this can be understood in terms of the adjoint of the arborified Hoffman exponential. We conclude in §6 with a brief discussion of the different approach by Hairer and Kelly.
Acknowledgements: The research on this paper was partially supported by the Norwegian Research Council (project 231632). The first named author thanks Martin Hairer for financial support via a Leverhulm Trust leadership award. The second named author would like to thank Dominique Manchon for helpful discussions.

Rooted trees, words and Hopf algebras
In what follows k denotes the ground field of characteristic zero over which all algebraic structures are considered.
2.1. Pre-Lie algebra. A left pre-Lie algebra (P, ⊲) is a k-vector space P together with a bilinear product ⊲ : P ⊗ P → P which satisfies the left pre-Lie identity [8,34] for any elements x, y, z ∈ P . An analogous definition exists for right pre-Lie algebra. Note that (3)  The bracket [x, y] := x⊲y −y ⊲x satisfies the Jacobi identity. Let (P 1 , ⊲ 1 ) and (P 2 , ⊲ 2 ) be two pre-Lie algebras. A pre-Lie morphism is a k-linear map ψ : P 1 → P 2 , such that ψ(x⊲ 1 y) = ψ(x)⊲ 2 ψ(y). A natural example of pre-Lie algebra is given in terms of a differentiable manifold M with a flat and torsion-free connection. The corresponding covariant derivation ∇ on the space χ(M ) of vector fields on M provides it with a left pre-lie algebra structure, which is defined by f ⊲ g = ∇ f g, by virtue of the two equalities ∇ f g − ∇ g f = [f, g] and ∇ [f,g] = [∇ f , ∇ g ], which express vanishing of torsion and curvature, respectively. Let M = R n with its standard flat connection. For Consider the initial value problemẏ t = f (y t ), y t=0 , where f is a vector field on R n . The solution can be written in terms of the flow, by using the pre-Lie product (4) of vector fields [9] (5) This can be seen from iterating the equivalent integral equation Indeed, observe that for h : R n → R n , we have thatḣ(y t ) = (f (y t ) · ∇)h(y t ). From this we see that h(y t ) = h(y 0 ) + t 0 (f (y s ) · ∇)h(y s )ds. For h = id it follows that (f (y s ) · ∇)y s = f (y s ) and (f (y s ) · ∇)(f (y s ) · ∇)y s = (f (y s ) · ∇)f (y s ) = (f ⊲ f )(y s ). Applying this to (6) and iterating yields (7) y 2.1.1. Free pre-Lie algebra. Recall that a rooted tree τ consists of vertices and nonintersecting oriented edges. All but one of the vertices have exactly one outgoing edge and an arbitrary number of incoming ones. The root is the one vertex with no outgoing edge and it is drawn on the bottom of the tree. The leaves are the only vertices without incoming edges. The set of non-planar rooted trees is denoted by T = , , , , , , , , . . . .
The sets of vertices and edges of τ ∈ T are denoted by V (τ ) respectively E(τ ). Chapoton and Livernet [10] showed that the basis of the free pre-Lie algebra P( ) in one generator can be expressed in terms of undecorated, non-planar rooted trees. The pre-Lie product in P( ) is given in terms of grafting, that is, τ 1 ⊲ τ 2 is given by summing over all trees resulting from grafting the tree τ 1 successively to each vertex of τ 2 : where → v denotes the grafting of the root of τ 1 via a new edge to vertex v of τ 2 .
The coefficient M τ1 τ2;τ is the number of vertices of τ 2 such that grafting τ 1 on τ 2 at such a vertex gives the tree τ , e.g., Indeed, given trees τ 1 , τ 2 , τ 3 ∈ T the expression τ 1 ⊲ (τ 2 ⊲ τ 3 ) − (τ 1 ⊲ τ 2 ) ⊲ τ 3 is the sum of all the trees obtained by grafting τ 1 and τ 2 at two distinct vertices of τ 3 . It is thus symmetric in τ 1 and τ 2 , and hence the linear span of rooted trees is a left pre-Lie algebra. Let Ω be a set. The free pre-Lie algebra P(Ω) in |Ω| generators amounts to non-planar rooted trees decorated by elements from Ω. We denote the set of Ω-decorated non-planar rooted trees by T Ω . The free pre-Lie algebra P(Ω) satisfies the universal property: for any pre-Lie algebra (Q, ) and any mapping v : Ω → Q there exists unique pre-Lie algebra morphismṽ : P(Ω) → Q, such that f =f • ι, where ι : Ω → P(Ω). The elementary differential F f in (2) is such a pre-Lie algebra morphism, extending the map v( i ) := f i for i ∈ Ω = {0, . . . , d}.

BCK Hopf algebra and GL product.
A Ω-decorated rooted forest is a finite collection F = (τ 1 , . . . , τ n ) of rooted trees from T Ω , which we denote by the commutative product τ 1 · · · τ n . The set of Ω-decorated forests is denoted F Ω and contains the empty forest 1 = {∅} ∈ F Ω . The operator B i + , i ∈ Ω, associates to the forest F the tree B i + (F ) ∈ T Ω obtained by grafting each tree in F on a common new root decorated by i. The unique rooted tree i = B i + (1). Recall the definition of several important numbers associated to a rooted forest F = τ 1 · · · τ n respectively the tree τ = B i + (F ). The number of vertices |V (τ )| is the sum given by |τ | := 1 + n j=1 |τ j |. The tree factorial is recursively defined with respect to the number of vertices, i.e., by i ! = 1 and τ ! = |τ | n j=1 τ j !. It is multiplicatively extended to forests, F ! = τ 1 ! · · · τ n !. The internal symmetry factor σ(F ) = n j=1 |Aut(τ j )|. The Connes-Moscovici coefficient cm(τ ) of a tree τ is defined by .
For the last two coefficients we used that the symmetry factor and tree factorial of the tree B + (B + (B + (1)B + (1))) are 2 respectively 12, whereas for the tree B + (B + (B + (1))B + (1)) they are one respectively 8. An important result is the following lemma: where the sum on the right runs over all trees in T with exactly n vertices.
The Ω-decorated Butcher-Connes-Kreimer Hopf algebra H Ω BCK of rooted forests is the free unital commutative k-algebra on the linear space T Ω [13,33]. It is graded by the number of vertices. The non-cocommutative coproduct is defined for the tree Observe that the space spanned by ladder trees B i1 + • · · · • B i l + (1) forms a cocommutative Hopf subalgebra in H Ω

BCK
with the simple coproduct We denote by (d F ) F ∈FΩ the normalised dual basis in the graded dual H Ω * BCK of the forest basis of H Ω BCK , i.e., d F , G = σ(F ) when the forests F and G coincide, and zero else. For any tree τ ∈ T Ω the corresponding d τ is an infinitesimal character over H Ω BCK , in other words, it is a primitive element of H Ω * BCK . The convolution product of H Ω * BCK gives rise to the Lie bracket The link with the pre-Lie product (8) on rooted trees follows from [13] By the Cartier-Milnor-Moore theorem H Ω * BCK is isomorphic as a Hopf algebra to the enveloping algebra U(g Ω ) [7], where g Ω := Prim(H Ω * BCK ) is the Lie algebra spanned by the d τ 's for rooted trees τ ∈ T Ω . Guin and Oudom [39] extended the pre-Lie product on P(Ω) to the symmetric module S(g Ω ). Then they defined another associative product on S(g Ω ) and showed that (S(g Ω ), ∆, * ) is isomorphic as a Hopf algebra to U(g Ω ). The associative product is defined for F, G ∈ S(g Ω ) by where, using Sweedler's notation, ∆ is the usual unshuffle coproduct. This product (11) can be written in terms of the B + operation, i.e., , by summing all the trees obtained by grafting the trees in the forest F on the tree B i + (G) at various places, and then removing the root of each component of the sum, which is the definition of the B i − -operation. Note that the decoration of B ± does not matter here. For trees τ 1 , τ 2 ∈ T Ω we find As a result the product * on S(g Ω ) coincides with the Grossman-Larson product (12) d where the product on the right-hand side is the convolution product in H Ω * BCK . 2.3. Shuffle and quasi-shuffle Hopf algebras. Algebraically, products of iterated Stratonovich integrals can be described using the usual shuffle product [41], whilst for the other stochastic integrals we obtain a quasi-shuffle relation. We refer the reader to Gaines' 1994 paper [22] for more details.
Following Hoffman [28] a quasi-shuffle algebra is defined on a locally finite alphabet A. By A * we denote the monoid of words w = i 1 · · · i l generated by the letters from A with concatenation as associative product. Moreover, we assume that A itself is a commutative semigroup with binary product [− −] : A × A → A. Commutativity and associativity of the latter allows us to write [i 1 · · · i n ] : The free non-commutative k-algebra of words w = i 1 · · · i l over the alphabet A is denoted k A . The empty word is 1 ∈ k A . The length |u| of a word u ∈ k A is defined by its number of letters. The commutative and associative quasi-shuffle product on words w, v ∈ k A is defined by We denote the quasi-shuffle algebra by H ⋆ := (k A , ⋆). Hoffman [28] showed that H ⋆ = (k A , △, ⋆) is a Hopf algebra with respect to the deconcatenation coproduct △. If [ij] = 0 for any letters i, j ∈ A, then the quasi-shuffle product reduces to the ordinary shuffle product and H ⋆ turns into the classical shuffle Hopf algebra H ∃ := (k A , △, ∃ ). Further below we shall see another, more surprising link between H ⋆ and H ∃ due to Hoffman [28]. For i ∈ A, define the operator R i : H ⋆ → H ⋆ by R i (w) = wi. It verifies with respect to the deconcatenation coproduct: 2.3.1. Arborifications. As before, let A be the alphabet which is also a commutative semigroup with binary product [− −] introduced above. Generally, the process of (contracting) arborification is given by a surjective Hopf algebra morphism from the A-decorated Butcher-Connes-Kreimer Hopf algebra H A BCK onto the (quasi-) shuffle Hopf algebra (H ⋆ ) H ∃ defined over the alphabet A. See [18,19] for details. In the shuffle case, the arborification morphism a : In the quasi-shuffle case, the contracting arborification morphism a c : a c (B i + (F )) = (a c (τ 1 ) ⋆ · · · ⋆ a c (τ n ))i.
For instance, both arborifications map decorated ladder trees to single words, e.g., Non-ladder trees are mapped to linear combinations of words, e.g., Observe that the difference between the two maps are extra terms coming from contractions.
2.4. Substitution bialgebra. Let A be the locally finite commutative semigroup from above. Let F A + be the set of A-decorated forests excluding the empty forest, and consider the commutative polynomial algebra H A + graded by the number of edges. A subforest of a rooted tree τ is a collection (τ 1 , . . . , τ n ) of pairwise disjoint rooted subtrees of τ . The decorations of each tree τ j is induced by the decoration of τ . In particular, two subtrees of a subforest cannot have any common vertex. A full subforest F = τ 1 · · · τ k of τ is such that V (F ) = V (τ ). The coproduct on H A + is defined in terms of extractions and contractions (15) δ Here the sum is over all A-decorated full subforests F = τ 1 · · · τ k and τ /F denotes the A-decorated rooted tree in which all connected components τ j of F have been contracted and replaced by vertices [τj] . Hence, for τ ∈ T A the coproduct δ + (τ ) is linear on the right. The decoration [τ j ] := [j 1 · · · j |τj | ] ∈ A, where j 1 , . . . , j |τj| are the decorations of the vertices of the subtree τ j . Note that if A few examples show that the number of edges is preserved The algebra H A + together with this coproduct is a connected graded bialgebra. Next we consider the space of ladder trees, which we will denote by ℓ i1···i l . The vertices are decorated successively starting from the leaf, decorated by i 1 , down to the root, which is decorated by i l . Proposition 1. The space spanned by ladder trees forms a Hopf subalgebra in H A + with the coproduct The sum runs over all partitions of the decoration sequence i 1 · · · i l ∈ A * into blocks I j ∈ A * such that the concatenation I 1 · · · I n = i 1 · · · i l .
Proof. The partition into blocks of the decoration sequence i 1 · · · i l corresponds to contractions on the ladder tree ℓ i1···i l . We will later see that these partitions can be understood in terms of compositions of integers. Recall that for single letters we have [i j ] = i j .
The so-called substitution bialgebra H + corresponding to undecorated rooted trees appeared in [6], where it was introduced in relation to backward error analysis on B-series. It is based on the seminal works [11,12] and aims at providing an algebraic description of the effect of substituting a vector field corresponding to an infinitesimal character in the dual H * + into another B-series. It was shown in [6] that there exists a left H + -bicomodule structure on the Butcher-Connes-Kreimer Hopf algebra H BCK , that is, Φ: H BCK → H + ⊗ H BCK , such that for τ = 1 Φ(τ ) = δ + (τ ) ∈ H + ⊗ H BCK , and Φ(1) = ⊗ 1. Let ϕ be a character of H + , let α be any linear map from H + into k, and let b, c be linear maps form H BCK into k. Let ε be the co-unit of H BCK and d on H BCK the infinitesimal character corresponding to the one-vertex rooted tree , and Z is the analogous infinitesimal character on H + . Then: Here * is the convolution product on the dual of H BCK and ⊛ denotes the convolution product on H * + , defined in terms of the coproduct (15), as well as the left action of H + on H BCK . These results are generalised to the A-decorated case H A + . The most important consequence of the interplay of the substitution bialgebra and the Butcher-Connes-Kreimer Hopf algebra is the following: If v is a character of H A + , then Ψ v is a Hopf algebra automorphism on H A BCK . Observe that the character v on H A + defined to map i to one, for all i ∈ A, and v( i j ) := δ ij 1 2 , and any other tree to zero, gives for instance v(

Duality and substitution
It is convenient to have a purely algebraic description of the flow, for this purpose we define the formal flow map ϕ ∈ H Ω * BCK⊗ H Ω Note the abuse of notation by writing τ for d τ . Here H Ω * BCK⊗ H Ω BCK is the completed tensor product, i.e. the space of infinite series of tensor products of forests with Grossman-Larson product on the left and Butcher-Connes-Kreimer on the right, with the inverse limit topology comprising open sets generated by sequences agreeing up to a given order, see [15,41]. The Taylor expansion (2) of a controlled differential equation with driving signal X and vector fields f i takes the form Recall that for the empty tree 1 ∈ T Ω , we have X 1 st = 1 and F f [1] = id. Given a pre-Lie algebra morphism g, we can attempt to find its adjoint g * , defined such that The original context of this problem was substitution of B-series. Indeed, suppose X st = (t − s) so that the rough differential equation is a standard ordinary differential equation. Consider increments of a fixed size, i.e., X t,t+h = h; the iterated integrals then obey Letting v(τ ) := 1 τ ! , the flow (F f ⊗ X st )(ϕ) thus yields a formal series in powers of the step size h, which is the Taylor series of the exact solution Y t , i.e., the paradigm of a so-called B-series. See [25] for details. Replacing the term v with a functional a gives another B-series B(a, f ). Suppose then that the vector field f is replaced by an expansion in terms of pre-Lie products of f , for instancẽ Collecting the powers in h appearing in (Ff ⊗ X st )(ϕ) results in a new series, which may be related algebraically to the B-series B( 1 τ ! , f ). Indeed,f = 1 h B(f, a) for some infinitesimal character a on H BCK , and we have the following Lemma 3.
[6] Let a be an infinitesimal character and b a character, both defined over H BCK . The substitution of B-series is computed using the convolution product in H * + : The above is generalised by first noting that it may be seen as the computation of an adjoint. Indeed, as F f is a pre-Lie algebra morphism, it follows that for the unique pre-Lie algebra morphism g described by its action on the single node rooted tree The previous lemma then amounts to the computation of the adjoint of g. Bruned et al. [5] considered the problem of finding an adjoint in a more general rough path setting, establishing the following result.

Lemma 4. Let a be an infinitesimal character on H A
BCK and define g to be the pre-Lie algebra morphism extended multiplicatively to forests. Then its adjoint g * = Ψ a = a ⊛ id = (a ⊗ id) • Φ such that

Arborified Hoffman isomorphism
Hoffman [28] showed the existence of a unique and explicit Hopf algebra isomorphism log H from H ⋆ to the shuffle Hopf algebra H ∃ . Stratonovich iterated integrals of a Brownian path may be obtained from Itô iterated integrals by an application of Hoffman's isomorphism [16,17].
Recall the notion of composition of integers and the related concept of contracting words [28]. Let A be a locally finite alphabet which is also a commutative semigroup with binary product [− −] introduced earlier. A sequence I := (i 1 , . . . , i m ) of positive integers such that i 1 + · · · + i m = n, denotes a composition of the integer n. The set of those integer compositions is written C(n). The contraction of a word w = a k1 · · · a kn ∈ A * is denoted by I[w] and defined in terms of the composition I = (i 1 , . . . , i m ) ∈ C(n) Here [i] := i. Hoffman's exponential [28,29] For example, for a k1 ∈ A we have exp H a k1 = a k1 and log H a k1 = a k1 . For words of length two and three we obtain The Hoffman exponential and logarithm are part of a class of quasi-shuffle algebra automorphisms induced by formal power series [29]. See also [38]. .
where |I| 2 is the number of i j = 2 ∈ I.

Lemma 5.
[29] Let f, g be formal power series. The composition of their associated quasi-shuffle algebra automorphisms is then computed using the composition f • g of formal power series: If f is invertible we may assume that Hoffman's exponential (16) and logarithm (17) follow from the regular exponential and logarithm.

Theorem 1. [28]
Hoffman's exponential, ψ exp(t)−1 , is a Hopf algebra isomorphism Recall that Lemma 2 states that the map Ψ v = (v ⊗ id) • Φ associated to a character v of H A + is a Hopf algebra automorphism of H A BCK . Moreover, suppose that v(τ ) is independent of the decorations of τ , and let v(ℓ n ) = f n , where ℓ n is a ladder tree with n nodes and the f n are coefficients of a formal power series f . Proposition 1 then shows that when restricted to ladder trees, we have Ψ v = ψ f . The following result can then be considered an extension of Hoffman's theorem from quasi-shuffle automorphisms induced by formal power series to Butcher-Connes-Kreimer automorphisms induced by the substitution bialgebra: Recall that H A + is a graded but not connected bialgebra. Let v be a multiplicative map from H A + to k that maps i to one, i.e., v( i ) = 1 for all i ∈ A. It turns out that in this case we can invert those characters by composition with the pseudo-antipode map α := i≥0 (id − ηε) ⊛i , where η and ε are the unit respectively counit of H A + .

One can then show that
Recall that a commutative and associative product on a decoration set A gives rise not only to a quasi-shuffle algebra, but also to a contracting arborification, that is, a surjective algebra morphism a c : H A BCK → H ⋆ . The non-contracting arborification a : H CK → H ∃ results from taking the product on A to be trivial, so that the quasi-shuffle product becomes a shuffle product.
Our central result is the description of the arborified Hoffman isomorphism, i.e., a canonical Butcher-Connes-Kreimer automorphism Ψ v making the following diagram commute: Let τ ∈ T A be an A-decorated rooted tree. Define v(τ ) := 1 τ ! be the inverse tree factorial, which is multiplicative and v( i ) = 1 for all i ∈ A. Then Ψ v is the arborified Hoffman exponential, i.e., it is the unique Butcher-Connes-Kreimer Hopf algebra automorphism satisfying the following: (1) Ψ v makes the diagram (18) commute.
(2) The adjoint of Ψ v is a pre-Lie morphism. In fact, the adjoint of the arborified Hoffman exponential is the pre-Lie morphism defined on single vertex trees by Before we prove this theorem, we show how the arborified Hoffman exponential generalises the classical Hoffman exponential. Indeed, consider the particular case of the Hopf subalgebra of ladder tree ℓ i1···i l in H A BCK . Recall Proposition 1 and the fact that ℓ i1···i l ! = l!. Then we find that in accordance with the remarks preceding Lemma 6 and the fact that I 1 · · · I n = i 1 · · · i l is equivalent to the application of the composition I = (i 1 , . . . , i n ) to the word w = i 1 · · · i l , where i m is the number of letters, that is, the length of the interval I m , for m = 1, . . . , n.
Let us look at the tree i2 i3 i1 as another example.
In the last equality we used associativity and commutativity of the semigroup A.
Proof. To prove Theorem 2, we first note that Lemma 2 guarantees that Ψ v is a Butcher-Connes-Kreimer automorphism. As the Hoffman exponential is an algebra morphism, to prove commutativity of the diagram (18) it suffices to show that Recall the definitions of (contracting) arborification in (14) respectively (13). Note that contractions on the right-hand side of (20) solely stem from Hoffman's exponential exp H , whereas on the left-hand side they come from both the contracting arborification as well as Ψ v . Following Hoffman, we proceed by counting. In particular, both the left-hand side and the right-hand side are sums over the same words, with coefficients. It remains to match these coefficients. To see this, assume that τ ∈ T A has n vertices decorated by i 1 , . . . , i n ∈ A, and note that from the definition of the map a in (13) we find that where the sum is over all ways of ordering the decorations of the nodes of the tree into a word in such a way that it respects the partial order of the tree. The definition of a implies that rightmost letter of i 1 · · · i n is the decoration of the root of τ . Similarly, we have a c (τ ) = I,M(τ ) where the sum is over all words ordered as above, and over all compositions of these words such that only letters which are incomparable with respect to the partial order of τ may be contracted. Note that we count only once each contracted block [i · · · j], and not all the different ways this may arise from different orderings. Now the right-hand side of the identity may be written where c K are the Hoffman exponential coefficients, i.e., the usual factorials 1 |w|! for each composed word [w] with |w| letters. We now consider the left-hand side. Note that following arborification, each contraction from Ψ v acting on τ will correspond to a composition J of the letters marking the contracted vertices. It follows that for some coefficient c I,J . Suppose that a subword w = i . . . j is composed in the above sum. The partial ordering on the vertices gives rise to a forest τ 1 · · · τ m ; this composition could only arise by having J contract each of the trees τ i , and then letting I compose the remaining (incomparable) vertices. It is clear then that the above sum may be rewritten where the sum is over all compositions L, and the coefficient c L for each composed word w with partial order τ 1 · · · τ m is 1 τ1!···τm! . To relate the two, it remains to compare the sums over M and N , i.e., to account for the fact that the righthand side of the identity to be proven allows multiple ways of obtaining the same composed word, that is, a letter [w]. For this purpose, we note that given such a [w], the number of times it appears in the sum from different orderings is equal to the number of linear extensions of the underlying partial order, i.e., number of ways of extending to a total order. Letting the forest of the order be τ 1 · · · τ m , and the total number of letters of w be n, this is exactly (see [30]) where the first equality comes from the recursive definition of the tree factorial. As multiplying c K (the Hoffman coefficients) by the right hand side above gives c L , we are done. The form of the adjoint follows from Lemma 4, as The proof is then concluded by observing that any morphism of the free pre-Lie algebra P(A) is determined by the values it takes on the set of decorations A.

Marcus canonical extension
A wide variety of physical phenomena are well approximated by stochastic differential equations (SDEs) of the form where {Z 1 t , . . . , Z N t } are N real-valued semimartingales [40] and a, b 1 , . . . , b N are vector fields in R d . To ungarble notation, we suppose in the following that N = 1, i.e., we consider the simple case of a single real-valued semimartingale. Often processes such as white noise, with its associated Wiener process differential dW t , are approximations or limits of a more regular noise dW ǫ t . Suppose that the equation describing the physical system is then an ordinary differential equation (ODE) of the form (22) dX ǫ t = a(X ǫ t )dt + b(X ǫ t )dW ǫ t . In general, the solution X ǫ t does not converge to X t as the regular noise dW ǫ t approaches dW t . In the case of SDEs driven by Wiener processes, this problem is usually resolved with the help of Stratonovich integration: the regular ODE (22) converges to the solution of the associated Stratonovich SDE Indeed, the regularised equation does converge to an Itô SDE, but with modified vector fields: McShane [37] called this equation the canonical extension. Observe that the term d i=1 b i (X s ) ∂b ∂x i (X s ) can be interpreted as the pre-Lie product (b ⊲ b)(X s ) of the vector field b with itself defined in (4). Indeed, it turns out that Marcus' main result [35] is that for SDEs driven by general semimartingales, the regularised solutions X ǫ t tend to the solution of the following canonical extension: Marcus showed explicitly that in the particular case where Z t is a Poisson process N t , as [N ] (n) t = N t the canonical extension reduces to This has a rather intriguing analytic interpretation that has dominated the study of canonical extensions since their introduction. Indeed, the term e L tb⊲ is the flow corresponding to the differential equationẏ(t) = b(y(t)). See equation (5) for details. The second integral on the right is then a jump process that jumps along the integral curves of the vector field b at the jump times of N t . For processes with varying jump heights the distance moved along the integral curve is proportional to the jump size. A further layer of interpretation is that a fictitious time t ′ is introduced that stretches at the jump times to accommodate the jumps; in this stretched time the process X t traverses the integral curve of b. See reference [1] for details. This interpretation has been used by Friz et al. in [20] to construct a theory of rough paths for Lévy processes. The analysis is considerably more complicated for general semimartingales; here the canonical extension cannot be readily collapsed to give integral curves, and this interpretation may not exist.
We will interpret (24) in terms of substitution. Indeed, expand the decoration set {0, 1} corresponding to N = 1 to the algebra A = Z ≥0 , where 0 corresponds to the drift term, and n > 0 to the n-fold quadratic variation [Z] (n) . The vector fields f i are defined to be zero for i > 1. The Marcus extension can be interpreted as a map Ψ * v : where v is the character multiplying by the inverse tree factorial, for which Theorem 3 (Hoffman and Marcus). The Marcus extension map is the adjoint of the arborified Hoffman exponential for the decoration set A.
Proof. This is an immediate consequence of (24), Lemma 1 and the form of the adjoint arborified Hoffman exponential given in Theorem 2.
Remark 1. The Marcus extension gives us a canonical rough path renormalisation in the following sense. Suppose the branched rough path lift is generated by integrals X τ obeying a quasi-shuffle law. The arborified Hoffman exponential tells us that the integrals X Ψv(τ ) are geometric in the sense that their arborification gives a geometric rough path, i.e., the usual rules of calculus are obeyed. We then search for a modificationf of the driving vector field f such that This is equivalent to asking for the adjoint of the mapping Ψ v given in (19), which as observed above is exactly the Marcus extension.

The Hairer-Kelly map
An alternative approach to the construction of a canonical rough path is given by Hairer and Kelly. In [26] they associate to any branched rough path X τ and driving vector fields f a geometric rough pathX w and a new set of vector fields f such that the solutions of the associated rough differential equations coincide. As for the Marcus extension, the set of decorations is expanded, but note that the Marcus extension and Hairer-Kelly map are in some sense inequivalent. Indeed, the expanded decoration sets do not coincide.
where ∆ is coproduct in H BCK and γ := id − ǫ is the augmentation projector.

For instance
Remark 2. It is interesting to note that the arborification and contracting arborification maps may be constructed using a recursion of the above form, where γ is replaced with the map that acts as the identity on all trees comprising a single node, and evaluates to zero on all other trees. Setting the range to be the shuffle Hopf algebra results in the arborification morphism, whilst if the range is quasi-shuffle we obtain the contracting arborification.
Central to the Hairer-Kelly approach is the construction of a geometric rough pathX for whichX(ψ(τ )) = X τ for all rooted trees τ ∈ T . Moreover, the construction is symmetric in the sense thatX(τ 1 ⊗ τ 2 ) =X(τ 2 ⊗ τ 1 ). There is a natural map π : T (T ) → F from tensor products of trees to forests that maps τ 1 ⊗ · · · ⊗ τ n → τ 1 · · · τ n . Much of the algebraic manipulation underlying the Hairer-Kelly results can therefore be expressed in terms of the symmetrised map ψ = π • ψ : F → F . It turns out that this can be characterised using the right comodule structure on forests F associated to the substitution bialgebra.
For example, we havẽ ψ( ) = (cm · σ)(1) + (cm · σ)( ) + 2 (cm · σ)( ) = + 2 + 2 Proof. To prove this, we note that the effect of the recursion in Definition 1 is to sum over all (not necessarily admissible) cuts of the tree τ into (ordered) subforests τ 1 ⊗ · · · ⊗ τ n , where the τ i are ordered such that their roots must respect the partial order of the vertices in τ . Upon symmetrization, this becomes the sum over all (unordered) subforests τ 1 · · · τ n with a combinatorial factor coming from the number of linear extensions of the partial order of the roots of the τ i to a total order. This cut coincides with the identification of possible subforests to conduct a substitution; the tree that results on the right-hand side of δ + from contracting τ according to the subforest τ 1 · · · τ n is precisely the tree τ associated to the partial order of the roots of τ i . The number of linear extensions of τ is precisely cm(τ )σ(τ ) = τ ! |τ |! , see [30], hence the result follows.
Proof. The above recursion is of a form similar to that defining the antipode in a connected graded Hopf algebra. Indeed it is a sort of 'twisted' antipode. One can prove that this is the inverse ofψ, as we havẽ ψ(τ ) = τ + τ (1) · (cm · σ)(τ (2) ), and the term on the right is of strictly lower degree than the tree τ . Any map of the formφ(τ ) = τ +φ − (τ ) whereφ − (τ ) is of lower grade than τ has an inverse of the formφ −1 (τ ) = τ −φ −1 (φ − (τ )) . Applying this to the above gives the result.
The Hairer-Kelly approach is based around the following equivalence, which is an immediate consequence of the definition of the adjoint. Indeed, the intention is to define a new rough pathX st obeyingX τ st = Xψ The key result is the following description of the flow map in terms ofX, which is the basis for consideringX as a geometric rough path.

Theorem 4. [26]
The flow map φ st for the rough differential equation driven by the vector fields f i and rough path X may be written as a word series using the rough pathX as follows The rough pathX is then interpreted as a geometric rough path as indicated in the above formula, by expanding the set of decorations to include trees. The usual 'sewing' procedure allows us to consider only trees up to a given order depending on the regularity of the rough path X. As a geometric rough pathX is symmetric, as follows from the descriptionX τ1···τn = X φ −1 (τ1···τn) and the symmetry ofψ −1 . The grading ofX must then be based on word length, and sewing must be performed 'horizontally'. Hairer and Kelly show that this is possible using the Lyons-Victoir extension theorem.
Remark 3. Note that the words do not correspond to iterated integrals of X, but rather polynomials, as can be seen from the definition via ψ. This is the reason for the symmetry ofX.
Remark 4. We conclude by noting that the Hairer-Kelly extension is not the arborification of the Marcus extension. Indeed, this is related to the observation of Hairer and Kelly that their map contains more terms than required in the case of Wiener processes. The Marcus canonical extension answers the question about how to construct simplifications in more general cases where a quasi-shuffle relation is still present.