A solution theory for quasilinear singular SPDEs

We give a construction allowing to construct local renormalised solutions to general quasilinear stochastic PDEs within the theory of regularity structures, thus greatly generalising the recent results of [BDH16,FG16,OW16]. Loosely speaking, our construction covers quasilinear variants of all classes of equations for which the general construction of [Hai14,BHZ16,CH16] applies, including in particular one-dimensional systems with KPZ-type nonlinearities driven by space-time white noise. In a less singular and more specific case, we furthermore show that the counterterms introduced by the renormalisation procedure are given by local functionals of the solution. The main feature of our construction is that it allows to exploit a number of existing results developed for the semilinear case, so that the number of additional arguments it requires is relatively small.


Introduction
Amidst the recent heightened interest in singular stochastic partial differential equations (SPDEs), three different methods [BDH16,FG16,OW16] have been developed to extend the theory to quasilinear equations. The first two of these worked with paracontrolled calculus, while [OW16] introduced a new variation of previous techniques to treat singular SPDEs which is closer to the theories of rough paths and regularity structures, but flexible enough to cover quasilinear variants. For a comparison between them in terms of scope we refer the reader to the introduction of [FG16], but it should be noted that in a sense all of them deal with the 'first interesting' case, when the noise is just barely too rough for the product a(u)∆u to make sense. In particular, quasilinear variants of the KPZ equation, or for example the parabolic Anderson model in a generalised form in 3 dimensions, are all outside of the scope of these works. One exception is the forthcoming work [OSSW17] INTRODUCTION which extends [OW16] to the next regime of regularity, which includes noises slightly better than space-time white noise in 1 dimension (similar to the setting of our example (1.1) below).
In the present article, we tackle this problem within the framework of regularity structures. The generality in which we succeed in building local solution theories is, in some sense, optimal: loosely speaking, we show that if an equation can be solved with regularity structures and its solution has positive regularity, then its quasilinear variants can also be solved (locally). We deal with both the analytic and the probabilistic side of the theory in the sense that we show that the general machinery developed in [CH16] can be exploited in order to produce random models that do precisely fit our needs. Another major advantage of our approach is that its formulation is such that it allows to leverage many existing results from the semilinear situation without requiring us to reinvent the wheel. This is why, despite its much greater generality, this article is significantly shorter than the works mentioned above.
The only disadvantage of our approach, compared to [BDH16,FG16,OW16], is that it is not obvious at all a priori why the counterterms generated by the renormalisation procedure should be local in the solution. The reason for this is that our method relies on the introduction of additional "non-physical" components to our equation, which are given by some non-local non-linear functionals of the solution, and we cannot rule out in general that the counterterms depend on these non-physical terms. We do however address the question of the precise form of the counterterms generated by the renormalisation procedure in a relatively simple case where we verify that, provided that the renormalisation constants are chosen in a specific way (which happens to be a choice that does still allow to show convergence of the underlying renormalised model, although it differs in general from the BPHZ renormalisation introduced in [BHZ16]), all non-local contributions cancel out exactly. The reason for a lack of general statement is that the algebraic machinery developed in [CH16,BCCH17], which allows to show that counterterms are always local in the semilinear case, does not appear to be applicable in a simple way. However, we do expect that this is something that could be addressed in future work.
Theorem 1.1. There exist deterministic smooth functions C ε · such that for all u 0 ∈ Cν there exists a random time τ > 0 such that u ε converges in probability in C([0, τ ] × T) ∩ C 1/2 loc ((0, τ ] × T) to a limit u. Furthermore, with a suitable choice of C ε · , one can ensure that the limit u is independent of ρ.
Remark 1.2. We would like to stress again that we only need the condition ν > 0 in order to guarantee that the counterterms created by the renormalisation of the underlying model are local functions of u. The rest of the argument works down to ν > − 1 2 (including in particular the case of space-time white noise), at which point the conditions of [CH16, Thm 2.14] are violated and one expects a qualitative change of the scaling behaviour of the solution. Similarly, we consider a scalar equation driven by a single noise purely for the sake of notational convenience. The exact same proof also applies for example to systems of the type with a taking values in some compact set of strictly positive definite symmetric matrices and implicit summation over j, k.
The structure of the remainder of this article goes as follows. In Section 2, we first give an equivalent formulation of a general quasilinear SPDE which is the main remark this article is based on. The main purpose of this reformulation is to write the equation in integral form in a way that resembles the mild formulation for semilinear problems. In particular, this can be done in such a way that the product a(u) · ∂ 2 x u never appears and is replaced instead by seemingly more complicated terms that however exhibit better scaling / regularity properties. In Section 3, we then show how to build a suitable regularity structure allowing to formulate the fixed point problem derived in Section 2. This is very similar to what is done in [Hai14,BHZ16] with the unusual twist that each symbol represents an infinite-dimensional subspace of the resulting regularity structure, rather than a onedimensional one. The formulation of the fixed point problem is then done in Section 4. Finally, we treat a concrete example in Section 5, where we also verify "by hand" that in this case the renormalisation procedure does indeed only produce local counterterms.
Acknowledgements MH gratefully acknowledges support by the Leverhulme Trust and by an ERC consolidator grant, project 615897.

An equivalent formulation
The main observation on which this article builds is that, at least for smooth drivers / solutions, a quasilinear equation is equivalent to another equation whose principal (smoothing) part does make sense even in the limit when the driving noise is taken to be rough. The 'right-hand side' of this new equation may however exhibit ill-defined products (sometimes even if the original right-hand side did not), but that situation is already closer to the ones that the theory of [Hai14] was developed for.
To describe this alternative formulation, we restrict our attention to the case of perturbations of the heat equation on the one-dimensional torus T, but it is straightforward to generalise this to other situations. In this case, one wants to solve initial value problems of the type where a is a smooth function taking values in K for some compact K ⊂ (0, ∞), F is a subcritical (in the sense of [Hai14,BHZ16]) local nonlinearity, and ξ is a noise term, which for (2.1) to make sense, is assumed to be smooth for the moment.
Remark 2.1. Assuming that we are interested in noises ξ ∈ C α−2 for α ∈ (0, 1), so that potential solutions are expected to be of class C α , F being subcritical is equivalent to assuming that it is of the form where F 0 : R 2 → R and F 1 : R → R are smooth functions and the dependence of F 0 in its second argument is polynomial of degree strictly less than (2 − α)/(1 − α).
It will be convenient to write the equation in a more 'global' way: setting f = 1 t>0 F (u, ξ)+ δ ⊗ u 0 , where δ is the Dirac mass at time 0 and both F and u 0 are extended periodically to all of R, (2.1) is equivalent to For c > 0, denote by P (c, ·) the Green's function of ∂ t − c∂ 2 x on T. Note that P is smooth as a function of c away from the origin and one has the identity ∂ ℓ ∂c ℓ P (c, ·) = ∂ 2ℓ x P (c, ·) * · · · * P (c, ·) ℓ+1 times , where the convolutions are in space-time. Introduce operators I (k) ℓ acting on smooth functions b and f by We will also use the shorthands I = I (0) 0 , I ′ = I (1) 0 , I 1 = I 0 1 , etc. Note that these operators are linear in their second argument, but not in the first. Note also that, by a simple integration by parts, one has the identities Even though I(b, f ) is of course not the same as the solution map to (∂ t − b∂ 2 x )u = f if b is non-constant, it turns out that solving (2.2) is equivalent to solving an equation of the type u = I(a(u),f ), (2.5a) wheref = 1 t>0F (u, ξ) + δ ⊗ u 0 for some modified (non-local) nonlinearityF . Since I does make perfect sense for arbitrary b ∈ C 0+ and f ∈ C −2+ (which are the expected regularities of the coefficient and the right-hand side, respectively, even in the limit), this moves all the ill-defined terms into the definition ofF .
Verifying the equivalence is elementary as long as all functions involved are smooth: suppose that u satisfies (2.5a) and apply ∂ t −a(u)∂ 2 x to both sides of this equation. Denoting the expression (∂ t − a(u)∂ 2 x )u by f , one then has One can rearrange the above as a fixed point equation forf by writing it aŝ we can look for solutions to this fixed point problem that are also of the formf (t, x) = 1 t>0F (t, x) + δ(t)u 0 (x). To see this, define the operatorÎ where we use the convention z = (t, x). (This is really how all the terms involving δ should be interpreted in (2.5a)-(2.7).) The function I 1 (a(u),F ) is continuous and vanishes at time 0, as doesÎ 1 (a(u), u 0 ) for any u 0 of strictly positive regularity, as one can see from (2.3) for example. Thus (2.7) can be written as a fixed point problem forF : F = (1 − a ′ (u) (I 1 (a(u),F ) +Î 1 (a(u), u 0 ))1 t>0 F (u, ξ) (2.5b) + (aa ′′ )(u)|∂ x u| 2 (I 1 (a(u),F ) +Î 1 (a(u), u 0 )) + (a(a ′ ) 2 )(u)|∂ x u| 2 (I 2 (a(u),F ) +Î 2 (a(u), u 0 )) If u 0 is sufficiently smooth, say C 3 , one can write the system (2.5) as a fixed point problem where A u 0 ,ξ is a contraction on a ball of (C 0 is consists of the space-time α-Hölder regular distributions that vanish for negative times. Indeed, for the first coordinate of A u 0 ,ξ this is immediate from classical Schauder estimates. For the second coordinate it suffices to notice that thanks to Remark 2.1 the right-hand side of (2.5b) is locally Lipschitz continuous from (C , and the norm of this injection is proportional to a positive power of t. Using a version of this argument with temporal weights, it is straightforward to show that (2.9) admits a unique local solution (u,F ). Furthermore, the preceding calculations show that, as long as the function g given by g = a ′ (u) (I 1 (a(u),F ) +Î 1 (a(u), u 0 ) , is strictly smaller than 1, one does indeed have (2.10) Since, by the same reasoning as above, g is continuous and g → 0 as t → 0, the claim follows. Moreover, if |u 0 | C 0+ ≤ C for some constant C, then for any fixed t 1 > 0, the A reasonable solution theory of (2.9) -which of course will require a renormalisation of the right-hand side of (2.9) -is expected to imply that for some t 1 > 0, this norm is uniformly bounded over a given family of smooth approximations of the 'true' noise ξ, and hence t 0 is uniformly bounded away from 0. It is not difficult to convince oneself that this argument is quite robust. For example, if in (2.1) the operator ∂ 2 x is replaced by ∂ 2k x for some k ∈ N, or if in higher dimensions, a(u) is matrix-valued and acts on the Hessian of u in a non-degenerate way, similar arguments show the analogous equivalence, of course with I built from a suitably modified family of parametrised kernels. It therefore suffices to solve equations of the type (2.5), which one can do using the framework of regularity structures as we will demonstrate in the remainder of this article.

Regularity structures with continuous parameter-dependence
It should be clear at this point that we would like to encode in our regularity structure the integration against all kernels P (c, ·), as well as some of their derivatives with respect to x and c. Since there is a continuum of them and since one wants to have some control over the dependence on c, this requires a modification of the construction in [Hai14].
The starting point of our construction however is very similar to that given in [Hai14,BHZ16] and we quickly recall it here, mainly to fix notations. We fix a dimension d ≥ 1 and on it, a scaling s ∈ N d . We assume that we are given a finite index set L = L + ⊔ L − as well as a map α : L → R \ {0} which is positive on L + and negative on L − . We build from this a set of symbols F by decreeing it to be the minimal set satisfying the following properties.
• There are symbols Ξ i and X k belonging to F for all i ∈ L − and any d-dimensional multiindex k. We also write 1 = X 0 and X i = X e i with e i the ith canonical basis vector. • For any τ, τ ′ , τ ′′ ∈ F, one also has τ τ ′ ∈ F, and τ (τ ′ τ ′′ ) and (τ τ ′ )τ ′′ are identified.
We also identify X k X ℓ with X k+ℓ , X k τ with τ X k , and 1τ with τ . • For any j ∈ L + , any d-dimensional multiindex k and any τ ∈ F, one also has a symbol I (k) j τ ∈ F. Remark 3.1. It is important here that unlike [Hai14,BHZ16] we do not identify ττ with τ τ ! The freedom to leave these as separate symbols will be very convenient later on.
We naturally associate degrees | · | to these symbols by postulating that We then consider the map G : F → P(F) (the set of all subsets of F) defined as the minimal map (P(F) being ordered by inclusion) satisfying the following properties.
• One has τ ∈ G(τ ) for every τ ∈ F, and one has The motivation for this definition is that these properties of the set G(τ ) guarantee that every element of the structure group associated to our regularity structure as in [BHZ16] maps any given symbol τ into the linear span of G(τ ). This allows us to give the following definition of a subcritical set W, which one should think of any subset of F that generates an actual regularity structure (one in which the set of possible degrees is locally finite and bounded from below).
Definition 3.2. A subset W ⊂ F is said to be subcritical if it satisfies the following properties.
It is said to be normal if, whenever ττ ∈ W, one has {τ,τ } ⊂ W and, whenever I (j) ℓ τ ∈ W, one has τ ∈ W.
As shown in [Hai14,BHZ16], every locally subcritical stochastic PDE (or system thereof) naturally determines a normal subcritical set W. From now on, we consider W to be fixed and we only ever consider elements τ ∈ W.

A regularity structure
In [Hai14,BHZ16], one then constructs a regularity structure by taking the vector space W generated by W as the structure space (graded by the notion of degree given in (3.1)) and endowing it with a suitable structure group. In our situation, to encode parameter dependence, we instead assign to each element of W a typically infinite-dimensional subspace of the structure space. In order to encode this, we first define the 'number of parameters' [τ ] in a symbol τ recursively by setting Remark 3.3. One could in principle encode some parametrisation of the noises as well, by setting [Ξ i ] = 1, or we could even allow the number of parameters to depend on the element of L we consider. Since this generality is not used in the sequel, we refrain from doing so here.
We also assume that we are given a real Banach space B and we write B k for the k-fold tensor product of B with itself, completed under the projective cross norm. In particular, we have a canonical dense embedding of B k ⊗ B ℓ into B k+ℓ . We also use the convention B 0 = R. Given a normal subcritical set of symbols W, we then construct a regularity structure from it in such a way that each symbol τ ∈ W determines an infinite-dimensional subspace T τ of the structure space T , isometric to B [τ ] . To wit, we set and equip the spaces T α with their natural norms. Here, we wrote τ for the one-dimensional real vector space with basis τ . (By the definition of subcriticality, there are only finitely many symbols τ with |τ | = α, so that the T α are naturally endowed with a Banach space structure. This is not the case for T itself though, but we view it as a topological vector space in the usual way.) For τ with [τ ] = 0, we also identify T τ with τ . The space T comes equipped with a number of natural operations. For every i ∈ {1, . . . , d}, we have an abstract differentiation D i acting on F by setting D i X j = δ ij 1, , and then extending it to all other symbols by enforcing that Leibniz's rule holds. For any τ ∈ W such that D i τ ∈ W and any b ∈ B [τ ] , we then set Similarly, for the abstract product, whenever τ,τ , ττ with b ⊗b interpreted as an element of B [ττ ] . (Here it is convenient that ττ andτ τ aren't identified since it avoids being forced to deal with symmetric tensor products.) Finally, we have a large number of abstract integration operators: for any j ∈ L + and any b ∈ B, a map I (k),b j is defined as the linear extension of defined on those T τ for which I (k) j τ ∈ W. So far we have not addressed the structure group at all, but its inductive construction as in [Hai14,Thm 5.14] is virtually identical in our setting. More precisely, as in [Hai14, Defs 4.6 & 5.25] the group G consists of those continuous linear operators Γ : T → T satisfying the following properties.
• One has Γ1 = 1, ΓΞ i = Ξ i and there are constants c i such that ΓX i = X i − c i 1.
• For any τ ∈ W such that D i τ ∈ W and any a ∈ T τ , one has ΓD i a = D i Γa.
• For any τ ∈ W such that I (k) ℓ τ ∈ W, any a ∈ T τ and any b ∈ B, one has As in [Hai14], on can show that this is indeed a group. The definitions of G and G also guarantee that, for any τ ∈ W, any Γ ∈ G maps T τ to σ∈G(τ ) T σ , which is indeed a subspace of T by the assumption on W.
From now on, we write T for the regularity structure with structure space T and structure group G given as above with the specific choice for some compact parameter space K, which is assumed for simplicity to be a subset of a R d 1 , as well as some sufficiently large N > 0 to be determined later.

Admissible models
We assume henceforth that L + and L − are singletons, and therefore omit the lower indices in Ξ, I. This is purely for the sake of notational convenience, this section immediately extends to the general case. We also omit k in I (k) and I (k),b if k = 0 and we set α = |Ξ| and β = |Iτ | − |τ |.
where δ 0 is the Dirac mass at the space-time origin and f (c) are compactly (and away from the origin) supported smooth functions, depending smoothly on c.
By the assumption on K, K ζ is also β-smoothing in the sense of [Hai14, Ass. 5.1] and, when considering its decomposition K ζ = n≥0 K ζ n , one has a bound of the type |D k K ζ n | 2 n(|s|−β+|k|s) |ζ| C −N 0 for any fixed N 0 > 0.
As our notation suggests, we want the maps I ζ to correspond to integrations against the kernels K ζ , which is encoded in the definition of admissibility in the present setting.
Definition 3.6. In the above setting, a model (Π, Γ) is admissible for T if, for all α ∈ A, τ ∈ W, such that |τ | = α and Iτ ∈ W, for all σ ∈ T τ , ζ ∈ C −N , x ∈ R d , and ϕ ∈ C ∞ 0 the following identity holds One can define the maps J ζ as in [Hai14], with K therein replaced by K ζ . With this notation the second term on the right-hand side above can be also written as (Π x J ζ (x)σ)(ϕ).
In the following we borrow the notations ψ λ x , Π − Π ′ γ; B , Γ − Γ ′ γ; B from [Hai14], and denote by B the set B −⌊α⌋ of test functions considered there. (This is in order to prevent confusion with the scale of spaces B k .) In fact, the lower indices in the norms of the models will usually be omitted for brevity, since the dependence on them does not play any role in our discussion.

Constructing models
In principle, if one has a sufficiently robust way of building a model (or a family of models with some continuity properties) for the (usual) regularity structure determined by W, one can also build an admissible model for the parametrised regularity structure T . Such a 'robust way of building models' is developed in great generality in [CH16]. To be selfcontained regarding the assumptions required to recall some of its results, we restrict our attention to the Gaussian case and refer the reader to [CH16] for more general noises that fit in the theory.
Assumption 3.7. Suppose we are given a centred, Gaussian, translation invariant, S ′ (R d )valued random variable ξ, such that there exists a distribution C whose singular support is contained in {0}, which satisfies for all test functions f, g ∈ S(R d ). Writing z → C (z) for the smooth function that determines C away from 0, it is furthermore assumed that any test function g satisfying D k g(0) = 0 for all multiindex k with |k| s < −|s| − 2α, one has Finally, there exists a κ > 0, such that for all multiindex k The final assumption on W is what is referred to as super-regularity in [CH16], which in the present setting reads as follows. Define, similarly to  Take, as in the introduction, a compactly supported nonnegative symmetric smooth function ρ integrating to 1, and set ρ ε (t, x) = ε −|s| ρ(z 1 ε −s 1 , . . . , z d ε −s d ). Under the above assumptions, we wish to construct a family of admissible models (Π ε ,Γ ε ) ε∈[0,1] that is continuous in a suitable sense in the ε → 0 limit, and which satisfyΠ ε z Ξ = ρ ε * ξ (here and below we use the natural convention of ρ 0 * denoting the identity).
Let Z ε = (Π ε , Γ ε ) for ε ∈ [0, 1] be the family of BPHZ models for TB as constructed in [BHZ16,CH16], which satisfy in particular Π ε z Ξ = ρ ε * ξ. One can then define the random distributionsΠ ε x a := Π ε x ι(a), (3.5) for a ∈ S. Note that formally the right-hand side also depends onB (the regularity structure TB in which ι(a) takes values depends on it, as well as the model Z ε ), but our construction is such that different choices ofB yield the same right hand side in (3.5). By [CH16], the random fieldsΠ x satisfy the bounds with some θ > 0, where here and below the second supremum is taken over x in some compact set, λ ∈ (0, 1], and ψ ∈ B. The random fieldΠ can then be turned into an admissible model for T in the following sense. Theorem 3.9. There exist admissible modelsẐ ε = (Π ε ,Γ ε ) with ε ∈ [0, 1] for T such that for all a ∈ S ∩ T ,Π ε x a =Π ε x a almost surely, and that one has the bounds for any c,c ∈ K [τ ] . Choosing p large enough, by Kolmogorov's continuity theorem one has a continuous modification (Π x a c,ℓ (τ )) c∈K [τ ] such that the admissibility condition is satisfied almost surely, and that one has the bound E sup x,λ,ψ sup c∈K [τ ] λ −p|τ | |(Π ε x a c,ℓ (τ ))(ψ λ x )| p 1.

Note that a generic element of S ∩T is of the form
and extending these maps to all of T by linearity and continuity, we get mapsΠ ε x that are admissible and that satisfy The corresponding bounds on the differencesΠ ε x −Π 0 x is obtained similarly, so it remains to treat the mapsΓ ε .
We proceed inductively. The definition of, and the appropriate bounds on,Γ ε xy τ if τ = Ξ or X k , are trivial. GivenΓ ε xy (ζ ⊗ τ ) andΓ ε xy (ζ ⊗τ ) with the right bounds, we set which one can bound by where we used our assumption on the spaces B k to obtain the second line. GivenΓ ε xy ζ ⊗ τ , we also setΓ ε xy (ζ ⊗ D i τ ) = D iΓ ε xy (ζ ⊗ τ ), for which the correct bounds follow automatically.
The only step to finish the induction is thus to define and boundΓ ε xy ζ ⊗ Iτ , provided Γ ε xy ζ ′ ⊗ τ are known. This is done as in [Hai14, Thm 5.14]: for ζ 1 ∈ B, a = ζ ′ ⊗ τ , we set (3.7) The first term on the right-hand side is harmless. Bounding the second one is immediate: thanks to the assumed bound onΓ ε xy ζ ′ ⊗ τ . Using again the assumptions on the spaces B k , this is precisely the required bound. To bound the third term on the right-hand side of (3.7), it suffices to recall [Hai14,Lem 5.21], with the kernel K therein replaced by K ζ 1 . Having the required bounds on elements of the formΓ ε xy (ζ 1 ⊗ ζ ′ ⊗ Iτ ), one can extendΓ ε xy to all a ∈ T Iτ once again via linearity and continuity. It is straightforward to check that the above defined mapsΓ ε xy do indeed belong to G, and hence the proof is finished.
The construction given above is essentially the same, but now each symbol τ determines a smooth function C ε τ : The construction of the renormalised model is then the same as in [BHZ16].

Lifting the operator I
We continue within the setting of the previous section. Given now that we have abstract integration operators I ζ on T that that can in principle be used as in [Hai14,Sec. 4] to build the operation of convolution with any of the K ζ , we are also able to construct the abstract counterpart of the operators I (ℓ) k , acting on suitable spaces of modelled distributions. From now on we assume d > 1 and the first coordinate will be viewed as time. We work with D γ,η P spaces defined as in [Hai14, Sec 6], with P = {(0, x) : x ∈ T d−1 }. It will be clear that apart from notational inconvenience there is no fundamental obstacle to obtaining analogous results for more complicated weighted spaces, like for example those considered in [GH17] that are suitable for solving initial-boundary value problems.
Given the setup of the previous section and an admissible model (Π, Γ), one can define the maps K ζ m by replacing I and K in [Hai14, Eq 5.15] by I (m),ζ and D m K ζ , respectively, provided |m| s < β. As before, we denote K c;ℓ m := K ∂ ℓ δc m , and for m = 0 the lower index is omitted.
We now define the lift of I by a sort of 'higher order freezing of coefficients' where, around a given fixed point z 0 , we don't simply describe I(b, f ) by K b(z 0 );0 f , but also use higher order information about b. Set, withb = b, 1 andb = b −b, where N ′ is a sufficiently large integer. (How large exactly will be specified in the statement of Theorem 4.4 below. Since the exact value of N ′ does not make much of a difference for our purpose, we do not explicitly keep track of it in our notations.) In the following we treat only I := I (0) 0 . The Schauder estimate for I (m) k can then be formally obtained by changing the family of kernels (K (c) ) c∈K to (∂ k c ∂ m x K (c) ) c∈K , as well as β to β − |m| s , and apply the Schauder estimate for the map I built from this family.
Note that the definition (4.1) is very reminiscent of how one composes modelled distributions with smooth functions F , see [Hai14,Sec 4.2]. To justify this analogy, one needs a substitute for the Taylor expansion of F , which is precisely the content of Corollary 4.3 below. Thanks to this (of course not coincidental, see Remark 4.6 below) similarity, the Schauder estimates for I will follow immediately from the one for 'constant coefficients' (Theorem 4.2 below), and a straightforward adaptation of the proof of [Hai14, Prop 6.13].
We assume henceforth that the kernels K (c) are non-anticipative, namely that they vanish for negative times. One then has the following, see [Hai14, Thm 7.1].
Remark 4.5. Note that if α 1 + α + β < 0, then the equality (4.4) fails to hold in general even for canonical models built from a smooth noise.
It then follows from the multiplicativity of the action of the structure group that The term A 1 can be bounded precisely as in [Hai14,Prop 6.13], with the only minor difference that the smooth function F (ℓ) (·, x) that b is substituted into, takes values in T instead of R. One then gets where in the above sum m 2 runs over homogeneities of IW +T , in particular its smallest value is (α + β) ∧ 0 ≥γ − γ 1 . Therefore, where in the last step we used η 1 ≥η ∨ 0 and α + β >η. One the other hand, For a fixed model, bounding |||I(b, R + f ); I(b, R +f )|||γ ,η;t is immediate from the above thanks to the linearity of I in the second argument. To bound |||I(b, R + f ); I(b, R + f )|||γ ,η;t , one can write, as in the proof of [Hai14,Thm 4.16 where the sum over i runs over 1, . . . , d 1 , and e i is the i-th unit vector in R d 1 . Now one can repeat the preceding calculation, with 'gaining' a factor |||b ′ ||| γ 1 ,η 1 ; t at each step. Finally, to bound |||I(b, R + f ); I(b, R +f )|||γ ,η;t for two different models, one can employ the trick in [HP15,Prop 3.11].
Remark 4.6. The same argument actually shows that if c → F (c, ·) is a smooth function from K to Dγ ,η P and b =b1 +b is as in the statement, then the function G given by belongs to Dγ ,η P . This statement then has both the first part of Theorem 4.4 and [Hai14, Thm 4.16] as corollaries.
To formulate the abstract counterpart of (2.5b), it remains to lift the operatorsÎ (ℓ) k . Using the notation and identifying this function with its lift via its Taylor expansion, we define, similarly to I, Further to the preceding we fix a non-integer exponent 1 > η 0 >η. This time the 'constant coefficient' result we rely on is the following variant of [Hai14, Lem 7.5].
Lemma 4.8. Assume β = s 1 . Let V , N ′ , b andb be as in Theorem 4.4, and let u 0 ,ũ 0 ∈ C η 0 . ThenÎ(b, u 0 ) ∈ Dγ ,η P and withκ = (η 0 −η)/s 1 one has the bounds, for any t ∈ (0, 1], Moreover, the following identity holds Proof. The proof goes precisely as that of Theorem 4.4, with the only slight difference that the 'constant coefficient' estimate seemingly does not help in obtaining a positive power of t. Note however that whenever η 0 > 0, for any c ∈ K and nonzero multiindex ℓ, K c;ℓ u 0 , 1 vanishes at the initial time. In particular, whenever η 0 < 1, all components of K c;ℓ u 0 lower than η 0 vanish at the initial time, and hence (see [Hai14,Lem 6.5]) one gets the estimate It remains to notice that in the calculation for |||Î(b, u 0 ) ;Î(b, u 0 )|||γ ,η;t analogous to (4.5) we only ever encounter instances of K with nonzero derivatives with respect to the parameter c, hence the claimed factor tκ in the lemma is indeed obtained.

A concrete example
At this point, we have a completely automatic solution theory: given a quasilinear equation like (2.1), its solution is defined as RU , where U is obtained from the system of abstract equations If F was a subcritical nonlinearity to begin with and α > −2, then the above system is again subcritical, so one can use the construction of Section 3 to build the corresponding regularity structure. Provided it satisfies Assumption 3.8, one can use [CH16] in the form of Theorem 3.9 to obtain the corresponding BPHZ model. The local well-posedness of (5.1) is then a standard consequence of the results of Section 4 above just as in [Hai14,Sec 6].
However, as mentioned in the introduction, at this point it is not automatic to see what counterterms appear (or whether they are even local in the solution) in the equation solved by R ε U ε , where U ε is obtained from solving (5.1) with a renormalised smooth model. Below we carry out the computation of these terms in the setting of the example (1.1). An interesting outcome of these calculations is that if we consider the BPHZ renormalisation of our model, then it may happen in general that non-local counterterms appear. However, as we will see, it is possible to choose the renormalisation procedure in such a way that these non-local terms cancel out, thus leading to the stated result.
Recall first from Remark 3.10 that the BPHZ renormalisation procedure is parametrised by functions C ε τ given by ( Using the graphical notation from [HP15,Hai16] (circles represent Ξ, plain lines represent I and bold red lines represent I ′ ), the only two such symbols are given by and . The corresponding renormalisation functions are given by where C ε = C * ρ ε * ρ ε . In this particular case, this allows us as in [BHZ16] to define linear maps M ε : T → T such that on T ≤0 the BPHZ renormalised model (Π ε ,Γ ε ) satisfies the identityΠ where Π ε is the canonical lift of ξ ε . (The fact that (5.3) holds is no longer the case when κ ≤ 0!) One has for example (5.4) At this point we note that if K (c) were exactly equal to the heat kernel instead of a compactly supported truncation, then one would have the exact identity C ε (c) = cC ε (c, c) . (5.5) It turns out that this identity is crucial in order to obtain the cancellations necessary to obtain local counterterms. We therefore define a model (Π ε ,Γ ε ) just like the BPHZ model, but with C ε defined by (5.5) instead of (5.2). Since the difference between these two different definitions of C ε converges to a finite smooth function as ε → 0, the convergence of the BPHZ model as ε → 0 also implies the convergence of the model (Π ε ,Γ ε ). Note also that (modulo changing the order of some factors: recall that = in our setting, but this distinction is essentially irrelevant since we always consider models such that for example Π x (ζ ⊗ η ⊗ ν ⊗ ) = Π x (ν ⊗ ζ ⊗ η ⊗ ), so that we can identify such elements for all practical purposes), one has W − = { , X 2 , , , , , , , , , } .
Inspecting (5.4), as well as the analogous expressions for all other symbols of negative degree, we conclude that one has for all τ ∈ T , where 1, σ denotes the coefficient of 1 in σ. Furthermore, the second term in this expression is non-vanishing only if τ contains a summand in T and / or in T . This is because of (5.3), combined with the fact that M ε only generates terms of strictly positive degrees for the remaining symbols in W.
We now have everything in place to derive the form of the renormalised equation. Given the (Π ε ,Γ ε ) for some fixed ε > 0, one obtains a local solution of the system (5.1) in D 3/2+2κ,2κ (W 0 ) ⊕ D κ,−2+3κ ⊕ D 1/2+2κ,−1+2κ (W 1 ) ⊕ D 1+2κ,2κ (W 0 ) ⊕ D 3/2+2κ,2κ (W 0 ), where W 0 is the sector generated by the Taylor polynomials and elements of the form I ζ τ , and W 1 = DW 0 . As a consequence of (5.6), we conclude as for example in [Hai14,Sec. 9.3] that for ε > 0 the pair (R ε U ε , R εF ε ) solves an equation just like (2.5), but with an additional term 1, (M ε − id)F ε appearing on the right-hand side of (2.5b). Hence R ε U ε solves an equation just like (2.1), but with an additional term appearing on the right-hand side. It now remains to show that if (U,F ) solve (5.1), then (5.7) coincides with a local functional of u = RU = 1, U . Write furthermore v i = RV i = 1, V i , as well as q = 1 − v 3 a ′ (u), where the V i are as in (5.1). Note that q is the denominator in (5.7), and that this is not a local functional of u, so that we should aim for a factor q to appear in the numerator as well. To ease notation, we henceforth also omit the lower indices in δ a(u) and δ ′ a(u) . Since all symbols appearing in the expansion of the solution are of the form ζ ⊗ τ , where ζ is a tensor product of either δ a(u) or δ ′ a(u) , this will hopefully not cause any confusion.
To calculate the numerator in (5.7), it follows from the above discussion that we only need to know the components ofF in T and in T . For this, note first that one haŝ where all terms included in (. . .) are of strictly higher degree than that of Ξ. Combining (5.1) with the definitions of I and I 1 , we then see that whereŨ takes values only in spaces T τ with τ = of the form τ = i I ζ i (σ i ). Furthermore, by (5.8) and the definition of V 3 , the distribution u is given by u = qF 1 (u)δ + a ′ (u)v 3 u ⇒ u = F 1 (u)δ , so that in particular a(U ) = a(u) 1 + (a ′ F 1 )(u)δ ⊗ + (. . .) .
At this point we note that by (5.5) and the fact that C ε is symmetric in its two arguments, one has the identity (∂C ε )(c) = C ε (c, c) + 2c(∂ 1 C ε )(c, c) .
Remark 5.1. The expression (5.2) also gives some information about the behaviour of C ε in the case where C is self-similar on small scales, i.e. C (λ 2 t, λx) = λ −3+ν C (t, x) for all λ ∈ (0, 1] and |t| 1/2 + |x| ≤ r, for some r > 0. Indeed, one can then write where ≈ means that the difference of the two sides converge as ε → 0 to a smooth function of c. Hence, modulo changing again the renormalisation constants by a finite quantity, one can use in this case a counterterm of the form ε ν−1 A(u) for some explicit function A of the solution u.