Well-posedness of stochastic heat equation with distributional drift and skew stochastic heat equation

We study stochastic reaction--diffusion equation $$ \partial_tu_t(x)=\frac12 \partial^2_{xx}u_t(x)+b(u_t(x))+\dot{W}_{t}(x), \quad t>0,\, x\in D $$ where $b$ is a generalized function in the Besov space $\mathcal{B}^\beta_{q,\infty}({\mathbb R})$, $D\subset{\mathbb R}$ and $\dot W$ is a space-time white noise on ${\mathbb R}_+\times D$. We introduce a notion of a solution to this equation and obtain existence and uniqueness of a strong solution whenever $\beta-1/q\ge-1$, $\beta>-1$ and $q\in[1,\infty]$. This class includes equations with $b$ being measures, in particular, $b=\delta_0$ which corresponds to the skewed stochastic heat equation. For $\beta-1/q>-3/2$, we obtain existence of a weak solution. Our results extend the work of Bass and Chen (2001) to the framework of stochastic partial differential equations and generalizes the results of Gy\"ongy and Pardoux (1993) to distributional drifts. To establish these results, we exploit the regularization effect of the white noise through a new strategy based on the stochastic sewing lemma introduced in L\^e~(2020).

where b is a generalized function in the Besov space B β q,∞ (R), D ⊂ R andẆ is a space-time white noise on R + × D. We introduce a notion of a solution to this equation and obtain existence and uniqueness of a strong solution whenever β − 1/q −1, β > −1 and q ∈ [1, ∞]. This class includes equations with b being measures, in particular, b = δ 0 which corresponds to the skewed stochastic heat equation. For β − 1/q > −3/2, we obtain existence of a weak solution. Our results extend the work of Bass and Chen (2001) to the framework of stochastic partial differential equations and generalize the results of  to distributional drifts. To establish these results, we exploit the regularization effect of the white noise through a new strategy based on the stochastic sewing lemma introduced in Lê (2020).

Introduction
While regularization by noise for ordinary differential equations (ODEs) is quite well understood by now, much less is known about regularization by noise for partial differential equations (PDEs). The goal of this article is to analyze regularization by noise for parabolic PDEs and to build new robust techniques for studying this phenomenon. We consider stochastic heat equation with a drift (stochastic reaction-diffusion equation) where b is a generalized function in the Besov space B β q,∞ (R, R), β ∈ R, q ∈ [1, ∞], the domain D is either [0, 1] or R, T 0 > 0,Ẇ is space-time white noise on [0, T 0 ] × D, and u 0 : D → R is a bounded measurable function. Note that for β < 0 this equation is not well-posed in the standard sense: indeed in this case b is not a function but only a distribution and thus the composition b(u t (x)) is a priori not well-defined. We introduce a natural notion of a solution to this equation in the spirit of [BC01, Definition 2.1]. We show that equation (1.1) has a unique strong solution if β − 1 q −1, β > −1 and q ∈ [1, ∞], see Theorem 2.6. This includes equations where b is measure, in particular, the skewed stochastic heat equation, which corresponds to the case b = κδ 0 , κ ∈ R. The latter equation is important for the stochastic interface models and appeared in [BZ14] where its well-posedness was left open. We resolve this problem in our paper, see Theorem 2.8 and Corollary 2.9.
Our results extend [BC01] to the framework of stochastic partial differential equations (SPDEs) and generalize [GP93a,GP93b] to singular drifts. We exploit the regularization effect of the white noise and develop a new proof strategy based on stochastic sewing [Lê20]. Furthermore, we give several extensions of the stochastic sewing lemma which allow singularities, critical exponents and usage of random controls. In particular, we extend to the stochastic setting deterministic sewing with controls, see, e.g., [Lyo98,Gub04,FdLP06,FZ18]. We would like to stress that in contrast to vast majority of regularization-by-noise papers for ODEs [Ver80,HS81,Dav07b,CG16,ZZ17] our method uses neither Girsanov transform nor Zvonkin transformation. These two popular techniques are not useful in our setting.
It has been known since 1970s that ill-posed deterministic systems can become wellposed if a random noise is injected into the system. Consider the following simple example. The ODE dX t = b(X t )dt with a bounded measurable vector field b : R d → R d is illposed. It might have infinitely many or no solutions in some specific cases. Yet, if this deterministic system is perturbed by a Brownian noise B, then the corresponding stochastic differential equation (SDE) Pardoux [GP93a,GP93b]. The authors used comparison theorems to establish existence and uniqueness of (analytically) weak solutions to (1.1) for the case where the drift b is the sum of a bounded function and an L q -integrable function with q > 2. Path-by-path uniqueness of solutions to (1.1) for bounded b was obtained recently in [BM19].
Bounebache and Zambotti [BZ14] considered stochastic partial differential equations with measure valued drift. In particular, motivated by problems arising in the study of random interface models, see, e.g., [Fun05], they studied the skew stochastic heat equation, i.e., (1.1) with b being a Dirac delta function. Using Dirichlet form techniques, they obtained existence of a weak solution. However existence and uniqueness of strong solution remained open. Resolving this problem was one of the motivations for our work.
From the above discussion, we note that there is a gap between the ODE and PDE settings. For ODEs numerous results treating distributional drifts are available [HS81,BC01,CG16,ZZ17,PvZ20]. On the other hand, almost no such results were known for PDEs, note though the paper [GP18] and the discussion there. Our goal in this article is to construct a robust general method for proving strong existence and uniqueness to (1.1) in the case where the drift b is a Schwarz distribution. In particular, we treat the skew stochastic heat equation.
Inspired by the finite dimensional setting [BC01,CG16] we define a natural notion of a solution to (1.1) in Definition 2.3 and show that (1.1) has a unique strong solution when b belongs to the Besov space B β q,∞ (R, R), β − 1 q −1, β > −1 and q ∈ [1, ∞], see Theorem 2.6, and when b is a finite Radon measure, Theorem 2.8. We also prove strong convergence of smooth approximations to (1.1) in Theorem 2.10. We establish strong existence and uniqueness of the skew stochastic heat equation in Corollary 2.9 and show that this equation appears naturally as a certain scaling limit of "standard" SPDEs where the drifts are continuous integrable functions, see Theorem 2.12.
To obtain these results we develop a new strategy based on certain regularization estimates for SPDEs, see Lemma 5.2. These estimates can be viewed as infinite-dimensional analogues of the corresponding Davie's bounds for SDEs [Dav07b, Proposition 2.1], see also [ZZ17,Lemma 5.8]. Note though that Davie's method involves exact moment computations and is not easily extended to the SPDE setting. Therefore to obtain these regularization estimates, we extend and employ the stochastic sewing technique introduced originally in [Lê20]. We believe that these new stochastic sewing lemmas (Theorems 4.1, 4.5 and 4.7) form a very useful toolkit which might be of independent interest. The usage of regularization estimates are explained briefly in Section 2.2.
We conclude the introduction by commenting on the optimality of our results. It is known [LN09] that for each fixed space point, the free stochastic heat equation (that is, equation (1.1) with b ≡ 0) behaves "qualitatively" like a fractional Brownian motion (fBM) with the Hurst parameter 1/4, denoted further by B 1/4 . Therefore, one can expect that strong existence and uniqueness for equation (1.1) would hold under the same conditions on b as in the equation [CG16,Theorem 1.13]. This indeed turned out to be the case, see Theorem 2.6, even though the method of [CG16] could not be transferred to the PDE setting. Note that this class of functions does not include the Dirac delta function, which lies in B −1+1/q q , q ∈ [1, ∞]. Therefore we had to come up with an additional argument to cover the case β − 1/q = −1 as well (see Proposition 3.6).
The rest of the paper is organized as follows. We present our main results and a brief overview of the proof strategy in Section 2. Since the proofs are quite technical, for the convenience of the reader we split them in several steps. In Section 3 we prove the main results. The proofs are based on key propositions, which are stated also in Section 3 and proved in Section 5. Extensions of the stochastic sewing lemma are stated and proved in Section 4. The proofs of crucial regularity results are given in Section 6. Appendices A to C contains auxiliary technical results which we will use freely throughout the paper.
Convention on constants. Throughout the paper C denotes a positive constant whose value may change from line to line. All other constants will be denoted by C 1 , C 2 , . . .. They are all positive and their precise values are not important. The dependence of constants on parameters if needed will be indicated, e.g, C β or C(β).

Well-posedness of the stochastic heat equation with a distributional drift
We begin by introducing the necessary notation. Let B(D) be the space of all real bounded measurable functions on D. Let C ∞ b = C ∞ b (D, R) be the space of infinitely differentiable real functions on D which are bounded and have bounded derivatives. We denote by C ∞ c = C ∞ c (D, R) the set of functions in C ∞ b (D, R) with compact supports. For β ∈ (0, 1], let C β be the space of bounded Hölder continuous functions with exponent β. For each β ∈ R and q ∈ [1, ∞], let B β q denote the (nonhomogeneous) Besov space B β q,∞ (R) of regularity β and integrability q, see Definition A.1. We recall that for β ∈ (0, 1), the space B β ∞ coincides with the space C β (see [BCD11,page 99]). For β ∈ (−1, 0) the space B β ∞ includes all derivatives (in the distributional sense) of functions in C β+1 .
Let g t , p per t , p N eu t be the free-space heat kernel, the heat kernel on [0, 1] with periodic boundary conditions, and the heat kernel on [0, 1] with the Neumann boundary conditions, respectively. That is, (2.1) p per t (x, y) := n∈Z g t (x − y + n), t > 0, x, y ∈ [0, 1]; (2.2) p N eu t (x, y) := n∈Z (g t (x − y + 2n) + g t (x + y + 2n)), t > 0, x, y ∈ [0, 1]. (2.3) Our main results are valid in three different setups: when equation (1.1) is considered on the domain D = R; when (1.1) is considered on the domain D = [0, 1] with the periodic boundary conditions; and when (1.1) is considered on D = [0, 1] with the Neumann boundary conditions. To simplify the notation and to ease the stating of the results we will use the notation p for g, p per or p N eu and D will denote the corresponding domain. The specific choice of the domain and of the boundary condition (out of the above three options) will not affect the results and arguments in most places of the paper. In very few places of the paper where the choice of the domain is important we will highlight it.
For bounded measurable functions ϕ : D → R, t > 0 we put It will be convenient to denote the heat semigroup on R by for all bounded measurable functions ψ : R → R. Let T 0 > 0 and let (Ω, F , (F t ) t∈[0,T 0 ] , P) be a filtered probability space. For each m ∈ [1, ∞], the norm of a random variable ξ in L m (Ω) is denoted by ξ Lm . Here, as usual, we use the convention ξ L∞ := ess sup ω∈Ω |ξ(ω)| when m = ∞. We recall that a random process W : where the integration in (2.4) is a stochastic integration understood in the sense of Walsh ([Wal86, Chapter 2]). It is known that V is a Gaussian random field adapted to (F t ) and has a continuous version on [0, T 0 ] × D which we will use throughout the paper. It follows from (2.4) that V has the local nondeterminism property for every s t and x ∈ D. (2.5) Let us note that we do not analyze in our article equation (1.1) equipped with the Dirichlet boundary conditions. In this case, the right-hand side of (2.5) goes to 0 as x tends to the boundary of the domain.
The uniformity in x in (2.5) plays a key role in our arguments. While it is possible to adapt our proofs to treat Dirichlet boundary condition as well (and we are convinced that our results hold in the setting), we have deliberately decided not to focus on this case in order to emphasize how our approach works and to avoid additional technical difficulties. Now let us give a notion of a solution to (1.1). It is inspired by the definition in finite dimensional setting in [BC01, Definition 2.1].
Definition 2.2. Let f be a distribution in B β q with β ∈ R and q ∈ [1, ∞]. We say that a sequence of functions It is clear that for any f ∈ B β q , there is a sequence of functions (f n ) n∈Z + ⊂ C ∞ b which converges to f in B β− q as n → ∞. For example, one can take f n := G 1/n f , see Lemma A.3.
(2) for any sequence of functions (3) a.s. the function u is continuous on (0, T 0 ] × D. We note that Definition 2.3 defines a solution to equation (1.1) in three different settings, see Assumption 2.1. When b ∈ C β with β > 0, we can choose a sequence (b n ) which converges to b in uniformly. Then it is immediate that Definition 2.3 is equivalent to the usual notion of a mild solution of (1.1), that is, P-almost surely We say that a solution u ≡ {u t (x) : t ∈ (0, T 0 ], x ∈ D} is a strong solution to (1.1) if it is adapted to the filtration (F W t ). A weak solution of (1.1) is a couple (u, W ) on a complete filtered probability space (Ω, G, (G t ) t 0 , P) such that u is adapted to (G t ), W is (G t )-white noise, and u is a solution to (1.1). We say that strong uniqueness holds for (1.1) if whenever u and u are two strong solutions of (1.1) defined on the same probability space with the same initial condition u 0 , then Consider the following class of solutions.
Definition 2.4. Let κ ∈ [0, 1]. We say that a solution u to SPDE (1.1) belongs to the class V(κ) if for any m 2, sup (t,x)∈(0,T 0 ]×D u t (x) Lm < ∞ and Remark 2.5. Recalling that u t = P t u 0 + K t + V t , t > 0, we see that the numerator in Definition 2.4 is just K t (x) − P t−s K s (x) Lm . Thus, class V(κ) contains solutions of (1.1) such that the moments of their drifts satisfy certain regularity conditions.
We are now ready to present our main result. Fix T 0 > 0 and recall Assumption 2.1.
Remark 2.7. It follows from the proof of Theorem 2.6 (see Proposition 3.6) that in the case β − 1 q −1, β > −1, condition (2) of Definition 2.3 can be relaxed. Namely, if a measurable adapted process u : (0, T 0 ] × D × Ω → R satisfies conditions (1) and (3) of Definition 2.3, belongs to V(3/4), and satisfies the following weaker condition (2 ′ ) then it satisfies a stronger condition (2) of Definition 2.3 for any sequence of smooth approximations b n → b in B β− q . Note that the additional assumption in Theorem 2.6(ii) that the solution lies in V(3/4) is a natural extension to the SPDE setting of a very similar condition arising in the analysis of SDEs with the distributional drift. It appears in [ Since for any q ∈ [1, ∞], L q (R) is continuously embedded in B 0 q (R) ([BCD11, Proposition 2.39]), Theorem 2.6 complements the corresponding results in [GP93a,GP93b]. Namely, Theorem 2.6(ii) allows L 1 (R)-integrable drifts, while the aforementioned papers requires the drift to be L q (R)-integrable for some q > 2. Note that the drift b in [GP93a,GP93b] can also depend on (t, x). It is clear that our method can be adapted to this setting; however, for clarity and to highlight the main ideas, we only consider equations of the type (1.1) herein.
Since signed measures belong to B 0 1 ([BCD11, Prop. 2.39]), Theorem 2.6(ii) is also applicable for this class. Other specific cases of drift b for which (1.1) has a unique strong solution include b(u) = |u| −σ , σ ∈ (−1, 0) and b(u) = ζ −1 (u), where ζ −1 is the Cauchy principal value of 1/u, defined in (2.9) below. This is due to the fact that | · | −σ belongs to B 0 1/σ while ζ −1 belongs to B −1/2 2 (see Lemma A.4). In the case when b is a finite non-negative measure on R, we have the following improved result.
Theorem 2.8. Let b be a finite non-negative Radon measure. Then for any bounded initial condition u 0 equation (1.1) has a unique strong solution.
Corollary 2.9. The skew stochastic heat equation, that is equation (1.1) with b = κδ 0 , κ ∈ R, has a unique strong solution for every bounded measurable initial condition u 0 .
Note that in Theorem 2.8 and Corollary 2.9, the assumption u ∈ V(3/4) is not required.
Our next result is a stability theorem. Let (b n ) n∈Z + be a smooth approximation of b. Theorem 2.10 shows that a solution to SPDE (1.1) with smooth drift b n converges as n → ∞ and that the limit does not depend on the particular choice of the approximating sequence. Such solutions are sometimes called "constructable solutions" [GP93b, Definition 2.2]. We show that in our setting "constructable solutions" coincide with the standard solutions defined above.
Let u n be a strong solution to Eq(u n 0 ; b n ). Then there exists a measurable function (2) u is a strong solution to (1.1) with the initial condition u 0 ; (3) u satisfies for every m 1. In particular, u belongs to V(3/4).
Finally, we state two interesting applications of Theorem 2.10. The first one is the comparison principle for the solutions of SPDE (1.1), which extends the standard comparison principle. As usual, for two Schwarz It is known (see, e.g., [Tre67,Exersice 22.5 Our second applications of Theorem 2.10 shows that the skew stochastic heat equation appears naturally as a scaling limit of certain SPDEs.
Let us introduce the space C uc ((0, T 0 ] × D) of real continuous functions on (0, T 0 ] × D equipped with the topology of uniform convergence over compact sets. It is well-known that this topology is induced by the metric and C uc ((0, T 0 ] × D) is separable. We define the Schwarz distributions ζ −1 on R by for each Schwarz function ϕ. Similarly, for α ∈ (−1, 0) we put In the following result we consider the stochastic heat equation with D = R.
(ii) Assume that ρ ∈ (1, 3/2) and for some constants c − , c + ∈ R. Then the random field converges weakly in the space C uc ((0, 1] ×R) as λ → ∞ to the solution of the stochastic heat equation with the initial condition u 0 ≡ 0.
Remark 2.13. It easy to see that if f is absolutely integrable on R, then c = 0, c 0 = R f (x)dx, and one deduces a convergence to the skew stochastic heat equation from part (i) of the above result. In such case, Theorem 2.12(i) is an analogue of [Ros75] (see also [LG84, Corollary 3.3]) for stochastic partial differential equations.
Remark 2.14. It is known that any homogeneous distribution on R of order α ∈ (−1, 0) is a linear combination of ζ α + and ζ α − , and any homogeneous distribution on R of order −1 is a linear combination of δ 0 and ζ −1 , [GS16, Chapter I]. Therefore, Theorem 2.12 shows that for any homogeneous distribution h of order α ∈ [−1, 0), one can easily find a continuous function f , so that the corresponding scaling limit of (2.11) converges to stochastic heat equation with drift h.

Overview of the proofs of the main results
Before we proceed to the proofs of our main results, we would like to demonstrate our strategy on the following simple example, provide an overview of our arguments, and highlight the main challenges arising in the proofs. We hope that this would help the reader to better understand our method and grasp the main ideas without having to dive into too many technical details Thus, in the section we first consider the uniqueness problem for the equation (1.1), where the initial condition u 0 ≡ 0, the drift b is a function (not a distribution) and b ∈ C β = B β ∞ with β ∈ (0, 1). Furthermore, we consider this equation on the time horizon [0, ℓ] rather than [0, T 0 ], where ℓ ∈ (0, 1) is small enough and to be chosen later.
As mentioned before, in this setting (1.1) is equivalent to Eq(0; b). Assume the contrary and suppose that this equation has two solutions u and v. We define We note that z(0) = 0 and our goal is to prove that z(t) = 0 for all t ∈ [0, ℓ]. We clearly have for any t ∈ [0, ℓ], x ∈ D (2.16) A naive (and wrong) approach would be to then use directly the fact that b ∈ C β and to put the · L 2 norm inside the integral. Then one would get and hence sup (2.17) Since β ∈ (0, 1), it is obvious that neither of the above inequalities allows to conclude that sup y∈D z t (y) = 0. Instead, our aim is show that the following trade-off holds: one can have (2.17) with the factor z t (x) L 2 in the power 1 and the price to pay is that factor ℓ will be in a certain power smaller than 1. However this will not obstruct the final conclusion.
To show this we are planning to work directly with the integral in the right-hand side of (2.16) and exploit the regularizing properties of the white noise. Recall that in the SDE setting it is known that where B H denotes the fractional Brownian motion (fBM) with Hurst index H ∈ (0, 1) and where we used the notation Remark 2.15. The exponent 3 4 + β 4 in (2.19) can be written as H(β − 1) + 1 for H = 1 4 , which is the same as in (2.18). This is due to the local nondeterministic property of V in (2.5). Note that fBM with H = 1/4 satisfies a very similar local nondeterministic property. This provides another connection of our results and the results in [CG16] concerning regularization by noise for fBM. Now let us apply (2.19) with τ = 3/4 + β/4, divide both sides of the inequality by |t − s| 3/4+β/4 and take supremum over all 0 s t ℓ. We get (2.20) Since the constant C does not depend on ℓ, we can choose ℓ small enough so that C b C β ℓ 3 4 + β 4 1/2. Substituting this back into (2.20), we get Applying this bound to (2.19), setting there s = 0, and taking there supremum over all 0 t ℓ, we finally obtain Provided that ℓ is small enough, this yields z C 0,0 L 2 ([0,ℓ]) = 0, and thus Eq(0; b) has a unique strong solution.
Weak existences of solutions to (1.1) would follow from similar bounds (see Lemma 5.2) and Prokhorov's theorem. Finally, strong existence follows from weak existence and strong uniqueness by the Yamada-Watanabe principle (by the method of [GK96]).
While the uniqueness proof outlined above is quite short and "almost" rigorous, two major obstacles appears when one tries to extend this proof to cover distributional drifts b ∈ C β , β < 0, and especially the case b = δ 0 .
First, for β < 0 the right-hand side of (2.19) contains the additional factor . (2.22) When b was a bounded function and β 0, it was obvious that and thus this extra factor was finite. Now when b is a distribution, the finiteness of this extra factor is not clear at all (note also the appearance of ess sup there). The second obstacle is even more hindering. It turns out that bound (2.19) is valid only for β > −1 and thus is not applicable to the case where b is the Dirac delta function. This is similar to the fact that the corresponding bound for fBM with the Hurst parameter 1/4, (2.18), is also known to be valid only for β > −1, see [CG16, Theorem 1.1]. While it is true that the Dirac delta function actually has better regularity and belongs to B −1+1/q q for any q ∈ [1, ∞], this does not help much. Indeed, one can show that bounds (2.18) and (2.19) hold for b ∈ B β q with β − 1/q > −1; however this still does not cover the delta function.
Let us explain now how we are overcoming these obstacles. A crucial role in our approach belongs to Proposition 3.6. It shows that if u, v ∈ V(3/4) are two weak solutions to (1.1) adapted to the same filtration, and if for one of them expression (2.22) is finite, then these solutions coincide. To obtain this proposition we combine the critical stochastic sewing lemma (Theorem 4.5, extension of the stochastic sewing lemma from [Lê20] and [FHL21, Lemma 2.9]) with a very delicate analysis of the solution to (1.1). Bound (2.19) (which is not valid for the case b = δ 0 ) is replaced by (5.33), see Lemma 5.7. Note that we have to use a certain rough-path inspired expansion of the solution and bound its norm as well, see (5.34). Since new bound (5.33) contains now some logarithmic terms, the final part of the uniqueness proof is less straightforward compared with (2.21), see Section 5.2. Our argument there is a stochastic analogue of Davie's argument in [Dav07a,Theorem 3.6]. Now we are ready to outline our strategy for establishing strong existence and uniqueness for equation (1.1).
Step 1. We show that for any solution u η;f to Eq(η; f ), where η is a bounded initial condition and f is a smooth function, the additional factor [u η;f − V ] from (2.22) is finite and is bounded by a constant which depends only on the norm f C β , see Proposition 3.2. This is done using regularization bounds from Lemma 5.2.
Step 2. At this step we fix two sequences of smooth functions (b ′ n ) n∈Z + , (b ′′ n ) n∈Z + converging to b in B β− q and denote by u ′ n , u ′′ n the solutions of Eq(u 0 ; b ′ n ), Eq(u 0 ; b ′′ n ), respectively. Then, using again bounds from Lemma 5.2, we are able to show that the sequence (u ′ n , u ′′ n ) is tight. By Prokhorov's theorem, this implies that it has a subsequence which converges weakly. We denote its limit by (u ′ , u ′′ ). This is done in Proposition 3.3.
Step 3. Now we show that both u ′ and u ′′ solve (1.1), belong to V(3/4) and the factor is finite. This is the content of Proposition 3.4 and Corollary 3.5.
Step 4. Now we have two solutions u ′ , u ′′ ∈ V(3/4) for which the extra factor from (2.22) is finite. Hence, by Proposition 3.6 discussed above u ′ = u ′′ . This implies, thanks to a Yamada-Watanabe type result from [GK96, Lemma 1.1], that u ′ is actually a strong solution to (1.1), see the proof of Theorem 2.10.
Step 5. Now if v ∈ V(3/4) is any other solution (for which the factor from (2.22) is not necessary finite), it still coincides with the strong solution u ′ constructed at the previous step. This is again due to Proposition 3.6, see the proof of Theorem 2.6(ii).
Step 6. Finally we show that the extra condition u ∈ V(3/4) is automatically satisfied for SPDEs with measure valued drift. This is done in Proposition 3.8 using stochastic sewing lemma with random controls (Theorem 4.7). This proves Theorem 2.8.
Thus, we see that regularization estimates (Lemmas 5.2 and 5.7) play a very important role in our proofs. They are obtained using a flexible toolkit of stochastic sewing, which extends upon the original stochastic sewing from [Lê20]. For the convenience of the reader, all sewing results are stated separately in Section 4.

Proofs of the main results
In this section we prove the main results stated in Section 2.1. The technical parts, including the regularization estimates, are stated as propositions. The proofs of these propositions are postponed to the following sections. First, we set up some necessary notation.
For 0 S < T we denote by ∆ S,T the simplex {(s, t) : S s t T }. Let (Ω, F , (F t ) t 0 , P) be a complete filtered probability space on which the white noise W is defined. We assume that the filtration F = (F t ) t∈[0,T ] satisfies the usual condition and that W is (F t )-white noise. We will write E s for the conditional expectation given F s For a random process Z : [0, T 0 ] × D × Ω → R we will denote by (F Z t ) its natural filtration. If G ⊂ F is a sub-σ-algebra, then we introduce the conditional quantity which is a G -measurable non-negative random variable. It is evident that for 1 m n ∞ one has It follows from (3.2) that for 1 m n ∞ Let BL m , where m 1, be the space of all measurable functions D × Ω → R such that Before we begin the proofs of our main results, we would like to claim that in Theorems 2.6 and 2.10 it suffices to consider only the case q ∈ [2, ∞], β < 0. Indeed, recall the following embedding between Besov spaces ([BCD11, Proposition 2.71]) This means that the results of Theorem 2.6(ii) for b in B β q with q ∈ [1, 2) are consequences of those with larger integrability components q. Exactly the same argument is valid for Theorem 2.6(i) and Theorem 2.10. Hence, we assume without loss of generality hereafter that q ∈ [2, ∞]. Similarly, thanks to embedding B β q ֒→ B −β ′ q for all β, β ′ > 0, we see that the statements of Theorems 2.6 and 2.10 for b ∈ B β q with β 0 follows from the results of these theorems for some β < 0. Hence, we can also assume without loss of generality that β < 0. To summarize, we have the following Assumption 3.1. From now on and till the end of this section we fix . We assume that β − 1/q > −3/2. We begin with the proof of the existence of the solutions to (1.1). It consists of several steps.
Let u η;f be the solution to Eq(η; f ). Then there exists a constant C = C(β, q, m, T 0 ) > 0 independent from η, f such that To formulate the next two statements we consider the space is a Polish space and metrizable by the following metric, similar to (2.8), Proposition 3.4 (Stability). Let (b n ) n∈Z + be a sequence of bounded continuous functions converging to b in B β− q . Let (u n 0 ) n∈Z + be a sequence of functions from B(D) converging to u 0 uniformly on D. Let V n be a random element having the same law as V . Assume that u n is a strong solution of Eq(u n Suppose that there exist measurable functions u, V : Then the function is a solution to (1.1) with the initial condition u 0 and for any m ∈ [2, ∞) there exists The proofs of Propositions 3.2-3.4 are presented in Section 5.1.
Combining the above propositions we obtain the following corollary, which immediately implies Theorem 2.6(i). This corollary will be also important to show the existence of strong solutions to (1.1).
Corollary 3.5. In the setting of Proposition 3.3 the following holds. There exists a filtered probability space ( Ω, F, (1) both v ′ and v ′′ are adapted to the filtration ( F t ) and are weak solutions to (1.1) with the initial condition u 0 ; (2) there exists a subsequence (n k ) such that (3.9) (3) for V defined as in (2.4) with W in place of W the following holds: (3.10) Proof. By Proposition 3.3, there exists a subsequence (n k ) such that ( u ′ n k , u ′′ n k , V ) k∈Z + converges weakly in the space [C uc ([0, T 0 ] × D)] 3 . By passing to this subsequence, to simplify the notation, we may assume without loss of generality that ( u ′ n , u ′′ n , V ) converges weakly. Since this space is Polish, we can apply the Skorohod representation theorem [Bil99, Theorem 6.7] and deduce that there exists a sequence of random elements ( v ′ n , v ′′ n , V n ) defined on a common probability space ( Ω, F, P ) and a random element We see now that all the conditions of Proposition 3.4 are satisfied. Applying this result, we see that v ′ and v ′′ are solutions to (1.1) in the sense of Definition 2.3 with V in place of V . Define It follows immediately from the definition of the white noise that for any (s, t) ) dx is independent of F s . Thus, by Lemma B.2, there exists an ( F t )-white noise W such that (2.4) holds for W in place of W and V in place of V . Hence v ′ and v ′′ are weak solutions to (1.1) and they are adapted to the same filtration ( F t ).
Finally, it remains to note that (3.10) follows now from (3.8).
Proof of Theorem 2.6(i). Let (b n ) be a sequence of smooth functions converging to b in B β− p . Applying Corollary 3.5 with b ′ n = b ′′ n = b n and u ′ 0,n = u ′′ 0,n = u 0 we obtain existence of a weak solution v ′ . By (3.10), where the first inequality follows from (3.2). Hence v ′ ∈ V(1 + β 4 − 1 4q ). Now we move on to the proofs of strong existence and uniqueness of solutions to (1.1).
Proposition 3.6 (Uniqueness). Suppose additionally that Suppose that u, v are adapted to the filtration (F t ) t∈[0,T 0 ] and belong to the class V(3/4). Assume further that for some m 2 (3.12) The proof of Proposition 3.6 is given in Section 5.2. The proof of strong existence uses the following statement from [GK96]. For the convenience of the reader we provide it here.
Proposition 3.7 ([GK96, Lemma 1.1]). Let (Z n ) be a sequence of random elements in a Polish space (E, ρ) equipped with the Borel σ-algebra. Assume that for every pair of subsequences (Z l k ) and (Z m k ) there exists a further sub-subsequence (Z l kr , Z m kr ) which converges weakly in the space E × E to a random element w = (w 1 , w 2 ) such that w 1 = w 2 a.s.
Then there exists an E-valued random element Z such that (Z n ) converges in probability to Z.
Proof of Theorem 2.10. We will use Proposition 3.7. Fix a sequence (b n ) of bounded continuous functions converging to b in B β− q and a sequence (u 0,n ) of functions from B(D) converging to u 0 . Let u n be the strong solution to Eq(u 0,n ; b n ). Define be two arbitrary subsequences of (b n , u n ). We apply Corollary 3.5. It follows that there exists a filtered probability space ( Ω, F, . We note that (3.10) together with (3.2) implies that for any m 2 where we used the fact that 1 + β/4 − 1/(4q) 3/4. Thus the pair (v ′ , V ) satisfies (3.12) and v ′ belong to the class V(3/4). Similarly, v ′′ ∈ V(3/4). Thus, we see that all the assumptions of Proposition 3.6 are satisfied and we can conclude that v ′ = v ′′ a.s. By definition, this implies that v ′ = v ′′ a.s.
Thus, all the conditions of Proposition 3.7 are met. Hence there exists a C uc ([0, T 0 ]×D)valued random element u such that u n converges to u in probability as n → ∞. Set now Applying Proposition 3.4, we see that u is a solution to (1.1) with the initial condition u 0 .
Thus, u is a strong solution to (1.1). From the convergence of probability of u n to u, we get that for any Further, by the assumptions of the theorem where we used the fact that |P t f (x)| sup y |f (y)| for any bounded function f . Combining (3.13) and (3.14), we obtain (2.6). Finally, part (3) of the theorem follows from (3.8) and the fact that 1 + β 4 − 1 4q 3 4 .
Proof of Theorem 2.6(ii). By Theorem 2.10, there exists a strong solution u to (1.1) satisfying (3.12). If v is another strong solution to (1.1) in the class V(3/4), then, by Proposition 3.6 u = v. This shows strong uniqueness of solutions to (1.1).
Proposition 3.8. Suppose that b is a non-negative finite measure, then every solution of (1.1) belongs to the class V(3/4).
The proof of Proposition 3.8 is given in Section 5.3.
Proof of Theorem 2.8. Let u 0 be a bounded measurable function. Since measures belong to B 0 1 , Theorem 2.6 yields existence and uniqueness of a strong solution u to (1.1) in V(3/4) starting from u 0 . On the other hand, by Proposition 3.8, every solution to (1.1) belongs to V(3/4) and thus has to coincide with u, thus completing the proof.
Proof of Corollary 2.11. The proof uses an idea similar to [GP93b, Proof of Theorem 2.4].
for any x ∈ R, thanks to the definition of the partial order. Then, using again that b ′ n and b ′′ n are smooth and bounded, the standard comparison principle (see, e.g., [GP93b,Theorem 2 as n → ∞. Therefore, by passing to the limit as n → ∞ in (3.15), we get for any fixed t > 0, x ∈ D by Theorem 2.10.
Since u ′ and u ′′ are continuous, this implies that a.s.
. By a change of variables, we havē Note that the random fields V λ and V have the same probability law. Hence,ū λ is a weak Applying Theorem 2.10, we see that if ρ = 1, then the processū λ converges weakly in the space C uc ((0, 1] × R) as λ → ∞ to the solution of (2.13). Similarly, if ρ ∈ (1, 3/2) the same theorem implies that the processū λ converges weakly in the space C uc ((0, 1] × R) as λ → ∞ to the solution of (2.15).

Stochastic sewing lemmas
We present three extensions of the stochastic sewing lemma introduced earlier in [Lê20]. More precisely, we incorporate singularities, critical exponents and random controls in the stochastic sewing lemma. In addition, we also provide estimates in some conditional moment norms, inspired by the stochastic sewing in [FHL21]. As we will see in later sections, singularities allow for improvements of regularities and broaden the scope of applications of the stochastic sewing techniques (see for instance Lemma 6.1 and Corollary 6.2 in Section 6). The result with random controls (Theorem 4.7 below) is used in Proposition 3.8 to obtain a priori estimates for solutions to Eq. (1.1) when the drift b is a measure. The stochastic sewing result for critical exponent is used to prove Proposition 3.6, that is strong uniqueness for Eq. (1.1) when β − 1/q = −1. Finally, the estimates in conditional moment norms are also used in Proposition 3.2 which is later used in Proposition 3.6 to prevent a loss of integrability. We believe that these results complement [Lê20, Theorem 2.1], [FHL21, Theorem 2.7] and form a toolkit which is also of independent interest and can be useful for other purposes.

Statements of stochastic sewing lemmas
Till the end of this section we fix a time horizon T ∈ (0, ∞) and a filtered probability The mesh size of a partition Π of an interval will be denoted by For each α, β ∈ [0, 1), (s, t) ∈ ∆ S,T define the function (4.1) Recall the notation · Lm|Ft introduced in (3.1).
Remark 4.3. In view of (3.2), condition (4.3) follows from the following simpler condition. There exist constants (4.7) Sometimes, it might be useful to apply the following modification of stochastic sewing lemma. a.s. (4.8) Then there exists a constant C = C(ε 1 , ε 2 , m) independent of S, T such that for every (s, t) ∈ ∆ S,T we have The next result, which is used later in the proof of Proposition 3.6, is inspired by the stochastic Davie-Grönwall lemma from [FHL21].
Then there exists a map B : ∆ S,T → L m and a constant C > 0, such that B is a functional of (s, t) → δA s,(s+t)/2,t and for every (s, t) ∈ ∆ S,T |A t − A s − A s,t | CΓ 1 |t − s| α 1 λ(s, t) β 1 + B s,t a.s. (4.14) and B s,t Lm CΓ 2 |t − s|

Proofs of stochastic sewing lemmas
The proofs of the results from Section 4.1 make use of the following common notation. For each integer k 0 and each (s, t) ∈ ∆ S,T with s > 0, let π k [s,t] = {s = t k 0 < t k 1 < · · · < t k 2 k = t} be the dyadic partition of [s, t]. For Let (s, t) ∈ ∆ S,T be fixed. For every k 0, we have . (4.17) Proof of Theorem 4.1. We estimate I k s,t by triangle inequality and condition (4.2), Using (B.6), it is easy to see that which implies To estimate J k s,t , we observe that it is a sum of martingale differences and use the conditional Burkholder-Davis-Gundy (BDG) inequality (see, e.g., [CF16,Proposition 27]) to obtain where κ m is the constant from the conditional BDG inequality. Then, we use the Minkowski inequality, condition (4.3) and similar reasoning as above, to see that Hence, we have shown that This implies that for some constant C = C(ε 1 , ε 2 , m). By sending k → ∞, using (4.4) and Fatou's lemma, we obtain (4.5). We observe that E S J k s,t = 0, the relation (4.16) also yields E S (A k+1 s,t − A k s,t ) = E S I k s,t . In view of the estimate (4.18), we obtain This yields (4.20) Sending k → ∞ and reasoning as previously, we obtain (4.6).
Proof of Theorem 4.4. The proof goes along exactly the same lines as the proof of Theorem 4.1 till (4.19). Rewriting this inequality, we derive By passing to the limit as k → ∞ and using (4.8) and Fatou's lemma, we obtain (4.9). Inequality (4.10) follows similarly from (4.20), (4.8) and Fatou's lemma.
Proof of Theorem 4.5. The term J k s,t is estimated as in the proof of Theorem 4.1, which gives J k s,t Lm On the other hand, we estimate I k s,t differently, using triangle inequality and condition (4.11) in the following way I k s,t Lm Hence, in view of (4.16), we have shown that for any However, recalling (4.18), we still have which together with (4.21) provides an alternative bound for A k+1 s,t − A k s,t Lm . Namely, (4.23) Combining (4.22) and (4.23) together, we get that there exists a constant C = C(ε 1 , ε 2 , ε 4 , m) such that for any fixed integers 0 k N where in the first sum we have applied (4.22) and in the second sum we used (4.23). Similar to the proof of Theorem 4.1, we pass now to the limit as N → ∞ (note that k remains fixed) with the help of Fatou's lemma and (4.4). We deduce Now, let us fine-tune the parameter k. If Γ 3 Γ 1 T ε 1 , we choose k = 1 and the previous inequality implies (4.12). If Γ 3 < Γ 1 T ε 1 , we can choose k 1 so that 2 −kε 1 Γ 1 T ε 1 Γ 3 2 (1−k)ε 1 Γ 1 T ε 1 which optimizes the right-hand side above and contributes the logarithmic factor. This gives (4.12).
Proof of Theorem 4.7. We use a slightly different version of (4.16) (4.24) By (4.13), where the last inequality follows from the Hölder inequality and superadditivity of the random control λ. Note thatJ k s,t is the sum of martingale differences, hence, can be estimated analogously to J k s,t as in the proof of Theorem 4.1. Applying the BDG and Minkowski inequalities, we have Using condition (4.3) (with n = m and α 2 = β 2 = 0), we have Thus, we get from (4.24) and (4.25) Define B s,t := ∞ k=0 |J k s,t |. Using (4.26) and triangle inequality, we see that B satisfies (4.15).

Proofs of key propositions
In this section, we present the proofs of the propositions from Section 3. The regularization estimates which are necessary for the proofs are summarized in Lemma 5.1 and Lemma 5.2 below. The proofs of these lemmas are presented in Section 6.1. We recall (3.5) which defines the space BL m .
Lemma 5.1. Let f ∈ B γ p be a bounded function. Let m ∈ [2, ∞), p ∈ [m, ∞] and γ ∈ (−2, 0). There exists a constant C = C(γ, m, p) such that for any 0 s t T and any B(R) ⊗ F s -measurable function κ ∈ BL m one has t s D Then the following statements hold.
(i) There exists a constant C = C(γ, p, τ, m) > 0 independent of T such that for any There exists a constant C = C(γ, p, τ, δ, m) > 0 independent of T,T such that for any S ∈ [0, T ] Remark 5.3. Estimate (5.3) is an analogue of the following estimate by Davie in [Dav07b] for Brownian motion which holds for every s t, x 1 , x 2 ∈ R d and bounded measurable f : R d → R d . Noting that the map x → (f (x + x 1 ) − f (x + x 2 ))/(|x 1 − x 2 |) has finite B −1 ∞ -norm and therefore (5.6) is indeed an estimate with distributions. A closely related estimate is of the type where p d + 1, θ = θ(p) = 1 − d 2p , X is a martingale of the form X t = t 0 σ r dB r , σ is adapted, Λ −1 |σ r | Λ for all r, Λ is a deterministic positive constant. Estimate (5.7) follows from a general result of Krylov in [Kry87] and an argument similar to [GM01, Corollary 3.2]. When f is a non-negative function, by expanding moment and successively conditioning, one can obtain from the above estimate that for every integer m 2, s t, which is comparable to Davie-type estimate (5.6). However, the fact that f is a nonnegative function is crucial and in particular, one cannot obtain (5.6) from such an argument. As observed in [Lê20] for the case of fractional Brownian motion, one can indeed obtain estimates of the types (5.6) and (5.8) from estimates of the type (5.7) by mean of the stochastic sewing lemma for f being a distribution provided that θ > 1/2. This passage is also visible in the proof of Lemma 5.2 in Section 6.
Note that s k − s k−1 ℓ 0 , we can apply (5.10) to obtain that for any y ∈ D Hence we can continue (5.11) in the following way Since N < 1 + T 0 /ℓ 0 , this implies (3.7).
To bound the second term in (5.14), I 2 , we use Lemma 5.2(i) with the parameters described in (5.15) as well as n = m, x = x 2 , S = s. We get where we used again bound (5.16). Finally, let us bound I 3 . We note that In view of (5.16), we can apply Lemma 5.2(iii) to obtain Now combining this with (5.17), (5.18) and substituting into (5.14), we arrive at where we used the fact that 1 + β 4 − 1 4q > 1 2 > δ ′ 2 and denoted R := h B β q (1 + f B β q ). Recall that δ ∈ (0, δ ′ ) and m is arbitrarily large. Then, by the Kolmogorov continuity theorem (which is an easy extension of [Kun90, Theorem 1.4.1]), there exists a random variable H(ω) such that for any ω ∈ Ω, x 1 , x 2 ∈ D, s, t ∈ [0, T 0 ] we have and EH(ω) C(T 0 )R. This completes the proof of the theorem.
Proof of Proposition 3.4. Step 1. We show that u is a solution to (1.1). We define for  By triangle inequality, we decompose for any k, n ∈ Z + , x ∈ D, t ∈ [0, Let us estimate successively all the terms in the right-hand side of (5.22). Since for any fixed n the functionb n is a smooth bounded function, we have for any |x| R, x ∈ D, M > R We use triangle inequality and the estimate |P t ϕ(x)| sup y∈D |ϕ(y)| valid for any bounded measurable function ϕ to obtain that By assumption, the above implies that u k converges to u in C uc ([0, T 0 ] × D) in probability. Hence, in the previous estimates for I 1 , we send k → ∞ then M → ∞ to see that To bound I 2 we fix β ′ < β such that β ′ − 1/q > −3/2. This is possible thanks to Assumption 3.1. We apply Lemma 5.4 with h =b n − b k , f = b k , η = u k 0 , x 1 = x 2 = x, s = 0, β ′ in place of β. We get that there exists a random variable H n,k such that where again the constant C does not depend on n, k. Thus for any ε > 0 one has To treat I 3 , we first derive from (5.12) and the definition of u k that for any k ∈ Z + , u k = K b k ;u k + V k . Hence, together with (5.20), we have This implies that Step 2. It remains to show (3.8). It follows from Proposition 3.2, that there exists a constant C such that for every (s, t) Note that we used here that sup n∈Z + b n B β q < ∞ thanks to the definition of convergence in B β− q . It follows from the mild formulation of u n (recall that u n solves Eq(u n Hence, we have Putting s = 0, the previous estimate implies that On the other hand, we see from (5.24) that lim n→∞ K b n ;u n = K in C uc ([0, T 0 ] × D) in probability. Hence, by passing to the limit as n → ∞ in (5.26) and applying Fatou's lemma, we see that sup (t,x)∈[0,T 0 ]×D K t (x) Lm < ∞. By Lemma B.4, we see that P t−s K s (x) is well-defined as an L m -integrable random variable. Furthermore, in view of (5.26) and the convergence of K b n ;u n to K, we obtain from Lemma B.5 that for each fixed 0 s t T 0 , Therefore, we can pass to the limit as n → ∞ in (5.25) and apply Fatou's lemma to obtain that

Proof of Proposition 3.6
In this subsection we will use the following additional notation. Let (S, T ) ∈ ∆ 0,T 0 . For a measurable function Z : ∆ S,T × D × Ω → R, τ ∈ [0, 1], m 1 we put Till the end of the subsection fix the parameters β, q satisfying the conditions of Proposition 3.6 and b ∈ B β q . We fix also (u t ) t∈[0,T 0 ] , (v t ) t∈[0,T 0 ] ∈ V(3/4), which are as in the statement of Proposition 3.6.
We define Our goal is to prove that z(t) = 0 for all t ∈ [0, T 0 ]. Fix m ∈ [2, ∞) such that m q. We see that (3.12) and the fact that u, v ∈ V(3/4) implies that This in turn yields that for any t ∈ [0, T 0 ] one has where the space BL m is introduced in (3.5).
Recall that the process v satisfies condition (2 ′ ) of Remark 2.7. We fix a sequence of smooth functions (b n ) which appeared there. For n ∈ Z + introduce the process Define H n,ϕ s,t (x) in a similar way with ϕ s in place of ψ s in the right-hand side of (5.29). Note that the expressions P t−r ψ s and P t−r ϕ s are well-defined thanks to (5.28) and Lemma B.4.
Our first step in obtaining uniqueness is to pass to the limit as n → ∞ in (5.29).
Furthermore, there exists a constant C > 0 such that for any (s, t) ∈ ∆ 0,T 0 we have The lemma is proved using the stochastic sewing lemma. We postpone the proof till Section 6.2.
Recall the notation (5.12). Denote K u := ψ − P u 0 and K v := ϕ − P u 0 . It follows from Definition 2.3 and condition (2 ′ ) of Remark 2.7, that K b n ;u (t, x) → K u (t, x) and Lemma 5.6. For every fixed (s, t) ∈ ∆ 0,T 0 , x ∈ D we have It follows from Lemma 5.5, that we can now define The next result is crucial for proving that z ≡ 0 and thus obtaining strong uniqueness.
Lemma 5.7. There exists δ = δ(β, q) ∈ (0, 1/2) such that for any τ ∈ (1/2, 1] there The proof is presented in Section 6.2, in which we use the stochastic sewing lemma with critical exponent, Theorem 4.5. Now we are ready to prove the main result of this subsection: uniqueness of solutions of equation (1.1).
Step 2. We show that the map t → z t BLm is continuous on [0, T 0 ]. By triangle inequality, we have for every (s, t) ∈ ∆ 0,T 0 , From (5.35), it is clear that lim t↓s z t − P t−s z s BLm = 0 and lim s↑t z t − P t−s z s BLm = 0. It remains to consider the last term in the above estimate. Since u = P u 0 + K u + V and v = P u 0 + K v + V by definition, we see that z = K u −K v . Hence, it suffices to show that P t−s K u s − K u s and P t−s K v s − K v s converge to 0 in BL m as t ↓ s and s ↑ t. We have for each x ∈ D, n ∈ Z + , Fix arbitrary ε ∈ (0, 1). Let us apply Lemma 5.2(iii) with f = b n , γ = β, p = q, S = 0, T = s,T = t − s, δ = ε, τ = 3/4. We see that condition (5.2) is satisfied and thus we obtain where we also used the fact that b n B β q b B β q . Applying Lemma 5.6 and Fatou's lemma, we can pass to the limit as n → ∞ in the above inequality to obtain by Fatou's lemma that This implies that lim P t−s K u s − K u s BLm = 0 as t ↓ s and s ↑ t. The convergence of P t−s K v s − K v s to 0 is obtained by exactly the same way.
Step 3. We show by contradiction that z ≡ 0. Suppose that z t BLm is not identically 0 on [0, T 0 ]. Choose k 0 1 such that 2 −k 0 < sup t∈[0,T 0 ] z t BLm . Then for each integer k k 0 , define It is evident that each t k is well defined. In addition, z t BLm < 2 −k for t < t k while z t k BLm = 2 −k by continuity shown in the previous step. Consequently, the sequence {t k } k k 0 is strictly decreasing. For k sufficiently large so that t k −t k+1 ℓ, estimate (5.35) with (s, t) = (t k+1 , t k ) yields On the other hand, by (B.8), P t k −t k+1 z t k+1 BLm z t k+1 BLm = 2 −k−1 and hence by triangle inequality, It follows that which implies that t k − t k+1 C (1 + k) −1 for some constantC. This implies that k k 0 (t k − t k+1 ) = ∞, which is a contradiction because {t k } is a decreasing sequence in [0, T 0 ]. We conclude that z ≡ 0, and hence, u = v.

Proof of Proposition 3.8
Let u be a solution to (1.1) and m be arbitrary in [2, ∞). Since b is a non-negative measure, we can choose a sequence of smooth bounded non-negative functions (b n ) which converges By Definition 2.3, we see that K n converges to K in C uc ([0, T 0 ]×D) in probability. By passing through a subsequence, we can assume without loss of generality that this convergence is almost sure. Hence, we can find Ω * ⊂ Ω such that P(Ω * ) = 1 and that K n (ω) converges to ω) is well-defined as a non-negative measurable function. We note that at this stage, we do not rule out the possibility that P T −t K t (x, ω) may take infinite value. We divide the proof into several steps.
Step 1. Fix arbitrary ω ∈ Ω * . We show that for every 0 s t T T 0 and x ∈ D. (5.37) For simplicity, we omit the dependence on ω. By definition, we have which implies that P T −t K n t (x) P T −s K n s (x) for every 0 s t T T 0 and x ∈ D. In particular, setting T = t, one gets P t−s K n s (x) K n t (x). Applying this inequality and Fatou's lemma, we have This shows that P t−s K s (x) K t (x). Applying P T −t on both sides, we obtain (5.37).
Step 2. Define ψ = u − V = K + P u 0 and We claim that Indeed, for each integer j 1, define ψ j t (x) := (K t (x) ∧ j) + P t u 0 (x) which belongs to BL m . We note that measures belong to B 0 1 , which is embedded in B 1/m−1 m (see (3.6)). Applying Lemma 5.1, we have for a universal constant C > 0. By the Lebesgue monotone convergence theorem, we have lim j→∞ P r−s ψ j s (y) = P r−s ψ s (y) for every r, s, y. Then by the Lebesgue dominated convergence theorem, we see that the left-hand side above converges to A T,n s,t (x) Lm as Step 3. We show by mean of Theorem 4.7 that for every n ∈ Z + , (s, t) ∈ ∆ 0,T 0 there exist a non-negative measurable map (x, ω) → L n s,t (x, ω) and a deterministic finite constant C such that Hence, (s, t) → λ T s,t (x) is a random control per Definition 4.6.
Define A T,n s,t (x) as in the previous step and Then for u := (s + t)/2 we have Applying consequently the Fubini theorem, (C.7), Lemma A.3(iv) and Lemma C.3, we deduce that where we used the notation ρ t (x) := Var(V t (x)). Since ψ = P u 0 + K, we see that P r−u ψ u − P r−s ψ s = P r−u K u − P r−s K s . Using the elementary inequality and (5.37), we get As we have shown previously, (s, t) → λ T s,t (x) is a random control for every fixed T, x. Hence, the above estimate verifies condition (4.13). The estimate (5.38) verifies condition (4.3) (with α 2 = β 2 = 0 and n = m). It remains to verify condition (4.4). Let Π := {0 = t 0 , t 1 , ..., t k = T } be an arbitrary partition of [0, T ]. Denote by |Π| its mesh size. Then we have For each i, using the fact that b n is Lipschitz, we have where we have used (5.37). Hence, combining with the above estimate and applying (5.37) once again yield

It follows that
which converges to 0 a.s. as |Π| → 0. Thus, condition (4.4) holds. Applying Theorem 4.7, we have where B T,n s,t (x) Lm C|t − s| 3/4 uniformly in T, n, x. To obtain (5.39), it suffices to put T = t, L n s,t (x) = B t,n s,t (x) + |A t,n s,t (x)|. Since B T,n is a functional of δA T,n (x), measurability of (x, ω) → L n s,t follows. The uniform moment estimate for L n s,t (x) follows from those of B T,n s,t (x) and A T,n s,t (x) in (5.38).
Step 4. We show that lim n→∞ P r K n t (x) = P r K t (x) in probability for every t ∈ [0, T 0 ], r ∈ [0, T 0 − t] and x ∈ D.
The situation here is similar to Lemma B.5 except for a uniform moment bound for K n t (x). We replace this condition by estimate (5.39) and the uniform moment bound for L n 0,t (x). Indeed, let x ∈ D be fixed and let M > 0 be a positive number. We write Since K n t converges to K t uniformly on {y ∈ D : |y| M}, we see that |y| M p r (x, y)K n t (y)dy converges to |y| M p r (x, y)K t (y)dy a.s. To treat the second term, we first set s = 0 in (5.39) to obtain that |K n t (y)| Ct 1 2 K t (y) + L n 0,t (y) a.s. ∀y ∈ D.
It follows that |y|>M p r (x, y)K n t (y)dy Ct 1 2 |y|>M p r (x, y)K t (y)dy + |y|>M p r (x, y)L n 0,t (y)dy.
By the Lebesgue monotone convergence theorem and (5.37), we see that a.s. Lastly, which converges to 0 as M → ∞. These facts imply the claim.
Step 5. We define L s,t (x) = lim inf n L n s,t (x) then by Fatou's lemma and Step 3, L s,t (x) Lm C|t − s| 3/4 uniformly in x. In (5.39), we send n → ∞, applying the convergence in step 4 to obtain that This implies that K t (x) − P t−s K s (x) 2L s,t (x) for t − s ℓ and ℓ is such that Cℓ 1/2 1 2 . An application of L m -norm yields for every t − s ℓ. Using the identity K t − P t−s K s = ψ t − P t−s ψ s once again and recalling that ψ = u − V , the above estimate shows that u − V belongs to C 3/4,0 L m ([S, T ]). This completes the proof of Proposition 3.8.

Proofs of regularity lemmas
In this section, we present the proofs of Lemmas 5.1, 5.2, 5.5, 5.6 and 5.7. Throughout the section, we fix the filtration (F t ) t 0 , which appears in the aforementioned lemmas.

Proof of Lemmas 5.1 and 5.2
We begin with the following auxiliary result, which will be crucial also for the proof of Proposition 5.7. The proof relies on the stochastic sewing lemma (Theorem 4.1) and the estimates in Lemma C.3 and Lemma C.4. 1) for any fixed (z, r, x) ∈ R×[S, T ]×D the random variable h(z, r, x) is F S -measurable; 2) there exists a constant Γ h > 0 such that Then there exists a constant C = C(γ, m, n, p) which does not depend on S, T , Γ h , Γ X , h, K such that for any t ∈ [S, T ] t S D X r (y)h(V r (y), r, y) dydr Proof. The proof is based on the stochastic sewing lemma, Theorem 4.1. We put for S s < t T , A s,t := E s t s D X r (y)h(V r (y), r, y) dydr.
Finally, let us check condition (4.4). Let Π := {S = t 0 , t 1 , ..., t k = t} be an arbitrary partition of [S, t]. Denote by |Π| its mesh size. Note that for any is a sum of martingale differences. Then using orthogonality, Therefore, k−1 i=0 A t i ,t i+1 converges to A t in probability as |Π| → ∞ and thus condition (4.4) holds.
It is easy to see that (6.2) is satisfied with Γ X = 1 and ν = 0. Let us verify that h satisfies all the assumptions of Lemma 6.1 with s in place of S. The first assumption clearly holds. To check the second assumption, we note that (A.5) implies BLm , where in the first inequality we used the fact that |ξ| λ Lm ξ λ Lm for any random variable ξ and λ ∈ (0, 1]; the second inequality follows from (B.8); the third inequality follows from the definition of the norm · BLm , see (3.5). Thus, inequality (6.1) is satisfied BLm . Thus all the conditions of Lemma 6.1 are met. Bound (6.9) follows now from (6.3) and (3.2).
(iii). We fix x ∈ D, choose n = m, X r (y) := p T −r (x, y) and where z ∈ R, r ∈ [u, t], y ∈ D, ω ∈ Ω. Let us verify h satisfies the assumptions of Lemma 6.1 with u in place of S. Again, it is easy to see that the first and the third conditions of Lemma 6.1 are satisfied with Γ X = 1 and ν = 0. To check the second condition, we note that (A.6) yields h(·, r, y) B γ−λ 1 −λ 2 p f B γ p |P r−u κ 1 (y) − P r−u κ 2 (y)| λ 1 |P r−u κ 1 (y) − P r−u κ 3 (y)| λ 2 . Using the fact that the functions κ 1 and κ 2 are F s -measurable, we derive from the above bound Here in the penultimate inequality we used the fact that E s [|ξ| λ ] (E s |ξ|) λ for any random variable ξ and λ ∈ (0, 1], s 0; in the last inequality we applied bound (B.8).
Bound (6.11) follows now from Lemma 6.1 and part (ii) of the Corollary.
To establish Lemma 5.2, we need the following result.
Note that the expression P r−s ψ s (y) is well-defined thanks to Lemma B.4 and (6.13). Set now It is easy to see that the integral in the left-hand side of inequality (6.12) is A T − A S . Let us verify that all the conditions of Theorem 4.1 are satisfied. Clearly, for any S s < u < t T we get For fixed S s < u < r T , y ∈ D introduce a function h r,y : R × Ω → R h r,y : (z, ω) → f (z + P r−s ψ s (y)) − f (z + P r−u ψ u (y)).
Note that for fixed non-random parameters the random variable h r,y (z) is F u -measurable. Hence, applying (C.9) with u, r in place of s, t, respectively, we deduce where the last inequality follows from (A.5) with λ = 1. Applying the integral Minkowski inequality, we derive E s |P r−s ψ s (y) − P r−u ψ u (y)| Ln dydr.
Finally, let us check condition (4.4). Let Π := {S = t 0 , t 1 , ..., t k = t} be an arbitrary partition of [S, t]. Denote by |Π| its mesh size. Note that contrary to the proof of Lemma 6.1, we cannot use here the orthogonality because the sum is not a sum of martingale differences. Indeed, in contrast to the proof of Lemma 6.1, By exactly the same argument, we see that the sequence (H k,ϕ s,t (x)) k∈Z + converges in probability to a limit which we denote by H ϕ s,t (x). Now, applying the Fatou's lemma we derive for each x ∈ D, (s, t) ∈ ∆ 0,T 0 , λ ∈ [0, 1] where the second inequality follows from Corollary 6.2(ii) with f = b k , γ = β, p = q, κ 1 = ϕ s , κ 2 = ψ s . Taking into account that by assumption sup k∈Z + b k B β q < ∞ and β − 1/q −1, we obtain Now by taking λ = 1 in this bound we get (5.30), and by taking λ = 0 we get (5.31).
Proof of Lemma 5.6. Since u belongs to V(3/4) and sup n b n B β q < ∞, we obtain from Lemma 5.2(i) that Therefore for any fixed x ∈ D the sequence (K bn;u s (x)) n∈Z + is uniformly integrable. Recalling that K bn;u s (x) converges to K u s (x) in probability, we get Thus, by the dominated convergence theorem (here we once again made use of (6.2)). Thus, P t−s K bn;u s (x) → P t−s K u s (x) in probability as n → ∞. The convergence of P t−s K bn;v s (x) to P t−s K v s (x) in probability is obtained by exactly the same argument. Proof of Lemma 5.7. The proof is based on the stochastic sewing lemma with critical exponent, Theorem 4.5. Let us verify that all the conditions of this theorem are satisfied.
A key tool for the verification will be Corollary 6.2. Fix 0 S T , x ∈ D, τ > 1/2. Recall that we are given a sequence of smooth functions We note that P r−s ψ s (y) and P r−s ϕ s (y) are well-defined thanks to Lemma B.4 and (5.28). Put From now on we fix k ∈ Z + , and will drop the superindex k in A k and A k . Let us verify that all the conditions of Theorem 4.5 are satisfied. We get for any S s < u < t T δA s,u,t = t u D p T −r (x, y) b k (V r (y) + P r−s ψ s (y)) − b k (V r (y) + P r−s ϕ s (y)) − b k (V r (y) + P r−u ψ u (y)) + b k (V r (y) + P r−u ϕ u (y)) dydr.
One can see that the functions κ 1 and κ 2 are B(R) × F s measurable, and the functions κ 3 and κ 4 are B(R) × F u measurable, exactly as required by the conditions of Corollary 6.2(iii). Further, we see that κ i ∈ BL m , i = 1, .., 4, thanks to (5.28) and (B.8).

A. Useful results on Besov spaces
We give a brief summary on nonhomogeneous Besov space which is sufficient for our purpose. For a more detailed account on the topic, we refer to [BCD11, Chapter 2]. Let ς, ̟ be the radial functions which are given by [BCD11, Proposition 2.10]. We note that ς is supported on a ball while ̟ is supported on an annulus. Let h −1 and h respectively be the inverse Fourier transform of ς and ̟. The nonhomogeneous dyadic blocks ∆ j are defined by where h j (y) = 2 j h(2 j y), j 0.
For a distribution f in B γ p , we note that ∆ j f is a smooth function for each j −1. In addition, the Fourier transform of ∆ −1 f is supported on a ball B while for j 1 the Fourier transform of ∆ j f is supported on the annulus 2 j N ⊂ 2 j B for some annulus N .
To obtain various properties of Besov spaces, we will make use of the following Bernstein's inequalities. Let f be a function in L p (R). For every integer k 0, every λ > 0 and t > 0 we have where F f denotes the Fourier transform of f . We refer to [BCD11, Lemmas 2.1 and 2.4] for proofs of (A.1) and (A.2). For a proof of (A.3), we refer to [MW17, Lemma 4].
Similarly, using mean value theorem and (A.1), we have and Interpolating between these inequalities, we obtain which is the Hilbert transform of h. It is known that Hilbert transform is bounded on L p (R) for every p ∈ (1, ∞). Hence, we have ∆ 0 ζ −1

Lp(R)
C h Lp(R) which is finite.

B. Other auxiliary results
Proposition B.1. Let Λ be a set and let (X n,λ ) n∈Z + ,λ∈Λ be a collection of random elements taking values in a metric space E. Let (Y n ) n∈Z + be a collection of random elements taking values in a metric space E. Suppose that for each fixed n the random element Y n is independent of (X n,λ ) λ∈Λ . Furthermore, assume that for each fixed λ ∈ Λ one has X n,λ → X λ and Y n → Y in probability as n → ∞. Then Y is independent of (X λ ) λ∈Λ .
Proof. Consider a collection of λ 1 , λ 2 , . . . , λ n for some n ≥ 1. Then we can construct a common subsequence such that X n k ,λ j → X λ j and Y n k → Y almost surely as k → ∞ for all 1 ≤ j ≤ n. h j (X n k ,λ j )g(Y n k )) → E( n j=1 h j (X λ j )g(Y )) as k → ∞ and also n j=1 E(h j (X n k ,λ j ))Eg(Y n k ) → n j=1 E(h j (X λ j ))E(g(Y )) as k → ∞ .
We have assumed for each k the random element Y n k is independent of (X n k ,λ j ) 1≤j≤n . Therefore from the above it is immediate that E n j=1 (h j (X λ j ))g(Y )) = n j=1 E(h j (X λ j ))E(g(Y )) As, n ≥ 1, λ 1 , λ 2 , . . . , λ n and h 1 , h 2 . . . h n , g were arbitrary, the result follows.
Lemma B.2 (Gaussian process representation). Let ( Ω, F, P) be a filtered probability space. Let V : [0, T 0 ] × D × Ω → R be a measurable function with the same law as V . Then on the same space there exists a white noise W such that identity (2.4) holds with V in place of V and W in place of W . Furthermore, (ii) suppose additionally that there exists a filtration ( F t ) t∈[0,T 0 ] such that F V t ⊂ F t and for any (s, t) ∈ ∆ 0,T 0 , ϕ ∈ C ∞ c the random variable D ( V t (x) − P t−s V s (x))ϕ(x) dx is independent of F s . Then W is ( F t )-white noise.
Proof. The result is probably well-known. However, we give a proof for the sake of completeness. In what follows we will use the following notation: f, g := D f (y)g(y) dy, for measurable functions f, g : D → R for which the above integral is well-defined. It is well-known that (2.4) is equivalent (see, e.g., [Shi94, Theorem 2.1]) to representing V as a solution to the additive stochastic heat equation in a distributional form: V t , ϕ = 1 2 t 0 V s , ∂ 2 yy ϕ ds + W t (ϕ), for any t ≥ 0, ϕ ∈ C ∞ c . (B.1) Since V has the same law as V , we immediately get that the functional has the same distributional properties as W . That is, for any ϕ ∈ C ∞ c , the process ( W t (ϕ)) t∈[0,T 0 ] is an (F V t )-Brownian motion with E W 1 (ϕ) 2 = ϕ 2 L 2 (D,dx) and clearly this also holds for any ϕ ∈ L 2 (D, dx) since C ∞ c is dense in L 2 (D, dx). Also W t (ϕ) and W t (ψ) are independent whenever ϕ, ψ ∈ C ∞ c with D ϕ(x)ψ(x) dx = 0, and again since C ∞ c is dense in L 2 (D, dx), this holds for any ϕ, ψ ∈ L 2 (D, dx) with D ϕ(x)ψ(x) dx = 0. This immediately implies that W is an (F V t )-white noise. Thus, V is a solution to and thus for any t ∈ [0, T 0 ], F V t ⊂ F W t . On the other hand, from (B.2) we also immediately get that F W t ⊂ F V t for any t ∈ [0, T 0 ], and (i) follows. Let us prove part (ii) of the proposition. We need just to show that for any (s, t) ∈ ∆ 0,T 0 , ϕ ∈ L 2 (D, dx) W t (ϕ) − W s (ϕ) is independent of F s .
By our assumptions, we get that the above stochastic integral is independent of F s . Clearly, since F s ⊂ F r for any s r, we get that for any (s 1 , s 2 ) ∈ ∆ s,t , where convergence is in L 2 (Ω). By (B.5) and properties of the white noise we get that Y n is independent of F s for all n, and hence the limit in L 2 (Ω) of this sequence, is also independent of F s . Thus we get that W t (ϕ) − W s (ϕ) is independent of F s , for any ϕ ∈ C ∞ c . We can easily get the same property for any ϕ ∈ L 2 (D, dx), by approximating such ϕ in L 2 (D, dx) by a sequence of functions ϕ n ∈ C ∞ c and again passing to the limit of corresponding sequence of random variables W t (ϕ n ) − W s (ϕ n ). Since u = (s + t)/2, t − S 2(u − S), we obtain (B.6) from the above inequalities. In addition, if f ∈ BL m , there exists a set Ω ′ ⊂ Ω of full measure such that for any ω ∈ Ω ′ P t f (x, ω) < ∞ for Lebesgue almost every (t, x) ∈ [0, T 0 ] × D. (B.9) Proof. It suffices to show the result assuming that f is non-negative. In such case, it is evident that P t f : Ω×R → R is measurable. Applying the conditional integral Minkowski inequality, we obtain that We then apply Minkowski inequality, to get Lemma C.2. For every α ∈ [0, 1], there exists a constant C = C(α, T 0 ) > 0 such that for any s, t ∈ (0, T 0 ], s t, x, x 1 , x 2 ∈ D we have D |p t (x 1 , y) − p t (x 2 , y)| dy C|x 1 − x 2 | α t −α/2 , (C.2) D p t (x, y)|y − x| α dy Ct α/2 , (C.3) D |p t (x, y) − p s (x, y)|dy Cs −α/2 (t − s) α/2 . (C.4) Proof. From the elementary estimate |g t (x 1 − y) − g t (x 2 − y)| C|x 1 − x 2 | α t −α/2 (g 2t (x 1 − y) + g 2t (x 2 − y)) and (2.2) and (2.3), we obtain that |p t (x 1 , y) − p t (x 2 , y)| C|x 1 − x 2 | α t −α/2 (p 2t (x 1 , y) + p 2t (x 2 , y)) for p ∈ {g, p per , p N eu }. Integrating over y ∈ D and note that D p 2t (x, y)dy = 1 for each x ∈ D, we obtain (C.2) for p ∈ {g, p per , p N eu }.
The estimate (C.3) for p t (x, y) = g t (x − y) follows easily by a change of variable. For the other cases, we note that g t (z + k) cg t (z)e −2 |k| t for every |z| 2 and every k ∈ R. Since n∈Z e −2 |n| t C(T 0 ), we see that from (2.2) and (2.3) that D p t (x, y)|y − x| α dy C(T 0 ) R g t (x − y)|x − y| α dy. The upper bound on ρ t is established in exactly the same way.
Lemma C.4. Let 0 s u t T 0 . Let f : R × Ω → R be a bounded B(R) ⊗ F smeasurable function. Then E u f (V t (x)) = G ρ t−u (x) f (P t−u V u (x)). (C.7) In addition, there exists a universal constant C = C(T 0 ) > 0 such that for every x ∈ D, γ < 0, n ∈ [1, ∞] and p ∈ [n, ∞] Proof. For s t introduce the process By definition of Z, we have for any 0 s u t, x ∈ D V t (x) = P t−u V u (x) + Z u,t (x). (C.10) It is immediate to see that Z u,t (x) is independent of F s and is Gaussian with zero mean and variance Var(Z u,t (x)) = t u p 2(t−r) (x, x)dr = ρ t−u (x). (C.11) Using (C.10), this yields E u f (V t (x))(ω) = [G ρ t−u (x) f (·, ω)](P t−u V u (x)), which is (C.7). Next, we show (C.8). To proceed, we further decompose P t−u V u (x) = P t−s V s (x) + P t−u Z s,u (x). The random variable P t−u Z s,u (x) = u s D p t−r (x, y)W (dr, dy) is independent of F s and has a Gaussian law with mean zero and variance  where the first inequality follows from (C.1). Hence, using (C.7) and the fact that f is F s -measurable, we have where g is the standard Gaussian density (2.1). Now we put q = p n 1, 1 q + 1 q ′ = 1 and apply Hölder inequality and (C.12) to estimate the above integral from above by R G ρ t−u (x) f (P t−s V s (x) + z) p dz n/p g ρs,u,t(x) (·) L q ′ , (C.14) The first factor in the above expression equals G ρ t−u (x) f (·, ω) n Lp(R) and thus is bounded above by C f (·, ω) n B γ p ρ t−u (x) γn 2 by Lemma A.3(i). The second factor in (C.14) is bounded by Cρ s,u,t (x) − n 2p for some constant C > 0, where we again used Lemma A.3(i) for Dirac delta. Taking into account (C.6), the lower bound for ρ s,u,t (x) in (C.12), we continue (C.13) as follows: Taking expectation, we get (C.8).