Some Convexity Criteria for Differentiable Functions on the 2-Wasserstein Space

We show that a differentiable function on the 2-Wasserstein space is geodesically convex if and only if it is also convex along a larger class of curves which we call `acceleration-free'. In particular, the set of acceleration-free curves includes all generalised geodesics. We also show that geodesic convexity can be characterised through first and second-order inequalities involving the Wasserstein gradient and the Wasserstein Hessian. Subsequently, such inequalities also characterise convexity along acceleration-free curves.


Introduction
1.1 Acceleration-Free Curves and Main Result.In the theory developed in the book of Ambrosio, Gigli and Savaré [2], the notion of geodesic semi-convexity is a key ingredient in establishing the existence and uniqueness of gradient flows in the 2-Wasserstein space (P 2 (R d ), W 2 ).However, without the stronger notion of semi-convexity along generalised geodesics, we can not use the theory developed in [2] to establish several other important properties of gradient flows such as stability and optimal error estimates.Despite the strength of convexity along generalised geodesics over geodesic convexity in this regard, it is shown in [2,Chapter 9] that, for the three energy functionals introduced by McCann in [14], these two notions coincide.This manuscript shows that this occurrence is not limited to these energy functionals.On the contrary, in section Section Three, we prove the following Theorem.
Theorem 1.1.Let F : P 2 (R d ) → R and let λ ∈ R. If F is differentiable on P 2 (R d ) then the following statements are equivalent.
Theorem 1.1 introduces the notion of acceleration-free curves (cf.Definition 3.1) which are, heuristically, curves of probability measures which describe the evolution of a density of particles for which the path of each particle is acceleration-free.For continuously differentiable functions on the 2-Wasserstein space, the notion of convexity along acceleration-free curves coincides with the first-order notions of L-convexity and displacement monotonicity, introduced by Carmona and Delarue (cf.[4,Definition 5.70]) and Ahuja (cf.[1, Equation 4.3]) respectively.The latter notion of displacement monotonicity has seen particular success in the study of mean field games (see Ahuja [1]; Gangbo et al. [10]; Mészáros and Mou [15]; Gangbo and Mészáros [9]) and suggests that the further study of acceleration-free convexity may be applicable in this area.
Since the set of acceleration-free curves includes every generalised geodesic, the notion of convexity along acceleration-free curves is stronger still than the notion of convexity along generalised geodesics.Consequently, in proving Theorem 1.1, our main focus is to show that every geodesically convex, differentiable function is also convex along acceleration-free curves.As we demonstrate in Example 3.9, the notions of geodesic convexity and accelerationfree convexity do not generally coincide, even if we assume our functions to be continuous.Consequently, the assumption of differentiability forms a part of our analysis.
1.2 Strategy of Proof -Theorem 1.1.In order to clearly understand accelerationfree curves and their associated notion of convexity, our analysis focuses on acceleration-free curves between discrete measures.Firstly, this is because such curves may be viewed, at the finite-dimensional level, as a collection of straight-line paths between a finite collection of points in R d .Secondly, we choose discrete measures because the set of discrete measures on R d is dense in (P 2 (R d ), W 2 ).
Consider [0, 1] ∋ t → µ t , an acceleration-free curve between discrete measures.Since the measure µ s is also discrete for all s ∈ (0, 1), we may envision the curve [0, 1] ∋ t → µ t as a collection of straight-line paths between a finite collection of points in R d .Moreover, given s ∈ (0, 1), there is no 'crossing of mass' between the underlying particles on a small interval on either side of s (even though there may be 'crossing of mass' when t = s itself).As we show in Lemma 3.4, this behaviour means that, for every s ∈ (0, 1), there exist ε, δ > 0 such that the curves [s, s + ε] ∋ t → µ t and [s − δ, s] ∋ t → µ t define geodesics.Consequently, if a function F : P 2 (R d ) → R is assumed to be geodesically convex, then the map t → F (µ t ) must be convex on the intervals [s, s + ε] and [s − δ, s].
In general, convexity on the intervals [s, s + ε] and [s − δ, s] does not mean that the map t → F (µ t ) is convex on [0, 1].This is because joining the two intervals [s − ε, s] and [s, s + ε] may create a non-convex 'cusp' at t = s.In order to circumvent the existence of these cusps, we show in Lemma 3.8 that no cusps can exist when [0, 1] ∋ t → F (µ t ) is differentiable.We also show, in Lemma 3.7, that [0 Subsequently, in the Proof of Theorem 1.1, we assume that F is differentiable and λgeodesically convex.As a consequence of the Lemmas 3.4, 3.7 and 3.8, it follows that F is λ-convex along any acceleration-free curve between discrete measures.Moreover, since F is continuous, the convexity criteria, Lemma 3.5, states that F must also be λ-convex along any acceleration-free curve.
1.3 Higher Order Convexity Criteria.Supplementary to our first result, we present Theorems 1.2 and 1.3.These theorems characterise geodesic semi-convexity for differentiable and twice differentiable functions on P 2 (R d ) respectively.Moreover, as a consequence of Theorem 1.1, these results also characterise semi-convexity along generalised geodesics and acceleration-free curves.Furthermore, whilst we only address geodesic semi-convexity in this manuscript, we expect that Theorems 1.2 and 1.3 may be suitably extended to characterise the notion of geodesic ω-convexity introduced by Craig in [7].
then F is λ-geodesically convex if and only if, for all µ ∈ P 2 (R d ) and all ζ ∈ T µ P 2 (R d ), the following inequality holds.
Theorems 1.2 and 1.3 are proven in Section Four and may be seen as analogues of the respective inequalities that characterise semi-convexity for differentiable and twice differentiable functions on finite-dimensional space.Although the 2-Wasserstein space is not a smooth manifold, the formal Riemannian calculus, proposed by Otto in [16] and further developed by Otto and Villani in [17], provides a well-established notion of differentiability for functions defined on P 2 (R d ).For the notions of gradient and Hessian utilised in this manuscript, we refer to the theory of differential 1-forms and 2-forms subsequently developed by Gangbo et al. in [8].In particular, the work of Gangbo and Chow (see [6]) elucidates that the differential 2-form proposed in [8] defines a Hessian on the 2-Wasserstein space and that this Hessian is consistent with the Levi-Civita connection proposed by Gigli in [11] and Lott in [13].
Whilst the current theory is lacunary in a complete presentation of the first and secondorder convexity criteria presented above, there are well-established first-order-convexity inequalities such as the aforementioned notions of L-convexity and displacement monotonicity.Furthermore, Lanzetti et al. show that, when a function [12,Proposition 2.8]).Comparatively, in Theorem 1.2, we posit that the converse also holds.
In contrast to the first-order convexity criteria, second-order convexity inequalities, such as (2), are far less well-established.In particular, it seems there is currently no description of geodesic semi-convexity purely in terms of the Hessian presented in [6, Definition 3.1] (see also Definition 2.6).In addition, as we explain in the following subsection, the nature of parallel transport on 2-Wasserstein space means that establishing a second-order characterisation of geodesic convexity requires particular care.
1.4 Strategy of Proof -Theorem 1.3.As seen in Definition 2.6, the Wasserstein Hessian is defined via its extension from , however, it is also possible to calculate the Hessian directly via the covariant derivative proposed in [11] and [13].Whilst it is expected that one could employ this latter method to derive a second-order geodesic convexity criterion, there are some difficulties in this approach.In particular, as established in [11,Example 5.20], parallel transport does not exist everywhere along some Wasserstein geodesics.In recognition of this difficulty, we instead choose to establish a second-order geodesic convexity criterion via an alternative argument which we describe as follows.
Firstly, let [0, 1] ∋ t → µ t ∈ P 2 (R d ) be a geodesic between measures µ, ν ∈ P 2 (R d ) and assume that there exists ϕ ∈ C ∞ c (R d ) such that µ t = (id + t∇ϕ) # µ for all t ∈ [0, 1].In Lemma 4.1, we show (under some additional convexity assumption on ϕ) that the second derivative of the map t → F (µ t ) may be written explicitly in terms of the Wasserstein Hessian.Consequently, convexity along any such curve t → µ t may be characterised by Inequality (2).
In order to extend this characterisation to a larger class of geodesics we subsequently introduce the set P rc 2 (R d ).This is the subset of P 2 (R d ) containing measures which are absolutely continuous with respect to the d-dimensional Lebesgue measure and have compact support.In particular, in Lemma 4.2, we show that, if µ, ν ∈ P rc 2 (R d ), then the optimal map between any µ and ν can be approximated by a sequence of smooth optimal maps (T n ) n∈N for which ( 1.5 A Note.After the initial submission of this manuscript to the arXiv, the author was made aware of the work of Cavagnari, Savaré and Sodini [5] which was developed in parallel with this manuscript.The results of [5] present several connections with Section Three of this manuscript.In particular, the authors introduce the notion of total convexity and this corresponds to our notion of convexity along acceleration-free curves.
In addition to this, it is shown in [5, Theorem 9.1, Remark 9.2] that, if d ≥ 2, then the assumption of differentiability in Theorem 1.1 may be relaxed to the assumption of continuity.To prove this result, the authors of [5] also find the space of discrete measures crucial to their argument and, moreover, the findings of Lemma 3.4 may similarly be found in [5,Theorem 6.2].

Notation and Preliminaries
2.1 Notation.The notation introduced here is largely consistent with the notation used in [2].We refer the reader to [2] for a more detailed description of their properties.
The set P(R d ) denotes the space of Borel probability measures over R d and the set P 2 (R d ) denotes the set of µ ∈ P(R d ) with bounded second moment.For µ, ν ∈ P 2 (R d ), the set of transport plans Γ(µ, ν) denotes the set of γ ∈ P 2 (R d × R d ) with first marginal µ and second marginal ν.Subsequently, we define the 2-Wasserstein distance The pair (P 2 (R d ), W 2 ) defines a metric space which we refer to as 2-Wasserstein space and, throughout this manuscript, we will assume that P 2 (R d ) is endowed with the W 2 distance.
The set of optimal transport plans Γ o (µ, ν) is defined as follows.
For i, j ∈ {1, 2, 3} and k ∈ {1, 2}, we define the projection operators π i,j (x 1 , x 2 , x 3 ) := (x i , x j ) and π k (x 1 , x 2 ) := x k .Given a set A ⊂ R d , its convex hull is denoted Hull(A).Finally, given a measure µ ∈ P(R d ), its support is denoted supp(µ) and, given a map µ-measurable map T : R d → R n , the pushforward of µ through T is the measure 2.2 Preliminaries.In this subsection, we recall a number of established definitions and results for use throughout the remainder of the manuscript.In particular, since Theorem 1.1 concerns both geodesic convexity and convexity along generalised geodesics, we first recall the definitions of such concepts from [14, Definition 1.1] and [2, Definition 9.2.4] respectively.Furthermore, we recall from [2, Lemma 7.2.1], a useful lemma concerning the properties of Wasserstein geodesics.
) for at least one geodesic between every pair of measures in P 2 (R d ).However, if F is λ-geodesically convex and continuous, then F necessarily satisfies Equation (3) for every geodesic between every pair of measures in P 2 (R d ).This is a consequence of Lemma 2.3.
In order to establish Theorems 1.2 and 1.3, we first require notions of differentiability and twice differentiability on the 2-Wasserstein space.Subsequently, we recall these concepts from [8, Definition 4.9] and [6, Definition 3.1] respectively.Definition 2.5.
is non-empty, then we say that F is differentiable at µ and define the Wasserstein gradient ∇ w F [µ] to be its element of minimal norm.We also define the differential dF If HessF [µ] exists and there exists In addition, we denote this extension by HessF [µ] and say that F is twice differentiable at µ.

Convexity Along Acceleration-Free Curves
Throughout this manuscript, we differentiate functions of the form f : [a, b] → R. When we say that f is differentiable on [a, b], we mean that f is differentiable on (a, b) and its respective right and left-sided derivatives exist at the points a and b.When evaluated at a generic point on the interval [a, b], we denote derivatives of f : [a, b] → R by f ′ or df dx , however, when evaluated specifically at the endpoints, we utilise the right and left-sided derivatives which we denote by df dx + and df dx − respectively.Using this definition of differentiability, we derive the following convexity criteria which we will use to prove Theorem 1.2.[3,Theorem 12.18]).Furthermore, it is shown in [18, Appendix C, Theorem 1], that the left and right-sided derivatives of a convex function satisfy the following inequality Identifying the one-sided derivatives at b and a with f ′ (b) and f ′ (a) respectively, the above inequality implies that (f

Convexity Along Acceleration-Free Curves
In this section, our goal is to prove Theorem 1.1.To achieve this, we first introduce acceleration-free curves and their associated notion of convexity.Subsequently, we examine some properties of these curves and develop a number of preparatory lemmas concerning the differentiation of functions along such curves.Finally, we prove Theorem 1.1 and construct an example to show that, in general, the notions of geodesic convexity and acceleration-free convexity do not coincide, even if we assume that our functions are continuous.

Acceleration-Free Curves
Definition 3.1.We say that a curve Remark 3.2.An acceleration-free curve may be induced by any transport plan, however, a Wasserstein geodesic is an acceleration-free curve induced by a transport plan which is optimal for the transport between the initial and final measures.
In the following Lemma, we show that an acceleration-free curve between discrete measures also defines a Wasserstein geodesic when we restrict ourselves to small enough subintervals of [0, 1].This property is key to our further analysis of acceleration-free convexity as it allows us to study acceleration-free curves using the properties of Wasserstein geodesics.
Fix s ∈ [0, 1), fix p ∈ {1, . . ., l} and let P s p ⊂ {1, . . .l} denote the set of integers q such that z p s + w p = z q s + w q .In particular, we remark that, p ∈ P s p for all p ∈ {1, . . .l} and, consequently, the set P s p is always non-empty.Since the set Q is finite, we may choose ε p > 0 such that (s, s + ε p ] ∩ Q = ∅.Consequently, there exists δ p > 0 such that |z p s + w p − (z q t + w q )| > δ p for all q / ∈ P s p and all t ∈ [s, s + ε p ]. Additionally, for any q ∈ P s p , it follows that lim t→s |z p s + w p − (z q t + w q )| = 0.
Consequently, we may re-choose an even smaller ε p > 0 such that δ p > |z p s + w p − (z q t + w q )| for all t ∈ [s, s + ε p ] and all q ∈ P s p .In particular, by re-choosing a smaller ε p it still holds that |z p s + w p − (z q t + w q )| > δ p for all q / ∈ P s p and all t ∈ [s, s + ε p ].Moreover, Equation ( 5) holds for all t ∈ [s, s + ε p ] and all q ∈ P s p . inf We subsequently define ε := inf 1≤p≤l ε p .
There exists an optimal plan σ ∈ Γ o (µ s , µ t ) of the form Proof of Claim.To show that σ is an optimal plan it is sufficient to show that the support of σ is cyclically monotone.This result is a consequence of [2, Theorem 6.1.4].Moreover, the support of σ is cyclically monotone if the following inequality is satisfied any permutation ρ of the integers {1, . . ., l}.
We first assume that ρ is a permutation map such that ρ(k) ∈ P s k for all k ∈ {1, . . ., l}.By the definition of P s k it follows that z k s + w k = z ρ(k) s + w ρ(k) for all k ∈ {1, . . ., l}. consequently, the following equality holds.
On the other hand, assume that ρ is a permutation map and there exists a set R ⊂ {1, . . ., l} such that ρ(k) / ∈ P s k for all k ∈ R and such that ρ(k) ∈ P s k for all k ∈ {1, . . ., l} \ R. As a consequence of Equation ( 5), the following system of inequalities holds.
Since the two cases we have considered include all possible permutations of the integers {1, . . ., l}, we conclude that the support of σ must be cyclically monotone and, hence, the plan σ must be optimal between µ s and µ t .
|x − y| 2 dσ(x, y) As a consequence of Equation ( 6), the following equality also holds for all t ∈ [s, s Moreover, it follows that εW 2 (µ s , µ t ) = |t − s|W 2 (µ s , µ s+ε ) for all t ∈ [s, s + ε], and so, up to a time re-scaling of a factor ε, the curve [s, s + ε] ∋ t → µ t defines a unit time geodesic between µ s and µ s+ε .By a similar argument, for every s ∈ (0, 1], there exists δ > 0 such that, up to a time re-scaling, [s − δ, s] ∋ t → µ t defines a unit time geodesic between µ s−δ and µ s . To further motivate the study of acceleration-free curves between discrete measures, the following lemma shows that it is enough, in the context of proving Theorem 1.1, to show that a function is convex along acceleration-free curves when restricted to the set P d 2 (R d ).
Lemma 3.5.Let F : P 2 (R d ) → R. If F is continuous then F is λ-convex along accelerationfree curves if and only if its restriction to P d 2 (R d ) is λ-convex along acceleration-free curves.
Proof.If [0, 1] ∋ t → µ t defines an acceleration-free curve between two discrete measures then µ t is a discrete measure for all t ∈ [0, 1].Consequently, if F is λ-convex along accelerationfree curves then its restriction to P d 2 (R d ) is λ-convex along acceleration-free curves and it is left to prove the converse implication.Let µ 1 , µ 2 ∈ P 2 (R d ) and let γ ∈ Γ(µ 1 , µ 2 ).Since t defines an acceleration-free curve between two discrete measures which we denote µ 1,n and µ 2,n .Consequently, if we assume that F : P 2 (R d ) → R is continuous and its restriction to P d 2 (R d ) is λ-convex along acceleration-free curves, then the following system of inequalities holds for all t ∈ [0, 1].

Differentiation Along Acceleration-Free Curves and Proof of Theorem 1.1.
Before attempting to calculate the derivative of a function along an acceleration-free curve, it is first useful to establish the following lemma which characterises the derivative of a differentiable function along geodesics.Since acceleration-free curves between discrete measures behave somewhat like Wasserstein geodesics, we may also use Lemma 3.6 in order to characterise the derivative of a differentiable function along these curves.
In particular, since t → µ t is a geodesic, the following system of equalities holds.
Proof.Given s ∈ [0, 1), it follows from Lemma 3.4 that there exists ε > 0 such that [s, s + ε] ∋ t → µ t defines a geodesic.Moreover, by Definition 2.1, there exists an optimal plan γ s,ε ∈ Γ o (µ s , µ s+ε ) such that µ s+tε = ((1 − t)π 1 + tπ 2 ) # γ s,ε for all t ∈ [0, 1].Consequently, it follows from Lemma 3.6 that we may express γ s,ε in terms of γ.In particular, the following equality holds. ( From Equation ( 7), it follows that and, consequently, For s ∈ (0, 1], it also follows from Lemma 3.4 that there exists δ > 0 such that [s − δ, s] ∋ t → µ t defines a geodesic.Consequently, (and using a similar reasoning to the calculation of the right-sided derivative) there exists an optimal plan γ s,δ As established in Equation ( 7), it follows that Since s was an arbitrary point in the interval (0, 1), the left and right-sided derivatives of t → F (µ t ) agree on (0, 1).Moreover, we conclude that [0 Proof.Since f is differentiable, it follows that f is convex on [a, c] if and only (f ′ (x) − f ′ (y))(x − y) ≥ 0 for all a < x, y < c (cf. [3,Theorem 12.18]).Without loss of generality, we fix a < x ≤ y < c.If x, y ≥ b or x, y ≤ b then, since f is convex on [a, b] and [b, c], it follows as a consequence of [3,Theorem 12.18] that (f We conclude that f must be convex on [a, c]. Proof of Theorem 1.1.The set of acceleration-free curves contains every generalised geodesic and the set of generalised geodesics contains every geodesic.Consequently, if F is λ-convex along acceleration-free curves, it must also be λ-convex along generalised geodesics and, similarly, if F is λ-convex along generalised geodesics then F must also be λ-geodesically convex.It is left to show that λ-geodesic convexity implies λ-convexity along acceleration-free curves.We first consider the case in which λ = 0. Let [0, 1] ∋ t → µ t be an acceleration-free curve between µ 1 , µ 2 ∈ P d 2 (R d ).Given s ∈ (0, 1), Lemma 3.4 implies that there exist δ, ε > 0 such that the maps [s − δ, s] ∋ t → µ t and [s, s + ε] ∋ t → µ t define geodesics.Moreover, since we assume F to be geodesically convex, the restriction of [0, 1] ∋ t → F (µ t ) to the intervals [s − δ, s] and [s, s + ε] must be convex.By Lemma 3.7, the map [0, 1] ∋ t → F (µ t ) is differentiable on [0, 1] and, consequently, we conclude from Lemma 3.8 that the restriction of the map [0, 1] ∋ t → F (µ t ) to the interval [s − δ, s + ε] is also convex.Since the map t → F (µ t ) is continuous on [0, 1] and convex on a neighbourhood of s for every s ∈ (0, 1), we conclude that the map [0, 1] ∋ t → F (µ t ) is convex.Now, since t → µ t was an arbitrary acceleration-free curve between measures in P d 2 (R d ), we conclude that the restriction of F to P d 2 (R d ) must be convex along acceleration-free curves.Moreover, since F is differentiable it must also be continuous, and so, as a consequence of Lemma 3.5, F must also be convex along all acceleration-free curves.
We now assume that F : P 2 (R d ) → R is differentiable and λ-geodesically convex for λ = 0.By Remark 3.3, the map is differentiable and geodesically convex.As we have shown, this implies that the map is convex along acceleration-free curves and so we conclude by Remark 3.3 that F : P 2 (R d ) → R must be λ-convex along acceleration-free curves.
In the following example, we construct a function on P 2 (R) which is both geodesically convex and continuous but not convex along acceleration-free curves.This demonstrates that, in general, one can not hope to relax the assumption of Theorem 1.1 from differentiability to continuity.Example 3.9.Let ε > 0. We define We also recall the definition of the sets ∆ + and ∆ − from [19,Proposition 7.25].

First and Second Order Convexity Criteria
In this section, we prove Theorems 1.2 and 1.3.Whilst the proof of Theorem 1.2 is relatively self-contained, in contrast, and, as we outlined in Subsection 1.4, the proof of Theorem 1.3 requires us to first establish a number of helpful lemmas.
Furthermore, using Lemma 3.6 to characterise the one-sided derivatives of t → F (µ t ), we conclude that Equation (8) and Equation ( 1) are equivalent and, since the choice of geodesic was arbitrary, we conclude that (1) must hold for all µ 1 , µ 2 and all γ ∈ Γ o (µ 1 , µ 2 ).
Subsequently, the following inequality must hold for any 0 ≤ s < r ≤ 1.
Second Order Convexity Criteria.To begin this subsection, we first calculate the second derivative of the map t → F (µ t ) when F is twice differentiable and [0, 1] ∋ t → µ t defines a suitably 'smooth' geodesic.
From Equation (10), it follows that ∇ϕ . It also follows from Lemma 2.3 that there is a unique optimal plan between µ t and µ s for all s ∈ [0, 1] and t ∈ [0, 1).In the following system of equalities, we show that this optimal plan is induced by an optimal map of the Now, since U is a neighbourhood of Hull(supp(µ 1 )) and µ t = (g t ) # µ 1 , it follows that supp(µ t ) is a subset of g t (U ).Moreover, since the optimal map id + (s − t)(∇ϕ • g −1 t ) is only uniquely defined µ t -almost everywhere and since ∇f t | gt(U ) = ∇ϕ • g −1 t , we may identify the optimal map with id + (s − t)∇f t .In particular, this means that Utilising Definition 2.5, we calculate the first derivative of F • µ t as follows.
Since F is twice differentiable, it follows from Definition 2.6 that the map t We calculate the second derivative as follows.
and via the application of the chain rule, the following system of equalities holds µ t -almost everywhere.
Via the application of the chain rule, we also derive Equation (12).
) and let T denote the optimal map between µ and ν.There exists a sequence of C ∞ c (R d ) functions (ϕ n ) n∈N and U n , a convex neighbourhood of Hull(supp(µ)), such that each ϕ n is (−1)-convex on U n and (∇ϕ n + id) → T in L 2 (µ; R d ) as n → ∞.
Proof.In [2], Theorem 6.2.10, it is shown that, for µ ∈ P r 2 (R d ), there exists a convex function φ such that ∇φ = T in L 2 (µ; R d ).It is also shown that, if ν is compactly supported, then φ is locally Lipschitz.Since ν is compactly supported, it also follows that T is bounded µ-almost everywhere.Let χ ε be a positive mollifier and let (κ n ) n∈N be a sequence of smooth cutoff functions such that κ n = 1 and ∇κ n = 0 on B n (0).Since µ has compact support, there exists N ∈ N such that Hull(supp(µ)) ⊂ B n (0) for all n ≥ N .We fix such an N , we define ϕ n := κ n+N (χ 1 n * φ − 1 2 |id| 2 ) and we let U n = B n+N (0).Since mollification preserves convexity and κ n+N (x) = 1 for x ∈ U n , each ϕ n is (−1)-convex on U n for all n ∈ N. Now, using the convergence properties of a mollifier, the gradient ∇ϕ n (x) converges to T (x)−x for every x ∈ Hull(supp(µ)).Moreover, since µ is absolutely continuous with respect to L d , it follows that ∇ϕ n (x) converges to T (x)− x for µ-almost every x ∈ R d .Using this convergence and the boundedness of T , it follows that there exists C ∈ R such that |∇ϕ n | ≤ C µ-almost everywhere.Moreover, by applying the Dominated Convergence Theorem to the sequence ∇ϕ n , it follows that (∇ϕ n + id) converges to T in L 2 (µ; R d ) as n → ∞.
Whilst the following results, Lemmas 4.3 and 4.4 are not used directly in the Proof of Theorem 1.3, they are necessary results in the proof of Lemma 4.5 which, in turn, is utilised in our proof of the main result.Moreover, whilst it is expected that the following two lemmas are well-known to experts, we include their proof for completeness.Lemma 4.3.Let (µ 1,n ) n∈N and (µ 2,n ) n∈N be sequences in P 2 (R d ) and let (γ n ) n∈N be a sequence such that γ n ∈ Γ o (µ 1,n , µ 2,n ) for all n ∈ N. If there exist µ 1 , µ 2 ∈ P 2 (R d ) such that lim n→∞ W 2 (µ 1 , µ 1,n ) = 0 and lim n→∞ W 2 (µ 2 , µ 2,n ) = 0 then there exists γ ∈ Γ o (µ 1 , µ 2 ) such that, up to a subsequence, (γ n ) n∈N narrowly converges to γ.
on the whole space P 2 (R d ).Lemma 4.5 states that a continuous function is λ-geodesically convex if and only if its restriction to P rc 2 (R d ) is λ-geodesically convex and is inspired by the convexity criterion derived in [2, Proposition 9.1.3].