Markov chains under nonlinear expectation

In this paper, we consider nonlinear continuous-time Markov chains with a finite state space. We define so-called $Q$-operators as an extension of $Q$-matrices to a nonlinear setup, where the nonlinearity is due to parameter uncertainty. The main result gives a full characterization of convex $Q$-operators in terms of a positive maximum principle, a dual representation by means of $Q$-matrices, time-continuous Markov chains under convex expectations and fully nonlinear ODEs. This extends a well-known characterization of $Q$-matrices.


INTRODUCTION AND MAIN RESULT
In mathematical finance, model uncertainty or ambiguity is an almost omnipresent phenomenon, which, for example, appears due to incomplete information about certain aspects of an underlying asset or insufficient data in order to perform reliable statistical estimation methods for the parameters of a stochastic process. The latter typically leads to so-called parameter uncertainty in the generator of a stochastic process. Prominent examples for this type of uncertainty include a Black-Scholes model with uncertain volatility, the so-called uncertain volatility model, cf. Avellaneda, Levy, and Parás (1995), Avellaneda and Parás (1996), and Vorbrink (2014), and a Brownian motion under drift or volatility uncertainty leading to the g-framework, see, for example, Coquet, Hu, Mémin, and Peng (2002) or the G-framework by Peng (2007) and Peng (2008), respectively. Lately, these approaches have been generalized to Lévy processes with uncertainty in the Lévy triplet, cf. Denk, Kupper, and Nendel (2020), Hu and Peng (2009), and Neufeld and Nutz (2017), and uncertainty in the generator of Feller processes, cf. Nendel and Röckner (2019). While these works give sufficient conditions in order to guarantee the existence of stochastic processes under model uncertainty and to establish a connection to nonlinear partial differential equations, there is no necessary condition that determines the maximal degree of ambiguity that can be captured by an uncertain process.
In the present paper, we address this issue in a simplified setup, where we consider a finite state space. We provide sufficient and necessary conditions in terms of the generators of timehomogeneous continuous-time Markov chains that guarantee the existence of a continuous-time Markov chain under a convex expectation. We further establish a one-to-one relation between the transition operators of convex Markov chains and a class of nonlinear ordinary differential equations. In particular, we extend a classical relation between Markov chains, rate matrices, and ordinary differential equations to the case of model uncertainty. The ordinary differential equation related to a convex Markov chain is a spatially discretized version of a Hamilton-Jacobi-Bellman equation, and the nonlinear transition operators are related, via a dual representation, to a control problem where, roughly speaking, "nature" tries to control the system into the worst possible scenario (see Remark 4.18). The explicit description of the transition operators gives rise to a numerical scheme, different from Runge-Kutta methods, for the computation of price bounds for European contingent claims under model uncertainty. We illustrate this method and other numerical methods in several examples, where we consider an underlying Markov chain, which is a discrete version, more precisely, the generator is a finite difference discretization of the generator of a Brownian motion with uncertain drift, cf. Coquet et al. (2002), and uncertain volatility, cf. Peng (2007) and Peng (2008). The main tools, we use in our analysis, are convex duality, a semigroup-theoretic approach to control problems due to Nisio (1976/77), see also Denk et al. (2020) and Nendel and Röckner (2019), and a convex version of Kolmogorov's extension theorem due to Denk, Kupper, and Nendel (2018), which allows to extend the expectation to functionals that depend on the whole path. Restricting the time parameter, in the present work, to the set of natural numbers leads to a discrete-time Markov chain, in the sense of Denk et al. (2018, Example 5.3).
Our setup is inspired by Peng (2005), where Markov chains under nonlinear expectations are considered in an axiomatic way. However, the existence of stochastic processes under nonlinear expectations has only been considered in terms of finite-dimensional nonlinear marginal distributions, whereas completely path-dependent functionals could not be regarded. Markov chains under model uncertainty have been considered among others by Avellaneda and Buff (1999), De Cooman, Hermans, and Quaeghebeur (2009), Hartfiel (1998), and Škulj (2009. Avellaneda and Buff (1999) study a finite difference discretization of the uncertain volatility model leading to a Markov chain setting. Hartfiel (1998) considers so-called Markov set-chains in discrete time, using matrix intervals in order to describe model uncertainty in the transition matrices. Later, Škulj (2009) Denk et al. (2018), allows the construction of discrete-time Markov chains on the canonical path space. In continuous time, in particular, computational aspects of sublinear imprecise Markov chains have been studied amongst others by Krak, De Bock, and Siebes (2017) and Škulj (2015).
Another concept that is closely related to Markov chains under nonlinear expectations, as discussed in the present paper, are BSDEs on Markov chains by Cohen and Elliott (2008) and Cohen and Elliott (2010a), see also Cohen and Szpruch (2012), Cohen and Hu (2013), and Cohen and Elliott (2010b) for the discrete-time case. Here, a reference Markov chain = ( ) ≥0 with generator ( ) ≥0 is fixed, and one considers BSDEs driven by . This can be viewed as a discretization of the classical BSDE setup, where the state space is ℝ, the driving process is a Brownian Motion, and the generator is 1 2 . Cohen and Szpruch (2012) show that Markovian solutions to BSDEs on Markov chains are related via their driver to a system ′ ( ) = ( , ( )) + ( ) ( ) for all ≥ 0, (0) = 0 of nonlinear ordinary differential equations with a nonlinear function that is assumed to be globally Lipschitz in the variable . In the present paper, ( , ) =  for a convex operator . The biggest difference between our approach and the theory of BSDEs on Markov chains lies in the fact that we do not consider a fixed reference Markov chain that drives the model. On the other hand, our approach is restricted to considering Markovian solutions to BSDEs on Markov chains. From a technical standpoint, further differences are that the theory of BSDEs allows for more generality in terms of nonlinearity of the driver, while we do not require global Lipschitz continuity of the generator allowing for a possibly unbounded convex conjugate. Additionally, we only focus on the time-homogeneous case. However, regarding the existence of Markov chains under convex expectations and their connection to nonlinear ordinary differential equations (ODEs), this restriction could easily be overcome with a slight modification of the construction of the transition operators. Dentcheva and Ruszczyński (2018) consider Markov risk measures for a countable state space, see also Fan and Ruszczyński (2018a), Fan and Ruszczyński (2018b), and Ruszczyński (2010) for the discrete-time case. Here, the focus lies on time-consistent risk measurement related to a fixed reference continuous-time Markov chain = ( ) ≥0 . Using so-called semiderivatives in the direction of the generator , the authors derive, in the case of a coherent risk measure, a sublinear ordinary differential equation related to the risk measure, where the dual representation of the nonlinear generator depends on the generator of the baseline model . Clearly, in the theory of Markov risk measures, the focus lies more on law-invariant risk measures such as the average value at risk, and is therefore not directly comparable with our approach, where we explicitly avoid to fix a baseline model but rather try to capture very general forms of uncertainty in the generator. However, on a technical level, our approach also allows to consider risk evaluations related to convex generators that do not depend on a fixed reference generator.
In view of the aforementioned existing literature on imprecise versions of Markov chains, the contribution of this paper can be summarized as follows (see Remark 2.6 for further details): -We propose a framework describing Markov chains under model uncertainty in terms of the rate matrix. Our approach complements the existing literature on BSDEs on Markov chains and Markov risk measures covering a different range of examples and applications in a consistent way. The key difference between our framework and the aforementioned existing approaches lies in the fact that we do not consider a fixed reference Markov chain describing the dynamics of an underlying asset. Moreover, our approach relies on analytic rather than stochastic methods using distributional rather than pathwise properties, and thus leading to restrictions in certain directions but advantages in other directions. -We show that, as in the linear case, Markov chains under convex expectations with certain regularity at time 0 are linked via a one-to-one relation to certain convex functions (their generator) and to solutions to convex differential equations, which can be solved, for example, by using an explicit Euler method or any other Runge-Kutta method. In particular, we prove the global existence of solutions to a class of convex differential equations with unbounded convex conjugate, that is, without a global Lipschitz condition on the generator. -We show that the transition semigroup of a convex Markov chain can be explicitly constructed using any (!) dual representation of the generator. In particular, for numerical computations, a "minimal" dual representation in terms of certain "corner points" can be used to solve the nonlinear Kolmogorov equation. Based on the explicit construction of the semigroup, we propose a novel algorithm for the numerical computation of solutions to a class of nonlinear ODEs. Moreover, we show that every convex transition semigroup is the least upper bound (in the sense of semigroups) of a family of linear transition semigroups, and vice versa. -The convex expectations we consider are defined on the whole path space without fixing any reference measure. We show that the nonlinear expectation, although possibly undominated, always admits a dual representation in terms of countably additive probability measures. Moreover, we derive an explicit dual representation in terms of an optimal control problem, where nature tries to control the system into the worst possible scenario, giving a control-theoretic interpretation to Markov chains under convex expectations.

Structure of the paper
In Section 2, we fix the notation, introduce our setup and basic definitions, and state the main result (Theorem 2.5). In Section 3, we prove the first part of Theorem 2.5 (implications ( ) ⇒ ( ) ⇒ ( ) ⇒ ( )). The main tool, we use in this part, is convex duality in ℝ . Moreover, we discuss how, in the sublinear case, computational efficiency can be improved by reducing compact and suitably convex sets of generator matrices to their "corner points." The effectiveness of this reduction is demonstrated in Section 5. In Section 4, we prove the remaining implications ( ) ⇒ ( ) ⇒ ( ) of Theorem 2.5. Here, we use a combination of so-called Nisio semigroups, as introduced in Nisio (1976/77), the theory of ordinary differential equations, and a Kolmogorov-type extension theorem for convex expectations derived in Denk et al. (2018). We conclude this section by showing that the semigroup envelope admits a dual representation as a cost functional related to an optimal control problem. In Section 5, we use and compare two different numerical methods, based on the results from Sections 3 and 4, in order to compute price bounds for European contingent claims, where the underlying is a discrete version of a Brownian motion with drift uncertainty (g-framework) and volatility uncertainty (G-framework).

NOTATION, BASIC DEFINITIONS, AND MAIN RESULT
Given a measurable space (Ω,  ), we denote the space of all bounded measurable functions Ω → ℝ by  ∞ (Ω,  ). A nonlinear expectation is then a functional  ∶  ∞ (Ω,  ) → ℝ, which satisfies If  is additionally convex, that is, for all , ∈  ∞ (Ω,  ) and ∈ [0, 1], we say that  is a convex expectation. It is well known (see, e.g., Denk et al., 2018or Föllmer & Schied, 2011) that every convex expectation  admits a dual representation in terms of finitely additive probability measures. If , however, even admits a dual representation in terms of (countably additive) probability measures, we say that (Ω,  , ) is a convex expectation space. More precisely, we say that (Ω,  , ) is a convex expectation space if there exists a set  of probability measures on (Ω,  ) and a family ( ℙ ) ℙ∈ ⊂ [0, ∞) with inf ℙ∈ ℙ = 0 such that for all ∈  ∞ (Ω,  ). Here, ℙ denotes the expectation w.r.t. a probability measure ℙ on (Ω,  ). If ℙ = 0 for all ℙ ∈ , we say that (Ω,  , ) is a sublinear expectation space. Here, the set  represents the set of all models that are relevant under the expectation . In the case of a sublinear expectation space, the functional  is the best case among all plausible models . In the case of a convex expectation space, the functional  is a weighted best case among all plausible models  with an additional penalization term ℙ for every ℙ ∈ . Intuitively, ℙ can be seen as a measure for how much importance we give to the prior ℙ ∈  under the expectation . For example, a low penalization, that is, ℙ close or equal to 0, gives more importance to the model ℙ ∈  than a high penalization. Throughout, we consider a finite nonempty state space with cardinality ∶= | | ∈ ℕ. We endow with the discrete topology 2 and w.l.o.g. assume that = {1, … , }. The space of all bounded measurable functions → ℝ can therefore be identified by ℝ via Therefore, we denote bounded measurable functions as vectors of the form = ( 1 , … , ) ∈ ℝ , where represents the value of in the state ∈ {1, … , }. On ℝ , we consider the norm for a vector ∈ ℝ . Moreover, for ∈ ℝ, the vector ∈ ℝ denotes the constant vector ∈ ℝ with = for all ∈ {1, … , }. For an arbitrary matrix = ( ) 1≤ , ≤ ∈ ℝ × , we denote by ‖ ‖ the operator norm of ∶ ℝ → ℝ w.r.t. the norm ‖ ⋅ ‖ ∞ , that is, .
Inequalities of vectors are always understood componentwise, that is, for , ∈ ℝ , In the same way, all concepts in ℝ that include inequalities are to be understood componentwise. For example, a vector field ∶ ℝ → ℝ is called convex if for all ∈ {1, … , }, , ∈ ℝ and ∈ [0, 1]. A vector field is called sublinear if it is convex and positive homogeneous (of degree 1). Moreover, for a set ⊂ ℝ of vectors, we write = sup if = sup ∈ for all ∈ {1, … , } and = max if = sup and, for all ∈ {1, … , }, there exists some ∈ with = .
In the following, we briefly recall the basic definitions and concepts from the theory of (time-homogeneous) Markov chains. A (time-homogeneous) Markov chain is a quadruple that is, ℙ denotes the probability distribution under which the Markov chain starts in the state . Moreover, we use the notation In particular, ( ( + )| = ) = ( ( )) for all , ∈ {1, … , }.
A matrix = ( ) 1≤ , ≤ ∈ ℝ × is called a Q-matrix or rate matrix if it satisfies the following conditions: It is well known that every continuous-time Markov chain with certain regularity properties at time = 0 can be related to a Q-matrix and vice versa. More precisely, for a matrix ∈ ℝ × , the following statements are equivalent: In this case, for each vector 0 ∈ ℝ , the function ∶ [0, ∞) → ℝ , ↦ ( 0 ( )) is the unique classical solution ∈ 1 ([0, ∞); ℝ ) to the initial value problem where is the matrix exponential of . We refer to Norris (1998) for a detailed illustration of this relation.
Notice that the properties ( )-( ) in the previous definition are a one-to-one translation of (M1)-(M3) to a convex setup. The Markov property given in ( ) of the previous definition is the nonlinear analog of the classical Markov property (M4) without using conditional expectations. Due to the nonlinearity of the expectation, the definition and, in particular, the existence of a conditional (nonlinear) expectation are quite involved, which is why we avoid to introduce this concept. In order to get the idea behind the formulation in (iv), choose 0 = ( + )1 ( ) for a measurable function ∶ {1, … , } → ℝ and arbitrary ⊂ {1, … , } . Then, if  is linear, Equation (1) reads as which is equivalent to (M4). On the other hand, for every linear Markov chain, Property (M4) implies Property ( ). Hence, in the linear case, Definition 2.2 is consistent with the classical definition of a Markov chain.
In line with Denk et al. (2018, Definition 5.1), we say that a (possibly nonlinear) map Here and throughout, we make use of the notation S( )S( ) ∶= S( ) • S( ). If, additionally, S(ℎ) → uniformly on compact sets as ℎ ↘ 0, we say that the semigroup S is uniformly continuous. We call S Markovian if S( ) is a kernel for all ≥ 0. We say that S is linear, sublinear, or convex if S( ) is linear, sublinear, or convex for all ≥ 0, respectively. Definition 2.4. Let  ⊂ ℝ × be a set of Q-matrices and = ( ) ∈ a family of vectors with sup ∈ = 0 = 0 for some 0 ∈ , that is, ≤ 0 for all ∈  and there exists some 0 ∈  with 0 = 0. We denote by That is, the semigroup envelope S is the smallest semigroup that dominates all semigroups ( ) ∈ .
The following main theorem gives a full characterization of convex Q-operators.
Theorem 2.5. Let  ∶ ℝ → ℝ be a mapping. Then, the following statements are equivalent: There exists a set  ⊂ ℝ × of Q-matrices and a family = ( ) ∈ ⊂ ℝ of vectors with ≤ 0 for all ∈  and 0 = 0 for some 0 ∈ , such that for all 0 ∈ ℝ , where the supremum is to be understood componentwise. (iv) There exists a uniformly continuous convex Markovian semigroup S with In this case, for each initial value 0 ∈ ℝ , the function Moreover, the Markovian semigroup S from (iv) is the (upper) semigroup envelope of (, ), and ( ) = S( ) 0 for all ≥ 0.
Remark 2.6. Consider the situation of Theorem 2.5.
(a) The dual representation in ( ) gives a model uncertainty interpretation to Q-operators. The set  can be seen as the set of all plausible rate matrices, when considering the Q-operator . For every ∈ , the vector ≤ 0 can be interpreted as a penalization, which measures how much importance we give to each rate matrix . The requirement that there exists some 0 ∈  with 0 = 0 can be interpreted in the following way: There exists at least one rate matrix 0 within the set of all plausible rate matrices  to which we assign the maximal importance, which is the minimal penalization. (b) The semigroup envelope S of (, ) can be constructed more explicitly, in particular, an explicit (in terms of (, )) dual representation can be derived. For details, we refer to Section 4 (Definition 4.2 and Remark 4.18). Moreover, we would like to highlight that the semigroup envelope S can be constructed w.r.t. any dual representation (, ) as in ( ) and results in the unique classical solution to (3) independent of the choice of the dual representation (, ) of . This gives, in some cases, the opportunity to efficiently compute the semigroup envelope numerically via its primal/dual representation (see Remark 3.3 and Example 5.2). (c) The same equivalence as in Theorem 2.5 holds if convexity is replaced by sublinearity in ( ), ( ), ( ), and ( ) and = 0 for all ∈  in ( ). In this case, the set  in ( ) can be chosen to be compact as we will see in the proof of Theorem 2.5. (d) Theorem 2.5 extends and includes the well-known relation between (linear) Markov chains, Q-matrices, and ordinary differential equations. (e) A remarkable consequence of Theorem 2.5 is that every convex Markovian semigroup, which is differentiable at time = 0, is the semigroup envelope with respect to the Fenchel-Legendre transformation (or any other dual representation as in ( ) of its generator, which is a convex Q-operator.
(f) Although  has an unbounded convex conjugate, the convex initial value problem has a unique global solution. (g) Solutions to (4) remain bounded. Therefore, a Picard iteration or Runge-Kutta methods, such as the explicit Euler method, can be used for numerical computations, and the convergence rate (depending on the size of the initial value 0 ) can be derived from the a priori estimate in Banach's fixed point theorem. (h) As in the linear case, by solving the differential equation (4), one can (numerically) compute expressions of the form ( ) = ( 0 ( )).
We illustrate this computation procedure in Example 5.1.

PROOF OF ( ) ⇒ ( ) ⇒ ( ) ⇒ ( )
We say that a set  ⊂ ℝ × of matrices is row-convex if, for any diagonal matrix ∈ ℝ × with ∶= ∈ [0, 1] for all ∈ {1, … , }, where = ∈ ℝ × is the -dimensional identity matrix. Notice that, for all ∈ {1, … , }, the th row of the matrix + ( − ) is the convex combination of the th row of and with . Notice that a set  ⊂ ℝ × is row-convex if and only if it is convex and, for arbitrary , ∈ , the matrix that results from replacing the th row of by the th row of is again an element of . For example, the set of all Q-matrices is row-convex.
Remark 3.1. Let  be a convex Q-operator. For every matrix ∈ ℝ × , let and * ∶= − * ( ) for all ∈  * . Then, the following facts are well-known results from convex duality theory in ℝ .
(b) Let ≥ 0 and  * ∶= { ∈ ℝ × |  * ( ) ≤ }. Then,  * ⊂ ℝ × is compact and rowconvex. Therefore, defines a convex operator, which is Lipschitz continuous. Notice that the maximum in (5) is to be understood componentwise. However, for fixed 0 ∈ ℝ , the maximum can be attained, simultaneously in every component, by a single element of  * , that is, for all 0 ∈ ℝ , there exists some 0 ∈  * with This is due to the fact that  * is row convex and that, for ∈  * , the th component of the vector * only depends on the th row of . (c) Let ≥ 0. Then, there exists some ≥ 0, such that In particular,  is locally Lipschitz continuous and where, for fixed 0 ∈ ℝ , the maximum can be attained, simultaneously in every component, by a single element of  * . In particular, there exists some 0 ∈  * with * 0 = sup ∈ * * = (0) = 0.
( ) ⇒ ( ): This follows directly from the positive maximum principle, considering the vectors and − for all > 0 and ∈ {1, … , }. ( ) ⇒ ( ): Let  be a convex Q-operator. Moreover, let  * and * = ( * ) ∈ * be as in Remark 5. Then, by Remark 5 (c), it only remains to show that every ∈  * is a Q-matrix. To this end, fix an arbitrary ∈  * . Then, for all ∈ ℝ, Therefore, ≤ 0 for all ∈ ℝ. Since is linear, it follows that 1 = 0. Now, let ∈ {1, … , }. Then, by definition of a Q-operator, we obtain that that is, ≥ 0. Therefore, is a Q-matrix. It remains to show the implications ( ) ⇒ ( ) ⇒ ( ), which is done in the entire next section. □ Before we start with the proof of the remaining implications ( ) ⇒ ( ) ⇒ ( ), we would like to point out how, in the sublinear case, the set  * of Q-matrices from Remark 3.1 can be reduced to certain "corner points." This can be done using the concept of row convexity, introduced at the beginning of this section, together with Minkowski's theorem on extremal points of convex sets in ℝ . Let  ⊂ ℝ × be a nonempty set of matrices. Then, we define the row-convex hull of  by For a convex set ⊂ ℝ , we denote the set of all extreme points of by ( ). Recall that an extreme point of a convex set ⊂ ℝ is an element ∈ such that = + (1 − ) , for ∈ (0, 1) and , ∈ , implies that = = . For a matrix ∈ ℝ × and ∈ {1, … , }, we denote by ∶= ( 1 , … , ) ∈ ℝ the th row of . Let  ⊂ ℝ × be a nonempty compact row-convex set of matrices. Then, we say that a set  ⊂  is -row-extreme if That is, the set of all th rows of  is the set of all extreme points of the th rows of . We say that a set  ⊂  is minimal -row-extreme, if  is row-extreme for  and  ⊂  implies  =  for any -row-extreme set  ⊂ .
where the maxima are to be understood componentwise.
Proof. By Minkowski's theorem, the set of all -row-extreme sets is nonempty, and one readily verifies that the latter together with the partial order ⪯, given by  1 ⪯  2 if and only if  1 ⊃  2 , has the chain property. Hence, by Zorn's lemma, there exists a maximal element  within the set of all -row-extreme sets, which, by definition, is a minimal -row-extreme set. Now, let  be an arbitrary -row-extreme set and 0 ∈ ℝ . Then, is a nonempty, compact, and row-convex set. By the previous proposition, there exists a minimal  * -row-extreme set  ⊂  * , and, for all 0 ∈ ℝ , where the maximum is to be understood componentwise. Since * = ( * ) ∈ * = 0, it follows that (, 0) is a dual representation as in Theorem 2.5(iii). Notice that, in many cases, the cardinality of  is way smaller than the cardinality of  * . Therefore, concerning computational aspects, the dual representation (, 0) is often way more tractable than the dual representation ( * , 0), and, by Theorem 2.5, both representations result in the same semigroup envelope, and thus, the same solution to the ODE (3).

PROOF OF ( ) ⇒ ( ) ⇒ ( )
Throughout, let  ⊂ ℝ × be a set of Q-matrices and = ( ) ∈ ⊂ ℝ with ≤ 0 for all ∈  and 0 = 0 for some 0 ∈ , such that the map is well-defined. For every ∈ , we consider the linear ODE with (0) = 0 ∈ ℝ . Then, by a variation of constant, the solution to (7) is given by for ≥ 0, where ∈ ℝ × is the matrix exponential of for all ≥ 0. Then, the family = ( ( )) ≥0 defines a uniformly continuous semigroup of affine linear operators (see Definition 2.3).
For the family ( ) ∈ or, more precisely, for (, ), we will now construct the Nisio semigroup, and show that it gives rise to the unique classical solution to the nonlinear ODE (3). To this end, we consider the set of finite partitions The set of partitions with end point ≥ 0 will be denoted by , that is, For all ℎ ≥ 0 and 0 ∈ ℝ , we define where the supremum is taken componentwise. Note that  ℎ is well-defined since for all ∈ , ℎ ≥ 0 and 0 ∈ ℝ , where we used the fact that ℎ is a kernel. Moreover,  ℎ is a convex kernel, for all ℎ ≥ 0, as it is monotone and for all ∈ ℝ, where we used the fact that there is some 0 ∈  with 0 = 0. For a partition = { 0 , 1 , … , } ∈ with ∈ ℕ and 0 = 0 < 1 < ⋯ < , we set Moreover, we set  {0} ∶=  0 . Then,  is a convex kernel for all ∈ since it is a concatenation of convex kernels.
Notice that S( ) ∶ ℝ → ℝ is well-defined and a convex kernel for all ≥ 0 since  is a convex kernel for all ∈ . In many of the subsequent proofs, we will first concentrate on the case, where the family is bounded and then use an approximation of the Nisio semigroup by means of other Nisio semigroups. This approximation procedure is specified in the following remark.
It remains to show that the family S is the semigroup envelope of (, ). We have already shown that S is a semigroup and, by definition, S( ) 0 ≥ ( ) 0 for all ≥ 0, 0 ∈ ℝ and ∈ .
Taking the supremum over all ∈ , it follows that S( ) 0 ≤ T( ) 0 for all ≥ 0 and To finish the proof of the implication ( ) ⇒ ( ), it remains to show that the Nisio semigroup S is uniformly continuous and that it gives rise to the unique classical solution to the nonlinear ODE (3).
(a) Since  is bounded, it follows that  is Lipschitz continuous. Therefore, the Picard-Lindelöf Theorem implies that, for every 0 ∈ ℝ , the initial value problem (0) = 0 , has a unique solution ∈ 1 ([0, ∞); ℝ ). We will show that this solution is given by ( ) = S( ) 0 for all ≥ 0. That is, the unique solution of the ODE (14) is given by the Nisio semigroup. (b) Since  is bounded, the mapping The following key estimate and its proof are a straightforward adaption of the proof of (Nisio, 1976/77, Proposition 5) to our setup. Recall that, by Remark 4.3, the boundedness of the family implies the boundedness of the set . Proof. Let 0 ∈ ℝ and ℎ > 0. Then, by (8), we have that Notice that, by Lemma 4.5, the mapping [0, ∞) → ℝ , ↦ Σ( ) 0 is continuous and therefore locally integrable for all 0 ∈ ℝ . Hence, for all ≥ 0, Next, we show that for all ∈ by an induction on = | |, where | | denotes the cardinality of . If = 1, that is, if = {0}, the statement is trivial. Hence, assume that for all ′ ∈ with | ′ | = for some ∈ ℕ. Let = { 0 , 1 , … , } ∈ with 0 = 0 < 1 < ⋯ < and ′ ∶= ⧵ { }. Then, we obtain that where, in the second inequality, we used (15) with ℎ = − −1 and = −1 , and, in the last inequality, we used the sublinearity of Σ( ). Using the induction hypothesis, we thus see that

By (16), it follows that
for all ∈ . Taking the supremum over all ∈ , we obtain the assertion. □ The following proposition states that the Nisio semigroup (S( )) ≥0 is differentiable at zero if the family is bounded. Proposition 4.12. Assume that is bounded. Then, for all 0 ∈ ℝ , Proof. Since is bounded, it follows that  is bounded (see Remark 4.3). Let > 0 and 0 ∈ ℝ . Using Lemma 4.5, the boundedness of  and (9), there exists some ℎ 0 > 0 such that, for all 0 < ℎ ≤ ℎ 0 , for all ∈ , and for all ∈ . Dividing by ℎ and taking the supremum over all ∈ , it follows that Moreover, by Lemma 4.11, Dividing again by ℎ > 0 yields which, together with (17), implies that where we used that the fact that  ∶ ℝ → ℝ is convex and thus continuous. Therefore, is continuously differentiable with ′ ( ) =  ( ), for all ≥ 0, and (0) = 0 . The Picard-Lindelöf theorem together with the local Lipschitz continuity of the convex map  ∶ ℝ → ℝ implies the uniqueness of . □ Corollary 4.14. Let be bounded. Then, there exists some constant > 0 such that Proof. Since is bounded, we have that  is bounded, and therefore,  is Lipschitz continuous with Lipschitz constant ∶= sup ∈ ‖ ‖. For all 0 ∈ ℝ , we thus obtain that In order to end the proof of Theorem 2.5, we have to extend Corollary 4.13 to the unbounded case. We start with the following remark, which is the key observation in order to finish the proof of Theorem 2.5.
We are now able to finish the proof of Theorem 2.5. The following proposition summarizes the results from this section, and proves the implication ( ) ⇒ ( ). Moreover, the Nisio semigroup S of (, ) coincides with the semigroup envelope of (, ).
Proof. In view of Proposition 4.9, it remains to show that the Nisio semigroup is uniformly continuous and that is the unique solution to the ODE ′ =  with initial value (0) = 0 . By Remark 4.15, the initial value problem has a unique classical solution * ∈ 1 ([0, ∞); ℝ ), which is given by * ( ) ∶= S * ( ) 0 for all ≥ 0.
Remark 4.18. We will now derive a dual representation of the semigroup envelope by viewing the semigroup envelope as the cost functional of an optimal control problem, where, roughly speaking, "nature" tries to control the system into the worst possible scenario (using controls within the set ). For = ( 1 , … ) ∈  and ≥ 0, let ( ) ∈ ℝ × be given by for all 0 ∈ ℝ and ∈ {1, … , }. That is, ( ) is the matrix whose th row is the th row of ( ) for all ∈ {1, … , }. Here, the interpretation is that, in every state ∈ {1, … , }, "nature" is allowed to choose a different model ∈ . We now add a dynamic component, and define

COMPUTATION OF PRICE BOUNDS UNDER MODEL UNCERTAINTY
In this section, we demonstrate how price bounds for European contingent claims under uncertainty can be computed numerically in certain scenarios, first, via the explicit primal/dual description (19) of the semigroup envelope and, second, by solving the pricing ODE (3). Throughout, we consider two Q-matrices 0 ∈ ℝ × and ∈ ℝ × and, for , ℎ ∈ ℝ with ≤ ℎ , the interval [ , ℎ ]. Then, we consider the Q-operator  ∶ ℝ → ℝ given by Then, by Example 3.4,  is sublinear and has the (minimal) dual representation ({ 0 + , 0 + ℎ }, (0, 0)). Choosing the latter as a dual representation as in Theorem 2.5 (iii), we may compute  and  ℎ , for ℎ ≥ 0, via and In the sequel, we use (20) and (21) which is a discretization of the second space derivative with Neumann boundary conditions, and the rate matrix F I G U R E 2 Upper and lower price bounds for a bull spread (25) with = 4 and = 5 under volatility uncertainty depending on the current price in red and green, respectively. In black and blue, we see the value of the butterfly in the Bachelier model with drift 1 and 1.5, respectively [Color figure can be viewed at wileyonlinelibrary.com] as a discretization of the first space derivative. Then, the rate matrix 2 2 + , for > 0 and ∈ ℝ, is a finite-difference discretization of 2 2 + , which is the generator of a Brownian motion with volatility and drift .
We start with an example, where we demonstrate how the semigroup envelope can be computed by solving the nonlinear pricing ODE (3). In the following example, we compute the upper and lower semigroup envelope for a discretized version of a Brownian motion (Bachelier model) with drift or volatility uncertainty. The solutions resemble the price bounds resulting from the parameter uncertainty of the underlying asset (the discretized version of a Brownian Motion) for a particular European contingent claim with fixed maturity as a function of the current price of the underlying asset.
Example 5.1. In this example, we compute the semigroup envelope (S( )) ≥0 by solving the ODE ′ =  (0) = 0 ∈ ℝ with the explicit Euler method. The latter could be replaced by any other Runge-Kutta method. We consider the case, where = 101, = 1 10 . The state space is = { | ∈ {0, … , 100}}, which as a discretization of the interval [0,10], the maturity is = 1, and we choose 1,000 time steps in the explicit Euler method. We consider the following two examples.

F I G U R E 3
Upper and lower price bounds for a butterfly spread (24) with = 4 and = 5 under drift uncertainty from Example 5.1(a) in red and green, respectively. In blue and black, the upper and lower price bounds, computed via (26), respectively [Color figure can be viewed at wileyonlinelibrary.com] (a) Let  be given by (20) with 0 ∶= , ∶= , ∶= −1, and ℎ ∶= 1, that is, we consider the case of an uncertain drift parameter in the interval [−1, 1]. We price a butterfly spread, which is given by 0 ( ) = ( − − | − |) + , for = and ∈ {1, … , 100}, with = 4 and = 5. In Figure 1, we depict the upper and lower price bounds as well as the prices corresponding to the Bachelier model with drift −1 and 0 in blue and black, respectively. (b) Now, let 0 ∶= 0, ∶= , ∶= 0.5, and ℎ ∶= 1.5 in (20). That is, we consider the case of an uncertain volatility in the interval [0.5,1.5]. We price a bull spread with = 4 and = 5. In Figure 2, we see the upper and lower price bounds as well as the prices corresponding to the Bachelier model with volatilities 1 and 1.5 in black and blue, respectively.
The following example presents a second algorithm, using the primal/dual representation of the semigroup envelope, for the computation of price bounds for European contingent claims F I G U R E 4 Upper and lower price bounds for a bull spread (25) with = 4 and = 5 under volatility uncertainty from Example 5.1(b) in red and green, respectively. In blue and black, the upper and lower price bounds, computed via (26), respectively [Color figure can be viewed at wileyonlinelibrary.com] under model uncertainty. We compare the results with the ones from the previous example, which were obtained using Euler's method.
Example 5.2. For a fixed maturity ≥ 0, we consider the partitions ∶= { 2 − | = 0, … , 2 }, for ∈ ℕ 0 , of the time interval [0, ]. We are then able to approximate the upper bound for prices of European contingent claims under uncertainty with maturity = 1 by computing, for ∈ ℕ 0 sufficiently large, with  ℎ given by (21) for ℎ ≥ 0. The fundamental system ℎ( 0 + ) , for = ℎ , , appearing in (21) can either be computed via the Jordan decomposition of 0 + , by the approximation with ∈ ℕ 0 sufficiently large or by numerically solving the matrix-valued ODE ′ = ( 0 + ) with (0) = , where = is the × -identity matrix. We illustrate the approximation of the semigroup envelope via (26) in the following two examples, where and are given by (22) and (23). Again, we consider the case, where, = 101, = 1 10 and the maturity is = 1.
(a) As in Example 5.1(a), let 0 ∶= , ∶= , ∶= −1 and ℎ ∶= 1. Again, we compute the price of a butterfly spread, which is given by (24) with = 4 and = 5. In Figure 3, we see the upper and lower price curves from the previous example as well as the price bounds computed in this example. We observe that the price bounds match very well. (b) We consider the case of an uncertain volatility parameter from Example 5.1(b), that is, let 0 ∶= 0, ∶= , ∶= 0.5, and ℎ ∶= 1.5. As in Example 5.1(b), we price a bull spread given by (25) with = 4 and = 5. In Figure 4, we again depict the upper and lower price bounds from the previous example and this example. As in part (a), we observe that the price bounds perfectly match. □

A C K N O W L E D G M E N T S
The author would like to thank Robert Denk and Michael Kupper for their valuable comments, support, and suggestions related to this work. Moreover, the author expresses his gratitude toward three anonymous referees for their helpful suggestions and remarks. Financial support through the German Research Foundation via CRC 1283 is gratefully acknowledged. Open access funding enabled and organized by Projekt DEAL.