Convergence of optimal expected utility for a sequence of binomial models

Abstract We consider the convergence of the solution of a discrete‐time utility maximization problem for a sequence of binomial models to the Black‐Scholes‐Merton model for general utility functions. In previous work by D. Kreps and the second named author a counter‐example for positively skewed non‐symmetric binomial models has been constructed, while the symmetric case was left as an open problem. In the present article we show that convergence holds for the symmetric case and for negatively skewed binomial models. The proof depends on some rather fine estimates of the tail behaviors of binomial random variables. We also review some general results on the convergence of discrete models to Black‐Scholes‐Merton as developed in a recent monograph by D. Kreps.


INTRODUCTION
Mark Davis has dedicated a large portion of his impressive scientific work to Mathematical Finance. He shaped this field by applying masterfully the tools from stochastic analysis which he dominated so well.
The present authors remember very well several discussions during Mark's 7 months stay in Vienna in 2000. Mark repeatedly expressed his amazement about the perfect match of Itô's stochastic calculus with the line of Mathematical Finance initiated by Black, Scholes, and Merton (Black and Scholes, 1973;Merton, 1973). In particular, Mark was astonished how well the martingale representation theorem fits to this theory and loved this connection. He also appreciated the approximation of the Black-Scholes-Merton model by binomial processes as initiated by Cox et al. (1979). The subtle notions of Itô integrals and the martingale representation theorem in continuous time boil down in the discrete setting to simple linear algebra as we all know from teaching Mathematical Finance to undergraduates.
Let us have a look back to the early days of Mathematical Finance. After the pioneering papers Black and Scholes (1973), Merton (1973), and Cox et al. (1979) the next step was taken by Harrison, Kreps, and Pliska in three articles Harrison and Kreps (1979), Kreps (1981), and Harrison and Pliska (1981), who paved the way from previous ad hoc arguments to a systematic study of the notions of arbitrage, martingale measures, martingale representation, complete markets, and the interconnections between these notions. This opened the arena where Mark made so many important contributions.
In the present note we want to go back to the roots and reconsider the approximation of the Black-Scholes-Merton model by discrete models such as the binomial model. Already Bachelier has viewed Brownian motion as an infinitesimal version of a symmetric random walk. This random walk view opens a very direct path from simple linear algebra to the martingale representation theorem. The guiding intuition is that a Brownian motion during each infinitesimal time interval has only two choices, namely going up or going down by a properly scaled infinitesimal.
But what happens if we take some other approximation of Brownian motion by discrete processes? The archetypical example is the "trinomial" model. In addition to the up-and down-tick in the binomial model, there is a third intermediate possibility. In the limit you find the same (geometric) Brownian motion as for the binomial model. But if you try to apply the discrete time reasoning from the binomial case as in Cox et al. (1979) to the trinomial model, you immediately run into serious trouble. There is no replication argument available any more and the financial market becomes highly incomplete. The blunt reason is that you are looking for the solution of two linear equations with three unknowns so that there is no hope for a unique solution. As is well known, the unique arbitrage-free option prices of the binomial models are replaced by an interval of arbitrage-free prices in the trinomial model, whose lower and upper bounds are given by the sub-and super-replication prices. Typically these intervals become very wide and are of no practical relevance.
But is this really the last word? From an economic point of view this sharp distinction between two similar approximations of the same object seems to be artificial. Can one find a more satisfactory answer? This question recently triggered the attention of David Kreps and led him to take up the theme of option pricing again, where he had made fundamental contributions some 40 years ago. This renewed interest resulted in the monograph Kreps (2019) which appeared in 2019. Kreps' starting point was a simulation of the results of delta-hedging in the framework of a trinomial model. He applied this rather naive strategy to a standard European option and plotted the outcomes of 500 simulations (see figure 1.1 of Kreps (2019)).The result was amazing: With bare eyes one can hardly see the difference between the precise terminal option value and the result of the delta hedge. The visual impression of the outcome of the simulations is that of a complete market.
To analyze this phenomenon in proper generality let us fix some notation as in Kreps' monograph (Kreps 2019). We work in the space Ω = 0 [0, 1], the space of all continuous functions from [0,1] to ℝ whose value at 0 is 0. We let denote a typical element of Ω, with ( ) the value of at date . Let  be Wiener measure on Ω.
We consider a Black-Scholes-Merton model of the form ( , ) = ( ) for the stock, taking the bond as numeraire. We know that there is a unique probability measure on Ω, denoted  * , that is equivalent to and, under which, ( ) is a martingale (Harrison and Kreps (1979)).
Contingent claims are Borel-measurable functions ∶ Ω → ℝ. We let  denote the space of bounded and contingent claims which are continuous with respect to the norm topology on Ω = 0 [0, 1]. The well-known "complete markets" result for the Black-Scholes model says that, for every ∈ , can uniquely be written for a predictable and -integrable integrand . Now suppose that for = 1, 2, …, we have different probability measures  defined on Ω, with the following structure: For each , the support of  consists of piecewise linear functions that, in particular, are linear on all intervals of the form [ ∕ , ( + 1)∕ ], for = 0, … , − 1. The interpretation is that  represents a probability distribution on paths of the log of the stock price in an -th discrete-time economy, in which trading between the stock and bond is possible only at times = ∕ for = 0, … , − 1. At time 1, the bond and stock liquidate in state at prices 1 and (1) .
Consumers in the -th discrete-time economy can implement (state-dependent) self-financing trading strategies ( (0), { ( ∕ ), = 0, … , − 1}), where the interpretation is that (0) is the value of the consumer's initial portfolio, ( ∕ , ) is the number of shares of stock held by the consumer after she has traded at time ∕ , and, after time 0, bond holdings are adjusted that any adjustments in stock holdings at times ∕ are financed with bond purchases/sales. In the -th economy, the consumer only knows at time ∕ the evolution of the stock price up to and including that date. In the usual fashion, if ( ∕ , ) is the value of the portfolio formed by this trading strategy at time ∕ in state , then for all = 1, … , , We maintain throughout the assumption that, for each ,  specifies an arbitrage-free model of a financial market in the usual sense: It is impossible to find in the -th discrete-time model a trading strategy ( (0), ) with (0) = 0, (1) ≥ 0  -a.s., and (1) > 0 with  -positive probability. This is true if and only if there exists a probability measure  * that is equivalent to  , under which {( ( ∕ ) , ∕ ); = 0, … , } is a martingale (Dalang, Morton, Willinger Dalang et al. (1990)). Such a  * is called an equivalent martingale measure (emm) for the -th discrete-time model. Of course, in general there will be more than one emm  * . However, with respect to any emm  * , ( ( ∕ ), ∕ ) is a martingale with respect to  * . In particular, the expectation of (1) under every emm  * is (0).
A basic question treated in detail in Kreps' mongraph Kreps (2019) is the following: in which precise sense and under which precise assumptions can elements of  be approximated by elements of  ? During a visit of the second named author to Stanford University in the spring term 2019 we jointly took up this scheme in the paper Kreps and Schachermayer (2021) and found the following definition to be suitable.
Definition 1.1. The claim can be asymptotically synthesized with -controlled risk if, for every > 0, there exists such that, for all > , there is and, in addition, The main result result of Kreps and Schachermayer (2020) states that, under mild conditions which are natural in the present context, -controlled risk can be attained: Then every (continuous and bounded) ∈  can be asymptotically synthesized with -controlled risk. Moreover, fixing the claim , the sequence of claims { } that asymptotically synthesize can be chosen where, for ( (0), ) the trading strategy that gives , (0) ≡  * [ ], the Black-Scholes-Merton price of the claim .
As a particular example, the theorem applies, for example, to the trinomial model and, more generally, to a wide range of incomplete approximations of the Black-Scholes model. The message is: replacing the notions of sub-and super-replications by Definition 1.1 we obtain economically meaningful notions of synthesis also in incomplete markets. This is in contrast to the no-arbitrage bounds which typically only yield huge intervals.
Theorem 1.2 settles the issue of replication of contingent claims. However, this result immediately triggers the next question: what about utility maximization when passing from a discrete approximation to the limiting Black-Scholes-Merton model? This question too is amply discussed in Kreps' monograph Kreps (2019) and was further pursued in another paper Kreps and Schachermayer (2020) by Kreps and the second named author.
Let us recapitulate the setting which is slightly more structured than the assumption of Theorem 1.2 above.
We imagine an expected-utility-maximizing agent who is endowed with initial wealth , trading as above either in the discrete market or in the continuous limit.
The question addressed in Kreps (2019) is: If we place this consumer in the th discrete-time economy (where the stock and bond trade (only) at times 0, 1∕ , 2∕ , … , ( − 1)∕ ), does the optimal expected utility she can attain approach, as → ∞, what she can optimally attain in the continuous-time Black-Scholes-Merton economy?
Let ( ) be the supremal expected utility she can attain in the th discrete-time economy if her initial wealth is , and let ( ) be her supremal expected utility in the Black-Scholes-Merton economy. Kreps (2019) obtained partial one-sided results, showing that lim inf ( ) ≥ ( ). And he proved lim ( ) = ( ) in the very special cases of having either constant absolute or relative risk aversion. But he only conjectures that the second "half", or lim sup ( ) ≤ ( ) is true for general (sufficiently regular) .
To tackle this issue in proper generality we first need precise definitions Definition 1.3. A utility function is a strictly increasing, strictly concave, and continuously differentiable function ∶ ℝ + → ℝ, which satisfies the Inada conditions that lim →0 ′ ( ) = ∞ and lim →∞ ′ ( ) = 0.
As usual, we define the corresponding value functions ( ) and ( ) as the maximal expected utility an agent can achieve from initial wealth by admissibly trading in the markets defined by the measures  and .
The following theorem gives an affirmative answer to Kreps' conjecture under the asymptotic elasticity condition.
To resume, admitting the condition AE( ) < 1, this theorem settles the issue of convergence of the optimal expected utility in the discrete approximations of the Black-Scholes-Merton model in an economically satisfactory way. Note that we did not suppose the completeness of the discrete markets modeled by the measures  . In other words: the convergence of expected utility behaves well, independently of whether we are in the binomial or in the trinomial approximation. Also note that the assertion of finiteness of both terms in (3)-as a consequence of the asymptotic elasticity assumption-is a non-trivial result.
But, of course, at this stage the next question pops up. What happens for the-admittedly somewhat pathological-case of utility functions with AE( ) = 1? For this case Kreps and the second named author found to their surprise that the answer to Kreps' conjecture turns out to be negative. More surprisingly: this pathology already happens in the framework of the binomial model! To address this issue let us fix the notation for the special case of the binomial model in (2). For arbitrary ∈ (0, 1) we consider and i.i.d. sequence ( ) ∞ =1 of Bernoulli variables with where ∈ (0, 1). Denote by the corresponding standardized variables Then , = [ , = , ].
The distribution of the random variable (1) under  equals the binomial distribution of Again we define the value functions ( ) and ( ) as and where  * and  * now are the unique equivalent martingale measures pertaining to the Black-Scholes-Merton model and its -th approximation, respectively. When ∈ (0, 1∕2) we have [ 3 ] > 0. This is the case where things go astray, as demonstrated by the counterexample in Section 9 of Kreps and Schachermayer (2020). If ∈ (0, 1∕2), there is a utility function satisfying the conditions of Definition 1.3 (but with AE( ) = 1) such that ( ) is a perfectly well-behaved finite function while lim →∞ ( ) = ∞, for all > 0. This phenomenon happens if [ 3 ] > 0 which means that the up-tick of the log-price is larger than the down-tick.
It was left as an open question in Kreps and Schachermayer (2020) what happens in the case [ 3 ] ≤ 0 with special emphasis on the symmetric case [ 3 ] = 0 when the up-tick of the log-price is equal to the down-tick.
The good news is that in this case everything works out as it should as stated in the subsequent theorem which is the main novel contribution of the present paper.
Theorem 1.5. If and are as above with ∈ [1∕2, 1), we have The theorem will follow from the subsequent more technical version of (10). As above, let where * and * denote the unique equivalent martingale measures of the binomial and the Black-Scholes-Merton model, respectively.
Proof of Theorem 1.5 (admitting Proposition 1.6). We deduce from (Kreps and Schachermayer, 2020, Proposition 2) and standard results on conjugate functions that the reverse inequality to (11) does hold true, that is, Admitting Proposition 1.6, formulas (11) and (12) Using again standard results on conjugate functions (compare Kreps and Schachermayer (2020)) we obtain (10) from (13). □ We therefore are left to show Proposition 1.6 which will be a technically demanding task. A key ingredient for the proof of Proposition 1.6 are estimates for the tails of the standardized binomial distributions in terms of the standard Gaussian tails.
Let be a standard normal random variable and denote its density by ( ) = and̄( Proposition 1.7. Suppose ∈ [1∕2, 1), then there is > 0 such that, for ≥ 1, we have Remark 1.8. The terms (16) and (17) are a lower bound for the area under the density ( ) between , −1 and , on the left and , and , +1 on the right tail, respectively.
We prove Proposition 1.6 and Proposition 1.7 first for the symmetric case = 1∕2 in Section 2, as this case allows for several simplifications and the main arguments are more transparent. We then provide the slightly more technical details for the asymmetric case ∈ (1∕2, 1) in Section 3. Remark 1.9. Of course the history of the Central Limit Theorem goes back to prehistoric times. William Feller said in 1945in his article Feller (1945: Although the problem of an efficient estimation of the error in the normal approximation to the binomial distribution is classical, the many papers which are still being written on the subject show that not all pertinent questions have found a satisfactory solution.
We believe his statement remains valid to the present day, below some recent publications on that topic are given below and in the references. Further on Feller says: What is really needed in many applications is an estimate of the relative error, but this seems difficult to obtain.
Here the control of the relative error is crucial for our application to utility maximization. For small values of , namely Proposition 1.7 follows from an old and well-known limit theorem from the proof of the De Moivre-Laplace Central Limit Theorem, see, for example, (Feller, 1968, Theorem VII.3.1, p. 184 Based on a corollary of a theorem given by Chernoff, (Okamoto, 1958, Theorem 1.i, p. 33) yields for = 1∕2, in our notation, the inequalitȳ( ) < √ 2 ( ). SinceΦ( ) ∼ −1 ( ) as → ∞ our estimate improves asymptotically by a factor of 1∕ as → ∞. This is important for the utility application.
In (Desolneux et al., 2008, Chapter 4) we find an impressive discussion of a large list of inequalities for the binomial distribution.

THE SYMMETRIC CASE
Proof of Proposition 1.6 ((symmetric case) admitting Proposition 1.7). As in (Kreps and Schachermayer, 2020, Section 3) we write and where ( ) = sup{ ( ) − ∶ > 0} is the conjugate function of . The random variables and are the densities of the (unique) equivalent martingale measures  * and  * with respect to  and  , respectively, that is, = In the symmetric case the calculations from (Kreps and Schachermayer, 2020, Section 6) simplify, and we have that It follows that increases to 1∕8 as → ∞.
Fix > 0 such that ( ) < ∞, otherwise (11) is certainly true. Denote by ∶ ℝ → ℝ the function and by ∶ ℝ → ℝ the function Clearly, these functions are increasing on ℝ. Note, however, that they are not necessarily concave. We know that where ( ) is the standard normal density and , are the binomial probabilities as in Section 1. As ( ) ≤ ( ) for all ∈ ℝ, in order to show (11), it will suffice to show In order to show (30) the crucial estimate is the uniform integrability of the random variables ( (1)) under . More precisely, we need the following estimates (31) and (32). For > 0 there is > 0 such that and uniformly in ∈ ℕ. Formulas (31) and (32) correspond to the formulas (Kreps and Schachermayer, 2020, (8.12) and (8.14)). First we consider > (1) + . If ( , ) > , then ∕2 < ≤ and we are in a position to invoke formula (17) of Theorem 1.7, which gives an estimate on the right tail of the binomial distribution as compared to the normal one. More precisely, there is a universal constant > 0 such that, for every ∈ ℕ and ∕2 < ≤ and all ∈ ( , , , +1 ) Thus It follows from (28) that the right-hand side of (34) can be made smaller than for sufficiently large .
Using the well-known weak convergence of  to  and the uniform integrability conditions we can deduce (29), see (van der Vaart, 1998, Thm 2.20, p. 17).
This finishes the proof of Proposition 1.6. □ Proof of Proposition 1.7 (symmetric case). Let us start with (17). It is enough to prove that there is 0 > 0 such that (17) holds for ≥ 0 , since is strictly positive and there are only finitely many remaining cases that can be incorporated in the value of the constant .
As regards the denominator of (37), we have Writing log = log + log( ∕ ) and log( − ) = log + log(1 − ∕ ) and combining (42) log as → ∞. Here the second term on the right hand side of (46) is the leading term.
The proof of (16) (16) and (17) we mentioned already in Remark 1.8 how these two inequalities imply (18) and (19). □ For the above proof of Proposition 1.6 the estimates (16) and (17) involving an unspecified constant > 0 is sufficiently strong. But we we can do better than that. We may adapt the above argument to yield a constant = 1 + for sufficiently large. Indeed, analyzing the above proof of Proposition 1.7, w e see that the above argument also works when we split the interval ( 1 2 , 1) not at = 3∕4, but at a point ∈ (1∕2, 1), which is close to 1∕2 to obtain a better constant , for large enough . The detailed argument is given in the proof of the following proposition, which sharpens Proposition 1.7.

D ATA AVA I L A B I L I T Y S TAT E M E N T
Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.