Semi-tractability of optimal stopping problems via a weighted stochastic mesh algorithm

In this article we propose a Weighted Stochastic Mesh (WSM) Algorithm for approximating the value of a discrete and continuous time optimal stopping problem. We prove that in the discrete case the WSM algorithm leads to semi-tractability of the corresponding optimal problems in the sense that its complexity is bounded in order by $\varepsilon^{-4}\log^{d+2}(1/\varepsilon)$ with $d$ being the dimension of the underlying Markov chain. Furthermore we study the WSM approach in the context of continuous time optimal stopping problems and derive the corresponding complexity bounds. Although we can not prove semi-tractability in this case, our bounds turn out to be the tightest ones among the bounds known for the existing algorithms in the literature. We illustrate our theoretical findings by a numerical example.

of American options). Primal and dual approaches have been developed in the literature giving rise to regression-type Monte Carlo (MC) algorithms for high-dimensional optimal stopping problems. So far modern literature on numerical analysis of high-dimensional optimal stopping problems focused almost exclusively on convergence analysis of various simulation-based stochastic algorithms. However, comparing different algorithms based only on their convergence rates is not possible since these algorithms may have different costs. Therefore, it is important to carry out complexity analysis of stochastic algorithms for optimal stopping problems. Such a complexity analysis will provide us with convergence rates of a stochastic algorithm in terms of its cost and hence can be viewed as a universal criteria for comparing different algorithms. Complexity analysis has played and is still playing an important role in numerical analysis of algorithms, see Novak and Woźniakowski (2008) and the references therein. The key numerical problem studied in this literature is the computation of integrals by means of deterministic and stochastic (randomized) algorithms. Optimal stopping problems cannot be boiled down to the computation of a single integral but rather require computation of several nested integrals (dynamic programming principle). Hence, the standard concepts and results from the existing complexity theory cannot be directly transferred to complexity analysis of optimal stopping problems.
One of the most widely adopted regression algorithms by practitioners is the Longstaff and Schwartz (LS) algorithm. It is based on approximating conditional expectations using least-squares regression on a given basis of functions in each backward induction step. Longstaff and Schwartz (2001) demonstrated the efficiency of their approach through a number of numerical examples, and in Clément, Lamberton, and Protter (2002) and Zanger (2013) general convergence properties of the method were established. In particular, it follows from corollary 3.10 in Zanger (2013) that for a fixed number of stopping opportunities and a popular choice of polynomial basis functions of degree less or equal to , the error of estimating the corresponding value function at one point is bounded by where is the number of paths used to perform regression, ≥ 1 is related to smoothness of the corresponding conditional expectation operator, is the dimension of the underlying state space, and is some constant independent of , , and . On the other hand, due to the computation of a (random) pseudoinverse at every stopping date, the computational cost of the least-squares MC algorithm is approximately proportional to 1 2 where 1 is proportional to the cost of an elementary operation (multiplication for example). This leads to the following estimate for the complexity of the LS algorithm, that is, the amount of "elementary" operations needed to construct an approximation for the value function with accuracy . Proposition 1.1. For stopping opportunities and underlying dimension , the computational work for achieving an accuracy by the LS algorithm is bounded by with 2 ∶= ln(2 )∕ ln 5.
If we next consider a continuous-time optimal stopping problem, then we need to approximate it by a discrete one with stopping dates, and then let → ∞. For instance, let us assume that the error due to the time discretization is of order − for some 0 < < 1, independent of . Then, for achieving an overall accuracy of order , we may take = 3 −1∕ for some 3 > 0, which gives by (2) the complexity bound It follows that the complexity of the LS algorithm for continuous-time optimal stopping problems may even grow faster than exp( −1∕ ). Similar complexity bounds can be derived for other simulation-based regression algorithms, such as the value iteration algorithm by Tsitsiklis and Van Roy (2001) (TV). See also for example, Egloff, Kohler, and Todorovic (2007) for more general regression algorithms or Goldberg and Chen (2018) for a novel nested-type MC approach with complexity which is independent of but exponential in −1 , unfortunately. An interesting question is whether the complexity bounds (2) and (3) for the discrete and continuoustime stopping problems, respectively, are attained in worst cases. The appearance of 1∕ in the exponential in (3) and the number of exercise dates in the exponential in (2), respectively, is of course due to the appearance of in the exponential of the convergence estimate (1). In fact, the latter appearance is observed in all error bounds concerning regression-based backward dynamic programs for optimal stopping in the literature (e.g., Egloff et al., 2007, Zanger, 2013. It also appears in a later result by Zanger (2018), based on dependent samples, and in the convergence analysis by Belomestny and Schoenmakers (2018) in the context of optimal stopping of McKean-Vlasov processes. This factor seems to be unavoidable because at each backward step the projection error of the estimated continuation function needs to be bounded in relation to the projection error of the true continuation function. For details, see for instance, Zanger (2013), theorem 3.1 versus theorem 3.3, and Zanger (2018), theorem 5.1 versus theorem 5.6. It should also be noted that if we discretize a continuous-time optimal stopping problem, then conditional variance of the underlying process decreases from one exercise date to the next. However, this decrease is typically of order 1∕ with being the number of exercise dates and as such is not fast enough to remove exponential dependence on in the above convergence estimates. Thus, in view of the above considerations, the complexity bounds (2) and (3) for the LS algorithm seem to be sharp in some sense, but, a rigorous proof of this assertion seems highly nontrivial and therefore beyond the scope of this paper.
An important notion in complexity analysis is tractability of a numerical problem. A -dimensional numerical problem, for example, computation of an integral ∫ [0,1] ( ) is called tractable, if there is an algorithm to solve it with complexity ( , ) satisfying Unfortunately in the case of optimal stopping problems, this definition is not very meaningful and rather restrictive. It turns out that for all regression-type algorithms one has, already in the case of discrete-time optimal stopping problems, lim sup (based on the convergence rates known in the literature), that is, any discrete-time optimal stopping problem is intractable according to this definition. As an example, consider again the LS algorithm in the case of analytic (hence infinitely smooth) continuation functions. Using results from Trefethen (2017), it can be shown that the error (cf. Equation 1) of the estimated value function in this case has the form: where is again the maximal polynomial degree. Similar to Proposition 1.1, it then follows that is an upper bound for the LS complexity and so we get (5) (take −1 = and let ↗ ∞). Thus even in the case of analytic continuation functions tractability of discrete-time optimal stopping problems in the sense of (4) cannot be established by the LS algorithm. However, the latter observation also applies to any other simulation-based algorithm addressed in this paper, including the Weighted Stochastic Mesh (WSM) algorithm that we are going to present below. In fact, the problem with criterion (4) is that it puts too much weight on the dimension on the one hand, but on the other hand, is too relaxed regarding the dependence of ( , ) on . For instance, it is not difficult to see that it renders a problem with an algorithmic complexity of order 2 exp( −1 ∕ log log … log −1 ) to be (weakly) tractable while an algorithm with complexity 2 ∕ is not. However, running an algorithm with a complexity growth of exp( −1 ∕ log log … log −1 ), hence faster than any − , ∈ ℕ, when ↘ 0, is in practice impossible (even when = 1). In the setting of optimal stopping problems, the dimension is typically fixed, though can be large. Therefore, in our paper we propose a more meaningful definition of tractability based on the quantity called tractability index.

Definition 1.2.
(i) For an algorithm with complexity ( , ) the so-called tractability index is defined as (ii) We call a problem semitractable if there is an algorithm to solve it which has zero-tractability index, that is, For example, it is easily seen from Proposition 1.1 that for continuation functions with smoothness the discrete time LS algorithm has tractability index Γ  = 3∕ , whereas in the case of analytic continuation functions (6) implies log  ( , ) log(1∕ ) ≲ log + log 1 + log log 1 log 1 , hence Γ =∞  = 0 and semitractability in our sense follows. However, we see from (3) (for < ∞) and a similar expression obtained from (6) in the case of analytic continuation functions, that the LS algorithm has tractability index ∞ for continuous-time optimal stopping problems.
In this paper, we introduce WSM algorithm and show (under mild assumptions) that a discrete-time optimal stopping problem may be computed by this algorithm with tractability index 0, and so is semitractable in the sense of Definition 1.2(ii). Furthermore, we show that the continuous-time stopping problems may be computed via this algorithm with finite tractability index equal to 2. The construction of this algorithm in the discrete time case follows closely the idea of the mesh method by Broadie and Glasserman (2004). By enhancing the latter method with a suitable regularization, we prove that the complexity of the resulting WSM algorithm satisfies (8) (under mild conditions), provided the transition densities of the underlying Markov chain are analytically known or can be well approximated. It turns out that for solving a continuous-time optimal stopping problem we do not need to assume that the transition densities are known but can use Gaussian transition densities of the corresponding Euler scheme. This results in an algorithm with complexity of order −(2 +14) for some constant > 1.
Although this does not imply semitractability of continuous-time optimal stopping problems, the proposed algorithm is very simple and its complexity remains provably polynomial in −1 as opposed to the LS approach. In particular, it follows that the WSM algorithm for continuous-time optimal stopping problems has tractability index 2, and as such has the smallest tractability index among the existing algorithms for continuous-time optimal stopping problems. Let us remark that a complete convergence analysis as well as complexity analysis of the stochastic mesh method for optimal stopping in discrete and continuous time is still missing in the literature, apart from some preliminary results for the discrete case in Agarwal and Juneja (2013). But, Agarwal and Juneja (2013) does not trace the dependence of the errors on the underlying dimension and the number of stopping times, and is moreover based on a rather restrictive assumption of a compact state space. Furthermore, the WSM algorithm presented in this paper bears some similarities to the random grid algorithm of Rust (1997). However, the Rust (1997) algorithm was constructed and studied for Markov Decision Problems in discrete time and is not directly applicable to optimal stopping problems. As such the corresponding convergence analysis in Rust (1997) differs in several respects. For example, it assumes a compact state space and Lipschitz continuity of transition densities with Lipschitz constant basically not depending on the dimension. (The latter assumption is violated by a -dimensional Gaussian kernel with small enough variance, due to exponentially growing Lipschitz constants in , for instance.) The paper is organized as follows. A description of the proposed algorithm is given in Section 2. Section 2.2 is devoted to convergence and complexity analysis of our algorithm. In Section 3, we turn to continuous-time optimal stopping problems. Section 4 concisely highlights the main achievements of the paper. Some numerical experiments are presented in Section 5 and all proofs are collected in Section 6.

DISCRETE-TIME OPTIMAL STOPPING PROBLEMS
We begin with the description of the WSM algorithm for discrete-time optimal stopping problems. Let us assume a finite set of stopping dates {0, … , }, for some natural > 0, and let ( , = 0, … , ) be a Markov chain in ℝ , adapted to a filtration ( , = 0, … , ). For a given set of nonnegative reward functions , = 0, … , , on ℝ , we then consider the discrete Snell envelope process: where  , stands for the set of  -stopping times with values in the set { , … , }, and ∶=  stands for the  -conditional expectation, and the measurable functions (⋅) exist due to Markovianity of the process ( ) ≥0 . For simplicity and without loss of generality, we assume that the Markov chain ( ) ≥0 is time homogeneous with -steps transition density denoted by ( | ) and one-step density denoted by for all , ∈ ℝ . Fix some 0 ∈ ℝ and assume that 0 = 0 . It is well known that the Snell envelope (9) satisfies the dynamic program principle, Next we fix some > 0 and define a truncated version of the above dynamic program viã Thus, by construction,̃vanishes outside the ball . Also by construction it holds that which is easily seen by backward induction. In view of (11), we may write Now assume that we have a set of trajectories ( ) , = 0, … , , with ( ) 0 = 0 , = 1, … , , simulated according to the one-step transition density , and consider the approximation: where in view of the Chapman-Kolmogorov equation ) .
Hence, we have approximately We thus propose the following algorithm. We start with for = 1, … , . By construction, each function vanishes outside the ball . Working all the way down to = 0 results in the approximation: for 0 . As such the presented algorithm is closely related to the mesh method of Broadie and Glasserman (2004) apart from truncation at level and a special choice of weights.

Cost estimation
Let us estimate the cost of carrying out the backward dynamic program (14). One needs to compute ( ( ) +1 | ( ) ) for all = 1, … , , , = 1, … , . This can be done at a cost of order 2 ( ) , where ( ) is the cost of evaluating a (typical) function of 2 arguments. In the typical situation ( ) is proportional to . The evaluation of for = 1, … , , = 1, … , , has a cost of order 2 * with * being the cost of an elementary numerical operation, which is negligible if * ≪ ( ) . So the overall cost of carrying out the backward dynamic program (14) is of order 2 ( ) .

Error and complexity analysis
In this section, we analyze convergence of the WSM estimate (14) to the solution of the discrete optimal stopping problem (9) for = 0 and a fixed 0 ∈ ℝ as → ∞. Let us first bound a distance between and̃, = 0, … , .
and that [ max Suppose further that for some , > 0, and = 1, … , , for all , ∈ ℝ . One then has Next we control the discrepancy between 0 and̃0.

Proposition 2.3. With
and such that (1 + )∕ Proposition 2.5. Under the assumptions of Proposition 2.2 the complexity of the WSM algorithm is bounded from above by where 1 > 0 and 2 > 1 are natural constants and ( ) stands for the cost of computing the transition density ( | ) at one point ( , ).
Corollary 2.6. For a fixed > 0, the discrete-time optimal stopping problem (9) with and ( ) ≥0 satisfying (16), (17), and (18) is semitractable, provided that the complexity of computing the transition density ( | ) at one point ( , ) is at most polynomial in . Different approximation algorithms for discrete-time optimal stopping problems can be compared using the tractability index (7). For example, it follows from (2) that the tractability index of the LS approach is equal to 3∕ . If the continuation functions are analytic, then the tractability index for the LS approach becomes zero. Moreover, from inspection of theorem 2.4 in Bally, Pagès, and Printems (2005), we see that the Quantization Tree Method (QTM) has tractability index 2.

Approximation of the transition density
A crucial condition for semitractability in the discrete exercise case is the availability of the transition density ( | ) of the chain ( ) ≥0 in a closed (or cheaply computable) form. However, we can show that, if a sequence of approximating densities ( | ), ∈ ℕ, converging to ( | ) can be constructed in such a way that for some ∈ ℕ and a sequence ↗ ∞, ↗ ∞, then under proper assumptions on the growth of and the cost of computing (in fact it should be at most polynomial in ), one can derive a complexity bound ( , ) satisfying lim ↘0 log ( , ) log 1 is finite and does not depend on .
The proof involves a (rather straightforward) extension of the present one based on exact transition densities. But, on the one hand, one of the main results in this paper, tractability index 2 of the continuoustime stopping problem, does not rely on transition density approximation, and on the other hand, such a proof would entail a notational blow up and might detract the reader from the main lines, therefore the details are omitted.
To construct a sequence of approximations ( | ) satisfying the assumption (23), one can use various small-time expansions for transition densities of stochastic processes, see, for example, Azencott (1984) and Li (2013). Let us exemplify this type of approximation in the case of one-dimensional diffusion processes of the form: where is a bounded function, twice continuously differentiable, with bounded derivatives and is a function with three continuous and bounded derivatives such that there exist two positive constants • , • with • ≤ ( ) ≤ • . Consider a Markov chain ( ) ≥0 defined as a time discretization of ( ) ≥0 , that is, def = Δ , = 0, 1, 2, … for some Δ > 0. Under the above conditions the following representation for the (one-step) transition density of the chain is proved in Florens-Zmirou (1993) (see also Dacunha-Castelle & Florens-Zmirou, 1986 for more general setting): where is a standard Brownian bridge, ( ) = ∫ 0 ( ) , = −1 and Note that the expectation in (24) is taken with respect to the known distribution of the Brownian bridge . By expanding the exponent in (24) into Taylor series, we get for Δ small enough If̄is uniformly bounded by a constant > 0, then the above series converges uniformly in and for all Δ small enough. Set It obviously holds ( | ) > 0 for Δ < Δ 0 ( ) and uniformly for all , ∈ ℝ. Hence the assumption (23) is satisfied with = 0, provided that Δ < Δ 0 for some Δ 0 depending only on . Similarly if̄≤ 0, then (23) holds. To sample from we can use the well-known acceptance rejection method which does not require the exact knowledge of a scaling factor ∫ ( | ) .

CONTINUOUS-TIME OPTIMAL STOPPING FOR DIFFUSIONS
In this section, we consider diffusion processes of the form where ∶ℝ → ℝ and ∶ℝ → ℝ × , are Lipschitz continuous and = ( 1 , … , ) is andimensional standard Wiener process on a probability space (Ω,  , ). As usual, the (augmented) filtration generated by ( ) ≥0 is denoted by ( ) ≥0 . We are interested in solving optimal stopping problems of the form: where is a given real-valued function on ℝ , ≥ 0, and  , stands for the set of stopping times taking values in [ , ]. The problem (27) is related to the so-called free boundary problem for the corresponding partial differential equation. Let us introduce the differential operator : We denote by , (or , ( )), ≥ , the solution of (26) starting at moment from ∶ , = .
With this notation established, it is worth discussing the main issue that we are going to address in this section. Our goal is to estimate ( , ) at a given point ( 0 , 0 ) with accuracy less than by an algorithm with complexity  ⋆ ( , ) which is polynomial in 1∕ . As already mentioned in the introduction some well-known algorithms such as the regression ones fail to achieve this goal (at least according to the existing complexity bounds in the literature).
Let us introduce the Snell envelope process: where (somewhat more general than in (27)) is a given nonnegative function on ℝ ≥0 × ℝ . In the first step, we perform a time discretization by introducing a finite set of stopping dates = ℎ, = 1, … , , with ℎ = ∕ and some natural number, and next consider the discretized Snell envelope process: where  , stands for the set of stopping times with values in the set { , … , }. Note that the measurable functions • (⋅) exist due to Markovianity of the process . The error due to the time discretization is well studied in the literature. We will rely on the following result which is implied by theorem 2.1 in Bally et al. (2005) for instance.
where the constants • , • > 0 depend on the Lipschitz constants for , , and , respectively.
In order to achieve an acceptable discretization error we choose a sufficiently large , and then concentrate on the computation of • .
In the next step, we approximate the underlying process using some strong discretization scheme on the time grid = ∕ , = 0, … , , yielding an approximation . It is assumed that the one-step transition densities of this scheme are explicitly known. The simplest and the most popular scheme is the Euler scheme, = 1, … , , which in general has strong convergence order 1∕2, and the one-step transition density of the chain ( +1 ) ≥0 is given bȳ with Σ = ⊤ ∈ ℝ × and ℎ = ∕ . Now we will turn to the discrete-time optimal stopping problem with possible stopping times { = ℎ, = 0, … , }. To this end, we introduce the discrete-time Markov chain def = adapted to the filtration ( ) def = ( ), and ( ) def = ( , ) (while abusing notation slightly) and consider the discretized Snell envelope process where  , stands for the set of stopping indices with values in { , … , }, and the measurable functions (⋅) (or (⋅)) exist due to Markovianity of the process (or ). The distance between and • is controlled by the next proposition.
where ≲ stands for inequality up to constant depending on • , • , and Euler .
Since the transition densities of the Euler scheme are explicitly known (see Equation 32), the WSM algorithm can be directly used for constructing an approximation 0 ( 0 ) based on the paths of the Markov chain ( ). To derive the complexity bounds of the resulting estimate, we shall make the following assumptions: (AX) Assume that there exists a constant̄> 0 such that for all 0 ≤ ≤ , uniformly in (hence ℎ). This assumption is satisfied under Lipschitz conditions on the coefficients of the stochastic differential equation (26), and can be proved using the Burkholder-Davis-Gundy inequality and the Gronwall lemma.
(AP) Assume furthermore that ( ℎ , = 0, … , ) is time homogeneous with transition densities ℎ ( | ) that satisfy the Aronson-type inequality: there exist positive constants and such that for any , ∈ ℝ and any > 0, it holds that This assumption holds if the coefficients in (26) are bounded and is uniformly elliptic.
The next proposition provides complexity bounds for the WSM algorithm in the case of continuoustime optimal stopping problems.

Discussion. As can be seen from
and this shows the efficiency of the proposed algorithm as compared to the existing algorithms for continuous-time optimal stopping problems at least as far as the tractability index is concerned. Indeed, the only algorithm available in the literature with a provably finite limit of type (39) is the QTM of Bally et al. (2005). Indeed, by tending the number of stopping times and the quantization number to infinity such that the corresponding errors in theorem 2.4(b) in Bally et al. (2005) are balanced, we derive the following complexity upper bound: Hence, Γ QTM = 6.

SUMMARY
For discrete-time optimal stopping problems, we have established semitractability for the proposed WSM algorithm with respect to rather general Markov chains governed by certain transition kernels. In particular, apart from assumption (17) on the spatial decay of such kernels and some growth condition on the payoff, no further smoothness assumptions are imposed. As a rule, if both the transition kernels and the rewards are nonsmooth, the continuation functions may be smooth only up to some finite degree. (Examples may be easily constructed.) In the most common case of infinitely smooth continuation functions, many regression algorithms including the LS and TV algorithms are also semitractable for discrete-time optimal stopping problems. But when passing on to continuous stopping problems, the tractability index of the WSM method remains bounded (equal to two) while the tractability index of the regression methods tends to infinity. See Table 1 and Table 2 for the tractability indices of the different methods in the discrete and continuous exercise case, respectively.

NUMERICAL EXPERIMENTS
In the following experiments, 1 we illustrate the WSM algorithm in the case of continuous-time optimal stopping problems. A lower bound for the value function ( 0 , 0 ) at a given point ( 0 , 0 ) via the WSM algorithm can be obtained using a suboptimal policy computed on an independent set of trajectories. This policy can be constructed either directly via (13) or by using an interpolation of the likelihood weights .
The fastest and simplest way to do this is to use the nearest neighbor interpolation based on training set of trajectories, in all experiments below the number of neighbors was set to 500.

American put option on a single asset
In order to illustrate the performance of the WSM algorithm in continuous time, we consider a financial problem of pricing American put option on a single log-Brownian asset with denoting the riskless rate of interest, assumed to be constant, and denoting the constant volatility. The payoff function is given by ( ) = ( − ) + and a fair price of the option is given by No closed-form solution for the price of this option is known, but there are various numerical methods which give accurate approximations to 0 . The parameter values used are = .08, = .20, = 0, = 100, = 3. An accurate estimate for the true price obtained via a binomial tree type algorithm is 6.9320 (see Kim, Ma, & Choe, 2013). In Figure 1, we show lower bounds due to WSM, the leastsquares approach of Longstaff and Schwartz (2001) and the value function regression algorithm of Tsitsiklis and Van Roy (2001) (VF) as functions of the number of stopping times forming a uniform grid on [0, ]. These lower bounds are constructed using a suboptimal stopping rule due to estimated continuation values evaluated on a new independent set of trajectories. The maximal degree of polynomials used as basis functions in LS and VF are indicated by the numbers (2 and 4) in the legend. As can be seen, WSM lower bounds are more stable when increases. The VF lower bounds seem to diverge as → ∞. A similar behavior of regression algorithms for increasing was observed in Stentoft (2014).

American max-call on five assets
Here, a model with = 5 identically distributed assets is considered, where each underlying has dividend yield . The risk-neutral dynamic of assets is given by  Figure 2 where also the related results for the LS algorithm with a polynomial basis of order 2 are reported.