Cover's universal portfolio, stochastic portfolio theory, and the numéraire portfolio

Abstract Cover's celebrated theorem states that the long‐run yield of a properly chosen “universal” portfolio is almost as good as that of the best retrospectively chosen constant rebalanced portfolio. The “universality” refers to the fact that this result is model‐free, that is, not dependent on an underlying stochastic process. We extend Cover's theorem to the setting of stochastic portfolio theory: the market portfolio is taken as the numéraire, and the rebalancing rule need not be constant anymore but may depend on the current state of the stock market. By fixing a stochastic model of the stock market this model‐free result is complemented by a comparison with the numéraire portfolio. Roughly speaking, under appropriate assumptions the asymptotic growth rate coincides for the three approaches mentioned in the title of this paper. We present results in both discrete and continuous time.

and stochastic portfolio theory (SPT henceforth) as initiated by Fernholz (see Fernholz, 2002 and the references therein). After all, both theories ask for general recipes for choosing in a preference-free way good (at least in the long run) portfolios among assets, whose prices over time are given by = ( 1 , … , ) .
Here, the time varies in , where stands either for ℕ = {0, 1, …} (discrete time) or ℝ + = [0, ∞) (continuous time). In many cases, is modeled by a stochastic process defined on some probability space. We note, however, that one may also consider a model-free approach where = ( 1 , … ) ∈ is just a deterministic trajectory with values in (0, ∞) . Indeed, Cover and Ordentlich's discrete time results in Cover (1991) and Cover and Ordentlich (1996) are formulated in this model-free sense. The situation is more subtle in continuous time due to stochastic integration. Jamshidian (1992), extended Cover's universal portfolio to continuous time under a setting of Itô processes satisfying some asymptotic stability conditions. In SPT, one also seeks robust investment strategies. More precisely, the strategies should be constructed using only observable quantities (such as market weights and their quadratic variations) and should not depend on quantities that are nonobservable or difficult to estimate. In particular, no drift estimation is involved which is usually required in expected utility maximization. These are exactly the principles behind the concept of functionally generated portfolios (see Fernholz, 2002, chapter 3). Although in most of the literature an Itô process setting is assumed, much of SPT can be developed in a model-free setting as done by Pal and Wong (2016) in discrete time and by Schied, Speiser, and Voloshchenko (2016) in continuous time. The reason why it works in continuous time is that the value processes of functionally generated portfolios can be defined without stochastic integration.
In this paper, we connect the two theories and provide additionally a comparison with the numéraire portfolio, which corresponds to the classical log-optimal portfolio. 1 Relationships between the two theories were studied in the recent papers by Ichiba and Brod (2014), and Brod (2014) as well as Wong (2015). In particular, Wong (2015) extends Cover's approach to the family of functionally generated portfolios in discrete time and shows that the distribution of wealth in this family satisfies a pathwise large deviation principle.

Summary and discussion of the main results
In this paper, we work under the setting of SPT. Namely, the market portfolio is taken as the benchmark, or "numéraire," so that the primary assets are the market weights which take values in the open -simplex defined by Δ = { ∈ (0, 1) | ∑ =1 = 1}. Its closure is denoted byΔ = { ∈ [0, 1] | ∑ =1 = 1}. This enables us to analyze strategies which depend on the market weights, and the performance of relative wealth with respect to the market portfolio.

Discrete time
We start by summarizing our results in discrete time. We extend Cover's universal portfolio to a class of -Lipschitz portfolio maps denoted by L . Each element of L maps the market weights to long-only portfolio weights inΔ (see Definition 3.1).
Denoting by ( ) ∞ =0 the relative wealth process corresponding to a portfolio strategy 2 ( ) ∞ =1 , we are interested in comparing the asymptotic growth rates lim →∞ 1 log ( ) for certain "optimal" portfolio choices . More precisely, under suitable conditions we establish asymptotic equality of the growth rates of the following portfolios: • the best retrospectively chosen portfolio at time in the class L ∶= ⋃ ∞ =1 L (in this context * , will denote the relative wealth at time achieved by investing according to the best strategy in L over the time interval [0, ]); • the analog of Cover's universal portfolio whose relative wealth process ( ( )) ∞ =0 is defined in (17) (here is a probability measure on L with full support on each L ); • the log-optimal portfolio among the class of long-only strategies, whose relative wealth process is denoted by (̂) ∞ =0 .
The first two portfolios can be compared in a model-free way (see Theorem 3.9). To compare them with the log-optimal portfolio, we have to introduce a probabilistic setting. Our main result can then be roughly stated as follows: holds almost surely.
Intuitively, this theorem says that a suitable full support mixture of strategies (given by the universal portfolio) is asymptotically as good as the best one chosen with hindsight, and the log-optimal portfolio constructed with full knowledge of the underlying process.

Continuous time
Theorem 1.1, which involves Lipschitz portfolio maps, cannot be extended directly to continuous time because of stochastic integrals. Instead, we consider functionally generated portfolios (see Section 4) whose relative wealth processes can be defined in a pathwise manner (see, e.g., Schied et al., 2016). This choice not only allows model-free considerations but also perfectly connects Cover's theory with SPT in continuous time. By replacing the set L by certain spaces of functionally generated portfolios and assuming that the log-optimal portfolio is functionally generated, we get essentially the same theorem as above.
Apart from the work by Jamshidian (1992), universal portfolio theory has only been studied sparingly in continuous time; see, for example, the paper Ichiba, Papathanakos, Banner, Karatzas, and Fernholz (2011) which studied the performance of the universal portfolio under the "Hybrid Atlas" model. To the best of our knowledge, generalizations to nonparametric families of portfolio maps (in continuous time) have not been considered so far. In this sense, our results significantly extend the continuous-time literature.
Although our approach focuses on the mathematical aspects, universal portfolio strategies have also been studied extensively in an algorithmic framework. See Li and Hoi (2014) for a recent survey and in particular Hazan and Kale (2015).

Discussion of the results
Our model-free approach has clear advantages over classical ones which heavily rely on a particular model choice. Even in the case when the model class (e.g., the Heston model or Lévy models) is correctly specified, model parameters cannot be estimated precisely and always come with a confidence interval. So, in practice, the estimated optimal portfolio is always different from the true optimal one. Our results support the idea that a Bayesian average in the spirit of Cover's universal portfolio is, in the long run, better than a suboptimal estimate.
As for the original theorems of Cover and Jamshidian, a valid criticism is of course that we only establish asymptotic equality on a first-order log-return basis. As such, a lot of important information is lost in the limit. However, one cannot expect to obtain any information on higher order terms unless further quantitative assumptions are made on the considered models. Cover's aim and also the goal of the present paper is to be as model-free as possible. 3 Nevertheless, it is of great theoretical and practical interest to strengthen the asymptotic results to quantitative ones under suitable additional conditions. We hope to address this important question in future research.
The remainder of the paper is organized as follows. In Section 2, we provide a brief overview (in discrete time for convenience) of the main topics of this paper, that is, Cover's theorem, the setting of SPT, and the log-optimal portfolio. In Section 3, we establish Theorem 1.1 in discrete time (see Theorem 3.10 and Corollary 3.11), whereas Section 4 is dedicated to proving the corresponding statements in continuous time in the setting of functionally generated portfolios and-for the comparison with the log-optimal portfolio-under the assumption that the market weights follow an ergodic Itô diffusion (see Theorem 4.11 and Corollary 4.13). Some auxiliary and technical proofs are gathered in the Appendix.

OVERVIEW OF THE THREE PORTFOLIOS
For expositional simplicity, time is discrete in this section.

Cover's universal portfolio
Cover's insight reveals that the "wisdom of hindsight" does not give significant advantages over a properly chosen "universal" portfolio constructed using only historical and current prices of the assets. The relevant optimality criterion here is the asymptotic growth rate of the portfolio.
Let us sketch this-at first glance surprising-result in a particularly easy setting (compare Cover, 1991;Cover & Ordentlich, 1996): Fix ∈ ℕ and think of an investor who at time looks back which stock she should have bought at time = 0 (by investing her initial endowment and subsequently holding the stock). There is an obvious solution: pick ∈ {1, … , } which maximizes the normalized logarithmic return The problem with this trading strategy is, of course, that we have to make our choice at time = 0 instead of = . Here is the remedy (compare, e.g., Blum & Kalai, 1999): at time = 0 simply divide the initial endowment, say 1 , into portions of 1 , invest each portion in each of the stocks and then hold the resulting portfolio. At time , the normalized logarithmic return satisfies 4 1 log( ) ≥ 1 log where again denotes the stock which performed best during the time interval [0, ]. Hence, the difference between (2) and (3) can be bounded by log( ) which tends to zero as → ∞. Hence this buy-andhold portfolio, which corresponds to a universal portfolio in the sense of Cover, has asymptotically the same normalized logarithmic return as the-only retrospectively known-best performing stock. Instead of these "pure" investments, Cover considered a more ambitious setting, namely, all constant rebalanced portfolio strategies: let = ( 1 , … , ) ∈Δ , that is, ≥ 0 and ∑ =1 = 1. The value of the corresponding constant rebalanced portfolio ( ( )) ∞ =0 starting at 0 ( ) = 1 is defined by holding throughout the proportion of the current wealth in stock , so that 0 ( ) = 1 and for each trajectory = (( ) =1 ) ∞ =0 ⊂ (0, ∞) of the stocks. Fix again and define the quantity * by * ( ) = max ∈Δ ( )( ), which is a function of the trajectory = ( 1 , … , ) =0 . Again, the idea is that, with hindsight, that is, knowing ( 1 , … , ) =0 , one considers the best weight ∈Δ which attains the maximum (5). Cover's goal is to construct a portfolio which generates wealth that performs asymptotically as well as the process ( * ) ∞ =0 as → ∞, uniformly over all price paths. For this reason, the portfolio is said to be universal. In order to do so, let be a probability measure onΔ which replaces the previous uniform distribution over the stocks. The universal portfolio is built by investing at time 0 the portion ( ) of initial capital in the constant rebalanced portfolio ( ) and by subsequently following the constant rebalanced portfolio process ( ( )) =0 . The explicit formula for the wealth is where ( ) is defined by (4). The portfolio weight of the corresponding universal portfolio is given by the wealth-weighted average Let us now recall Cover's celebrated result: Theorem 2.1. (Cover, 1991): Let be a probability measure onΔ with full support. Then The proof is given in the Appendix.
Remark 2.2. As shown by Cover and Ordentlich (1996), the condition (9) can be dropped at least when is the uniform or Dirichlet ( 1 2 , ⋯ , 1 2 ) distribution on Δ (see also Blum & Kalai, 1999, for an elegant proof in case of the uniform distribution).
Remark 2.3. Let  1 (Δ ) be the set of probability measures onΔ . For each ∈  1 (Δ ), consider the value ∫Δ ( )( ) ( ) of the mixture portfolio with initial measure . Note that the constant rebalanced portfolio ( ) corresponds to the case where is the point mass at . It is easy to see that where * ( ) is defined by (5). It follows that the universal portfolio (6) (with initial measure ) is still asymptotically optimal in the larger class

SPT, portfolio maps, and the corresponding universal portfolio
In SPT, we let ( 1 , … , ) denote the market capitalizations of the stocks rather than their prices. Then we define the vector of market weights ( 1 , … , ) ∈ Δ by This amounts to taking the market portfolio (whose value at time is ∑ =1 ) as the numéraire (compare Delbaen &Schachermayer, 1995 andFernholz &Karatzas, 2010a).
The relative wealth process ( ) ∞ =0 , expressed in units of the market portfolio and starting at 0 = 1, is obtained by the following recursive relation: 5 In general, we allow all predictable, admissible trading strategies ( ) ∞ =1 , where the portfolio weight is used over the time interval [ − 1, ]. In this paper, all trading strategies are fully invested in the equity market, that is, the portfolio weights sum to 1 for all . In particular, the strategies do not lend or borrow money. Henceforth, all wealth processes are measured in units of the market portfolio.
We will focus on trading strategies defined by (deterministic) portfolio maps. These are (Borel) measurable functions which associate to the current market capitalization = ( 1 , … , ) the weights ( ( ) = ( 1 ( ), … , ( )) according to which an agent distributes current wealth among the stocks at time . The constant rebalanced portfolio strategies considered by Cover correspond to the constant functions ∶ Δ →Δ .
In this paper, we extend Cover's theory of constant rebalanced portfolios to certain families of portfolio maps. First, we note that Cover's and Jamshidian's definition of a universal portfolio as in (7) and (6) can be easily extended to a general setting. Let  denote some appropriate space of portfolio maps, () its Borel -algebra and some probability measure on .
Definition 2.4. Let be a probability measure on (, ()). Then, the corresponding universal portfolio at time is given by the wealth-weighted average From (11), it is easily seen that the wealth generated by is given by

The log-optimal portfolio
To define the log-optimal portfolio, we consider a probabilistic setting. The stock price process = ( 1 , … , ) ∞ =0 and the corresponding relative market capitalizations = ( 1 , … , ) ∞ =0 are now assumed to be stochastic processes defined on a filtered probability space (Ω,  , ( ) ∞ =0 , ℙ). There is a large literature on the log-optimal portfolio (see, e.g., Becherer, 2001;Karatzas & Kardaras, 2007 and the references given there). For a fixed horizon , this portfolio is by definition the maximizer of the expected logarithmic growth rate over all predictable, admissible trading strategies ( ) =1 . Under mild assumptions on the process a unique optimizer exists; see, for example, Becherer (2001) and Kramkov and Schachermayer (1999).
To connect the log-optimal portfolio with universal portfolios in the sense of Definition 2.4, we need appropriate assumptions. We will assume that is a time-homogenous Markov process, and we will restrict to long-only portfolios in the optimization of (15). These imply that the optimal portfolio in (15) (over the set of predictable processes taking values inΔ ) has the form = ( −1 ), where ∶ Δ  →Δ as in (12). We denote the corresponding optimizer bŷ.
The Markovian assumption can be motivated by the stability of capital distributions of equity markets (see Fernholz, 2002, chapter 5). In SPT, this led to systems of interacting Brownian particles whose dynamics depend on their relative rankings. Under suitable conditions, these systems show behaviors observed in large equity markets. See, for example, Banner, Fernholz, and Karatzas (2005) and Ichiba et al. (2011) for "Atlas"-type models and the references therein. 6 We also refer to Kardaras and Robertson (2012) which studies the growth optimal portfolio in a Markovian setting with uncertainties.

A COMPARISON OF THE THREE APPROACHES-THE DISCRETE TIME CASE
Throughout this section, we work in discrete time and assume that the market weights are described by a -dimensional path = ( ) ∞ =0 with values in Δ . We consider as far as possible a model-free approach, but will introduce a probabilistic setting when the log-optimal portfolio is involved.

Definitions of the portfolios
We start by defining rigorously, in the present setting, the three portfolios introduced in Sections 1 and 2.

The best retrospectively chosen portfolio
Consider Cover's theme of choosing retrospectively at time a strategy which is optimal within a certain class of strategies, in our case portfolio maps ∶ Δ →Δ . A moment's reflection reveals that it does not make sense to allow to choose among all measurable functions ∶ Δ →Δ . Indeed, there is no restriction to choose such that ( ) = ( ) , where ( ) ∈ {1, … , } maximizes +1 ∕ . This is asking for too much clairvoyance and does not allow for meaningful results (compare Cover & Ordentlich, 1996;and Blum & Kalai, 1999, section 5).
However, it does make sense (economically as well as mathematically) to restrict to more regular trading strategies. In particular, we work with the following set of -Lipschitz portfolio maps. For > 0, we letΔ denote the set of ∈ Δ satisfying ≥ , for = 1, … , . Also we let ‖ ⋅ ‖ 1 be the usual 1-norm.
Remark 3.3. Instead of Lipschitz functions we could just as well consider other compact function spaces, for example, Hölder spaces equipped with a proper norm. This is done in the context of functionally generated portfolios in Section 4.
The retrospectively chosen best performing portfolio among the above Lipschitz maps is defined as follows: By compactness (see Remark 3.2) and continuity of the map  → , there exists an optimizer * , ∈ L (not necessarily unique) such that * , = * , , thus the sup above can be replaced by max.

The universal portfolio
Our aim is to find a predictable process = ( ) ∞ =1 , that is, one which depends only on the history of the market weights, such that the performance of ( ) ∞ =0 is asymptotically as good as that of ( * , ) ∞ =0 . This can be achieved by the universal portfolio introduced in Definition 2.4, where the  is now L as in Definition 3.1. As L is a compact metric space, we may find a (Borel) probability measure on (L , ‖ ⋅ ‖ ∞ ) with full support; this will be essential for establishing an analog to Theorem 2.1. The (relative) wealth of the universal portfolio is given, as in (14), by

The log-optimal portfolio
In order to relate the universal portfolio to the (long-only) log-optimal portfolio, we assume that = ( ) ∞ =0 is a time-homogeneous Markov process (see Section 2.3).
Here is a precise statement. Assumption 3.5. The process is a time homogeneous, ergodic Markov process with a unique invariant measure on the open simplex Δ .
The long-only log-optimal trading strategŷ, as noted above, is given in terms of a portfolio map. Given that = ∈ Δ , we know the conditional law ( , ⋅) of +1 . We therefore choosê( ) ∈Δ as the maximizer̂( and assume that̂(⋅) can be chosen to be measurable (here ⟨, ⟩ denotes the Euclidean dot product). For ∈ Δ , define the number ( ) as the value of the optimization problem (18), that is, Considering ( ) = (which corresponds to the market portfolio) we clearly have ( ) ≥ 0 for each ∈ Δ . We obtain the a.s. relation wherê= (̂) ∞ =0 denotes the long-only log-optimal wealth procesŝdefined by the portfolio map via (11).
Assumption 3.6. Using the above notation we assume that Applying Birkhoff's ergodic theorem for discrete time Markov processes (see Eberle, 2016, theorem 2.2, section 2.1.4), we have the following result.
Theorem 3.7. Under Assumptions 3.5 and 3.6, we have that, for -a.e. starting value 0 ∈ Δ , the limit holding true a.s. as well as in 1 . More generally, let ∶ Δ →Δ be any measurable portfolio map such that We then have, for -a.s. starting value 0 , that a.s. as well as in 1 .
In general, there is little reason why the function̂should have better regularity properties than being just measurable. On the other hand, we may approximatêby more regular functions, in particular by functions in L . This will be crucial for comparing the asymptotic growth rates. The following result is intuitively obvious, but the proof turns out to be quite technical and will be given in the Appendix.  (20) and (22), respectively. In particular, we have = sup sup ∈L .

Asymptotically equivalent growth rates
We are now ready to compare the asymptotic performance of the three approaches. We first establish an analog of Theorem 2.1.
Theorem 3.9. Fix > 0 and a Borel probability measure with full support on L . For every Proof. The inequality "≥" is obvious. For the reverse inequality, we follow the argument of Blum and Kalai (1999). As L is compact and has full support, it is not difficult to see that for any > 0, there exists > 0 such that every -neighborhood of a point ∈ L has -measure bigger than . Let a trajectory ( ) ∞ =0 in Δ be given. For a fixed time , let * , ∈ L be an optimizer of (16). Consider a portfolio map ∈ L with ‖ − * , ‖ ∞ < , that is, such that, for every ∈ Δ we have ‖ ( ) − * , ( )‖ 1 = ∑ =1 | ( ) − * , ( ) | < .
Choose > 0 small enough so that = < 1 and define, for ∈ Δ , Rearranging, we have It is easy to see that̃maps Δ intoΔ .
Using (26), we have the estimate Fix > 0. Choosing > 0 sufficiently small we can make = small enough such that the final term is bigger than − . Summing up, we have whenever ‖ − * , ‖ ∞ < . Denote by = ( * , ) the ‖ ⋅ ‖ ∞ -ball with radius in L which has -measure at least > 0, where only depends on . As each element of satisfies (28) we have Now (24) is proved by sending in (29) to infinity and letting to zero. □ Note that in Theorem 3.9 we do not need the uniform boundedness condition (9) (compare this result with Wong, 2015, lemma 3.3). We now combine Lemma 3.8 (which is probabilistic) with Theorem 3.9 (which is pathwise) to obtain-under suitable assumptions-equality of the asymptotic performance among the three portfolios. We first consider the space L for a fixed . In Corollary 3.11, we then formulate a result for L = ⋃ L .
Theorem 3.10. Let Ω = (Δ ) ℕ be the canonical path space equipped with its natural filtration and a probability measure ℙ. Define = ( ) ∞ =0 to be the canonical process, that is, ( ) = , which takes values in Δ and satisfies Assumptions 3.5 and 3.6. Moreover, let > 0 be a fixed Lipschitz constant for the space L . Consider the following objects that are defined for each trajectory ( ) ∞ =0 : 7 (i) Define for each ∈ ℕ the portfolio map * , ∈ L as well as the corresponding wealth * , ∶= * , as in (16) .
(ii) Fix a probability measure on L with full support and consider the wealth process of the universal portfolio ( ( )) ∞ =0 as of (17).
where is given in (22). In addition, the first equality holds for all trajectories ( ) ∞ =0 in Δ . Proof. We first note that̂is well-defined; simply use the compactness of L with respect to ‖ ⋅ ‖ ∞ (compare the proof of Lemma 3.8). Note also that by the ergodic theorem (Theorem 3.7), we have for each ∈ L where is defined by (22). In particular, aŝ∈ L by definition, we have That the first equality in (31) holds for all trajectories ( ) ∞ =0 in Δ was shown in Theorem 3.9. For each fixed ∈ ℕ, we obviously have Using (32), (33), and Theorem 3.9 we thus have ℙ-a.s.
On the other hand, by the definition of (̂) ∞ =0 as the log-optimizer within the class L , we have To see this, note that the universal portfolio is given by (13). By the time-homogenous Markovianity it is thus sufficient to dominate the left-hand side of (35) by taking the supremum over elements in L . Combining now (35), Theorem 3.7 and (34) yields that , ℙ-a.s.
Here, the first inequality follows from Fatou's lemma (note here that 1 log( ( )) is bounded from below, see, e.g., (29)). From this we see that the quantity lim inf →∞ 1 log( ( )) is ℙ-a.s. constant and equal to lim →∞ 1 log(̂). This completes the proof of the theorem. □ Next we will send to infinity in the following way. For = 1, 2, 3, … choose a measure on L with full support. Define = ∑ ∞ =1 2 − and the wealth of the universal portfolio ( ) as in (17) by where L = ⋃ ∞ =1 L . Recall that (̂) ∞ =0 is the wealth process of the (long-only) log-optimal portfolio (18).
where is defined in (20).

□
Here we establish the supermartingale property used in the previous proof.

THE CONTINUOUS TIME CASE WITH FUNCTIONALLY GENERATED PORTFOLIOS
This section is dedicated to a similar analysis in continuous time and with functionally generated portfolio maps (Fernholz, 2002, chapter 3). Using the pathwise Itô calculus developed by Föllmer (1981), we can define the corresponding wealth processes in a pathwise manner for any continuous market path admitting a quadratic variation process. This allows us to define the best retrospectively chosen portfolio which is not well-defined in general (and in particular for the Lipschitz portfolio maps).

Functionally generated portfolios
We consider the following set of concave functions. For some fixed > 0 and 0 ≤ ≤ 1, we define where 2, (Δ ) denotes the Hölder space of 2-times continuously differentiable functions fromΔ → ℝ whose derivatives are -Hölder continuous. That is, with denoting a multi-index in ℕ 2 . For = 0, the second term in this norm is left away. Note that is only defined on the simplex Δ . In order that the partial derivatives are well-defined, we assume that each is extended to an open neighborhood of Δ such that ( ) = ( ′ ), where ′ is the orthogonal projection of onto Δ . The choice of the extension is irrelevant.
Here is an analytical lemma whose proof is given in the appendix.
To the set of generating functions  , , we associate now the set of functionally generated portfolios  , in the spirit of Fernholz (2002) defined by By the concavity of , takes values inΔ , that is, it is long-only (see, e.g., Fernholz & Karatzas, 2009, remark 11.1). The corresponding wealth processes are denoted by or . For these portfolios, it is possible to obtain a pathwise expression for . We refer the reader to Schied et al. (2016) for extensions of this pathwise approach to time-dependent and path-dependent generating functions. There, this is achieved by applying the functional Itô calculus developed by Dupire (2009) and Fournié (2010, 2013), which generalizes Föllmer's Itô calculus to pathdependent functionals. In this paper, we only consider functionally generated portfolio maps as defined in (40).
The dynamics of the relative wealth process built by investing according to ∈  , are given in this continuous-time case by (compare (11) in the discrete time case), where the right-hand side has to be understood as Föllmer's pathwise integral (cf. equation (6) in Schied et al., 2016). Note that the second equality holds by the definition of and the fact that ∑ =1 = 0. Using (41) and Föllmer's Itô calculus, we have the following pathwise version of Fernholz's (2002) master equation (also see Schied et al., 2016, theorem 2.9).

Definitions of the portfolios
We again consider (i) the best retrospectively chosen portfolio, (ii) the universal portfolio, and (iii) the log-optimal portfolio. To define the log-optimal portfolio, we will restrict to a specific stochastic model introduced in Section 4.2.3. In Section 4.2.4, we derive the asymptotic growth rate for this model class under an additional ergodicity assumption.

The best retrospectively chosen portfolio
We consider the set of functionally generated portfolios  , and a given continuous path ( ) ≥0 satisfying Assumption 4.2. For , > 0 fixed, we define * , , = sup We first prove that an optimizer exists by establishing the following continuity property whose proof can be found in the appendix. Proof. This is simply a consequence of continuity as proved in Lemma 4.4 and compactness of ( , , ‖ ⋅ ‖ 2,0 ) as shown in Lemma 4.1. □

Universal portfolio
To define the analog of Cover's/Jamshidian's portfolio in the present setting, let be a Borel probability measure on ( , , ‖ ⋅ ‖ 2,0 ). Consider the map where is given by (40). Define now on ( , , ‖ ⋅ ‖ ∞ ) a Borel probability measure via the pushforward = * . As in Definition 2.4, we then define the corresponding universal portfolio via Analogous to (14), the value of the universal portfolio is given by Remark 4.6. More precisely, we need to verify that the universal portfolio still allows for pathwise integration and that the value of the portfolio (as a pathwise integral) is given by the right-hand side of (46). These claims can be easily checked using the definitions and results in Schied et al. (2016), so we omit the details.

Functionally generated log-optimal portfolios
By definition, the log-optimal portfolios requires a stochastic model for the market weights. We suppose that = ( 1 , … , ) ≥0 follows a time-homogeneous Markovian Itô diffusion, defined on (Ω,  , ( ) ≥0 , ℙ) with values in Δ , given by where √ ⋅ denotes the matrix square root, is a -dimensional Brownian motion, is a Borel measurable function from Δ → ℝ , and is a Borel measurable function from Δ → + , satisfying The requirements in (49) are necessary to guarantee that the process lies in Δ . Note that ( ) ≥0 given by (47) satisfies the so-called structure condition (see Schweizer, 1995) (because of (48) and the fact that the drift part is of form ∫ 0 ( ) ( ) ). This structural condition characterizes the condition of "no unbounded profit with bounded risk" (NUPBR) in the case of continuous semimartingales (see, e.g., Hulley & Schweizer, 2010). In this setting, the proportions of current (relative) wealth invested in each of the assets are described by processes in the following set: where the process is defined componentwise by = ∫ 0 . Here, denotes the hyperplane corresponding to portfolio weights that are not necessarily long-only, that is, = { ∈ ℝ | ∑ =1 = 1}. Note that the set  , is clearly a subset of long-only strategies in Π. The relative wealth process satisfies In contrast to Section 4.1, this is a usual stochastic integral because we are dealing with general integrands . Note that we can also write = exp , where, for two vectors , ∈ ℝ , ∕ always denotes the componentwise quotient ( 1 1 , … , ).
Next we consider the log-optimal portfolio defined by (15) (but in continuous time now). As in Fernholz and Karatzas (2010b, section 3.1), we derive the ratio of two wealth processes and for , ∈ Π. Using (51) (for the processes and ) and Itô's lemma, this ratio is given by The finite variation part of the expression vanishes for every ∈ Π if we choose ∈ Π such that ( ) By passing from the scaled relative weights ∕ to ordinary portfolio weights via Fernholz and Karatzas (2010b, equation (5)), the generic solution of (53), which we denote bŷ, 8 is given bŷ Let̂be the associated wealth process. From (53), the ratio ∕̂is, for any ∈ Π, a nonnegative local martingale and therefore a supermartingale. Hencêyields the relative wealth process corresponding to the log-optimal portfolio (see, e.g., Fernholz & Karatzas, 2010b;Karatzas & Kardaras, 2007). Indeed, by the supermartingale property and Jensen's inequality Thus [log( )] ≤ [log(̂)] for all ∈ Π. By (52), the expected value of the log-optimal portfolio is given by So far we have optimized over all strategies in Π. In the sequel, we shall mainly consider suprema taken over smaller sets, in particular over  , . Note that in this case the optimizer will still be a function of the market weights due to the Markov property of ( ) ≥0 .
In this context, let us also answer the question of when the log-optimal portfolio is functionally generated. This is needed to relate its asymptotic growth rate to the one of the best retrospectively chosen portfolio and the universal portfolio.
Proposition 4.7. Let ( ) ≥0 be of the form (47). Then the log-optimal portfolio is generated by a differentiable function , that is, if the drift characteristic satisfies Proof. The assertion follows from expression (54). □

Asymptotic growth rates for an ergodic market weights process
Assumption 4.8. The process as given in (47) is an ergodic process with stationary measure on Δ .
With this assumption we derive an expression of the asymptotic growth rate lim →∞ 1 log . For the precise notion of ergodicity in continuous time, we refer to Eberle (2016, section 2.2., theorem 2.4, and section 2.2.3). Assumption 4.8 is essentially satisfied under a mean reversion condition. Examples include polynomial models for the market weights staying in the interior of the simplex (see Cuchiero, 2019, theorem 5.1) with the subclass of volatility stabilized models . In the following theorem, we consider portfolio maps which are not necessarily long-only, but can take values in the hyperplane .
The proof of Theorem 4.9 relies on the following lemma which is stated and proved in Fernholz (2002, lemma 1.3.2).
Proof of Theorem 4.9. Let us start by proving statement (i). By (51), log reads as The local martingale part Multiplying the left-hand side with (log log )∕ , therefore yields Condition (56) and Hence, evoking again the ergodic theorem yields ℙ-a.s. (and also in 1 (Ω,  , )) and thus assertion (i). Concerning statement (ii), note from (53) that the scaled relative weights corresponding to the logoptimal portfolio satisfy ( ) Thus, by (57) and (51), loĝsimplifies to In this case, we have which yields by the same argument as above

Asymptotically equivalent growth rates
As in discrete time, we will establish asymptotic equality of the growth rates of all three portfolio types introduced in Section 4.2. First, we compare the best retrospectively chosen portfolio with the universal one. For an analogous result in the context of optimal arbitrage, see theorem 4.5 of Kardaras and Robertson (2012).
Consider a probability measure on  , with full support and set = * with defined in (44). Then where * , , and , ( ) are defined in (43) and (46), respectively.
Proof. The inequality "≥" is obvious. For the converse inequality, we proceed similarly as in the previous section (using only generating functions). As has full support and  , is compact, we have that, for > 0 there exists some > 0, such that every -neighborhood of a point ∈  , has -measure bigger than . Let ≥ 1 and denote by * the optimizer as of Proposition 4.5. Consider now a generating function such ‖ − * ‖ 2,0 ≤ . Then it follows from (A9) that Fix > 0 and note that by assumption (58)  Denote by = ( * ) the ‖ ⋅ ‖ 2,0 -ball with radius in  , which has -measure at least > 0, where only depends on . We then may estimate using Jensen's inequality and (59) Letting → ∞ for any given (which determines and in turn ) yields the assertion. □ To compare the asymptotic performance with that of the log-optimal portfolio, we optimize over portfolio maps in  , and suppose henceforth that ( ) ≥0 is of the form (47). Under Assumption 4.8 and from Theorem 4.9 definê ) and the corresponding wealth procesŝ, bŷ, =̂, , whenever̂, is well-defined. As yieldŝ, as optimizer for all > 0,̂, corresponds to the log-optimal portfolio among functionally generated portfolios with generating function in  , .
Theorem 4.12. Let , > 0 be fixed and let ( ) ≥0 be a stochastic process of the form (47) satisfying Assumption 4.8. Moreover, suppose that Consider a probability measure on  , with full support and set = * with defined in (44).
In particular, holds ℙ-a.s. Due to (61), we can now apply Theorem 4.11 which implies the first equality in (63). Moreover, we have by the definition of * , , for each fixed the inequality , ℙ-a.s.
Using (64) On the other hand, by the definition of (̂, ) ≥0 as log-optimizer within the class  , [ log holds. Concerning the first inequality, note that the universal portfolio to build the wealth , ( ) is given by (45). By the time-homogenous Markovianity it is thus sufficient to dominate the left-hand side of (67) by taking the supremum over elements in  , . Combining now (67) , ℙ-a.s., where the first inequality follows from Fatou's lemma. From this, we see that is ℙ-a.s. constant and equal to lim →∞ 1 log(̂, ). Hence the assertion is proved. □ As in the previous section, we can formulate a result not depending explicitly on the constant on . Setting = 1 we choose for = 1, 2, 3, … a measure on  , 1 with full support. Define = ∑ ∞ =1 2 − and the process ( ) by In order to compare the performance with the one of the global log-optimal portfolio, whenever it is functionally generated, we combine the above results with Proposition 4.7.
Indeed this simply follows from continuity of  → as asserted in Lemma 4.4 and by choosing ∈  , 1 close enough with respect to the ‖ ⋅ ‖ 2,0 to the optimizing function̂∈ 2 (Δ ) whose generated portfolio yieldŝdue to (68) and Proposition 4.7. By Theorem 4.12, we can therefore conclude (following the proof of Corollary 3.11) that By the considerations of Section 4.2.3 (see also Becherer, 2001, proposition 4.3), it follows that ( ( ) ) ≥0 is a nonnegative supermartingale. It converges ℙ-a.s. to a finite limit as → ∞. This in turn implies (72) and proves the statement. □ Finally, a similar result can be obtained by restricting the log-optimal portfolio to the class of 2functionally generated portfolios without imposing the drift condition in Proposition 4.7. We denote by the wealth process of the log-optimal portfolio among concave 2 -functionally generated portfolios, that is, is defined as in (60), however by taking the arg max over all concave 2functionally generated portfolios.