Copula measures and Sklar's theorem in arbitrary dimensions

Although copulas are used and defined for various infinite‐dimensional objects (e.g., Gaussian processes and Markov processes), there is no prevalent notion of a copula that unifies these concepts. We propose a unified functional analytic framework, show how Sklar's theorem can be applied in certain examples of Banach spaces and provide a semiparametric estimation procedure for second‐order stochastic processes with underlying Gaussian copula.


Introduction
The investigation of linear and nonlinear dependence structures between the elements in an arbitrary family of random objects are inherent in many problems, ranging from modelling dependence within Markov processes (see e.g.[13], [22], [18]), Gaussian processes, and general processes with continuous marginals (see e.g.[35]), to the modelling of dependence between semimartingale processes (see e.g.[20], [5]) or the components of a random measure (see e.g.[28]).One of the most powerful tools, which captures the whole structure of statistical dependence for a finite collection of real-valued random variables, are copulas.
Copulas are cumulative distribution functions with uniform marginals, which can generally be interpreted as the dependence structure separated from the laws of the marginals by virtue of Sklar's Theorem.The theory for copulas is rather well-developed for the finite-dimensional case (see [24] for an introduction into the topic).In this paper we develop a general theory of copulas in infinite dimensions.
Relying on the one-to-one correspondence of probability measures and distribution functions in finite dimensions, we will introduce copulas as probability measures with uniform marginal distributions on product spaces.That is, copulas are treated as laws of families (U i ) i∈I where I is an arbitrary index set and U i ∈ L 0 (Ω; R) are real-valued uniformly distributed random variables on a probability space (Ω, F, P).
We prove Sklar's theorem in this general setting: The first part of this theorem states that each probability law on R I possesses an underlying copula measure (representing its dependence structure), whereas the second part enables us to merge together any copula measure with a freely chosen family of one-dimensional marginals to a law on the product space (with the copula measure as the specified dependence structure).The framework here suggested is well suited for the general setup of real-valued stochastic processes.Nevertheless, when setting an eye towards applications, e.g.numerical approximations or (functional) data analysis, it is relevant to have sufficient knowledge The paper is organised as follows.We describe the basic framework of copulas in product spaces and prove Sklar's Theorem in section 2. Section 3 is devoted to copula constructions in function spaces, where in subsection 3.1 we introduce a general framework for marginals in measurable vector spaces and describe the abstract construction problem.Subsection 3.2 presents criteria to overcome the latter in various function spaces.Finally, section 4 provides distance estimates for the copula construction, where in subsection 4.1 we describe the connection of copulas to Wasserstein spaces and in subsection 4.2 we derive an estimate of the L p (T ) distance of two processes in terms of the difference of the underlying copula and the one-dimensional Wasserstein distances of their marginals.
Notation.For any measure µ on a measurable space (B, B) and a measurable function f : (B, B) → (A, A) into another measurable space (A, A) we denote by f * µ the pushforward measure with respect to f given by f * µ(S) := µ(f −1 (S)) for all S ∈ A. If B = R I , where I is an arbitrary index set, and B = ⊗ i∈I B(R), we use the shorter notations π J * µ =: µ J for a subset J ⊆ I and π {i} * µ =: µ i for an element i ∈ I, where π J denotes the projection on R J .If J ⊂ I is finite, we denote the corresponding finite-dimensional cumulative distribution functions by F µ J or F µ i respectively.We will frequently refer to the one-dimensional distributions µ i , i ∈ I and equivalently F µ i , i ∈ I as marginals of the measure µ.Throughout the paper all random variables are considered on a complete probability space (Ω, F, P) and we write L 0 (Ω, F; A, A) =: L 0 (Ω; A) for the measurable functions f : (Ω, F) → (A, A), i.e., A-valued random variables.

Copulas and Sklar's Theorem in Infinite Dimensions
Following the natural interpretation of copulas as measures in finite dimensions (see section A for a short treatment of copulas in finite dimensions), we suggest to define the concept in the same line also in infinite dimensions: Definition 2.1.A copula measure (or simply copula) on R I is a probability measure C on ⊗ i∈I B(R), such that its marginals C i are uniformly distributed on [0, 1].
For finite-dimensional index sets I the notions of measures and cumulative distribution functions have a one-to-one correspondence, which is the reason why in this case a copula measure C can be uniquely identified with the copula F C in the classical sense of Definition A.2.For the same reason the finite-dimensional distributions C J of an infinite-dimensional copula measure C, correspond uniquely to the copula F C J in the familiar sense of copulas in finite dimensions.
We also introduce the important notion of copula processes.
Definition 2.2.We call a random variable U ∈ L 0 (Ω; R I ) with uniform marginals on [0, 1] a copula process.That is, the law of a copula process is a copula measure.
Since for each copula we can find a probability space, and a copula process with law C on it, the notion of copulas has a one-to-one correspondence with the one of copula processes.
As in finite dimensions, the most important result for the use of copulas is Sklar's Theorem: Theorem 2.3 (Sklar's Theorem).Let I be an index set and µ be a probability measure on ⊗ i∈I B(R) with marginal one-dimensional distributions µ i , i ∈ I.There exists a copula measure C, such that for each finite subset J ⊆ I, we have Vice versa, let C be a copula measure on R I and let (µ i ) i∈I be a collection of (onedimensional) Borel probability measures over R. Then there exists a unique probability measure µ on ⊗ i∈I B(R), such that (2.1) holds.
In the following proof and the rest of the paper we often use for a one-dimensional Borel measure µ i on R the notation for the quantile functions Proof.To prove the first part, let (X i ) i∈I be a random vector having µ as its law.Let U be a standard uniformly distributed real-valued random variable on the same probability space, such that U is independent of (X i ) i∈I .For a one-dimensional distribution function we denote its left-limit by F µ i (x−) := lim y↑x F µ i (y).Define the distributional transform process (U i ) i∈I by and C to be the law of (U i ) i∈I .Since each U i is uniformly distributed on [0, 1] and the finite-dimensional laws C J fulfill (2.1) by Theorem A.4, C is the copula measure we looked for.Observe that in case of continuous marginals all finite-dimensional marginals of C must be uniquely determined by the unique copulas of the finite-dimensional laws of µ induced by Sklar's Theorem in finite dimensions.
To prove the other direction of Sklar's Theorem, observe that, since given by the corresponding pushforward measure has the desired properties.To see this, we just have to verify that µ has the finitedimensional distributions induced by (2.1).Observe that, for all i ∈ I, by the monotonicity of the cumulative distribution functions we have that, for all x ∈ (−∞, ∞), Thus, for J ⊂ I finite, we have for all This concludes the proof.
Remark 2.4.If I is a finite set Theorem 2.3 coincides with Sklar's Theorem A.3 in finite dimensions by identifying the copula measure uniquely with its corresponding cumulative distribution function.
Remark 2.5.From the proof above it follows that for a copula measure C on R I and a collection of marginals (µ i ) i∈I , the pushforward measure in (2.3) represents a probability measure µ on R I having this underlying copula and marginals.
In [12] the authors used the notion that two laws µ and ν on (R I , ⊗ i∈I B(R)) have the same dependence structure, if there exist two stochastic processes X = (X i ) i∈I and Y = (Y i ) i∈I , such that X ∼ µ and Y ∼ ν on the same probability space (Ω, F, P) and X i and Y i are similarly ordered (X i s.o.
In finite dimensions this notion is equivalent to the existence of a common underlying copula by virtue of Sklar's Theorem A.3.This is also valid in infinite dimensions, as we show next.Later, this fact will play a crucial in transferring the theory for optimal couplings of stochastic processes as treated in [12] to our copula setting and in proving therewith approximation results in section 4.
Lemma 2.6.Two probability measures µ and ν on ⊗ i∈I B(R) have a common underlying copula measure in the sense of (2.1) if and only if Proof.Let µ and ν have the same underlying copula measure C and let U ∼ C be a corresponding copula process.Define with the notion introduced in (2.2) the random variables X := (F By construction and analogously to the proof of Sklar's Theorem 2.3, we obtain X ∼ ((F Vice versa, let X ∼ µ and Y ∼ ν be two random variables such that X i s.o.∼ Y i for all i ∈ I. Then by Proposition 2.1 in [12] for each i ∈ I there exist a uniformly distributed random variable ν i (U i ) (observe that the proof of this assertion does not need second moments, as stated in Remark 1 in [12]).If C is the law of U = (U i ) i∈I , we obtain µ = ((F This shows, that X and Y have the same underlying copula measure C. The following examples review some existing concepts of copulas, which can be embedded into our framework: Example 2.7.(Complete dependence and independence copulas) The complete dependence copula measure on R I is the law corresponding to the consistent family of finite-dimensional cumulative distribution functions given by M J ((u j ) j∈J ) = min j∈J u j .Observe, that its finite-dimensional distribution functions are Fréchet-Hoeffding upper bounds for the corresponding finite-dimensional copulas, that is, for all J ⊂ I finite and any copula C on R I we have The independence copula measure on R I is the law of the consistent family of finitedimensional cumulative distribution functions given by Π J ((u j ) j∈J ) = Π j∈J u j .
Example 2.8.(Inversion method and Gaussian copulas) Given a law µ with continuous marginals F µ i , i ∈ I, the underlying copula measure C induced by Sklar's Theorem 2.3 is given by its finite-dimensional cumulative distribution functions for each finite J ⊆ I via This method is known as inversion method (see e.g.[24]).In this way we can derive, for instance, the copula measures that are underlying a Gaussian process (that is, each µ J is Gaussian), which are called Gaussian copulas.Infinite-dimensional Gaussian copulas where applied already for example in [35] in a machine learning context.
Example 2.9.(Archimedean copulas): Fix a continuous, strictly decreasing and convex function φ : [0, 1] → [0, ∞] such that φ(1) = 0, and φ [−1] , its pseudoinverse, is given by Then the finite-dimensional laws of an Archimedean copula measure are given by for each finite J ⊂ I.By definition, these probability measures are exchangeable and it was shown in [11] that Archimedean copulas in infinite dimensions can be related to Dirichlet distributions.

Copulas in Function spaces
In this section we formulate a unified setting for the notion of copulas in the framework of vector spaces.
3.1.Marginals in Vector Spaces.Let V be a vector space over R and V a σ-algebra over this space.Recall that the algebraic dual of V is defined as the vector space Let X be a random variable on V .Let M be a linearly independent subspace of measurable functions in Hom(V, R) that separates the points V .Then we call the random variables (m(X) : m ∈ M ) the M -marginals of X.
Observe that by the definition above, we are able to embed the vector space framework into the framework of product spaces, by the embedding which is necessary for the application of our copula theory.Some choices of M which are of practical importance are given in the sequel.
Example 3.2 (Marginals in finite dimensions).In the finite-dimensional case, that is V = R d for some d ∈ N, M is necessarily of the form for a basis e 1 , ..., e d of R d and where •, • denotes an inner product on R d .In terms of finite-dimensional copula theory, the natural choice is the standard basis e 1 = (1, 0...., 0), ... , e d = (0, ..., 0, 1).
Example 3.3 (Product space).It is possible to embed the product space setting from section 2 into the framework of measurable vector spaces: The product space V = R I for some index set I becomes a measurable vector space, if we equip it with the product σ-algebra ⊗ i∈I B(R).The projections (or evaluation functionals) π j ((v i ) i∈I ) := v j for j ∈ I are measurable (even continuous) by definition, linearly independent and separate the points.Thus, we can take Observe that we can do this with every space of functions, in which the evaluations are linearly independent.For instance we can take the space of p-integrable functions over a subset for some natural number p and a measure space (T, A, µ).Observe that in this setting we work in a space of functions, rather than of equivalence classes.The reason is that point evaluations are not well defined in Banach spaces of equivalence classes.This serves also as motivation for subsection 3.2.1 where we construct copulas under these circumstances.
Example 3.4 (Path marginals for Banach spaces of functions).Let V be a separable Banach space of real-valued functions on a set T such that the evaluation functionals δ t f := f (t) (or projections in terms of product spaces) are continuous and V = B(V ) is the Borel σ-algebra with respect to the corresponding norm topology.In most of these settings, the subset of evaluations is linearly independent and, due to continuity, it consists of measurable functionals.Important examples in this framework are the continuous functions V = C(T ) and V = B K , where B K is a reproducing kernel Banach or Hilbert space in the sense of [36] or [3].Observe that, if we revisit Example 3.3 allowing for general topological vector spaces V instead of Banach spaces, then Example 3.3 would be part of this framework.
Example 3.5.(Basis marginals) If V is a Banach space that possesses a Schauder basis (cf.Definition 3.22) we can take where (f n ) n∈N is the sequence of coefficient functionals of the Schauder basis.Examples of Banach spaces that possess such a basis are C([0, 1]), L p ([0, 1]), the sequence spaces l p and, as a special case, all separable Hilbert spaces with orthonormal bases as Schauder bases.Note that in the latter case we are effectively in the setting of consistent copulas from [16].
Remark 3.6 (Marginals for nonlinear subspaces).Note, that if we are just interested in defining random variables on particular subsets of a vector space V , the set M must not necessarily be separating for all elements of V .One example is the construction of random probability measures, as a certain subset of random variables in the Banach space of signed measures on the real line.In this case it suffices to take that is, we identify a random probability measure with the corresponding random cumulative distribution function.
We will refer to the choice of marginals in Examples 3.3 and 3.4 as path marginals and the corresponding copulas in this framework as path copulas.In contrast, the corresponding constructions in Example 3.5 will be referred to as basis marginals and basis copulas.
Already in finite dimensions, due to different basis specifications, there is not just one choice for M .Unfortunately, copulas are not invariant under change of the notion of marginals, as shown by the following example: Since the independence copula is simply the product of the one-dimensional uniform distributions and the distribution function of X 2 is the identity on [0, 1], we have and analogously which is obviously not the case, in view of the distributional choices on the random variables.Thus, (X 1 , X 2 ) and (Y 1 , Y 2 ) do not posses the same copula with respect to {(1, 1), (0, 1)}-marginals, although they share the same copula with respect to {(1, 0), (0, 1)}marginals.
If we want to construct a measure on a vector space V by virtue of the second part of Sklar's Theorem the naive procedure reads now as follows: Construction 3.8.
(i) Choose some set M which satisfies the conditions of Definition 3.1.
(ii) Choose a copula C on R M (or a copula process (U m ) m∈M ) and one-dimensional distributions (µ m ) m∈M and merge them with Sklar's Theorem to a law µ (or a process) on ⊗ m∈M B(R).(iii) (Construction Problem) Check if µ can be identified with a measure on V via the embedding (3.1).
As anticipated in the introduction, the third point will not necessarily carry an affirmative answer.The choice of marginals and dependence structure in (ii) must be based on criteria that guarantee a solution to (iii), which is hereafter referred to as the construction problem for copulas in function spaces.
Consider now the following framework (which covers all mentioned examples): V is a topological vector space, V = B(V ) the corresponding Borel σ-algebra and M a subset of the dual that satisfies the conditions of Definition 3.1.In addition, assume that each m ∈ M is continuous, that is, M ⊂ V * , where V * denotes the topological dual of V , given by Then Construction 3.8 induced by Sklar's Theorem effectively culminates in the construction of a cylindrical premeasure on that vector space (see for instance [9] or [30] for a treatment of cylindrical measure theory).In the case that V is even a separable Banach space and M ⊂ V * , we however have the following useful criterion for our setting: Lemma 3.9.Let V be a separable Banach space.Assume that M ⊂ V * is a fundamental set with respect to the weak * -topology, that is, its linear span is dense.If the probability measure defined in Construction 3.8 is the law of a process X := (X m ) m∈M , such that X is almost surely in the range of the embedding (3.1), then it is the image of a Borel measurable random variable X in V under this embedding.
Proof.If (X m ) m∈M is almost surely in the range of the embedding, there is an Ω ⊆ Ω with full measure and a random variable X such that m( X(ω)) = X m (ω) for all ω ∈ Ω.Since M is a fundamental set, we have that for all v * ∈ V * there is a sequence is measurable, since linear combinations and limits of measurable functions are measurable.We conclude that X is a weakly measurable random variable on a separable Banach space and hence, by the Pettis theorem [26, Theorem 1.1] strongly measurable, that is, measurable with respect to the Borel σ-algebra.
Remark 3.10.Due to the existence of Hamel bases on V * and the Hahn-Banach Theorem (cf.Corollary 5.80 in [2]), the existence of a set M that satisfy the conditions in Definition 3.1 is always guaranteed in locally convex Hausdorff spaces.
We will concentrate in the next sections on special cases of path-and basis constructions, which are adequate to solve the construction problem (for instance by virtue of Lemma 3.9), which is why they are foremost of practical importance.

3.2.1.
Path Copulas for p-Integrable Stochastic Processes.We describe in this section how the copula construction induced by Sklar's Theorem 2.3 works for the function space L p (T, B(T ), µ; R) =: L p (T ) for p ∈ N, a measurable set T ⊂ R d with d ∈ N and a σ-finite measure µ.As mentioned in Example 3.3, we take M = {δ t : t ∈ T }-marginals, that is, we identify a function f ∈ L p (T ) by all its function values (f (t)) t∈T .Moreover, we denote by [f ] the corresponding equivalence class of almost everywhere coinciding functions with f , which forms an element in the Banach space of equivalence classes L p (T ).
For a stochastic process X = (X t ) t∈T we say that it is measurable, if the mapping Lemma 3.11.Let X = (X t ) t∈T be a measurable stochastic process.
(a) Assume X has values in L p (T ) almost surely.Then X is a Borel measurable random variable in L p (T ) (with respect to the pseudometric induced by Proof.We will verify that [X] is a Borel measurable random variable on the Banach space L p (T ).In that case we have that since Due to measurability of the process X, these integrals are indeed measurable.This shows (a).
To show (b), observe that by Fubini's theorem we have whenever one of the terms in this equation is finite.Using (a), this shows the assertion.
Lemma 3.11 yields the following simple construction of random variables X such that [X] ∈ L p (Ω; L p (T )): Construction 3.12.
(i) Specify a measurable copula process U = (U t ) t∈T .
(ii) define marginals (F t ) t∈T , with corresponding pth moments (m p t ) t∈T , such that (t, x) → F t (x) is jointly measurable and (iii) construct the new process X with underlying copula process U and marginals (F t ) t∈T via Sklar's Theorem 2.3, that is This process has values in L p (T ) and by Lemma (3.11), [X] is therefore an element in L p (Ω; L p (T )).
Notice that the interpretability of the underlying path copula of X is complicated, if transfer to the equivalence class [X].Indeed path copulas specify dependence between point evaluations of the random function, which are not well defined anymore for equivalence classes.If one really wants to specify dependence between equivalence classes, one should approach this by considering the notion of basis marginals, as described in Subsection 3.2.4.
From a measure theoretical point of view, Banach spaces in which evaluation functionals are well defined and continuous are favourable and we will discuss this in the sense of spaces of continuous functions in the next subsection.Recall that a process X = (X t ) t∈T with marginals (3.9) If we assume that all the marginals F t , t ∈ T are continuous (in x), (3.9) simplifies to the condition that In the latter case we have even joint continuity: Lemma 3.13.Assume that the marginals F t , t ∈ T of an almost surely continuous process X = (X t ) t∈T are continuous.Then If the marginals are strictly increasing between the points Proof.Due to Lemma 21.2 from [33] we have that t → F [−1] t is pointwise continuous.The proof follows then analogously to the arguments of the proof of Proposition 1 in [21].
Since processes with continuous sample paths are continuous in distribution, (3.9) (resp.(3.11)) forms a necessary condition on the marginals.Theorem 3.14.Let X = (X t ) t∈T be a stochastic processes with sample paths that belong almost surely to C(T ) and such that it has continuous marginals F t for all t ∈ T .Then U = (U t ) t∈T defined by is a copula process for X and almost surely continuous on T .Vice versa, if U is a copula process that is almost surely continuous on T and F t , t ∈ T are strictly increasing marginals between the points , which are continuous in distribution, then Y = (Y t ) t∈T defined by is a random variable, which is almost surely in C(T ) with marginals F t and underlying copula process U .Moreover, if T is compact, Y is measurable with respect to the Borel σ-algebra on C(T ).
Proof.Observe, that since we are in the case of continuous marginals, the process (U t ) t∈T defined by U t = F t (X t ) for all t ∈ T is a copula process underlying X.Its continuity follows by Lemma 3.13 and the continuity of s → X s .
To show the second part, observe that (t, x) → F . Hence Y is a random variable with values in C(T ) almost surely by Lemma 3.13 and the continuity of s → U s .Its Borel measurability for compact T follows from Lemma 3.9.
Remark 3.15.In principle, a more abstract set T could be taken, as the precise structure of R d is not used, but for convenience we stay in the euclidean setting throughout this paper.Remark 3.16.In the framework of stochastic processes, the initial value X 0 is often chosen to be deterministic.Therefore it has neither a continuous nor strictly increasing distribution in the initial value.Possibly, for some processes, we still manage to define a continuous underlying copula (if the copula process has a limit from above in 0 which is uniformly distributed), but since this might be hard to check in general, it is reasonable to start the process a little bit later than in the origin.
Example 3.17.For some t 0 > 0 let X t = B t for t ∈ [t 0 , ∞) be a standard Brownian motion with sample paths in C(R + ).
We note that the copula of a Brownian motion was investigated for instance in [31] in the framework of Markov copulas.Recall that for a constant γ > 0 a function f : T → R is called locally γ-Hölder continuous, if for each t ∈ T there is a neighbourhood N (t) of t in T and a constant K t > 0, such that for all s, r ∈ N (t) we have For a nonnegative integer k, γ ∈ (0, 1] and m ∈ N, we introduce the Hölder spaces C k,γ (T ; R m ) to be the space of functions f : T → R m which are continuously differentiable up to order k and the kth derivative is locally γ-Hölder continuous.
Recall the following fact about locally Hölder continuous functions: Lemma 3.18.Let I 1 , ..., I m be intervals and Proof.This is a special case of Theorem 4.3 in [23].
As a consequence of Lemma 3.18 we obtain the following immediately: Let U denote the associated copula process given by U t = F t (X t ), t ∈ T .Then almost surely For a copula process U ∈ C k,γ (T ; R) almost surely and marginal cumulative distribution functions By virtue of the previous Corollary 3. 19 we can determine the regularity of a copula process underlying a fractional Brownian motion: Example 3.20.Assume that (U t ) t∈(t 0 ,∞) is a copula process underlying a fractional Brownian motion (B H t ) t∈[t 0 ,∞) for some t 0 > 0 with Hurst parameter H ∈ (0, 1), that is, a centered Gaussian process with covariance function The process (U t ) t∈(t 0 ,∞) has locally H-Hölder continuous paths.To see this, we just have to verify the local H-Hölder continuity of (t, x) → Φ H t (x) as stated in the Corollary 3.19, where we denoted by Φ H t the cumulative distribution functions of B H t .We can estimate for s, t ∈ [t 0 , ∞) (with the constant c = 1/ √ 2π) Analogously, for x, y ∈ R we get

By the triangle inequality we obtain the joint Hölder continuity
Example 3.21 (Exponential Marginals and fBM copula).Several modeling situations (e.g., when modelling stochastic volatility, interest rates, etc. in financial mathematics) necessitate positive stochastic processes.It is simple to see that copula constructions might lead to good interpretable and alternative methods to model such process, since we are free to put any continuous family of marginals onto a Gaussian process (this was for example suggested in [35]).
As a simple example, take exponential marginals of the form for some t 0 > 0, a (Hurst-)parameter H = (0, 1) corresponding to the copula process (U t ) t∈T of a fractional Brownian motion B H (we take the parameter 1 t H for the marginals to keep the same variance as the underlying fractional Brownian motion).By the smoothness of G −1 t (y) = − log(1 − y)t h we obtain that the transformed fractional Brownian motion has underlying Gaussian copula U , is γ-Hölder continuous for all γ < H and has exponential marginals (with parameters 1 t H ). In [15] it is argued empirically for lognormal marginals with a fractional Brownian motion copula for the stochastic volatility of asset prices.Our example shows that one can easily modify the marginals (to exponential, say, as in our example here), or other positively supported distributions, in so-called rough volatility models of asset prices.Moreover, the flexibility in the copula framework allows also to go beyond the specific dependency yielded by the copula induced by fractional Brownian motion.

Construction on Schauder
Bases.In this section we will characterize copulaconstructed processes for random variables in Banach spaces with a Schauder basis.This includes L p ([0, 1])-spaces (with the Haar wavelets as Schauder basis), C([0, 1]) (with the original Schauder basis), l p -sequence spaces, and therefore in particular, all separable Hilbert spaces (with an orthonormal basis as Schauder basis).For a detailed account on the theory of bases in Banach spaces we refer to [17].Definition 3.22.A sequence (e n ) n∈N ⊆ V of linearly independent vectors is called a basis of a locally convex Hausdorff space V , if for all v ∈ V there is a unique sequence where the series converges with respect to the locally convex topology on V .A basis of V * is called weak * -basis of V * , if it is a basis with respect to the weak * -topology.If V is a Banach space and v → a n (v) is continuous with respect to the norm topology for all n ∈ N, we call (e n ) n∈N ⊆ V a Schauder basis.
The continuity of the function (a n ) n∈N is automatically satisfied, if E is a separable Banach space (see Theorem 3.1.in [32]).Note, that every Banach space that possesses a basis is separable.However, for a separable Banach space, the existence of a basis cannot be guaranteed, due to the counterexample by Enflo in [14].For a Banach space with Schauder basis we can verify, that the corresponding coefficient functions are always contained in the topological dual: Lemma 3.23.Let V be a Banach space with Schauder basis (e n ) n∈N and coefficient functions (a n ) n∈N .Then {a n : n ∈ N} ⊂ V * and for m, n ∈ N we have Proof.Linearity of the coefficients is clear due to uniqueness of the representation.Moreover for the same reason, a n (e n ) = 1 and a m (e n ) = 0 gives a valid series representation of e n for all n ∈ N and by uniqueness of this, the assertion follows.
That the coefficient functionals are linearly independent and separate the points is a consequence of the following Theorem: Theorem 3.24.Let V be a Banach space.A sequence (a n ) n∈N is a weak * Schauder basis of V * if and only if there exists a Schauder basis (e n ) n∈N of V that has (a n ) n∈N as its coefficient functionals.The coefficient functionals for the basis (a n ) n∈N are then given by the bidual elements (ι en ) n∈N , where Proof.See Theorem 14.1.in [32].
Observe that for a Banach space with Schauder basis, the corresponding set Thus, for the checkup of the Construction 3.8 we have the following: Corollary 3.26.Let V be a Banach space with Schauder basis (e n ) n∈N and (X n ) n∈N be a stochastic process.Then the following are equivalent: Proof.This follows directly from Theorem 3.25 and Lemma 3.9.
Let us now describe how we can construct a Banach space probability measure with predescribed dependence structure and marginals for the basis components: Construction 3.27.
(i) V a Banach space with Schauder basis (e n ) n∈N ⊆ V ; (ii) Choose a copula measure C on R N (which models the dependency between basis elements) and marginals (µ n ) n∈N .Merge them to a law of a random sequence (X n ) n∈N taking values in R N via Sklar's Theorem 2.3.(iii) Define X := n∈N X n e n .(iv) Check if this sum converges in V almost surely (corresponding to Corollary 3.26).
For the verification of (iv) we obtain conditions on the moments of the marginals.
Corollary 3.28.Let X be given as in Construction 3.27(iii).Then X ∈ L 1 (Ω; V, B(V )) if the marginals have finite first moment and Proof.This follows immediately by using Corollary 3.26 and the triangular inequality.
In case of sequence spaces, we obtain even sufficient and necessary conditions to construct laws with finite moments of certain order.Denote Corollary 3.29.Let V = l p and X be given as in Construction 3.27(iii).Then X ∈ L p (Ω; l p , B(l p )) if and only if the marginals have finite pth moment and Proof.The standard basis (δ n ) n∈N is the sequence which has components equal to zero everywhere, except on the n'th entry, where it is 1.This defines a Schauder basis on l p with coefficient functionals δ * i given by δ * i ((x n ) n∈N ) = x i , since This implies The assertion follows.
Remark 3.30.Observe that Corollary 3.29 generalises Corollary 4.3 in [16], where the case of separable Hilbert spaces, that is p = 2, was considered and the notion of a Schauder basis is reduced to the concept of orthonormal bases.
The results derived above just impose conditions on the marginals, which makes them useful from a practical viewpoint.Still, the concept of copulas for random variables in the space L p (Ω; l p ), or equivalent for laws in the Wasserstein space W p (l p ) (see (4.1) below) is characterized completely by Corollary 3.29.We will obtain another characterization of copulas as underlying solutions to certain restricted optimization problems in these Wasserstein spaces in the next section.

Robustness of the Copula Construction
The previous section suggested that copula theory is well suited for the spaces L p (T ) := L p (T, B(T ), µ; R) for a finite Borel measure µ and T ⊂ R d a compact interval and the sequence spaces l p , due to simple moment criteria to overcome the construction problem.In this section we will provide distance estimates of random variables in these spaces in terms of their copula and marginals separately.
Hereafter we shorten the notation as follows: For a random variable X with values in E (where E equals l p or L p (T ) respectively) we denote the operators Xt (f (t)), t ∈ T respectively) for all x ∈ l p (and f ∈ L p (T ) respectively).Moreover, we use the notation U X for the underlying copula process of X.We will for convenience switch between the spaces L p (T ) and L p (T ) whenever there is no confusion.If we say that an [X] ∈ L p (T ) has underlying copula process U ∈ L p (T ), we mean that there is a representative X ∈ L p (T ) of the corresponding element, that has this path copula.We will also drop equivalence class notation from time to time, to ease the writing (especially, when we work with Wasserstein spaces in the next section) and just refer to the representative X, no matter if we mean the equivalence class or the actual stochastic process.

4.1.
Copulas and Wasserstein Spaces.In this subsection we characterize copulas for measures in Wasserstein spaces.For two laws ν 1 and ν 2 on E we write ρ < ν 2 ν 1 for a law ρ on E × E that has marginal distributions ν 1 and ν 2 , that is ρ is a coupling of ν 1 and ν 2 .Recall that the p-Wasserstein space over a separable Banach space E is a complete separable metric space (see e.g.[34]) given by equipped with the metric (in the case that we interpret . If there are two random variables X ∼ ν 1 and Y ∼ ν 2 , we also say that (X, Y ) is a coupling and write W p (ν 1 , ν 2 ) = W p (X, Y ).If E = R, we have the following closed form of the Wasserstein distance (see e.g.Theorem 3.1.2in [27]): Theorem 4.1.Let X, Y be random variables in l p (in L p (T ) respectively) for some p ∈ N. Then the following are equivalent: (i) X and Y share the same underlying basis copula (path copula respectively) C; ) is an optimal coupling of X and Y , where U ∼ C; (iii) The Wasserstein distance between X and Y is given by In particular, if one of the above holds we have n∈N Proof.Since the proof for the L p (T ) case is analogous, we will just show the assertion for l p valued random variables X and Y .Assume (i) holds.By Corollary 3.29 we have that F X (U ) are measurable random variables taking values in l p for a copula process U ∼ C.Moreover, they are a coupling, as consequence of Sklar's Theorem 2.3.To show optimality, observe first that for X ∼ ν 1 and Y ∼ ν 2 we have This general lower bound on the Wasserstein distance is actually achieved in our case since, by (4.2), we obtain This shows (i) ⇔ (ii) and (i) ⇒ (iii).Since (ii) ⇒ (i) is trivial, it is therefore sufficient to show (iii) ⇒ (i).Since equality in (4.3) can just hold, if there is an optimal coupling (X, Y ), such that we have that (X i , Y i ) must also be an optimal coupling for all i ∈ N. By Proposition 2.1 in [29] we obtain that for all i ∈ N we have that X i s.o.∼ Y i .This implies (i) due to Lemma 2.6.Remark 4.2.Observe that the implications (ii) ⇒ (i) and (iii) ⇒ (i) in Theorem 4.1 must be interpreted in the sense that there is always a representative of the equivalence classes that possesses the same path copula.
Remark 4.3.An analogous result to the previous Theorem 4.1 was elaborated in finite dimensions in [29] and then transferred to the infinite-dimensional setting via the equivalent formulation for two random variables with similarly ordered marginals (which, by Lemma 2.6, is equivalent for them to share the same copula).However, since there the proof was not given explicitly and the notion of copulas in infinite dimensions was not used, we provided a proof.
Furthermore, the assertion of Theorem 4.1 does not hold for the q-Wasserstein distance over l p (L p (T ) respectively) if q = p, as it was shown in [1] for the finitedimensional case.
Remark 4.4.Theorem 4.1 is useful, because the one-dimensional Wasserstein distance has a closed form given by (4.2).This expression can oftentimes be estimated rather well from above (see for instance chapter 4.7 in [25] for a discussion of convergence of empirical measures).
Remark 4.5.The copula construction effectively solves the following optimization problem for E = l p or E = L p (T ) respectively: for any family of marginals (ν n ) n∈N (respectively (ν t ) t∈T ), and the optimal value is given by ν = ((F Moreover, Theorem 4.1 implies the following.Corollary 4.6.Let X, Y be stochastic processes with values in L 2 (T ).If (e n ) n∈N is an orthonormal basis in L 2 (T ), then the following are equivalent: (a ) X and Y have the same basis copula if and only if X and Y have the same path copula.

4.2.
A Robustness Inequality in L p (T ).In order to derive a distance estimate between random variables in L p (T ) := L p (T, B(T ), µ; R) for a finite Borel measure µ and T ⊂ R d compact, based on the copula and the marginals separately, we impose a smoothness assumption on marginals of the distribution function.
Assumption 1.For all t, the marginals F t are continuously differentiable and strictly increasing on (F )).Moreover, assume that for the corresponding densities f t there is a measurable such that each g t is ultimately monotone (see, e.g.[6]), that is, it is monotone on [m t − x t 0 , m t + x t 0 ] c for some x t 0 ∈ R + , m t ∈ R, with g t bounded away from 0 on [m t − x t 0 , m t + x t 0 ] by some λ > 0 (independent of t) and f t (x) ≥ g t (x) > 0 for all x ∈ (F Yt (1)) and for all t ∈ T , and there is an Observe that Assumption 1 is satisfied for Gaussian marginals: Example 4.8.Assume that Y = W is a zero mean continuous Gaussian process.Clearly, the densities f t of W t are ultimately monotone and and we can choose x t 0 = 0 and g t = f t .Then Condition 4.4 holds, as for all β ∈ (0, 1) where Z is a standard normally distributed random variable.Thus, since t → σ β t is continuous, it is integrable over T and Assumption 1 holds.
Another example, for which Assumption 1 holds, is the following class of heavy tailed marginals.
Example 4.9.(Regularly varying marginals) A measurable function h : R + → R + is regularly varying with tail index α ∈ R, if We write h ∈ R(α).If α = 0, h is called slowly varying.A one-dimensional law given by its cumulative distribution function F is said to have regularly varying tails, if the survival function F := 1 − F is regularly varying.Let Y be a càdlàg stochastic process, such that its marginals (F t ) t∈T are continuously differentiable, strictly increasing, supported on R + and regularly varying with tail index −α t for α t > 0 where we assume that t → α t is continuous.Moreover, assume that the densities f t are ultimately monotone on [0, y t 0 ] c for some y t 0 ∈ R + and jointly continuous in x and t.This enables us to use the monotone density theorem (c.f.Theorem 1.7.2 in [6]) to conclude that f t ∈ R(−(α t + 1)).Hence, there are slowly varying functions l t : R + → R + such that f t (x) = x −(1+αt) l t (x).For convenience, let us assume that (t, x) → l t (x) is bounded.By choosing β < min t∈T αt 1+αt we obtain such that −(1 − β)(1 + α t ) < 1 and by Karamata's theorem (c.f.Proposition 1.5.10 in [6]) we can find x t 0 > y t 0 and some δ > 0 such that Assume moreover that we can choose t → x t 0 to be continuous (this is possible for instance if all l t 's are supported on a compact domain) and hence, since f t (x) > 0 for all x ∈ R + we have for λ := min that each f t is bounded away from 0 on [−x t 0 , x t 0 ] c by this λ > 0.Moreover, it holds by the continuity of t → x t 0 .Thus, (4.4) in Assumption 1 is valid with The next Theorem gives an idea about the robustness of the copula construction.
Theorem 4.10.Let X = (F Xt (U X t )) t∈T and Y = (F Yt (U Y t )) t∈T be càdlàg stochastic processes, such that [X] ∈ L p (Ω × T ) for some p ≥ 1, [Y ] ∈ L p+ǫ (Ω × T ) for some ǫ > 0 and let the marginals F Y of Y satisfy Assumption 1. Then for all q ≥ 1 there are constants K := K(β, q, p, ǫ, F Y ) and ρ := ρ(β, q, p, ǫ) such that The constants are given by ρ := ǫqβ p(p + ǫ)(q + β) − pqβ (4.7) and Proof.By the triangle inequality we have (4.9) From Theorem 4.1 we know that (X, F Let us now estimate the second summand.Set δ := 1 + p(q+β)−qβ Then we can estimate for γ = δ δ−1 using Hölder's inequality Now observe that since Y t and F Yt (U X t ) share the same distribution and by the elementary inequality |x + y| r ≤ 2 r−1 (|x| r + |y| r ) for r ≥ 1 we have Appealing to the mean value theorem and once more Hölder's inequality (

.14)
We now show that the first factor is finite.Denote the random variables and choose x t 0 according to Assumption 1 such that g t is ultimately monotone on [−x t 0 , x t 0 ] c , where without loss of generality m t = 0. We can argue by continuity and monotonicity of cumulative distribution and quantile functions as well as Assumption 1 that Without loss of generality we can assume g to be symmetric in the tails, that is g . Hence, by Assumption 1 as well as the monotonicity and the symmetry of g, we have Therefore, for any uniformly distributed U on [0, 1] we obtain Thus, (4.15) and (4.17) imply Combining (4.11), (4.12), (4.14) and (4.18) we obtain The proof is complete.Remark 4.11.Although the marginals of Y must fulfill Assumption 1, the marginals of X can be chosen more freely and neither have to be absolutely continuous, nor must satisfy a tail condition.For instance, this allows to approximate a smooth law of Y with discrete marginal measures (e.g.empirical measures).
Remark 4.12.The parameters q, p, β and ǫ, used in equation (4.6) are competing in the following way: Choosing a lower p, but larger ǫ makes a potential convergence rate better, since it makes the exponent ρ decrease (However, this also lets the constant K grow larger).For the same reason we might wish to choose the largest value possible for β.The parameter q can be chosen in order to derive a good approximation rate for the copula processes (for instance via the next Theorem 4.13).
The next Theorem is useful if the copula processes stem from other processes, like for instance Gaussian or elliptical copulas.Theorem 4.13.Let F Ỹ , F X be marginals with finite qth moment for q ≥ 1 and define Asume F Ỹt is absolutely continuous, strictly increasing, and the corresponding density function is bounded, that is In particular, Using the triangle inequality, we obtain Then for the first summand we have by the mean value inequality For the second summand we have again by the mean value theorem (2) q = F Ỹ (F . Moreover, since W q q (F Xt , F Ỹt ) L 1 (T ) = W q q ( X, F Ỹ (U X )), which by Remark 4.5 can be estimated as, W q q ( X, F Ỹ (U X )) ≤ W q q ( X, F Ỹ (U Y )) = W q q ( X, Ỹ ) ≤ E[ X− Ỹ q L q (T ) ] = X− Ỹ q L q (T ×Ω) , also the second assertion follows.for some positive, real-valued random variable S with finite second moment and a Gaussian process V ∼ N (0, C), independent of S (see [7] for the exact description and the relation to finite-dimensional elliptical distributions).First, observe that without loss of generality we can assume V ∼ N (0, 1) since the process ) t∈T has by Lemma 2.6 the same copula as Ỹ .If S has finite inverse moment, then Ỹt has for each t a bounded density, since by the formula for the density of two independent products, we get Hence, for any other copula process U X corresponding to another process X in L 2 (Ω × T ), we have by Theorem 4.13 that Copulas may be suitable to capture tail behaviour in functional data.This can be seen by the following example, combining the last ones.(U t ) where U is copula process corresponding to an elliptical process Ỹ given by (4.19) and the marginals F Y are regularly varying as in Example 4.9.More specifically we can take F Y to follow Pareto marginals, that is l t (x) = α t x αt min x ≥ x min 0 x ≤ x min for some constant x min > 0, such that f t are the densities of a Pareto distribution P ar(x min , α t ), where t → α t is assumed to be continuous.Assume now α t > 2 + γ for some γ > 0. Then Y takes values in L 2 (T ).Consider a situation in which we can approximate the marginal function F Y by another marginal function F n (for example by empirical cumulative distribution functions).The underlying elliptical process is in the L 2 (T )-norm best approximated over all processes with n dimensional spectral decomposition by the projection Z i e i of the first n principal components in the corresponding Karhunen-Loève expansion Here (e i ) i∈N is an an orthonormal basis of eigenvectors of the covariance operator of Ỹ , where the corresponding eigenvalues (λ i ) i∈N are ordered decreasingly (see for instance Theorem (1.5) in [8] for a proof and [7] for more optimality properties of the principal components for elliptical processes).Then Assume that U n is the path copula process underlying Ỹ n .Let us investigate how well Y n = F [−1] n (U n ) approximates Y .We can choose p = 1, ǫ = 1, q = 2, β = 2 3 and hence ρ = 1  3 , m t = x min (using the notation of Theorem 4.10) and x t 0 = 0. Thus, by (4.6) and therefore, the convergence rate is 1 3 of the convergence rate of the principal components and the rate of convergence induced by the Wasserstein distance of the marginals, which depends on the respective approximation technique for the marginals.
The example above depicts a situation that can be investigated from the point of view of functional data analysis.As in finite dimensions the estimation of the underlying covariance matrix corresponding to an underlying elliptical copula is conducted via infering on the rank correlation, such as Kendall's τ or Spearman's ρ.A generalization of these objects and their estimation would have to be generalised to the functional setting.Nevertheless, this is a logical next step and is hence an appealing strand that is left for future research.

3. 2 .
Solutions to the Construction Problem.

3. 2 . 2 .
Path Copulas for Continuous Processes.For a given interval T ⊂ R d , d ∈ N, that is T = I 1 × ... × I d for some (eventually unbounded) one-dimensional intervals I 1 , ..., I d , we want to establish a 'Sklar-like' theorem in the space of real continuous functions C(T ) := C(T ; R).If T is compact, we equip this with the norm f ∞ := sup t∈T |f (t)|, making C(T ) a separable Banach space.

3. 2 . 3 .
Path Regularity and Copulas.Let again T = I 1 × ... × I d be an interval in R d for some d ∈ N.
of coefficient marginals satisfies all the conditions of Definition 3.1.Embedding (3.1) reads now v → (a n (v)) n∈N .(3.15) Theorem 3.25.Let V be a Banach space with Schauder basis (e n ) n∈N and coefficient functions (a n ) n∈N .The following are equivalent (i) A sequence (a n ) n∈N is in the range of the embedding (3.15); (ii) ∞ n=1 a n e n ∈ V is a convergent series in the norm topology; (iii) sup N ∈N N n=1 a n e n < ∞.Proof.See Theorem 4.13 in [17].

Example 4 . 14 .
Assume that U Y is an elliptical copula corresponding to an elliptical random variable Ỹ in L 2 (T ), that is Ỹ = SV (4.19)

Example 4 . 15 (
Approximating Pareto marginals on an elliptical copula).Assume that a process Y given by Y t := F [−1] t Thus, using also Example 4.14 and combining (4.20), (4.21) and (4.22) we obtain .4) Remark 4.7.If a density function (x, t) → f t (x) is continuous and ultimately monotone, that is, there are x t 0 > 0, m t ∈ R such that f t is monotone on [m t − x t 0 , m t + x t 0 ] c , the best candidate for the choice of g in Assumption 1 is f t itself.