30 Years of space–time covariance functions
Funding information: Fondo Nacional de Desarrollo Científico y Tecnológico, Grant/Award Number: 1170290; Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung, Grant/Award Number: 175529
Abstract
In this article, we provide a comprehensive review of space–time covariance functions. As for the spatial domain, we focus on either the d‐dimensional Euclidean space or on the unit d‐dimensional sphere. We start by providing background information about (spatial) covariance functions and their properties along with different types of covariance functions. While we focus primarily on Gaussian processes, many of the results are independent of the underlying distribution, as the covariance only depends on second‐moment relationships. We discuss properties of space–time covariance functions along with the relevant results associated with spectral representations. Special attention is given to the Gneiting class of covariance functions, which has been especially popular in space–time geostatistical modeling. We then discuss some techniques that are useful for constructing new classes of space–time covariance functions. Separate treatment is reserved for spectral models, as well as to what are termed models with special features. We also discuss the problem of estimation of parametric classes of space–time covariance functions. An outlook concludes the paper.
This article is categorized under:
- Statistical and Graphical Methods of Data Analysis > Analysis of High Dimensional Data
- Statistical Learning and Exploratory Methods of the Data Sciences > Modeling Methods
- Statistical and Graphical Methods of Data Analysis > Multivariate Analysis
Abstract
1 INTRODUCTION
Covariance functions describe the second‐order dependence of random processes. The popularity of covariance functions in spatial and space–time statistics, as well as in probability theory and machine learning is due to the fact that the properties of Gaussian random fields are completely determined by their first‐ and second‐order moment. Thus, covariance functions are crucial to modeling, estimation, and kriging prediction, of Gaussian random fields.
Space–time covariance functions as models for dependence have been central to many branches of applied and theoretical sciences. Applications include climate modeling (Alexeeff, Nychka, Sain, & Tebaldi, 2016; Berliner, Levine, & Shea, 2000; Crippa et al., 2016; Edwards, Castruccio, & Hammerling, 2019; Genton & Kleiber, 2015; Gething et al., 2010; Guinness & Fuentes, 2016; Heaton et al., 2019; Kühl, Gebhardt, Litt, & Hense, 2002; Sang, Jun, & Huang, 2011), environmental statistics (Bevilacqua, Fassò, Gaetan, Porcu, & Velandia, 2016; Calus, Bijma, & Veerkamp, 2004; Cameletti, Lindgren, Simpson, & Rue, 2012; De Cesare, Myers, & Posa, 2001b; De Iaco, Myers, & Posa, 2002a; Fassò, Finazzi, & Ndongo, 2016; Finazzi & Fassò, 2014; Haslett & Raftery, 1989; Legarra, Misztal, & Bertrand, 2004; Meyer, 1998), image analysis (Benali et al., 1997; De Iaco et al., 2002a; Hengl, Heuvelink, Tadić, & Pebesma, 2012; Jain & Jain, 1981; Meiring, Monestiez, Sampson, & Guttorp, 1997), probability forecast (Giebel, Brownsword, Kariniotakis, Denhard, & Draxl, 2011; Gneiting & Katzfuss, 2014; Gneiting, Larson, Westrick, Genton, & Aldrich, 2006; Zhang, Wang, & Wang, 2014), meteorology (Bourotte, Allard, & Porcu, 2016; Gneiting, 2002b; Handcock & Wallis, 1994; Jun & Stein, 2007; Li, Genton, & Sherman, 2007; Reich, Eidsvik, Guindani, Nail, & Schmidt, 2011) oceanography (Bertino, Evensen, & Wackernagel, 2003; Farmer & Clifford, 1986; Halliwell Jr & Mooers, 1979; White & Bernstein, 1979), extremes (Davis, Klüppelberg, & Steinkohl, 2013a; Davis & Mikosch, 2008; Huerta & Sansó, 2007; Huser & Davison, 2014; Kabluchko, 2009), machine learning (Garg, Singh, & Ramos, 2012; Genton, 2001; Sarkka, Solin, & Hartikainen, 2013), demography (De Iaco, Palma, & Posa, 2015), forestry (Buttafuoco & Castrignanò, 2005; Jost, Heuvelink, & Papritz, 2005), atmospheric sciences (Bardossy & Plate, 1992; Brown, Diggle, Lord, & Young, 2001) turbulence (Kraichnan, 1964; Shkarofsky, 1968), and finance (Fernández‐Avilés & Montero, 2016; Porcu, Montero, & Schlather, 2012) to mention just a few.
The first formulations of space–time covariance functions trace back to the early 1990s, albeit exploiting simple mathematical structures. For instance, an easy way to build a space–time covariance function is through the product of a spatial and a temporal covariance function. Such covariance functions are called separable: They are easy to construct and allow for considerable computational gains (details are deferred to subsequent sections). However, they are very limited in describing the interaction between space and time, in many cases implying unphysical dependence among process variables. Thus, the 1990s saw an increasing number of efforts to construct nonseparable covariance functions: The first approaches to building nonseparable space–time covariance functions can be found in Christakos (1990, 1992) and Dimitrakopoulos and Luo (1994). More recently, there has been a wealth of contributions based on direct construction in space and time domains (Cressie & Huang, 1999; De Iaco et al., 2002a; Gneiting, 2002b; Porcu, Gregori, & Mateu, 2006), through spectral densities in frequency space (Fuentes, Chen, & Davis, 2008; Stein, 2005b, 2005c), or on the basis of physical principles and dynamic modeling approaches (Baxevani, Podgórski, & Rychlik, 2011; Brown, Karesen, Roberts, & Tonellato, 2000; Christakos, 1990, 1992).
Modeling space–time covariance functions have had a recent resurgence thanks to the increasing interest in modeling global data. Here the spatial domain is taken to be a sphere, and so covariance functions must respect this topology. As for space–time stochastic processes on the sphere, we refer the reader to the recent approaches in Porcu, Bevilacqua, and Genton (2016); Berg and Porcu (2017) and Jeong and Jun (2015). Generalizations to multivariate space–time processes have been considered in Alegría, Porcu, Furrer, and Mateu (2019). The richness in modeling stochastic processes over spheres or spheres cross time is reflected in the diversity of research in this area: Mathematical analysis (Barbosa & Menegatto, 2017; Beatson, Zu Castell, & Xu, 2014; Chen, Menegatto, & Sun, 2003; Gangolli, 1967; Guella, Menegatto, & Peron, 2016a, 2016b, 2017; Hannan, 1970; Menegatto, 1994, 1995; Menegatto, Oliveira, & Peron, 2006; Schoenberg, 1942), probability theory (Baldi & Marinucci, 2006; Clarke, Alegria, & Porcu, 2018; Hansen, Thorarinsdottir, Ovcharov, & Gneiting, 2015; Lang & Schwab, 2013), spatial point processes (Møller, Nielsen, Porcu, & Rubak, 2018), spatial geostatistics (Christakos & Papanicolaou, 2000; Gerber, Mösinger, & Furrer, 2017; Gneiting, 2002b; Hitczenko & Stein, 2012; Huang, Zhang, & Robeson, 2012), space–time geostatistics (Berg & Porcu, 2017; Christakos, 1991a, 2000; Christakos, Hristopulos, & Bogaert, 2000; Porcu et al., 2016) and mathematical physics (Istas, 2005; Leonenko & Sakhno, 2012; Malyarenko, 2013).
This paper provides a review of space–time covariance functions. The plan of the paper is as follows: Section 2 provides the necessary background on different metrics used on different spaces. We provide some classes of functions defined on the positive real line that become building blocks for more complex covariances. Section 3 is devoted to properties of space–time covariance functions through their spectral representations. Section 4 provides several strategies for constructing space–time covariance functions. Sections 5 and 6 discuss covariance functions having special properties or motivated by certain physical principles. Section 8 discusses the estimation problem for space–time covariance functions. The article concludes with a perspective on future developments.
2 BACKGROUND
2.1 Spaces, distances, and covariance functions
We consider random fields
, where
is the spatial domain, and
is time. In this article, we shall work (either) with the case
(the d‐dimensional Euclidean space) or
, the unit d‐dimensional sphere. As for the domain
, time will be considered for most of this article in a continuous fashion (
), unless explicitly stated otherwise.
, defined as

Covariance functions are a linear measure of dependence between the random variables Z at the space–time locations (s1, t1) and (s2, t2). As such, they must be positive‐definite functions. Conversely, it is true that any positive definite function is the covariance function associated with a Gaussian process.
Under the assumption of a Gaussian process Z, the mean and covariance functions completely characterize its distribution. Moreover, any finite‐dimensional sampling will be distributed multivariate Gaussian, with the mean vector and the covariance matrix determined by the mean function μ and the covariance function C.
, we consider the classical Euclidean distance, denoted as ‖·‖ throughout. When
, the natural distance on the sphere is the geodesic or great circle distance, defined as the mapping
so that

, Z is called weakly stationary if
(1)
(2)
, the definition (1) is not meaningful because translations do not make sense on spheres. Thus, for a second‐order process Z defined over
, we define weak stationarity and geodesic isotropy when there exists a continuous mapping
ψ : [0, π] × ℝ → ℝ such that
ψ(0, 0) < ∞ and
(3)
, very often (in particular, in the analysis of climate data) the hypothesis of geodesic isotropy is replaced by that of axial symmetry: There exists a continuous mapping
such that

2.2 Positive‐definite functions and related building blocks
We introduce some classes of continuous functions, defined on the positive real line, that will be useful for the construction of parametric classes of space–time covariance functions. A function f : [0, ∞) → ℝ+ is called completely monotonic if it is continuous, infinitely differentiable on (0, ∞), satisfying (−1) nf(n)(t) ≥ 0, n ∈ ℕ. Here, f(n) denotes nth derivative and we use f(0) for f, where f(0) is required to be finite.
(4)
the set of Stieltjes functions. It has been proved that
is a convex cone (Berg, 2008), with the inclusion relation
, where
is the set of completely monotone functions. The relation (4) shows that the function
f(t) = 1/(1 + t), t ≥ 0, is a Stieltjes function. Using the fact that
if and only if 1/f is a completely Bernstein function (for a definition, see Porcu & Schilling, 2011), we can get a wealth of examples of Stieltjes functions, as the book by Schilling, Song, and Vondracek (2012) provides an entire catalogue of completely Bernstein functions. We finally note that completely Bernstein functions are infinitely differentiable over (0, ∞) and have a completely monotonic derivative. Similarly, Bernstein functions have a completely monotone derivative, but a different integral representation (Berg, 2008).
(5)
is the MacDonald function (Gradshteyn & Ryzhik, 2007).
is completely monotonic on the positive real line for all
ν > 0 (Miller & Samko, 2001). The appeal of this class is the parameter ν that governs the smoothness of the covariance function at the origin (Stein, 1999), and thus the smoothness of a Gaussian field on ℝ
d in the mean square sense. The Matérn family also has a simple form for its spectral densities. Some special cases, for specific values of half‐integer ν, are reported in Table 1.
. SP(k) means that the sample paths of the associated Gaussian field are k times differentiable
| ν | ℳ ν(t) | k |
|
SP(k) |
|---|---|---|---|---|
| 0.5 | e−t | 0 |
|
0 |
| 1.5 | e−t(1 + t) | 1 |
|
1 |
| 2.5 |
|
2 |
|
2 |
| 3.5 |
|
3 |
|
3 |
We finish this section by introducing a class of continuous and positive‐definite functions that vanish outside the interval [0, 1] (or a suitably rescaled interval).
, defined as
(6)
denotes the Askey family of functions (Askey, 1973), defined by
(7)
The latter two classes are, strictly speaking, correlation functions and can be scaled according to a variance parameter
σ2 > 0 and range parameter ϱ > 0 to obtain the covariance functions
σ2ℳ
ν(t/ϱ) and
, respectively.
3 PROPERTIES OF COVARIANCE FUNCTIONS
3.1 Descriptive properties
, and
,
(8)
(9)
(10)
and C
T are spatial and temporal covariance functions, respectively. In all the other cases where (9) does not happen, C is called nonseparable. Notably, C in (10) is not strictly positive definite even if both
and C
T are.
In settings with temporally collocated data (every spatial location is observed at each time), separable models allow for ease of computation and dimensionality reduction, as the space–time covariance matrix is obtained through the Kronecker product of the marginal spatial and temporal ones. However, separability is an unrealistic assumption for many applications since it implies limited interactions between the spatial and temporal variations. Also, the computational benefits are lost when there is no complete collocation of the observations. Accordingly, various techniques have been introduced for generating different classes of nonseparable spatiotemporal covariance models.
Nonseparability can account for complex interaction between space and time. To fix concepts, Rodrigues and Diggle (2010) define positive (negative) nonseparability when, respectively,
, or
. If such inequalities hold for all
, then C is called uniformly (positive or negative) nonseparable. Generalizations of these concepts are included in De Iaco and Posa (2013).

and C is stationary, then C is defined through the function K in Equation (1), and under full symmetry, we have
. Obviously, isotropy in space (whatever the space,
) and symmetry in time imply full symmetry. Separable covariance functions are also fully symmetric, and tests for separability can be used to test for full symmetry (Gneiting, Genton, & Guttorp, 2007). A direct way to build a nonseparable covariance function is by considering a product‐sum model (De Cesare, Myers, & Posa, 2001a; De Iaco, Myers, & Posa, 2001, 2011): For three positive weights a
i, i = 1, 2, 3, such a model is obtained through

Several generalizations of this construction have been considered by De Iaco et al. (2001, 2002a), De Iaco, Myers, and Posa (2002b) and Gregori, Porcu, Mateu, and Sasvári (2008), to mention a few. More constructions will be discussed in subsequent sections.
A covariance function is compactly supported, with spatial radius a and temporal radius b, if it is identically equal to zero outside the finite space–time range, (a, b). Let
. Then,
C((s1, t1), (s2, t2)) is identically equal to zero whenever
. We thus say that C is compactly supported over
.
When
then a scalar compact support can be attained if
, where a1 and a2 are positive scaling factors. Such a situation is termed geometric anisotropy, and space–time models of this type have been proposed by Dimitrakopoulos and Luo (1994). Finally, let C be a space–time covariance function that is spatially isotropic (either in ℝ
d or in
). We call a temporally dynamical radius, h, the continuous mapping from [0, ∞) to (0, ∞) such that for each
u
o ∈ [0, ∞), the margin
C(·, u
o) is compactly supported on a ball embedded in ℝ
d with radius
h(u
o). Clearly, both Askey (7) and generalized Wendland (6) classes are special cases of dynamical compact support, when
h ≡ b > 0 is the constant function. We call functions C with such a property dynamically supported (Porcu, Bevilacqua, & Genton, 2019). Clearly, if C is compactly supported over
, then it is also dynamically supported with dynamical radius being identically equal to b.
A fully symmetric covariance C has a dimple if Z(shere, tnow) is more correlated with Z(sthere, tthen) than with Z(sthere, tnow). For isotropic covariance functions, this implies that, for a fixed r o > 0, the functions φ(r o, ·) in (2) and ψ(r o, ·) in (3) are no longer monotonically decreasing (for this last case, we obviously require r o ≤ π), thus resulting in a possibly counterintuitive property. A first description of a dimple is due to Kent, Mohammadzadeh, and Mosammam (2011). More recently, a description of dimples through contour curves has been provided by Cuevas, Porcu, and Bevilacqua (2017).
A recent review in De Iaco, Posa, Cappello, and Maggio (2019) digs into other descriptive properties: Explicit distinction is made between partial, additive and total separability, as well as the concepts of axial, full and quadrant symmetries on the plane. Some tests on separability of space–time covariance functions can be found in Scaccia and Martin (2005, 2002, 2011), Fuentes (2006); Bevilacqua, Mateu, Porcu, Zhang, and Zini (2010), Mitchell, Genton, and Gumpertz (2006), Constantinou, Kokoszka, and Reimherr (2017), Lu and Zimmerman (2005), Aston, Pigoli, and Tavakoli (2017), Li et al. (2007), De Iaco, Posa, & Myers, 2013, De Iaco, Palma, & Posa, 2016), and Cappello, De Iaco, and Posa (2018). Tests for axial symmetry are provided by Scaccia and Martin (2002, 2005). For a further review, the reader is deferred to Kyriakidis and Journel (1999).
3.2 Spectral representations
, and the covariance function is weakly stationary and continuous, Bochner's theorem (Bochner, 1955) establishes a one‐to‐one correspondence between positive‐definite functions and the Fourier transforms of positive and bounded measures:


, defined as
(11)
(see Gneiting, 2002b, theorem 1). This generalizes the criterion provided by Cressie and Huang (1999).

and
has only been elucidated recently. Berg and Porcu (2017) have shown that the continuous function ψ in (3) is positive definite if and only if
(12)
is a sequence of temporal covariance functions with the additional requirement that
in order to guarantee the variance of Z,
σ2 = ψ(0, 0), to be finite. Here,
denotes the Gegenbauer polynomial with exponent
λ > − 1/2 and order
k = 0, 1, … (Dai & Xu, 2013).
(13)
, satisfies
. Then, the following assertions are equivalent:
-
ψ(θ, u) is the covariance function of a random field on
;
- The function
C
τ : [0, π] → ℝ, defined as
is the covariance function of a random field on

for almost every
τ ∈ ℝ;
- For all
k = 0, 1, 2, …, the functions
, defined through (13) are continuous, positive definite on ℝ, and ∑
kb
k, d(0) < ∞.
Spectral representations for the case of axial symmetry are available as well, and can be made explicit by using the arguments in Berg and Porcu (2017) in concert with the spectral expansions in Jones (1963). For a thorough account of axial symmetry, see Porcu, Castruccio, Alegria, and Crippa (2019).
4 CLASSES OF COVARIANCE FUNCTIONS
4.1 The persistent value of the Gneiting functions
The Gneiting class has been proposed in a wealth of applications involving space–time geostatistics (Diggle, 2013; Gelfand, Schmidt, Banerjee, & Sirmans, 2004), extreme events (Huser & Davison, 2014), applications to radar–rain gauge merging (Sideris, Gabella, Erdin, & Germann, 2014), solar irradiance forecasting (Yang et al., 2013), meteorology (Spadavecchia & Williams, 2009), particulate matter (Cameletti, Ignaccolo, & Bande, 2011), bubonic plague epidemics (Christakos, Olea, & Yu, 2007), ground‐level ozone (Gilleland & Nychka, 2005), cellular traffic at city scales (Chen, Jin, Qiang, Hu, & Jiang, 2015), and pricing in financial markets (Espen & Jurate, 2012), to cite a few. Beyond these specific applications, the Gneiting functions are also the building blocks for more sophisticated covariance functions: From extreme space–time modeling (Huser & Davison, 2014) to a deeper study of directional properties of space–time random fields (Sherman, 2011), dynamical models (Wikle & Hooten, 2010), dynamic factor analysis (Lopes, Salazar, & Gamerman, 2008), predictive modeling (De Luna & Genton, 2005), projective space–time processes (Wang & Gelfand, 2014), covariate‐dependent space–time modeling (Reich et al., 2011), and anisotropic and nonstationary covariance functions (Porcu, Gregori, & Mateu, 2006; Schlather, 2010).
. We provide a different exposition for the function here: We consider the family of functions
G
α : [0, ∞) × [0, ∞) → ℝ, defined through
(14)4.2 Gneiting class across different metric spaces
We now list relevant findings related to the Gneiting function representation (14).
] We have the following cases.
- If f is completely monotonic and α ≥ d/2, then, G α(‖·‖2, |·|2) is positive definite if and only if exp(−ch(|·|2)) is positive definite on the real line for all c > 0. The sufficiency has been proved by Gneiting (2002b). The necessary part of the assertion has been proved by Zastavnyi and Porcu (2011).
- If f is a Stieltjies function and h a Bernstein function, then G α(‖·‖2, |·|2) is positive definite for all α > 0 and for all d = 1, 2, …. This result was recently proved by Menegatto, Oliveira, and Porcu (2019).
-
Assume f is a generalized Wendland function,
in Equation (6). Let h be continuous and positive function on the positive real line, with
h(0) = 1 and such that 1/h(·) is increasing and concave on the positive real line, with lim
t → ∞ψ(t) = 0. Then,
G−α(‖·‖, |·|) is positive definite provided
ν ≥ (d + 5)/2 + κ and
α ≤ (d + 3)/2 + 2κ. This result was proved by Porcu, Bevilacqua, and Genton (2019).
] For the sphere, we have the following cases.
- If f is completely monotonic and h a Bernstein function, then, G α(|·|2, θ) is positive definite provided α ≥ d/2 (Porcu, Bevilacqua, & Genton, 2016b).
- If f is completely monotonic, h positive, increasing and concave, then, G α(θh2(|·|2), |·|2) is positive‐definite (Porcu et al., 2016b).
- If f is a Stieltjies function and h a Bernstein function, then G α(θ, |·|2) is positive definite for all α > 0 and for all d = 1, 2, … (White & Porcu, 2019b).
-
f is a generalized Wendland function
, h is positive, decreasing and convex (Porcu, Bevilacqua, & Genton, 2019).


A bridge between Gneiting functions and semi‐metric spaces has been recently provided by Menegatto et al. (2019).
4.3 Final remarks on the Gneiting class
According to Rodrigues and Diggle (2010), the Gneiting class is always negative nonseparable. Also, Kent et al. (2011) and more recently Cuevas et al. (2017) show conditions on the functions f and h such that a dimple can happen. In particular, Cuevas et al. (2017) offer a dual view of the dimple problem related to space–time correlation functions in terms of their contours. They find that the dimple property in the Gneiting class of correlations is in one‐to‐one correspondence with nonmonotonicity of the parametric curve describing the associated contour lines. Further, they show that given such a nonmonotonic parametric curve associated with a given level set, all the other parametric curves at smaller levels inherit the nonmonotonicity.
4.4 Scale mixtures: A smart trick
be a measure space. Let
and
be continuous mappings such that, for any fixed
and
are stationary (if
, then stationary and isotropic) covariance functions in their respective spaces. Let
(16)A wealth of examples is available thanks to such a construction. For
, points (A.1) and (A.2) in Section 4.1 are proved through scale mixtures. The quasi‐arithmetic class (Porcu, Mateu, & Christakos, 2010) is obtained through scale mixtures as well. The nonseparable models proposed by Fonseca and Steel (2011), Schlather (2010); Apanasovich and Genton (2010), Porcu et al. (2006); Porcu, Mateu, and Bevilacqua (2007), Porcu and Mateu (2007), and Alegría et al. (2019) are all obtained through scale mixture techniques.
The criterion in Cressie and Huang (1999) is a scale mixture as well: Compare (16) with (15) and let
, with
being the Borel sigma‐algebra in ℝ
d. Also, let
in (16) and, finally,
. Similarly, the so‐called half spectral approach proposed by Stein (2005c) is a special case of scale mixtures.
The product‐sum model of De Iaco et al. (2002a, 2002b), De Iaco and Posa (2012), Myers, De Iaco, Posa, and De Cesare (2002), and De Iaco et al. (2011) is not a scale mixture, but is instead based on the properties of positive‐definite functions seen as convex cone. Notably, some weights in the linear combinations in the product‐sum model can be negative while preserving positive definiteness. This permits covariance function with oscillatory behavior. Such a challenge has been faced by De Iaco et al. (2001) and Gregori et al. (2008) and has provided general results in a similar context. Peron, Porcu, and Emery (2018) have proposed linear combinations with negative weights for the case
.
4.5 Stein's spectral densities and related approaches
, a vector of strictly positive components, we define
(17)The function f is clearly strictly positive, but to make it a spectral density, according to (1), f must be integrable in ℝ
d × ℝ. Conditions for integrability are given by Stein (2005b), who then provides the geometric properties of a Gaussian random field having this parametric families of spectral densities. Finding closed‐form solutions of the covariance function through Fourier inversion from (17), however, is challenging. Stein (2005b) gives some special cases that provide spatial or temporal margins of the Matérn type. A similar philosophy is followed by Fuentes et al. (2008). Porcu, Gregori, and Mateu (2009) consider Archimedean functionals, that allow to compose spatial and temporal marginal spectral densities to build new classes of space–time spectra. Stein's approach (17) becomes a special case of Archimedean compositions, for a specific choice of the Archimedean functional. Extensions of Stein's approach to nonstationary cases has been attained through spatial adaptation of the parameter vector,
, which becomes a function of the spatial coordinates. Pintore and Holmes (2004) consider the simple case of the (square of the) product of two adapted spectral densities. Such an approach is then extended by Porcu et al. (2009) to more sophisticated compositions that are substantially similar to the Archimedean functionals.
When
, spectral modeling becomes much more challenging. One starting point, however, is the Berg–Porcu spectral representation in Equation (12). A spectral approach requires a sequence
of temporal spectral densities. To ensure that
as required in the Berg–Porcu characterization, one can make use of Parseval's theorem, which provides the spectral condition
. Spectral models on the sphere cross time are still very much an open area of research.
5 LAGRANGIAN REFERENCE FRAME AND TRANSPORT EFFECT
Environmental, atmospheric, and geophysical processes are often influenced by prevailing winds or ocean currents (Gneiting et al., 2007). Thus, the covariance function is no longer fully symmetric. In this situation, the idea of a Lagrangian reference frame is useful. Gneiting et al. (2007) summarize the physical justification of such a framework when
. For instance, the random rotation might represent a prevailing wind as in Gupta and Waymire (1987). It might be a westerly wind considered by Haslett and Raftery (1989), or again, it might be updated dynamically according to the current state of the atmosphere.
, that is distributed according to a probability distribution and a spatial Gaussian random field that is weakly stationary with stationary spatial covariance,
. Then, the resulting space–time covariance (that is weakly stationary, but not fully symmetric) is obtained through

has been recently tackled in Alegría and Porcu (2017): Take a random orthogonal (d × d) matrix ℛ. Let Z be a Gaussian process on
with geodesically isotropic covariance
. Define
(18)
The fact that the resulting covariance is still geodesically isotropic in the spatial component is a nontrivial property and is shown formally, for some specific choice of the random rotation ℛ, in Alegría and Porcu (2017), at least for the case of the sphere
. Some comments are in order. The resulting field Y in Equation (18) however, is not Gaussian (it is Gaussian conditional on ℛ). Also, obtaining closed forms for the associated covariance is generally difficult. Alegría and Porcu (2017) provide some special cases.
6 CLASSES WITH SPECIAL FEATURES
Covariance functions through a dynamic model have a long history that can be traced back to Yaglom (1948), Gandin and Boltenkov (1967) and Monin and Yaglom (1967). Representing the space–time process through stochastic integrals allows one to take into account physical properties of the process and adapt to specific applied problems. A wealth of examples and innovative ideas are provided by Christakos (1990, 1991b, 1992, 2000), and Christakos and Hristopoulos (1998). More recently, such dynamical representations have been considered by Brown et al. (2000) through the concept of blurring.

Very interesting examples have been considered (under discrete‐time setting) by Baxevani, Podgórski, and Rychlik (2003), and a wealth of sophisticated examples is provided by Baxevani, Caires, and Rychlik (2009), Baxevani et al. (2011), and Ailliot, Baxevani, Cuzol, Monbet, and Raillard (2011). Other models with special features have been recently proposed by Hristopulos and Tsantili (2016) and Hristopulos and Agou (2019).
Stationary models of covariance functions have been used as building blocks to create more complex models for dependence, for instance, to take into account nonstationarity in space. For just spatial processes, this idea was first proposed by Paciorek and Schervish (2006). The most general version of these models is provided in Porcu et al. (2010), who also extend the Paciorek–Shervish approach to space–time and in turn generalize Stein (2005a). Another path to nonstationarity has been pursued through spatial adaptation, for example, the parameters of a given family of covariance functions are allowed to vary smoothly with the spatial location (Kleiber & Nychka, 2012; Nychka, Wikle, & Royle, 2002). A nonstationary version of the Gneiting class G α has been provided by Porcu, Mateu, and Bevilacqua (2007) and more recently by Schlather (2010). Another strategy to construct nonstationary covariance functions is through convolutions, as in Rodrigues and Diggle (2010), but surprisingly, we have not found a similar extension in the space–time framework. Finally, we note that the strategy of spatially adapting through spectral approaches has been adopted by Pintore and Holmes (2004), Fuentes et al. (2008), and Porcu et al. (2009).
We are not aware of any extensions of the type above when the spatial domain
is the unit sphere embedded in a three‐dimensional Euclidean space. Certainly, the spatial adaptation of parameters could play an important role, and indeed a first attempt has been made in the spatial setting by Alegría, Cuevas, Diggle, and Porcu (2018) with the so‐called ℱ class that replaces the Matérn covariance function for processes defined over the sphere. Nonstationary models through spectral representations have been characterized by Estrade, Fariñas, and Porcu (2019). Finally, other models based on differential operators, but coupled with the chordal distance, have been proposed by Jun (2011), Jun and Stein (2007), and Hitczenko and Stein (2012).
7 MULTIVARIATE SPACE–TIME COVARIANCE FUNCTIONS
The literature on multivariate covariance functions has become ubiquitous and we refer the reader to Genton and Kleiber (2015) for a comprehensive review. Here, we focus on multivariate covariance functions that are isotropic in the spatial component and symmetric in the temporal one. Throughout, the argument
x ≥ 0 will denote either the Euclidean or the geodesic distance (in this last case,
x = θ ∈ [0, π]), depending on the domain
where the process is defined. We consider an m‐variate space–time random field
, for
and
t ∈ ℝ, that is isotropic in space and stationary and symmetric in time. Let
be a continuous matrix‐valued mapping, whose elements are defined as C
ij(x, u) = ℂov(Z
i(s1, t + u), Z
j(s2, t)), where
is either [0, ∞) (with
x = ‖s1 − s2‖) or [0, π] (and
x = θ(s1, s2)). According to that, C is isotropic (respectively geodesically isotropic, if
) in space and stationary in time (Porcu et al., 2016b). The diagonal elements of C, denoted as C
ii, are called marginal covariances, whereas the off‐diagonal members C
ij are called cross‐covariances. Observe that the marginal covariance functions are positive definite, while the cross‐covariances, in general, are not. Certainly, any parametric representation of C must respect the non‐negative definite condition analogue to Equation (8).
Appendix A in Alegría et al. (2019) contains rich material about the spectral representations associated with C, for both cases
and
, with
x = ‖·‖ and
x = θ, respectively. For
, such spectral representations have been presented by Alonso‐Malaver, Porcu, and Giraldo (2015), while the case
has been challenged in Appendix A of Alegría et al. (2019).
and
, being merely spatial and temporal matrix‐valued covariances, respectively, such that


is a univariate spatial covariance, and
a univariate temporal covariance function. Finally, Alegría et al. (2019) call the mapping C m‐separable if

, and a matrix
A, as previously defined. Clearly, the special case
offers complete space–time m‐separability as previously discussed.
7.1 Building nonseparable multivariate space–time covariance functions
The construction principles for multivariate space–time covariance functions are nicely summarized in Alegría et al. (2019), and we report here the essential content thereof.

is often imposed. There has been substantial criticism about this model as reported by Gneiting, Kleiber, and Schlather (2010) and Daley, Porcu, and Bevilacqua (2015). For example, the smoothness of any component of the multivariate field is restricted to that of the roughest underlying univariate process. Moreover, the number of parameters can quickly become massive as the number of components increases.
. The other case can be obtained analogously using transactions instead of rotations. Let Z be an m‐variate Gaussian field on
with covariance
. Let ℛ be a random orthogonal (d × d) matrix with a given probability law. Let

be a univariate space–time covariance parameterized by
. Let
, for
i, j = 1, …, m. For |ρ
ij| ≤ 1 and
ρ
ii = 1, and
σ
ii > 0, find the parametric restrictions such that
C : X × ℝ → ℝ
m × m, defined through

, Alegría et al. (2019) consider the case

for
d ≤ n + 1. Other examples and generalizations are provided by Alegría et al. (2019). For
and
x = ‖·‖, multivariate Gneiting functions have been proposed by Bourotte et al. (2016), where the functions f
ij in the Gneiting's compositions (14) belong either to the Matérn or to the generalized Cauchy class.
Scale mixture techniques can be adapted to the multivariate space–time setting by using the construction principle in Porcu and Zastavnyi (2011). Latent dimension approaches have been proposed by Porcu et al. (2006), Apanasovich and Genton (2010), and Porcu and Zastavnyi (2011) for the case
. The case
has been challenged in Alegría et al. (2019).
8 ESTIMATION OF SPACE–TIME DEPENDENCIES
Most of the approaches and estimation techniques proposed in the last 10 years are motivated by reaching a compromise between statistical efficiency and computational complexity. This last became an important issue, given the availability of massive (and multivariate) data sets that are often defined over large portions of the globe and repeatedly measured over time. A beautiful illustration of geostatistical estimation techniques for massive spatial data sets is provided by Sun, Li, and Genton (2012). This section departs from their treatment in two directions: We update the approaches described in Sun et al. (2012) with extensions to the space–time setting. Also, we briefly discuss estimation techniques that have been proposed for the spheres cross time problem (Porcu, Alegría, & Furrer, 2018). To set up the discussion we first update the spatial methods with space–time approaches related to separability of covariance functions, covariance matrix tapering, composite likelihoods, spectral techniques, and approximating the random field with a Gaussian Markov random field.
Estimation methods for space–time covariances for large data are conceptually similar to classical approaches. However, they do differ in the way statistical information (read: Full maximum likelihood) is sacrificed in favor of computational gains. It is statistically difficult to judge which method is best: The question depends on many critical aspects. Approximating covariance functions with computationally tractable ones is a convenient form of deliberate misspecification of the true underlying covariance structure. Classical examples are covariance tapering where the misspecification consists of using a direct product of the true covariance function and a compactly supported correlation function and the use of compactly supported covariance functions. Spectral methods, rely on truncation of the spectral expansion and so lose information on the geometric properties and the sample paths of the associated random field. Finally, Markov random fields coupled with SPDEs approximate a continuous process with a process defined over a lattice for which the conditional distributions only depend on nearby neighbors, leading to sparseness of the precision matrix, the inverse of the covariance matrix.
A review on composite likelihood methods has been provided by Varin, Reid, and Firth (2011). A specific challenge on composite likelihood for space–time data is instead taken by Bevilacqua, Gaetan, Mateu, and Porcu (2012b), and a numerical comparison of composite likelihoods for space–time is provided by Bevilacqua and Gaetan (2015). Extensions of composite likelihood to multivariate space–time have been proposed by Bourotte et al. (2016). Also, there has been a number of extensions to space–time wrapped Gaussian fields (Alegría, Bevilacqua, & Porcu, 2016), space–time multivariate Markov models (Gao & Song, 2011), Bayesian versions of space–time composite likelihood (Benoit, Allard, & Mariethoz, 2018; Pauli, Racugno, & Ventura, 2011; Ribatet, Cooley, & Davison, 2012), and hidden Markov models (Ranalli, Lagona, Picone, & Zambianchi, 2018). Composite likelihood for space–time extremes has been adopted by Huser and Davison (2014), Davison, Padoan, and Ribatet (2012), Padoan, Ribatet, and Sisson (2010), Davis, Klüppelberg, and Steinkohl (2013b), Castruccio, Huser, and Genton (2016), and Genton, Ma, and Sang (2011). The review by Varin et al. (2011) presents Vecchia's block likelihood (Caragea & Smith, 2006; Katzfuss & Guinness, 2017; Stein, Chi, & Welty, 2004; Vecchia, 1988) as a composite likelihood approach. A recent space–time composite likelihood approach has been proposed by Bai, Song, and Raghunathan (2012). Finally, there have been also some work on tests based on composite likelihood (Bevilacqua et al., 2010). Computational aspects related to space–time covariance functions have been provided by De Cesare, Myers, and Posa (2002), De Iaco, Myers, Palma, and Posa (2010), and De Iaco and Posa (2012).
Tapering of covariance functions has been well understood in the spatial setting. The literature can be separated into tapering with ultimate focus on prediction (e.g., Furrer, Genton, & Nychka, 2006) or on estimation (e.g., Furrer, Bachoc, & Du, 2016; Kaufman, Schervish, & Nychka, 2008) and asymptotic results based on infill‐domain asymptotics and increasing‐domain asymptotics, most mainly in the framework of purely spatial processes. Extensions to Bayesian versions have been provided by Shaby and Ruppert (2012) and Sang and Genton (2014) studied covariance tapering for max‐stable processes. The extension to space–time tapering has been studied only to a limited extent (Fassò, Finazzi, & Bevilacqua, 2011; Finazzi & Fassò, 2014) and some preliminary ideas of covariance tapering can be found in Guerci (2014).
In the framework of theoretical considerations for maximum likelihood estimation, there are two different schools on how the sampling locations are considered. On the one hand, there is the classical increasing‐domain asymptotics school where the density of the locations does not change. On the other hand, there is the infill asymptotics where the spatial domain is fixed and the sampling locations are sampled within that specific domain. In this setting, not all parameters are estimable. Infill asymptotics seems natural for sampling within a particular environmental framework (sediment samples of a lake, global meteorological variables), but the theory is rather cumbersome and still lacks elements for multivariate processes. It is not clear what the optimal asymptotic framework for space–time processes should be. Considering a space–time increasing domain asymptotic setting amounts to considering a process that is defined over the (d + 1)‐dimensional Euclidean space. Thus, the general results provided by Mardia and Marshall (1984) in terms of consistency and normality of the ML estimator of the parameters of a given family of covariance functions still hold. This path is taken, in space–time, by Bevilacqua, Gaetan, Mateu, and Porcu (2012a). A natural way to do space–time asymptotics would be to take an infill asymptotic approach for space while adopting an increasing domain approach for the temporal component. We are not aware of any contributions of this type in the literature. Space–time infill asymptotics with the space–time Matérn covariance function has been considered by Ip and Li (2017), and very recently by Faouzi, Porcu, and Bevilacqua (2020), who considered a class of space–time covariance functions having dynamical radii.
Dynamical approaches have been especially popular for the analysis of climate data, where climate models are generated over a regular grid generated on the sphere and repeatedly across time. As noted by Porcu et al. (2018), a very popular approach to drastically reduce the complexity is to separate the spatial and temporal components and to describe the dynamics of the process by specifying its evolution as a function of the past. Variability is then achieved by assuming a random spatial innovation. For climate data, temporal dynamics have been modeled through covariates only (Furrer, Sain, Nychka, & Meehl, 2007; Geinitz, Furrer, & Sain, 2015). Other relevant references are Cressie and Wikle (2011); Castruccio and Stein (2013); Fassò et al. (2016) and Finazzi and Fassò (2014). Recent work on satellite data has proposed to couple the dynamical approach dimension reduction techniques, and in particular, fixed rank kriging (FRK; see Cressie & Johannesson, 2008; Nguyen, Katzfuss, Cressie, & Braverman, 2014) to further reduce the parameter dimensionality and to achieve a fit for very large data sets (fixed rank filtering, see Kang, Cressie, & Shi, 2010; Cressie, Shi, & Kang, 2010).

the Fourier process for wavenumber k and latitude
the spectrum. For any pair of latitudes (ϕ, ϕ′), the function
ρ(k; ϕ, ϕ′) defines a spectral correlation (also called coherence). The computational aspects of these approaches have been analyzed by Jun and Stein (2008) and Castruccio and Stein (2013). Recent extensions have been provided by Castruccio and Genton (2014, 2016), Castruccio and Guinness (2017), Jeong, Castruccio, Crippa, and Genton (2017), and Horrell and Stein (2015).
8.1 Implicit models
, the authors study the SPDE defined through
(19)
is a Gaussian white noise process on
. Clearly, specific definitions and assumptions are needed depending on whether
is a planar surface, a sphere, or in general a manifold.
In order to provide a computationally convenient approximation of (19) for integer‐valued α, Lindgren et al. (2011) find a very ingenious computationally efficient Hilbert space approximation. Namely, the weak solution to (19) is found in some approximation space spanned by some basis functions. The computational efficiency is then attained by imposing local basis functions, that is, basis functions which are compactly supported. This all boils down to approximating the field X with a Gaussian Markov field with the highly sparse precision matrix. This idea is then generalized in Bolin and Lindgren (2011) through nested SPDE models and Bolin and Kirchner (2019) to arbitrary α > d/2. This approach has then been coupled with the Bayesian framework by Cameletti et al. (2012) to provide a space–time model. A direct space–time formulation of the SPDE approach is also suggested in Lindgren et al. (2011) and elaborated in Krainski et al. (2018). Extension of the SPDE approach to space–time has been considered recently by Vergara, Allard, and Desassis (2018). We note throughout all the SPDE models that there is an inherent discretization imposed on the problem and a precise covariance function of the approximation might not be available.
9 OUTLOOK
Although we present a comprehensive review of space–time covariance functions, we foresee much new, creative work and many open problems. A list of open problems related to space–time modeling is provided by Porcu et al. (2018).
A promising field of research is represented by the statistical analysis of processes that exhibit cyclic behaviors over time and or/space. This has been advocated in recent papers: Random fields defined over
, where
is the spatial domain (a path or a planar surface) and
is time wrapped over the circle, have been considered by Benigni and Furrer (2012) to analyze improvised explosive device attacks along a main supply route in Baghdad, or by Shirota and Gelfand (2017) to analyze daily crime events in San Francisco. Similar approaches are then adopted by Mastrantonio et al. (2019), who consider Bayesian hierarchical modeling where seasonality is modeled through conditioning sets. A similar approach under the Bayesian framework has been adopted by White and Porcu (2019a). Very recently, Porcu, Cleanthous, Georgiadis, White, and Alegría (2019) have considered random fields defined over the hypertorus, which is in turn obtained through the product of hyperspheres of possibly different dimensions. The work opens for many questions related to the statistical analysis of such processes.
Many applications are concerned about predictions of the spatial process, and the modeling and estimation of the covariance function is just a means to an end. Hence, the covariance function and the parameters are not necessarily needed for interpretation, thus approximations such as misspecifications are suitable provided that the computational gain justifies the predictive loss. As argued by, for example, Furrer et al. (2016), it is important that the misspecification is the same for the estimation and prediction. But, other than that, it is virtually impossible to provide guidelines for useful approximation routes. Quite often, available computing resources determine the maximum possible flexibility of covariance functions (e.g., dimension of the parameter space) or estimation precision (e.g., the number likelihood evaluations that can be carried out). The parametric covariance of Gerber et al. (2017) and Heaton et al. (2019) are mere attempts to capture the overall dependency structure. If excessive smoothing is not acceptable, nonparametric models are a promising option (Gerber, de Jong, Schaepman, Schaepman‐Strub, & Furrer, 2018), but such models are beyond the scope of this review.
ACKNOWLEDGMENTS
We thank the Associate Editor and two Referees for their work that have improved earlier versions of the manuscript. Partial support was provided by FONDECYT grant 1130647, Chile by the Millennium Science Initiative of the Ministry of Economy, Development, and Tourism, grant “Millennium Nucleus Center for the Discovery of Structures in Complex Data” for Emilio Porcu. This work was supported by the Swiss National Science Foundation (Grant 175529) for Reinhard Furrer.
CONFLICT OF INTEREST
The authors have declared no conflicts of interest for this article.
AUTHOR CONTRIBUTIONS
Emilio Porcu: Conceptualization; investigation; methodology; supervision; visualization; writing‐original draft; writing‐review and editing. Reinhard Furrer: Conceptualization; investigation; methodology; supervision; visualization; writing‐review and editing. Douglas Nychka: Conceptualization; investigation; methodology; supervision; visualization; writing‐review and editing.
RELATED WIREs ARTICLE
Covariance structure of spatial and spatiotemporal processes





is positive definite on the real line for every fixed
, with
, and k is positive and integrable in 
