## 1. Introduction

[2] Multivariate spatial problems are frequently encountered in environmental applications where we usually have a set of, say, *n* monitoring stations or gauged sites, and at each of those sites we have measurements of *p* pollutants. Therefore, for a given time period, concatenating the measurements into a single column vector, the observations comprise a *np* × 1 vector. (In fact, in practice, often not all pollutants are observed at all locations, creating misalignment or missing data issues.) Here we will concentrate on the spatial aspect of these observations. Usually, on the basis of the information we have from the gauged sites, the interest lies in predicting the *p* different processes at locations where they are not measured.

[3] Multivariate spatial modeling can be implemented in various ways. For instance, one could employ a Markov random field approach, such as applied by *Mardia* [1988] and, more recently, by *Gelfand and Vounatsou* [2003, and references therein]. This approach is most commonly used for areal data and for regular grid or lattice data. It results in a joint multivariate distribution for the data, which is typically assumed to be normal.

[4] A different formulation is to envision the data as arising from a multivariate spatial process. This approach presumes that a realization of the process consists of *p* dependent surfaces over the study region. We only observe the values of the surface at a set of *n* locations. Again, a multivariate distribution results for the observed data. However, specification of a valid multivariate process is more demanding in that it is determined through its associated finite dimensional distributions. Consistency of such determinations with regard to the entire uncountable dimensional joint distribution is a nontrivial matter. However, with Gaussian processes, consistency only requires the use of a valid cross-covariance function [see, e.g., *Wackernagel*, 1998, and references therein]. With interest in prediction at arbitrary new locations, multivariate processes seem to be the more attractive distributional framework.

[5] In what follows, we work with a multivariate Gaussian process employing a stationary cross-covariance specification. In a series of papers by N. Le, J. Zidek, and colleagues [see, e.g., *Brown et al.*, 1994], this restriction is removed by viewing the joint covariance matrix for the data as varying through an inverse Wishart distribution centered at the covariance matrix which will arise using a stationary cross-covariance function. While the restriction is removed, so is the process notion; we merely achieve a multivariate distribution for the data. In fact, the randomly realized covariance matrix has no connection to a covariance function. Connection with spatial separation is sacrificed, e.g., we expect to see many pairs of entries where, say, *d*_{ij}, the distance between **s**_{i} and **s**_{j}, is less than *d*_{i′j′} and cov[**Y**(**s**_{i}), **Y**(**s**_{j})] < cov[**Y**(**s**_{i′}), **Y**(**s**_{j′})]. Whether this consequence is desirable or not is likely application dependent. Alternative approaches to remove stationarity are mentioned by *Gelfand et al.* [2003, section 4]. Currently, approaches to nonstationarity through deformation [see *Sampson and Guttorp*, 1992; *Schmidt and O'Hagan*, 2003, and references therein] have only been implemented for univariate data with independent replications of the process.

[6] To formalize the definition of a valid cross-covariance function, we need to model not only the dependence of measurements across locations (the usual spatial setting) but also the dependence at each location. That is, if **Y**^{T}(**s**) = [*Y*_{1}(**s**), *Y*_{2}(**s**), ⋯, *Y*_{p}(**s**)] and we assume stationarity, we need to specify a *p* × *p* matrix function *C*(**s** − **s**′), where *C*(**s** − **s**′)_{ll′} = cov[*Y*_{l}(**s**), *Y*_{l′}(**s**′)] such that for any *n* and locations **s**_{1}, **s**_{2}, ⋯, **s**_{n}, the resulting *np* × *np* covariance matrix for **Y**^{T} = [**Y**(**s**_{1}), **Y**(**s**_{2}), ⋯, **Y**(**s**_{n})] is positive definite. Note that *C* need not be symmetric. Moreover, the goal is to enable flexible and computationally tractable covariance structures. Since environmental data are rarely found to follow a Gaussian distribution, we assume that suitable transformation is carried out in order to achieve approximate normality. Our approach can be extended to accommodate heavier-tailed distributions through scale mixing of Gaussian processes.

[7] With regard to the problem of prediction using multivariate spatial processes, let {**Y**(**s**): **s** ∈ *D*} represent a multivariate spatial random field, where **Y**(**s**) ∈ ℜ^{p} and *D* ⊂ ℜ^{d}, with, usually, *d* = 1, 2 or 3. A well-known approach for prediction at an ungauged location is cokriging, where the predictor of the process at, say, **s**_{0} is described by a linear combination of all the available data values of all the *p* variables [*Cressie*, 1993]. In cokriging the prediction is based on a known cross-covariance matrix, which, in practice, is not realistic. Usually, the matrix is estimated from the data so that the prediction does not take into account any uncertainty associated with estimating the covariance structure. In this regard, *Cressie* [1993] discusses potential problems with cross variograms with respect to the scaling of the different variables in the vector **Y**(.).

[8] We adopt a Bayesian approach here in order to more accurately capture the uncertainty in our model specification. In particular, all unknowns in the model are assumed to arise at random from some (prior) distribution. Using Bayes's theorem, this uncertainty is propagated to (posterior) inference about process unknowns and predictions under the process. Wider interval estimates will usually result, but we would argue that they more realistically express the variability associated with the inference. We choose fairly vague specifications for the prior distributions of our unknowns. Thus our inference becomes fairly insensitive to these choices. Still, a thorough analysis of the data and associated prediction requires some examination of prior sensitivity.

[9] As a result, we are left with the selection of a specific multivariate Gaussian spatial process model, i.e., with the selection of a valid cross-covariance function. We adopt an approach based on the linear coregionalization model first introduced by *Matheron* [1982] with cross covariograms of the form Σ_{m=1}^{r}**T**_{m}*g*_{m}(∥**s** − **s**′∥), where the *g*_{m}s are known variograms, the **T**_{m}s are unknown nonnegative matrices, and *r*(<*p*) gives the number of structures. Each matrix **T**_{m} may be thought of as a scalar product matrix between variables. Therefore covariation between variables is captured in a lower-dimensional space (*r* < *p*), defined by principal components analysis of **T**_{m} [*Goulard and Voltz*, 1992]. *Goulard and Voltz* [1992] describe a least squares technique to fit such models. Their aim is to estimate the unknown matrices **T**_{m}. Therefore they plug in empirical variogram estimates for the functions *g*_{m}, ignoring the uncertainty in the estimation of these variogram functions. For more details on coregionalization models with associated classical estimation procedures, see *Wackernagel* [1998]. Our use for the linear coregionalization model is not dimension reduction. Rather, we set *r* = *p* to obtain a rich constructive class of valid cross-covariance functions, as we describe in section 3.1.

[10] *Mardia and Goodall* [1993] describe modeling of multivariate spatiotemporal data. They propose a separable covariance structure; in order words, the covariance structure of the multivariate process can be factorized as the product of the correlation across locations by the covariance among variables. That is, for the data vector **Y**, **Σ**_{Y} = **R** ⊗ **T**, where *R*_{ij} = ρ(**s**_{i} − **s**_{j}; ϕ), with ρ being a valid correlation function in two dimensions and where **T** is the covariance matrix for **Y**(**s**). The linear coregionalization model includes this form as the special case when all of the *g*_{m} are identical. One implication of this separable form is that the cross covariances are symmetric, i.e., cov[*Y*_{k}(**s**_{i}), *Y*_{l}(**s**_{j})] = cov[*Y*_{l}(**s**_{i}), *Y*_{k}(**s**_{j})], which can be restrictive. Also, cov[*Y*_{l}(**s**), *Y*_{l′}(**s**′)]/ does not depend on **s** − **s**′, i.e., on the spatial scale. Another limitation is that the spatial range for each component of the process **Y**(**s**) is the same. That is, if ρ is isotropic, say, ρ(**s**_{i} − **s**_{j}) = ρ(*d*_{ij}; ϕ), where ϕ is a decay parameter (equivalently a range parameter), then cov[*Y*_{l}(**s**_{i}), *Y*_{l}(**s**_{j})] = *T*_{ll}ρ(*d*_{ij}, ϕ). As *l* varies, we have a different process variability but a common process range. Finally, *Mardia and Goodall* [1993] suggest inference based on the maximum likelihood estimator, which will be problematic if the likelihood is multimodal.

[11] This paper is organized as follows. Section 2 briefly describes the source for our data set. Then, in section 3 we present the Bayesian model based on the linear combination of independent univariate spatial processes, the above linear coregionalization model. The main advantage of this model is that each of the *p* different processes are allowed to have a different spatial range. We then discuss prior distributions which might be assigned to the parameters of this model. We can think of the multivariate distribution of our data as arising from a suitable sequence of conditioning, following *Royle and Berliner* [1999]. In fact, we identify univariate spatial conditional models to do this. Section 3.3 presents the reparameterization of our multivariate model in terms of univariate conditional processes and discusses its implementation in the latter form. In section 4 we apply the proposed model to a data set comprising day average measurements of CO, NO and NO_{2} from a set of gauged sites in California, USA. Finally, section 5 offers some concluding discussion.