Probabilistic Linear Inversion of Satellite Gravity Gradient Data Applied to the Northeast Atlantic

We explore the mantle density structure of the northeast Atlantic region using constrained linear inversion of the satellite gravity gradient data based on statistical prior information and assuming a Gaussian model. The uncertainty of the residual gravity gradient signal is characterized by a covariance matrix obtained using geostatistical analysis of controlled‐source seismic data. The forward modeling of the gravity gradients in the 3D reference crustal model is performed using a global spherical harmonics analysis. We estimate the model covariance function in the radial and angular directions using a variogram method. We compute volumetric gravity gradient kernels for a spherical shell covering the northeast Atlantic region down to the mantle transition zone (410 km depth). The solution of the linear inverse problem in the form of the mean density model and the posterior covariance matrix follows a least squares approach. The results indicate that on average the seismic velocity variation is proportional to the density variation in the northeast Atlantic region. However, a noticeable mismatch or anti‐correlation exists in some areas, such as the Greenland‐Iceland‐Faroe ridge and southwestern Norway. The predicted low‐density anomalies at depths of 100–150 km underneath the northeast Atlantic Ocean are correlated with the distribution of Cenozoic submarine volcanoes and seamount‐like features of the seafloor.

The morphology of the northeast Atlantic oceanic basins and, particularly, its margins was inferred from a number of seismic reflection and refraction profiles, mostly at the continent-ocean transition. The present-day configuration of the lithosphere in the northeast Atlantic is the result of continental rifting, break-up and subsequent seafloor spreading that formed a number of oceanic sub-basins from about 56 Ma (Figure 2), for example, Gaina, Nasuti, et al. (2017). It is postulated that seafloor spreading was initiated after the North Atlantic Igneous Province (NAIP; e.g., White & McKenzie, 1989) was emplaced affecting a substantial part of the Greenland and Eurasian lithosphere. Both intrusives and extrusive magmatic rocks associated with NAIP have been mapped in detail (Eldholm & Grue, 1994). Variable physical properties of the igneous crust in the deep ocean basin were linked to the Miocene activity of the Iceland plume Parkin & White, 2008). The geodynamic evolution over the subsequent 50 Myr may have developed in a pulsating fashion, with periods of high and low magmatic and tectonic activity, as documented in Iceland and mid-ocean ridges (Ito, 2001;Jones et al., 2002;Parnell-Turner et al., 2014), by pervasive seamount volcanism  and episodic uplift and basin inversion at the passive continental margins (Rudge et al., 2008;White & Lovell, 1997).
The long-term magmatic and tectonic evolution of the northeast Atlantic region is controlled by processes in the mantle. The observed broad anomaly of high values of the gravitational potential ( Figure 3) was attributed to a superposition of shallow and deep mantle sources (Cochran & Talwani, 1978). Disentangling these anomalous density sources is not fully possible due to nonuniqueness of the inverse gravimetric problem. To better utilize the physical constraints provided by gravity data, it is important to assess the contribution of individual density sources (with a presumably known probability) and provide a statistical measure for the uncertainty of the final density model including both the lithosphere and sublithospheric upper mantle.
The density structure of the lithosphere and asthenosphere in the northeast Atlantic region has been previously addressed in a number of studies based on analysis of gravity data. Haase et al. (2017), Tan et al. (2018), and Shulgin and Artemieva (2019) are among recent publications. Tan et al. (2018) focus on the lithospheric density structure of the oceanic region around Jan Mayen Island and discuss possible impact of the Iceland hotspot on the temperature and density structure of this region. Shulgin and Artemieva (2019) performed density modeling using a tesseroid approach and discussed the thermochemical structure of the upper mantle. As constraints, they used the UNASeis crustal model of Artemieva and Thybo (2013). A spectral method based on global spherical harmonics analysis, combined with local isostasy constraints, has been applied by Root, Ebbing, et al. (2016) to investigate the relation between seismic velocity and density structure of the lithosphere of the British Isles and adjacent part of the northeast Atlantic region. There have been few attempts to apply 3D inversion of gravity gradients, partly due to difficulties to incorporate geophysical prior information to reduce the nonuniqueness of the inverse problem.
The 3D gravity gradient inversion method has been traditionally used in mineral and petroleum exploration focusing on the reconstruction of localized density anomalies (e.g., Pilkington, 2014;Wan & Zhdanov, 2013). The flat Earth approximation and models in Cartesian coordinates have been traditionally used. The vertical component of the gravitational tensor has been shown to be most informative and was often utilized in various applications (Pilkington, 2014). The useful additional information containing in the gravity gradient tensor data is that, for a particular case of a point mass (or noninterfering localized density anomalies), its eigenvectors indicate the source location (Mikhailov et al., 2007;Pedersen & Rasmussen, 1990). More complex models require a priori constraints and regularization.
In the framework of 3-D inversion, Zhdanov (2015) discussed various stabilizing functionals applied to model parameters to obtain a solution satisfying certain a priori geological or geophysical knowledge, such as the minimum-norm, maximum smoothness, total variation, minimum entropy and minimum-gradient support functionals (Boulanger & Chouteau, 2001;Last & Kubik, 1983;Portniaguine & Zhdanov, 1999). Li and Oldenburg (1996) suggested a stabilizing function for the potential field inversion in the form of the inverse distance (1/r β ) to counteract the rapid decay with distance of the integral kernels. Zhdanov et al. (2011) and Wan and Zhdanov (2013) proposed a method for rapid imaging using gravity gradient data incorporating the depth weighting of Li and Oldenburg (1996) and a known background density model. Chasseriau and Chouteau (2003) and Barnoud et al. (2016) considered the inverse gravity problem in a Bayesian formulation implying that the density variation and the observed data can be considered as Gaussian random fields characterized by their mean and the covariance matrix.
Unlike in local applications working with gravity gradients, the 3-D geometry is required for the large-scale lithospheric modeling since the data are defined in the local Cartesian coordinate system at each observation  (Bouman et al., 2013). Conversion of 3-D gravity gradients to the flat Earth geometry for the purpose of regional modeling is possible but involves a bias (Bouman et al., 2013). The a priori geophysical information, such as spatial correlations of heterogeneities in the Earth's mantle, is also more convenient to describe using a spherical statistical model. Liang et al. (2014) used a tesseroid approach and performed linear inversion using a combination of angular and radial weighting functions as constraints. Afonso et al. (2019) performed constrained nonlinear inversion of GOCE gradients (together with other geophysical constraints) for several lithospheric layers. An efficient method for forward and inverse gravity modeling of crustal density structure based on spherical harmonic analysis was introduced by Wieczorek and Phillips (1998). Novák and Grafarend (2006) and Root, Novák, et al. (2016) have further extended this method for various geophysical applications. Martinec and Fullea (2015) used the GOCE gravity gradient data to model density structure of the Congo basin based on a spherical harmonic approach. Fullea et al. (2021) performed a joint inversion of gravity gradients constrained by seismic waveforms and petrological modeling for temperature and composition of the upper mantle at the global scale. The modeling method presented in this study combines the computationally efficient spherical harmonics method and the local parameterization to perform a probabilistic linear inversion of satellite gravity gradient data in 3-D geometry using a spherical shell model of the study region.
In the following, we describe a method to perform a constrained lithospheric-scale inversion of gravity gradient data. The a priori constraints include: (a) data covariance matrix; (b) prior model covariance matrix including a model for spatial variability of mantle heterogeneity; and (c) a stabilizing functional in the form of spatial weighting function. We construct a seismically constrained 3D crustal reference model including ice, water, sediments, and crystalline crustal layers based on spherical kriging interpolation including a model for the average density of the crystalline crust. The forward problem is solved on a global scale using a spherical harmonic approach. Then, we use the GOCE grids to infer the density variation (and associated Gaussian probability distribution) in the upper mantle within the northeast Atlantic region.  (Straume et al., 2019), (c) combined Moho depth model Szwillus et al., 2019) based on kriging interpolation (see Section 4.3 for details), and (d) Crustal age (Gaina, Nasuti, et al., 2017

Theoretical Background of 3-D Inversion of Gravity Gradients
In this chapter, we describe our probabilistic inversion approach including weighted parameters. The assumed linear model and Gaussian probability distribution for data and density allows using standard least squares solution. We give formulas corresponding to the two strategies of the solution: (a) inversion in model space and (b) inversion in data space. This formulation includes both data and model covariance matrices and the spatial weighting matrix. The choice of the weighting function is important to discuss in the context of potential field inversion. After that, we demonstrate analytically that the use of the weighting function based on the integrated sensitivity kernel (Zhdanov, 2015) results in correct reconstruction of source depth for a point mass using gravity gradient density imaging. Some useful theoretical background information can be found in Appendices A, B, and C.

Solution of the Inverse Problem
We assume a linear model (G) that connects data (d) and model parameters (m) in the standard form described in Appendix C (Equation C4). The Gaussian probability distribution for the data can be written with a priori information in the form of data correlation matrix Σ d and variance 2 .
Another applied constraint comes with a spatial weighting matrix W (Li & Oldenburg, 1996;Zhdanov et al., 2011). So that, the weighted model parameter vector can be written (2) The Gaussian probability distribution for the model parameters can be expressed in the form where −1 is the covariance of the weighted parameters and m 0 is the prior density model.
It is more practical to deal with covariance of unweighted model. We assume that the covariance of weighted density is related to the covariance of density as where α −1 is a coefficient approximately equal to the mean of the squared diagonal elements w of the weighting matrix W with the size [NxN]; N is the number of model parameters: Σ m is the correlation matrix, 2 is the variance.
The joint posterior probability can be written in the form The mean and the covariance matrix that maximize the posterior probability can be written as solution of a least squares problem. The solution corresponding to the inversion in data space follows Tarantola (2005): Here, Δ = ( − 0) are the weighted model parameter pertubations. G w = GW −1 = W −1 G T is the weighted integral kernel. I d is the identity matrix with the size equal to the number of data points. The data residuals are denoted as Δd = d − Gm 0 . The adjoint weighted kernel † is important quantity in potential field migration (Zhdanov, 2015), and it can be defined using the probabilistic formulation of Tarantola (2005) The posterior distribution Equation 6 can also be found using the inversion in model space: where the adjoint kernel ̃ † is defined using the posterior model correlation matrix Σ : and I m is the identity matrix with the size equal to the number of model parameters.

of 36
The ensemble of random realization corresponding to the posterior probability density function can be obtained using the Cholesky decomposition of the posterior covariance matrix = Δ + Here, Δm is the solution for the mean density distribution, ξ is a vector of mutually independent random numbers with zero mean and unit variance, and L is the matrix of Cholesky factors such as ̃= . The Cholesky factorization can only be applied to a symmetric positive definite matrix. In practice, the posterior covariance matrix calculated using Equations 9 or 12 does not necessarily follows this property. Here, we use a covariance model which is found to be more accurate for the spherical geometry and ensures positive definiteness of the covariance matrix: where 2 is the model variance, Σ Ω and Σ r are the angular and the radial correlation matrices, respectively.

Spatial Weighting and Depth Resolution
Rapid decay of sensitivity kernels with distance implies that the density anomalies predicted by the potential field inversion would concentrate at the surface (Li & Oldenburg, 1996). In this work, we introduce the weighting matrix W (Equation 2) which acts as a stabilizing functional for the inverse problem and provides depth weighting to each parameter according its contribution to the data points used in the inversion. Following Zhdanov (2015), we define W as a diagonal matrix with elements corresponding to the square root of the integrated sensitivity kernel: where ‖ ‖ = √ ∬ ds is the norm of the data perturbation at the observation plane S due to the variation of the model parameter k. Substituting Equation C4 into Equation 16, the weights can be expressed using the Green functions for corresponding gradient tensor components: To better understand the density imaging using satellite gradiometry data, we consider a simplified inversion approach in the form of potential field migration (Zhdanov et al., 2011). We assume that both the data and model covariance matrices are scaled identity matrices. Then, using definition (Equation 2), Equation 11 can be written in the integral form as This is the potential field migration equation which relates the gravity gradient kernels and the migration density field (Zhdanov et al., 2011). In order to obtain a reasonable estimate of density anomalies using the potential field migration, a spatial weighting function has to be applied. Several such functions have previously been suggested in general form |r − r′| −β , where β = 1‥3. In this work, the weighting function is derived based on the integrated sensitivity kernel Equation 17. This allows us to investigate how the spatial weighting affects the reconstructed depth of a localized mass anomaly using analytical integration of Equation 18.
The important property of the gravity gradient tensor is that its eigenvector, corresponding to the eigenvalue with maximum magnitude, indicates the direction from the observer to the point source (Pedersen & Rasmussen, 1990). For demonstration, we rewrite Equation C1 in cylindrical coordinates and use the Green's function associated with the largest eigenvalue of the gravity tensor which is proportional to |r − r′| −3 . Then, we substitute this expression to Equation 18 assuming that the point mass is located at (0, 1), the observation plane is at (ζ, 0) and z-axis is positive downwards: Here, the data represent the largest eigenvalue of the gravity gradient tensor The integration of Equation 19 results in The analytical integration of the sensitivity kernel in Equation 17 provides the explicit form of the weighting function: Substituting this equation into Equation 21, the weighted migration density distribution becomes The extreme value of this distribution corresponding to the maximum of the migration density can be found by differentiation with respect to depth: Thus, we obtain the center of the weighted migration density field located at the source depth z = 1, as it was expected. Therefore, the form of spatial weighting function, chosen in this work, allows for reconstructing correct depth of the point mass.

Implementation of the Gravity Gradient Inversion
A least squares solution in the model space Equations 11 and 12 can be obtained using an iterative solution technique (Liang et al., 2014;Barnoud et al., 2016) such as the method of conjugate gradients and LSQR (Paige & Saunders, 1982). For three-dimensional problems, the system matrix quickly grows and this approach becomes computationally challenging. Alternatively, we can follow the formulation of the inversion in the data space using Equations 8 and 9. In this case, the matrix to be inverted is of the data size and can be performed using the truncated singular value decomposition (SVD) method (Aster et al., 2018;Chasseriau & Chouteau, 2003). The inversion in data space can be also applied for one point (or blocks of points) at a time (Tarantola, 2005). This method does not require inversion of large matrix, and, therefore, can be numerically efficient. Assuming that the data are uncorrelated, we can modify the probability function of model parameters by incorporating new data, such as additional measurements, or combine different gravity gradient components. The corresponding least squares solution based on Equations 8 and 9 (Tarantola, 2005) can be written using our notation as The k index denotes current data point with the mean d k and variance σ d . G k is the system matrix Equation C4 with the number of rows corresponding to the number of data components (a row vector for single-component data).
The memory requirements can be substantially reduced by noticing the local nature of gravity gradient kernels. Therefore, only significant G k elements can be selected. We found that model parameters beyond 600-700 km distance from the observation point have a very small contribution to the data, and, thus, corresponding matrix elements of G k can be ignored (G k becomes a sparse matrix) without a substantial change of the solution. The vector q is analogous to the adjoint operator G † (Equation 10). The matrix H reduces to the scalar for single-component data or represents a small square matrix with the size of the number of data components. In practice, a block of data points can be selected instead of one point at a time to form rectangular matrices q and G k . Note that the mean weighted density ( ) and the covariance matrix ( ) are updated at each iteration k. The final density model m = W −1 m w is obtained after the last data point has been assimilated. Equations 25-29 allow for a rapid density imaging method similar to the potential field migration suggested by Zhdanov et al. (2011). The presented recursive method illustrates the effect of new data integration with respect to model update.

Synthetic Inversion Examples
The 3D synthetic density model represents a spherical shell with the lateral extent of 90° × 30° and the depth extent 0-410 km. The model is parameterized using 20 layers comprised of spherical prisms with a 3° × 1° resolution. We compute the synthetic density distribution shown in Figure 4a as a nonhomogeneous Gaussian random field with zero mean using Equation 14 and the spherical covariance model Equation 15. The model variance is set to with the standard deviation (SD) of 20 kg m −3 . We calculated the synthetic gravity gradient tensor components in Figure 5 using Equations A6, C1, and C2 and assuming the satellite observation height of 225 km.
The first method is the inversion in data space (Equations 8 and 9) using the full data covariance and full model covariance matrices. The inversion results in Figure 4b were obtained using the truncated SVD method (Aster et al., 2018) to invert the system matrix. 10 of 36 The second method we have applied is the recursive least squares inversion in the data space with the full model covariance matrix and the diagonal data covariance matrix ( Figure 4c). Here, we use Equation 25 and ignore the correlations in the data but include correct spatial weighting for each gravity gradient component defined by Equation 17.

of 36
The third inversion approach follows the model space formulation (Equations 11 and 12). We have used the LSQR method with diagonal covariance matrices (σ m = ±20 kgm −3 ) and the zero mean density model. An optimal damping parameter was found based on the standard L-curve criterion (Aster et al., 2018). The reconstructed model is shown in Figure 4d. The inversion recovers the lateral position, the depth and average intensity of density anomalies is less well-recovered. The vertical smearing of density anomalies limits the depth resolution in the model.
The constraints on spatial correlations in the models in Figures 4b and 4c allow for recovering not only average position (center of mass) of the density anomalies but also their correct aspect ratio. The spatial correlations control the shape, relative strength, and depth of individual density anomalies. In the high-resolution real data application, the SVD decomposition can be replaced by the recursive approach. The effect of including data correlations in our synthetic setup improves the data fit (Table 1). Whereas the pattern of density anomalies stays very similar to the model obtained with the approximate recursive method.
The synthetic data (shown in Figures 5b, 5d, 5f, and 5h) corresponds to the reconstructed density model in Figure 4c. The tensor components indicate similar relative misfit below the assumed data uncertainty. See also Table 1.

Application to Lithospheric-Scale Density Modeling
The application of gravity inversion to lithospheric-scale modeling generally includes two steps. First, it is to extract the residual signal corresponding to density distribution or geological interfaces of interest, (e.g., top basement or Moho depth) assuming that the remaining part of the Earth's density model is known. The second step includes the 3-D inversion of the residual signal. Here, we assume that the total gravity and gravity gradient signal in the North Atlantic region reflects the contribution of the following main components: (a) topography (elevation and bathymetry) and the associated Bouguer correction; (b) ice thickness variation; (c) sedimentary thickness variation; (d) crustal thickness and density variations; (e) lithospheric thickness variation resulted from the rifting process and subsequent cooling; and (f) density variation in the upper mantle. Multichannel seismic reflection data and wide-angle seismic data provide constraints on the sedimentary thickness and crustal thickness variation. The lithospheric thermal age model is shown in Figure 2 and follows Gaina, Nasuti, et al. (2017). The long-wavelength signal due to lower-mantle density variations was neglected since it is found to be small in gravity gradients compared to this signal in the geoid and the gravity anomaly field (see Figure 3).
We model the components (1-5) on the global scale using the spherical harmonic approach which implies a layer-based parameterization (described in Appendix B). The global modeling avoids edge effects and naturally includes the far-field signal in the regional density modeling. We treat the density variation in the upper mantle (component 6) as an unknown in the probabilistic linear inversion described in Chapter 2. The model parameters are sought on a 3-D grid within a spherical shell covering the Northeast Atlantic region. We use the volumetric gravity gradient kernels (described in Appendix C) both to solve the forward problem and to perform the inversion for density variation within the spherical shell. T zz synth 6.4 (5.7) 6.5 (5.8) 6.6 (5.9) 6.8 (6.1) 4.7 4.6 5.1 5.9 Note. The root mean square (rms) misfit is relative to rms-variation of corresponding component (in percent). The values are obtained using pointwise recursive method in the data space and the LSQR method in the model space. The rms misfit corresponding to the truncated SVD inversion method is given in parentheses.

Gravity and Gravity Gradient Signals of Lithospheric Layers and Data Used
The topography and bathymetry data were extracted from the international bathymetry and topography data including the ice thickness model over Greenland https://www.gebco.net ( Figure 2). The total spherical Bouguer correction includes both the topography and the ice layer over Greenland (Figures 6a and 9a). The topography correction is performed using the reduction density of 2,850 kg m −3 . The density of ice is assumed 970 kg m −3 . The global sediment thickness grid is combined using the marine sedimentary thickness by Straume et al. (2019) and the CRUST1.0 sediment thickness on land (Laske et al., 2013;Szwillus et al., 2019). Regional sedimentary thickness and crustal thickness data are based on the seismic reflection and refraction database compiled during NAGTEC project . We use spherical kriging interpolation approach  to merge global and regional data and avoid edge effects. The assumed sediment density is 2,400 kg m −3 . The gravity effect of sedimentary cover (shown in Figures 6c, 6d, and 9b) probably represents the upper bound estimate since the sediment density should be increasing with depth due to compaction. The difference of gravity gradient signal assuming the constant and quadratic density profile in sediments is shown in Figure 10. In our model, the estimated effect of variable sediment density on the gravity inversion results is up to 20% (see also Supplementary Information S1). The average density of crystalline crust follows the results of statistical interpolation by Szwillus et al. (2019). The density variation within the crustal layer is found to have a second-order effect compare to the gravity signal due to the crustal thickness variation (shown in Figures 7a and 7b). The lith- We estimated spherical harmonic coefficients for each crustal density layer using the global spherical harmonic analysis (Sneeuw, 1994;Wieczorek, 2007). Then, we calculated the gravitational potential (Stokes) coefficients, from which the gravitational potential and its functionals can be obtained (Novák & Grafarend, 2006). The residual gravity and gravity gradient anomalies correspond to the observed data after the effects of each individual crustal layer has been subtracted (Figures 8a and 8b and 11).
The input GOCE data (shown in Figure 1) have the form of grids in the local (north-oriented) reference frame (LNOF; Bouman et al., 2015). The satellite descended to a height of about 225 km during the last phase of the GOCE mission. The full range of observation performed between 250 and 225 km height was transformed to the lowest level at the stage of data processing (Bouman et al., 2015). The gravity field solutions (grids) contain a detailed signal at a height of ≈225 km with a spatial resolution of ≈80 km.

Residual Gravity and Gravity Gradient Signal
The goal of the following analysis is to extract from the observed gravity data the gravity signal corresponding to the 3D heterogeneous density structure of the lithosphere and sub-lithospheric upper mantle. We subtract from the observed data shown in Figure 1 the gravity signal of the crust (Figure 9c) including the topography The residual gravity and radial gravity gradient anomalies (Figures 8a and 8b and 11) reflect the effects of dense and cold continental roots and low-density regions adjacent to the Iceland hotspot region. The range of variation is from about −100 mGal to 75 mGal and ±2-3 E, for the gravity and the gravity gradient, respectively.
The obtained residual gravity gradients ( Figure 11) can be linked to the density variation unaccounted by the 3D reference model. The amplitude values of the residual gradients (±2-3 E) are about twice of the observed gradients, and the crustal correction constitutes about two-third of the total signal. Note that the superposition of the crustal and the mantle gravity signals should result in a small gravity anomaly due to isostatic compensation.
The interpretation of the residual signal is most transparent in the case of the radial gravity gradient ( Figure 11). The lithospheric and upper mantle signal is more pronounced in the residual gravity gradient signal (Figures 8b and 11) than in the residual gravity anomalies (Figure 8a). The residual signal shows a correlation with the upper mantle seismic velocity anomalies in the global tomography model by Schaeffer and Lebedev (2013). In particular, at the depth of 80 km (Figure 8d) in the oceanic region, both to the north and south of Iceland, and within the cratonic areas. The negative residual signal at the eastern Greenland margin, the southern Norway region and northern British Isles implies a low-density lithosphere at depth. The Fennoscadian craton appears colder and thicker compared to the Greenland lithosphere in both seismic tomography model and gravity data. Note that the observed data in Figures 1 and 3 show anti-correlation with the seismic anomalies in Figures 8c and 8d.

of 36
Both the density variation of the crystalline crust and the sedimentary layer density affect the observed gravity gradient signal. The lateral density variation in sediments can be related to the compaction process. We use quadratic density profiles Equation B5 and estimate the gravity gradient signal using Equation B12. Compaction would reduce the density contrast between sediments and crystalline crust, and would result in a less negative anomaly over sedimentary basins. The various degree of compaction would also result in the lateral density variation. Figure 10 shows that the sedimentary density variation has a non-negligible effect (up to ± 0.5 Eötvös) when compared to the observed signal (±1 Eötvös). So that, the residual information represents not only the mantle density but also the contributions of the sediment density variations. This signal constitutes about 10%-20% of the residual anomaly with largest values corresponding the sediment depocenters. The values of the residual gradients over the deep sedimentary basins off western Norway and northeast Greenland in Figures 11 and 10 provide an estimate that up to 20% of the residual gravity gradient signal can be related to the sediment density variations. The numerical results illustrating the effect of variable density of the sediment layer is presented in more detail in the Supplementary Information S1. 16 of 36

Variogram Method
The nonuniqueness of the inverse problem can be addressed using various approaches to incorporate prior information. In the probabilistic framework, the data and model covariance matrices are standard to describe the Figure 10. The difference between the radial gravity gradient signal of the sedimentary layer assuming constant and depthdependent (quadratic) density profile. Figure 11. Residual gravity gradient anomalies (T rr ).

of 36
prior knowledge. Assuming a stationary Gaussian random process, the covariance function K is related to the variogram (ℎ) estimated at the distance h between a pair of points as where 2 is the variance of the physical variable v. The variogram, estimated using N grid points at the spherical distance h, can be expressed as where δv(x) is the parameter variation with respect to the mean value.
The covariance matrix C ij is related to the covariance function as C ij = K(x i , x j ), where x i , x j are coordinates of two points. The covariance matrix can also be defined using variance and correlation function as = 2 Σ( , ) .

Data Covariance
The uncertainty of the Moho depth dominates the uncertainty of the residual gravity signal. A rough estimate made for a continental lithosphere would provide about 0.1-0.2 Eötvös signal (at a satellite height) corresponding to a Moho depth change of about 1 km. This effect is generally larger when compared to the variation of other sources of gravity signal within the crust and upper mantle (such as the density distribution within the sedimentary and crystalline crustal and layers). Therefore, we neglect other factors that may have contributed to the uncertainty. We assume a linear model and obtain the data covariance matrix using the uncertainty (covariance) of the seismic Moho depth C s and the 2-D Green function G s at the regional average radial distance (Tarantola, 2005) ̃= .
The normalization is required to ensure that patches with the same spherical surface area have the same probability: In our case, the largest uncertainty in the data (residual gravity gradient signal) arises from the uncertainty of the crustal thickness model due to nonhomogeneous coverage of wide-angle seismic data. This prior information is incorporated in the inversion using the data covariance matrix C d . Assuming a linear model between data and model parameters, the variance of crustal thickness can be translated to the data variance ( Figure 12) using Equation 32. We find the probability distribution for the crustal thickness following the geostatistical method described in Szwillus et al. (2019). Since the full original data set is not available, we extracted the actual Moho depth values from corresponding published grids Mooney, 2015;Szwillus et al., 2019) along the profile coordinates with a step of 10 km, assuming that the values at the data locations have not been biased by the interpolation in the regional grid. Then, we applied a spherical kriging interpolation to the obtained data set.
The data covariance matrix C d was obtained from the Moho variance grid using Equation 32 and the gravity gradient kernel Equation C2. For simplicity, we consider only diagonal elements of this matrix and imply that the data points are independent. This assumption is not generally required but allows for an efficient numerical implementation. The diagonal elements of −1∕2 (standard deviation) shown in Figure 12 imply the largest variance over southwestern Greenland ( 1∕2 > 1 Eötvös). The passive margin of Norway is densely covered by seismic profiles and has small data uncertainty ( 1∕2 < 0.5 Eötvös).

Model Covariance
The prior information about the length scale and spatial variability of the density field can be incorporated using the covariance matrix. The model parameter covariance inside the sphere is represented by a superposition of the radial and angular covariance functions (Equation 15). The form of the correlation matrices can be defined by corresponding correlation functions that can be obtained using the variogram method. There are a number of functions that can be used to approximate empirical variogram (Lantuéjoul et al., 2019;Terdik, 2015). In this work, the angular correlation function is approximated using the inverse distance model (Terdik, 2015): here, the correlation length depends on the parameter |a| < 1.
The variogram in the radial direction is approximated using an exponential-type correlation function (Monin & Yaglom, 1975) (their Equation 11.20) with the correlation length defined by the parameters b 1 and b 2 . The sensitivity of S-wave velocities and density to temperature changes, inferred from empirical data, can be used to obtain the model constraints in terms of spatial correlations.
The model covariance matrix C m can be obtained using the variogram method described in Section 4.3.3. We applied this method to the seismic shear velocity variation in the regional tomography model by Rickers et al. (2013); Fichtner et al. (2018) (CSEM North Atlantic, December 2019) which has a better spatial resolution compared to global models. We assume that the spatial correlations of seismic velocity and density anomalies are similar. Figure 13a shows the radial correlation function (Σ r ), estimated for a bin size of 20, 30, and 40 points, at the depth of 50-410 km, after the 1D mean has been removed. The estimated correlation becomes negative beyond a radial distance of about 150 km. This type of spatial variation can be approximated in the analytic form by a combination of exponential and cosine functions Equation 35 (Monin & Yaglom, 1975). We have also estimated the correlation depending on the spherical distance up to 0.5 rad (or about 3,000 km) at four different depth intervals between 85 and 245 km depth (Figure 13b). The angular correlation function (Σ Ω ) can be represented by the inverse distance model Equation 34 (Lantuéjoul et al., 2019) with the coefficient a between 0.9 and 0.99 depending on the depth.

Inversion of Residual Gravity Gradients
The density model was parameterized as a spherical shell with a lateral grid resolution of 50-70 km and the depth resolution of about 20 km. The single-component inversion was performed using the residual radial gravity gradient signal (T rr ) shown in Figure 11. In the inversion setup, we assume a zero density variation as a prior model.

of 36
The calculated data misfit roughly follows data uncertainty and shows higher values over Greenland and Norwegian continental margins ( Figure 14). The range of the misfit is about ±0.5 Eötvös which is generally within the data uncertainty 2 1∕2 (cf. Figure 12). The range of the signal is ±3 Eötvös. The predicted mean density model is presented as an image of density variation at the depth of 150 km ( Figure 15).
The general pattern of the density perturbation ( Figure 15) reflects the distribution of residual anomalies (Figure 11). The predicted negative density anomalies in the upper mantle correlate with shear wave anomalies identified within a wider region along mid-ocean ridges where seamount volcanism may have been active since the Oligocene (ca. 30 Ma). In addition, the thinned continental crust of Rockall Plateau, where older seamounts were emplaced (ca. 50-60 Ma), is underlain by a low-density upper mantle. The largest negative values of the mantle density anomaly (about −40 kg m −3 ) at the west Greenland margin corresponds to the early Cenozoic location of the Iceland hotspot (Torsvik et al., 2015). The Greenland-Iceland-Faroe ridge is underlain by a positive density anomaly in the shallow lithosphere. The negative density anomalies are observed underneath the Caledonian deformation front in northern British Isles and southern Norway. The mass deficit under the central-east Greenland margin and across Greenland is expressed in the density model at middle upper mantle depths.
The comparison with regional high-resolution seismic tomography model (V SH ) by Rickers et al. (2013) and Fichtner et al. (2018), shown at the same depth in Figure 16, indicate a similar correlated pattern of positive and negative anomalies. The negative anomalies are located along the mid-Atlantic ridge, and to a lesser extent under northern British Isles and the Norwegian margin. The positive anomalies are below the cratonic parts of Greenland and Fennoscandia. The dissimilarities between the tomography and gravity models at a particular depth can be partly related to the nonuniqueness of the inverse gravity problem such as along the Iceland-Faroe Ridge and at the west Greenland margin. A more detailed discussion on the predicted mantle density variation with its relation to crustal structure and seismic velocities along three regional transects follows in the next chapter.
We test the reliability of the mean density model with respect to various modeling assumptions using the three inversion methods described in Section 3.1 applying for both single-component (local coordinates) and three-component (global ECEF reference frame) gravity gradient data. The numerical grid is 3° × 1° laterally and 30 km over depth.
The inversion results for the radial gravity gradient component are shown in Figures 17a-17c. The inversion results in the model space are shown in Figure 17a. In the data space, we use the truncated SVD method with the full data covariance matrix (Figure 17b) and the pointwise recursive method with the diagonal data covariance  (Figure 17c), respectively. The reconstructed density structure is similar in the two models. However, the SVD method results in a more continuous shape of density anomalies at depth. The reconstructed gravity gradient tensor components (T zz , T xx , and T zx ) corresponding to the model in Figure 17f are shown in Figure 18. The statistical evaluation of the results is given in Table 1. The inversion resolves best the radial component, as expected. The vertical component is resolved best among the three components. The efficiency of data fit by various methods trades off with the model roughness and applied regularization (parameter α). The comparison of the three density models in Figures 17a-17c indicates that incorporating the full model covariance helps to resolve some important geological features, such as a sub-horizontal low-density channel in the asthenosphere suggested by seismic tomography (Rickers et al., 2013). Figures 17d-17f. The weighting function was calculated using Equation 17 in the recursive matrix-free method (Figure 17f). The two other methods that require inversion of the system matrix use the inverse distance weighting function (see Section 2.2; Figures 17a and 17d and 17b and 17e). Comparing the single-component and three-component inversion results, we notice an increase in magnitude of some density anomalies. The lateral and depth extent of the density anomalies changes slightly. The general pattern of density anomalies remains the same indicating that results are robust. The pointwise method required most computation time (for this small 3D model). The advantage of this method in the case of multicomponent inversion is that it allows incorporating correct weighting coefficients for each tensor component. The mean misfit over the gravity gradient components is optimized by including other tensor components and results in a smaller mean rms misfit value (see Table 1).

Mantle Density and Seismic Velocity Anomalies Along Lithospheric Transects
Several regional seismic tomography studies have shown an irregular-shape low-velocity seismic anomaly in the upper mantle in the northeast Atlantic region resolved in both S-wave (Fichtner et al., 2018;Lebedev et al., 2018;Legendre et al., 2012;Pilidou et al., 2005;Rickers et al., 2013; Figure 16) and P-wave velocity models (Bijwaard & Spakman, 1999;Hosseini et al., 2020;Jakovlev et al., 2012), and it has been suggested that it can be linked to the Cenozoic Iceland plume activity. The anti-correlation of the seismic velocity and long-wavelength gravity anomalies have previously been discussed by, for example, Jones et al. (2002); Rickers et al. (2013) and other studies. Our results are in agreement with their conclusion that the long-wavelength positive gravity anomalies (and corresponding dynamic topography) are associated with the low-density material in the asthenosphere.
The regional S-wave tomography model, described in Rickers et al. (2013) and Fichtner et al. (2018), is based on full-waveform inversion of surface and body waves. Previously, this model has been used to infer geodynamic processes in the northeast Atlantic region (e.g., Schoonman et al., 2017). Here, we compare the S-wave velocity anomalies in the tomography model with our density model along three lithospheric-scale transects. Each synthetic transect is characterized by good constraints on crustal structure based on controlled-source seismic data. The synthetic transects, whenever it is possible, follow available seismic lines and illuminate key tectonic features of the study area. The transects are shown for the upper 300 km, the region where most of density variation resides. The density model shown along the transects includes the effect of lithospheric cooling.

Profile 1
The west-east Profile 1 (Figure 19) runs across Greenland (0-750 km), along the Greenland-Iceland-Faroe Ridge (GIFR; 750-2100 km) and northern British Isles (2100-3000 km) where it intersects the main Caledonian suture zone (Barton, 1992). The profile crosses the thick and cold Greenland craton where crustal thickness is mainly constrained by receiver function data (Dahl-Jensen et al., 2003). Further east, the profile runs along the western portion of GIFR with thick high-velocity igneous or transitional crust (Korenaga et al., 2000;Yuan et al., 2020). The crustal thickness of the mainly volcanic Iceland Plateau reaches about 40 km, as it was estimated from the receiver function analyses (Kumar et al., 2007) and wide-angle profiles by Darbyshire et al. (1998) and Staples et al. (1997). In a regional context, Iceland is part of GIFR which represents a complex region of excessive magmatic crustal accretion due to overlapping rift systems, interlinked rifts and transform zones with a variable uplift and subsidence history (Hjartarson et al., 2017). In such excessively magmatic regions, high-density ultramafic rocks can be emplaced at or above Moho Richards et al., 2013), and significantly decrease the density contrast at the crust-mantle interface. The P-wave velocity and density structure along the eastern part of GIFR is apparently similar to the western part; although, the velocity model might be poorly resolved due to short source-receiver offsets.
The lithospheric density images derived using the gravity gradient inversion provide complementary information to seismological data. The thin extended continental crust within the Faroe Basin (Raum et al., 2005) is underlain 22 of 36 by slightly denser upper mantle compared to GIFR to the west and northern British Isles to the east. The Greenland lithosphere has a cold and dense cratonic root to a depth of about 300 km according to both the seismic and gravity data ( Figure 19). The asthenosphere beneath Iceland has a low-velocity and corresponding low-density anomaly of a relatively smaller magnitude (about 15-20 kg m −3 ). At the east Greenland margin (profile distance 500-800 km or about −40°E) along the profile the low-density anomalies (above 200 km depth) correlates with the region affected by the Cenozoic volcanism related to the emplacement of the North Atlantic Igneous Province ( Figure 15). Similarly, the high-amplitude low-density anomaly at the northern Britain margin (profile distance 2,300-3,000 km or about −12°E) corresponds to Paleocene volcanic centers in the Faroes region. This density anomaly is outlined by the inferred Caledonian suture zone. A possible explanation for a negative density anomaly beneath the Caledonian deformation front can be related to thin lithosphere and/or the presence of trapped oceanic crust which can be rich in plagioclase at elevated lithospheric temperatures. The lithospheric density structure of GIFR can be related to ultramafic melts crystallized at or below the Moho in combination with fragments of continental lithosphere and/or pyroxenite-rich mantle (Foulger et al., 2020;Yuan et al., 2020).
The velocity anomaly beneath Iceland appears much stronger in the velocity model compared to the density image along Profile 1. This can be explained in terms of partial melting which has a considerably larger effect on reduction of shear waves speed than on density (Mavko, 1980;Takei, 2017). The positive density anomaly beneath the Greenland craton generally follows the LAB model based on joint geophysical-petrological inversion by Fullea et al. (2021). This density anomaly is only weakly expressed in the velocity model, and can be related to the variation of chemical composition of the mantle lithosphere, seismic anisotropy. It is useful to note that this result is obtained assuming that the density distribution in the crystalline crust and sediments is known. The effect of depth-dependent density structure in the sedimentary cover is shown in Figure 10 and in Supporting Information S1 accompanying this paper.

Profile 2
Profile 2 ( Figure 20) generally follows the previously compiled lithospheric transect across the North Atlantic by Mjelde et al. (2008). The central and east Greenland lithosphere have a contrasting density structure in the model. In the upper lithosphere, the P-wave velocity of thick continental crust thickness of central-east Greenland 23 of 36 (>35 km) was constrained based on the analysis of broadband seismic data (Kraft et al., 2019). The transition to low-density mantle beneath east Greenland margin observed in Figure 20c is associated with the shallow lithosphere-asthenosphere boundary in the seismic tomography model (Figure 20d). The hyper-extended continental crust of Jan Mayen microcontinent (1,700 km distance) is underlain by a low-density asthenosphere. This may explain a relatively elevated topography of Jan Mayen (Tan et al., 2017(Tan et al., , 2018. The upper mantle beneath the extinct AEgir spreading ridge (Breivik et al., 2014) has coincident low-density and low-velocity anomalies (profile distance 2,000 km). The shallow lithosphere of the continental margin of Norway appears with a similar positive anomalous density as the Greenland lithosphere whereas the seismic velocities are relatively low here. The positive density anomalies underlying the cratonic lithosphere follows the LAB model. The velocity model includes negative anomalies at the LAB across the Norwegian continental margin. This mismatch can be caused by a combination of several factors, such as the unaccounted density variation in the sedimentary cover and crystalline crust in our gravity model, seismic anisotropy or changes of the mantle chemical composition.
As in Profile 1, the correlation of the lithospheric low-density anomalies with the distribution of Cenozoic volcanism is observed along Profile 2. The density image suggests that the low-density mantle anomaly beneath the East Greenland margin and the Jan Mayen region has the same deep asthenospheric source. The density and Figure 17. Comparison of single T rr -component inversion (a-c) and three-component (T zz , T xx and T zx ) inversion results (d-f) using three inversion approaches. LSQR inversion method in the model space (a and d); Truncated SVD inversion method in the data space with the full data covariance matrix (b and e); Pointwise recursive inversion in the data space with the diagonal data covariance matrix (c and f). See also Table 1. 24 of 36 seismic velocity structure suggests branching of a deep thermal anomaly in the shallow upper mantle toward east Greenland and the mid-Atlantic Ridge as also suggested by Celli et al. (2021).

Profile 3
Profile 3 approximately follows the Mid-Atlantic Ridge (Figure 21), across Iceland (500-1,100 km) and Jan Mayen microcontinent (1,500-1,600 km) and ends at the passive continental margin of Svalbard. The segmentation of igneous crustal thickness along Profile 3 can be associated with the alternation of low-density and lowshear velocity anomalies in the asthenosphere. This variation can be linked to the excess crustal accretion at the spreading ridge influenced by the Iceland hotspot Ito, 2001;Tan et al., 2018). This hotspot is presently located in the vicinity of the active Mid-Atlantic Ridge, and has been a ridge-centered mantle melt anomaly during late Cenozoic (Ito, 2001). The density distribution within the primary melt source region in the upper mantle remains unknown. The intense low-velocity in the upper 100 km beneath the thick igneous crust of Iceland (Gudmundsson, 2003) does not correspond to a similar low-density anomaly in our model shown in Figure 21c. Petrological models suggest that the mantle source of Iceland lavas is a peridotite (80%-90%) mixed with basalt (10%-20%), and would imply a density increase of 10-20 kg m −3 which might partly counteract the effect of temperature (Brown & Lesher, 2014).
This result might also reflect the insufficient resolution in the shallow lithosphere beneath Iceland in our inversion. The density of the crust and uppermost mantle along the Greenland-Iceland-Faroe ridge can be substantially different than assumed in our crustal model. Several studies have recently proposed that basaltic upper crust can be underlain by extended continental crust reworked by intrusive magmatism (Foulger et al., 2020;Torsvik et al., 2015). However, in this case, the crustal density would decrease emphasizing the difference between the seismic tomography image and the gravity model. Incorporating additional geophysical data within a more general nonlinear probabilistic inversion framework (e.g., Afonso et al., 2008) can be helpful to further constrain the density structure in this region.
The low density anomaly is obtained in the mantle both north and south of Iceland. The negative density anomaly beneath Reykjanes ridge is not well expressed in the shear velocity image as it can be related to limited resolution at the edges of the regional tomography model. A deep-seated density anomaly north of Iceland might extend over large distance in the shallow asthenosphere toward Svalbard margin where a thin lithosphere is predicted using various geophysical data (Minakov, 2018;Selway et al., 2020;Vågnes & Amundsen, 1993).

Relation Between Density and Seismic Velocity Anomalies
The density variation beneath hotspot regions is a key physical parameter for understanding thermo-chemical convection and distribution of volcanism. The density is also important for numerical modeling of the lithosphere evolution and for understanding the present-day distribution of lithospheric stresses. Modeling of lithospheric stresses is often performed using a thin sheet approximation. Using this approach, the depth-integrated litho- Figure 19. Inversion results along W-E Profile 1 across the northeast Atlantic. (a) Observed free-air gravity anomaly; (b) Crustal geometry: the Moho depth, top basement and topography; (c) density variation with respect to the 3D reference model; (d) seismic shear velocity perturbation in the regional S-wave tomography model (Fichtner et al., 2018;Rickers et al., 2013); (e) the uncertainty of crustal thickness (1 SD) with location of seismic refraction profiles (green) and synthetic lithospheric transects (red). LAB depth (dashed line) is taken from global model by Fullea et al. (2021). 26 of 36 spheric stresses are related to lateral variations of the gravitational potential energy which is a double integral of the mass density with depth (Ghosh et al., 2009;Medvedev, 2016;Schiffer & Nielsen, 2016).
The geophysical properties of the upper mantle rocks can be predicted along the lithospheric geotherm using mineral physics calculations (e.g., Stixrude & Lithgow-Bertelloni, 2005). The relation between the seismic velocity and density variations can be established based on their temperature partial derivatives (δρ/δT, δv s /δT). The conversion coefficient (δρ/δv s ) is about 0.2 for adiabatic mantle and a realistic range of chemical composition (Karato, 2008). The parametric relation between the seismic velocity and density in nonadiabatic and partially molten upper mantle is not available. The deviation of the actual relation between the density anomaly and seismic velocity perturbation from the theoretical value 0.2 may indicate the contribution of other factors unaccounted by the idealized model.
The joint probability density of the mean density perturbation (dρ/ρ 0 ) and the seismic velocity perturbation (dv/v 0 ) for the three selected profiles (shown in Figure 22) indicates that the theoretical relation δρ/δv s ≈ 0.2 is recovered for a young oceanic lithosphere for the velocity perturbation −3 to +2%. For a more pronounced negative velocity anomalies (−3% to −8%) the maximum of joint probability distribution shifts toward smaller dρ/ρ 0 values. This can be interpreted in terms of presence of melt since its direct effect on density is negligible compared to the attenuation of seismic shear velocities (Mavko, 1980;Takei, 2017).

Model Resolution and Uncertainty
The diagonal elements of the posterior covariance matrix provide variance of the resulting density model. Figure 23 indicates the maximum variance reduction at the depth of about 150 km. The prior model covariance is assumed constant, with the SD of 20 kg m −3 . The variance reduction with depth reflects the shape of the weighted integrated kernel. The largest sensitivity is located at the depths of 100-150 km, which makes the method sensitive to the variation of the lithosphere-asthenosphere boundary geometry. (c) density variation with respect to the 3D reference model; (d) seismic shear velocity perturbation in the regional S-wave tomography model (Fichtner et al., 2018;Rickers et al., 2013); (e) the uncertainty of crustal thickness (one standard deviation) with location of seismic refraction profiles (green) and synthetic lithospheric transects (red). LAB depth (dashed line) is taken from global model by Fullea et al. (2021).
Our probabilistic inversion approach implies that the density variation in the target region is a result of a Gaussian process. The estimated mean model of this process is shown in Figure 15 and Figure 17. A way to evaluate the density model parameter space is to generate an ensemble of random realizations using the posterior covariance. This approach can help to test various geological hypotheses proposed in recent publications against interpretation of the GOCE gravity gradient data, such as the composition and nature of the GIFR lithosphere (Foulger et al., 2020;Yuan et al., 2020). A random realization can be constructed using Cholesky decomposition of the covariance matrix and a random vector Equation 14. Note, that the prior covariance matrix, defined by our spherical statistical model Equation 15, is positive definite. If the update of the covariance matrix using data inversion does not change this property, the posterior covariance matrix will be also positive definite. Six random models (shown in Figure 24) are centered at the estimated mean model whereas the variance is 150-400 kg 2 m −6 depending on the depth. The full exploration requires a much larger number of realizations. Here, we just demonstrate the length scales and the general pattern of more robust features such as the anomalous low-density mantle beneath mid-Atlantic ridge and a high-denser lithosphere of the Fennoscandian craton. The model density variations in the asthenosphere have a sheet-like structure where the low-density material extends over the oceanic basin toward the passive margins.
The limitations of the density imaging method presented here are related to our two main assumptions: (a) the uncertainties of the crustal model are adequately described by the data covariance model, (b) the spatial weighting function and the covariance model ensure correct mean depth and aspect ratio of the density anomalies. The validity of our linear Gaussian model depends on the first assumption. The depth resolution of the gravity inversion method depends on the second assumption.
We believe that the progress in statistical description of geological prior knowledge is essential for improved imaging of the Earth's deep structure, in particular, using satellite observations. This implies the development of more accurate data and model covariance models and their efficient use in the inversion. In this paper, only the empirical variance of the Moho depth was used. The data covariance obtained using Equation 32 represents a combination of Moho covariance and the sensitivity kernel. The knowledge of data correlations can be useful as an additional constraint for the solution of the inverse problem. (c) density variation with respect to the 3D reference model; (d) seismic shear velocity perturbation in the regional S-wave tomography model (Fichtner et al., 2018;Rickers et al., 2013); (e) the uncertainty of crustal thickness (one standard deviation) with location of seismic refraction profiles (green) and synthetic lithospheric transects (red). LAB depth (dashed line) is taken from global model by Fullea et al. (2021).
Several approaches exist to avoid computer memory limitations in probabilistic inversions involving full covariance matrices. The use of approximate statistical models in the form of sparse covariance matrices can be beneficial. An approach alternative to the one presented in this paper, would be to separate the density model into blocks using the marginalization method (Geng et al., 2018;Tarantola, 2005).
Another issue concerns our strong assumption of the Gaussian probability distribution for data and model parameters. The current state of knowledge might be inadequate to describe some complex structures (such as the Greenland-Iceland-Faroe ridge) with a Gaussian model. Future progress in statistical description of geological a priori knowledge will lead to more realistic Earth's density models. Additional information such as the state of isostatic compensation and the lithospheric stress regime would be important next steps to explore the ensemble of acceptable density models.

Conclusions
Satellite gravity gradients contain useful information on the density structure of the crust and upper mantle. In this work, we present a probabilistic 3-D linear inversion method and apply this method to image the density heterogeneity within the lithosphere and sub-lithospheric upper mantle in the northeast Atlantic region. The prior information is incorporated through the  covariance matrices providing a statistical description of the data and model parameters in three dimensions and a radial (depth) weighting function proportional to the inverse of the integrated sensitivity kernel. This approach represents a novel way of performing a constrained inversion of satellite gravity gradient data.
The predicted density variation in the upper mantle is generally consistent with seismic velocity anomalies implying a mostly thermal origin of density heterogeneities in the northeast Atlantic. The notable exception is the upper mantle beneath the Greenland-Iceland-Faroe Ridge and southern Norway which appears close to or denser than the background mantle in our model. No strong low-density anomaly is observed under the present-day location of Iceland. Low-density asthenospheric anomalies north and south of Iceland (20-40 kg m −3 ) correlate well with the distribution of Cenozoic seamounts and seamount-like features of the ocean floor.

Appendix A: The Definition of the Gravity Vector and the Gravity Gradient Tensor
In this appendix, we provide a summary of general concepts and equations that form the basis of our forward modeling method used to obtain the results presented in Chapters 3 and 4.
The gravity vector can be expressed using a scalar potential as The gravitation tensor is defined as double gradient of the gravitational potential

of 36
The gravitational potential due to density distribution ρ(r′) at point r can be computed using the volume integral where γ is the gravitational constant. The scalar Green function G is in the form of inverse distance function: The integral forms for the gravity vector and gravity gradient tensor can be obtained by substituting Equations A3 into A1 and A2 The way of solving numerically the integrals Equations A3, A5, and A6 depends on application.
The following relation can be used in the integral Equation A3 The expression for the gravity vector using a non-Cartesian basis e j is where h j are the normalization metric coefficients. For spherical coordinates, the metric coefficients are h r = 1, h ϕ = r, and h λ = r cos ϕ.
The gravity gradient tensor can be written in dyadic notation Here, the derivatives of the basis vectors e i → (e r , e ϕ , e λ ) (radial outward, north, east direction) with respect to spherical coordinates x i → (r, ϕ, λ) are The scalar Green function G can be expressed using surface spherical harmonics (Y n,m ) (Martinec, 2003): where =̂ , ′ = ′̂′ , ̂′ = [cos ′ cos ′ , cos ′ sin ′ , sin ′ ] ; n, m is the spherical harmonic degree and order, respectively; the asterisk denotes complex conjugation.
We substitute this equation in Equation A3. Together with Equations A15 and A16 it gives We obtain an expression for the gravity potential of a thin density layer approximating the radial integral by the first three terms of Taylor series (Novák & Grafarend, 2006;Root, Novák, et al., 2016 Here, h 1 (ϕ, λ), h 2 (ϕ, λ) describe the bottom and top boundary topography of the spherical layer with density ρ(ϕ, λ) and R 0 is the reference radius.
A depth-dependent density within the layer can be modeled using a second-order polynomial function (Root, Novák, et al., 2016): with the density at the top of the sedimentary layer ρ and polynomial coefficients a, b. In this case, the function The external anomalous gravitational potential in geocentric spherical coordinates can be represented in terms of surface spherical harmonics (Balmino, 1994;Hofmann-Wellenhof & Moritz, 2006): where M = 4πρ E R 3 /3 is the total mass of the Earth with the average density of ρ E , R is the geocentric radius, and ̄ are fully normalized spherical harmonic coefficients.
The Stokes coefficients for the gravitational potential can be obtained by equating Equations (B8) and (B9): In the spherical harmonics representation, the first and the second radial derivatives of the gravitational potential are Equation B13 requires differentiation of spherical harmonics (D ϕ Y n,m ). This and other derivatives of spherical harmonics can be obtained by standard recursion relations (Koop, 1993).
The physical components of the gravity gradient tensor can be obtained by substituting the derivatives with respect to the spherical coordinates into Equation A18. The gradient operation, applied twice, provides the expression of the gravity gradient tensor in the local Cartesian coordinate frame: The Green function corresponding to the gravity anomaly on the sphere is The zero-order approximation for the forward problem (Wild-Pfeiffer, 2008) which is equivalent to the sum of point masses is found sufficiently accurate for the geometry of our problem. The integral Equation A6 written in the discrete matrix form for the gravity gradient tensor is where d is the data vector, G is the system matrix, and m is the density vector. The geocentric radius for each layer is referenced to the center of each volume element.
The comparison of modeling results with the GOCE data requires the coordinate transformation according to the traditional sign convention used in geophysics for the gravity gradients. In particular, the radial gravity gradient anomaly is assumed positive for a positive mass anomaly in the local coordinate frame. The conversion of gravity gradient tensor in the local Cartesian system spherical coordinates (LNOF) to the x-, y-, z-tensor components in global Cartesian system (ECEF) can be performed by tensor rotation: where Q(ϕ, θ) is the transformation matrix (Bouman et al., 2013).