Influence of GIA Uncertainty on Climate Model Evaluation With GRACE/GRACE‐FO Satellite Gravimetry Data

Global coupled climate models are in continuous need for evaluation against independent observations to reveal systematic model deficits and uncertainties. Changes in terrestrial water storage (TWS) as measured by satellite gravimetry missions GRACE and GRACE‐FO provide valuable information on wetting and drying trends over the continents. Challenges arising from a comparison of observed and modelled water storage trends are related to gravity observations including non‐water related variations such as, for example, glacial isostatic adjustment (GIA). Therefore, correcting secular changes in the Earth's gravity field caused by ongoing GIA is important for the monitoring of long‐term changes in terrestrial water from GRACE in particular in former ice‐covered regions. By utilizing a new ensemble of 56 individual realizations of GIA signals based on perturbations of mantle viscosities and ice history, we find that many of those alternative GIA corrections change the direction of GRACE‐derived water storage trends, for example, from gaining mass into drying conditions, in particular in Eastern Canada. The change in the sign of the TWS trends subsequently impacts the conclusions drawn from using GRACE as observational basis for the evaluation of climate models as it influences the dis‐/agreement between observed and modelled wetting/drying trends. A modified GIA correction, a combined GRACE/GRACE‐FO data record extending over two decades, and a new generation of climate model experiments leads to substantially larger continental areas where wetting/drying trends currently observed by satellite missions coincide with long‐term predictions obtained from climate model experiments.


Introduction
Readily available freshwater resources are gradually declining in many parts of the world due to increasing demands from human societies for use in agriculture, industry and households (Jasechko & Perrone, 2021).In addition, natural replenishment of those resources is gradually changing under anthropogenic climate change, with shifting precipitation patterns all over the world (Deininger et al., 2020).As a fundamental variable for the interactions between the land-surface and the atmosphere, water kept in the soil or in other hydrological storage compartments has a critical role in the flux of latent, radient, and sensible heat (Mahfouf, 1991).Long-term wetting or drying conditions of terrestrial water storage (TWS) in certain regions of the world in consequence of persistent hydrometeorologic changes would carry important information for decision makers concerned with both national and transboundary water management.However, interactively coupled models used to predict future climatic conditions still exhibit a large spread in the representation of soil moisture (Qiao et al., 2022), snow (Mudryk et al., 2020) and other land water related variables (Song et al., 2021).It is therefore important to use a diverse set of independent observations for evaluating model performance in order to obtain a realistic view on both capabilities and deficits of numerical models.
The satellite missions Gravity Recovery And Climate Experiment (GRACE (Tapley et al., 2004), in operation from 2002 to 2017) and GRACE-Follow-On (GRACE-FO (Landerer et al., 2020), since 2018) have been observing TWS changes from space with global coverage.Jensen et al. (2019) were the first to demonstrate the potential of this data set for the evaluation of long-term drying and wetting trends in water storage as represented in global coupled climate models taking part in the Coupled Model Intercomparison Project Phase 5 (CMIP5), and Jensen et al. (2020) extended this investigation toward analyzing the temporal evolution of the annual cycle of water stored on ground in model experiments of CMIP6.However, besides the successful identification of "hot spot" regions in which water storage trends in the GRACE data record are likely an indication of a long-term evolution of the global water cycle, Jensen et al. (2019) also revealed several challenges: (a) short observation time spans (only 14 years of data were used for that study) are dominated by inter-annual variations that mask long-term climate-related trends, and (b) discrepancies between observed TWS (full integrated water column, possibly superimposed by non-hydrological mass changes) and modeled terrestrial water storage mTWS (CMIP models provide only soil moisture and snow and therefore do not provide information about all storages that were found to be relevant to explain the GRACE observations) regionally hamper the interpretation of results.Apart from climate model uncertainties, these challenges might be the reason for a disagreement between models and observation on the direction of the trend in several regions, such as in Eastern Canada, the Amazon region, the southern part of Africa, or eastern China (see Jensen et al. (2019), Figure 7).With the length of the data record steadily increasing (20 years of GRACE/-FO observations at the time of writing), limitations imposed by the observation time span will become less critical over the remaining lifetime of GRACE-FO and potential successor missions (Daras et al., 2023;Massotti et al., 2021;Wiese et al., 2022).In addition, a new generation of climate models taking part in the CMIP6 intercomparison can be expected to bring further improvements from the modeling perspective possibly leading to reduced model uncertainties.
However, the task of signal separation persists when vertically integrated gravity observations are to be used for assessing modelled water storage compartments.Jensen et al. (2019Jensen et al. ( , 2020) ) identified uncertain regions due to anthropogenic impacts on the water cycle, that are typically not taken into account by climate models (e.g., groundwater pumping and surface water management with dams and reservoirs), and surface water storage in natural lakes and wetlands.In addition, GRACE data is also affected by non-hydrological mass transport processes such as earthquakes.The necessary reduction of glacial isostatic adjustment (GIA) effects from the GRACE mass change data was identified as another important source of uncertainty for GRACE-derived TWS trends.GIA, primarily manifesting itself in a continued vertical motion of regions that were heavily glaciated during the time of the last glacial maximum approximately 20,000 years ago, is often corrected for in geodetic data analysis.However, models used for this correction carry uncertainties related to the specific treatment of, for example, ice history, mantle rheology, sea level variations, and coastline migration, as it might be deduced from the frequent updates of such correction models that took place in particular during the years right after the launch of the GRACE mission (Tapley et al., 2019).Frequently used GIA correction models include ICE-6D_D by Argus et al. (2014), the empirical GIA model LM17.3 utilized in Jäggi et al. (2019), or regionally optimized estimations as described by Sasgen et al. (2017).
The issue of GIA uncertainties shall be investigated in this study to answer the following questions: How does the treatment of GIA signals influence (a) the determination of TWS trends from GRACE/FO data and (b) the (dis-) agreement between drying/wetting trends in satellite gravimetry and CMIP6 climate models?Is the apparent disagreement between observations and models in the highly GIA-affected region in Canada potentially caused by errors in the GIA reduction?
To investigate these questions, we derive gridded linear trends from the combined GRACE and GRACE-FO TWS time series for the time span 2002/04-2021/11 and compare them to long-term trends (1850-2100) of modeled TWS (mTWS) approximated by the multi-model median (MMMed) trends of the sum of the two variables soil moisture and snow provided by an ensemble of CMIP6 models.Uncertainties of GIA modeling are derived from an ensemble of 56 GIA models that differ both in ice history and Earth structure parametrization (Bagge et al., 2021).To assess the influence of the choice of the GIA correction on the GRACE-derived water storage trends, each of the GIA models is successively subtracted from the GRACE time series to come up with 56 different TWS trend estimates.The sensitivity of the sign of the trend to the choice of the GIA model is then analyzed for each continental grid cell.Subsequently, for each of these TWS trend maps the agreement of the direction of the trend is compared to the mTWS MMMed CMIP6 trend to assess the relevance of the GIA correction for the comparison of GRACE with climate model output.

Terrestrial Water Storage From GRACE/GRACE-FO
We use GRACE and GRACE-FO gridded level 3 TWS data from the time span 2002/04 to 2021/11 based on the Release 06 processing of GFZ provided on the GravIS portal (Boergens et al., 2020) mapped to a 2°latitudelongitude global grid.Post-processing steps include the subtraction of the long-term mean field, anisotropic filtering (Horvath et al., 2018;Kusche et al., 2009), the replacement of the coefficients c 10 , c 11 and s 11 from a model-guided approximation method for the geocenter motion (Swenson et al., 2008), the coefficients c 20 , c 30 , c 21 and s 21 from estimates based on SLR (Cheng & Ries, 2017;Dahle et al., 2019;Loomis et al., 2020), the correction of the aliased signal of the S2 Tide (Landerer et al., 2020;Stammer et al., 2014), and the correction of signals associated with three major earthquakes (Sumatra-Andaman 2004, Chile 2010, and Tohoku-Oki 2011) (Boergens et al., 2020).To account for the gravity changes induced by GIA, secular rates of gravity change provided by Peltier et al. (2018) were subtracted in order to arrive at gridded estimates of TWS.We recall that all those postprocessing steps are part of the GRACE-based gridded terrestrial water storage available from GravIS, and will be thus implicitly included in any hydrological application of GRACE observations that is based on data from that website.Therefore, we consider the GIA correction by Peltier et al. (2018) as a reasonable starting point for our study.
For each grid cell, a linear trend is estimated together with the annual and semi-annual harmonics.The trend map is displayed in Figure 1 with purple colors illustrating a long-term drying and green colors indicating a wetting trend.Please note that due to the lack of continental ice dynamics in the CMIP6 models, Greenland and Antarctica are excluded from any further analysis.

GIA-Induced Gravity Rates
Within the GRACE data processed for the GravIS, GIA is treated as a deterministic correction model that is subtracted without the consideration of its inherent uncertainty, since readily applicable error estimates are typically not available.GIA-induced gravity rates given by Peltier et al. (2018) are based on the ICE-6G_D ice history and the VM5a radially symmetric viscosity profile in the mantle constrained by a multitude of observational data-sets.The resulting gravity rates are widely used in GRACE/GRACE-FO satellite gravimetry applications (Tapley et al., 2019) and elsewhere in geodesy and geophysics, but in view of sometimes limited observational evidence and various assumptions (e.g., ice history) and simplifications (e.g., spherically symmetric viscosity distribution in the mantle) applied, those models will naturally represent an imperfect description of the GIA-induced gravity signals.
To further investigate residual uncertainties in GIA-induced gravity rates and to demonstrate the impact of poorly constrained GIA corrections on, for example, climate applications of GRACE/GRACE-FO data, we utilize an ensemble of 56 different GIA realizations from the Viscoelastic Lithosphere and Mantle model (VILMA) (Bagge et al., 2021).Compared to the work by Peltier et al. (2018), the GIA ensemble applied here is advantageous in terms of (a) the utilization of 3D Earth structures considering lateral variations in rheology rather than using radially symmetric viscosity distributions, (b) the consideration of different Earth structure parametrizations as well as (c) the consideration of different ice histories over the last glacial cycle.
The 3D Earth structures used for this work were previously published by Bagge et al. (2020).The viscosity distributions are derived from seismic tomography data from Schaeffer and Lebedev (2013) for depths shallower than 200 km, and from the 2010 update of Grand (2002) for below with a smooth transition and additional geodynamic constraints (e.g., static gravity anomalies, heat flux, mineral physics, Haskell average).The Earth structures account for lateral variations in lithospheric thickness and viscosity in the asthenosphere, upper mantle and transition zone.The upper part of the lower mantle also considers lateral variations, but in the lower part of the lower mantle we use global mean value as 1D layers to reduce the computational burden.For the same reason, we also set all viscosities lower than 6 ⋅ 10 19 Pa s to this threshold.This allows for larger integration time steps and hence less computing time.The Earth structures differ primarily in the applied conversion algorithm from temperature to viscosity.First, we vary the activation enthalpy factor (r) in Arrhenius' law (0.2857, 0.3, 0.4, 0.5, 0.75, 1.0) resulting in a more pronounced lateral variation of the Earth structure and notable differences in lithospheric thickness.Secondly, we alter the radial viscosity profile resulting in different viscosity jumps between upper mantle and transition zone at 410 km depth (s16, sc06, sc06b).The combination of the two types leads to 18 distinctly different Earth structures.
In addition to the various viscosity distributions, we use three distinct ice histories, namely ICE-5G (Peltier, 2004), ICE-6G_C (Argus et al., 2014;Peltier et al., 2015), and the regionally refined NAICE (Gowan et al., 2016) solution, which is a combination of a regional ice sheet model over North America and ICE-6G_C for the remaining globe.Since NAICE has significantly less ice over North America than ICE-6G_C, the residual was uniformly added to the other ice bodies of ICE-6G_C to ensure that globally integrated ice volume and sea level vary consistently.Although the ICE-5G history has been replaced by newer versions from the same authors, we nevertheless include it into the ensemble to obtain a better understanding about the consequences of possible systematic errors in ice histories, since ICE-5G shows a significant different ice distribution and volume particularly in North America.The 18 different 3D Earth structures are combined with the three different ice histories, resulting in an overall 3D GIA ensemble of 54 unique members, which are complemented by two radially symmetric Earth models, that is, the ICE-6G_C ice history combined with the 1D Earth structure VM5a, and the ICE-5G ice history combined with the 1D Earth structure VM2, that are widely used elsewhere (Argus et al., 2014;Peltier, 2004;Peltier et al., 2015).In the following analyses, we will thus consider a GIA ensemble with 56 members in total.
The newly constructed ensemble differs from the work of Caron et al. (2018) by utilizing laterally variable 3D Earth structures instead of a vast number of just radially symmetric viscosity profiles.Our approach is closely related to the work of Pan et al. (2022) who similarly utilized an ensemble of GIA models with horizontally and vertically variable Earth structures to assess in particular variations in global mean sea-level since the last glacial maximum.All 56 GIA simulations considered in this study are performed with the VILMA model, that has been applied in many GIA-related studies in the past (Bagge et al., 2021;Dobslaw et al., 2020;Hoening et al., 2023;Huang et al., 2023;Klemann et al., 2008Klemann et al., , 2015;;Martinec et al., 2018;Schachtschneider et al., 2022).The numerical code solves the sea level equation in the spherical domain and accounts for continuum-mechanics, viscoelasticity, rotational feedback, self-gravitation, mass conservation, and the migration of coastlines.As VILMA accounts for gravitational consistent mass redistributions, we thus directly calculate present-day gravity changes for all ensemble members.Please note that the solution ICE-6G(vm5a) from Peltier et al. (2018) and m_vm5_1d_ice6g from Bagge et al. (2021) use the same 1D Earth structure VM5a as model parametrization, but in detail the ice history differs between ICE-6G_D (Peltier et al., 2018) and ICE-6G_C (VILMA).Furthermore, due to model implementation differences in the GIA codes employed, the model results are notably different.Important differences include the impact of compressibility on the Poisson ratio considered by Peltier et al. (2018), the maximum degree of the spherical harmonics expansion within the model integration of 512 versus 170 (VILMA), and the number of iterations for each glacial cycle (three iterations in VILMA), respectively.
Since all the GIA ensemble members considered for this paper are based on a single computational code, systematic errors cannot be addressed in our study.Nevertheless, we note that VILMA has repeatedly participated in model benchmark studies that involved a wide range of different modelling approaches (Martinec et al., 2018;Spada et al., 2011).We therefore consider VILMA as sufficiently mature that robust conclusions can be drawn from the presented experiments.We also note that VILMA incorporates a feedback from changes in the rotational pole on the linear momentum of the viscoelastically deforming solid Earth (Martinec & Hagedoorn, 2014).This VILMA-inherent pole tide (Wahr et al., 2015) has been removed to make the modelling results consistent with the GRACE processing standards.Further details on those recent VILMA improvements will be published in a separate paper.
Gravity rates taken from GIA models can be expressed in terms of equivalent water height (EWH) following Wahr et al. (1998) and the conversion from gravity potential to gridded EWH values was carried out using the software package GROOPS (Mayer-Gürr et al., 2021).We deliberately choose this rather unusual expression of GIA-induced gravity rates in order to tightly connect to the climate model evaluation presented later in this paper.As expected, Figure 2 reveals that the largest GIA signals corresponding to trends as large as several cm/a in EWH are present in North America and Fennoscandia.Additional strong signals in Greenland and Antarctica are not discussed throughout this article due to generally missing continental ice dynamics in CMIP6 climate models.
In regions heavily affected by GIA, the positive trends tend to be stronger for the majority of the ensemble members compared to Peltier et al. (2018), since the differences of Peltier et al. (2018) minus the ensemble mean, are rather negative (Figure 2b).However, it should be noted that this particularly depends on the ice history applied, and some ensemble members also exhibit smaller trends in these areas.The ensemble standard deviation (Figure 2d) has values of up to 0.02 m/a in Canada and the spread (i.e., the difference between the minimum and maximum values per grid cell) reaches values of more than 0.04 m/a (Figure 2c).We emphasize that these values are well within the range of the GIA trend signal itself (Figure 2a).

Climate Model Data From CMIP6
We use climate model output from the sixth phase of the Coupled Model Intercomparison Project (CMIP6) (Eyring et al., 2016).In CMIP6, the so-called "historical" model experiments cover the time span from 1850 to 2014, and subsequent projections following different "shared socioeconomic pathways (SSP)" are provided from 2015 to 2100.As in Jensen et al. (2020), we choose to concatenate the projection "SSP585" representing the upper end in the range of scenarios (O'Neill et al., 2016) with the historical runs for a continuous time series from 1850 to 2100.
Since there is no TWS variable readily provided by the CMIP6 models, an approximation of this quantity is derived from the sum of the variables "soil moisture content (mrso)" [kg m 2 ] and "surface snow amount (snw)" [kg m 2 ] in line with our previous work Jensen et al. (2019Jensen et al. ( , 2020)).In the following, this sum will be referred to as modelled TWS or mTWS and the corresponding grids are remapped to a common 2°global resolution.It should be noted that there are systematic differences between TWS derived from GRACE and mTWS.The mTWS includes soil moisture and surface snow but no (deep) groundwater, even though part of the groundwater storage change can be expected to be implicitly contained in the soil moisture variations, and no large (artificial) changes in surface waters.Anthropogenic impacts like massive pumping of groundwater or interference with surface waters as in the construction of dams may result in a significant change in the TWS time series determined by GRACE, but has no effect on the mTWS time series.However, Jensen et al. (2019Jensen et al. ( , 2020) ) show that the influence of soil moisture and surface snow on the TWS time series is large enough to enable a comparison between mTWS and TWS despite the systematic differences.
The same ensemble of 17 CMIP6 models as in Jensen et al. (2020) that have provided the relevant variables mrso and snw at the time of writing is also selected for this study, excluding some highly correlated models sharing the same sub-models to avoid biasing the ensemble toward a specific model family.A detailed list of the models that have been excluded can be found in the supplementary material of Jensen et al. (2020) in Figure S2.The linear trend of the mTWS time series is estimated for each model individually and subsequently the median of the trend values is calculated for each continental grid cell outside Greenland and Antarctica.It should be noted that some of the 17 models have several different runs (mostly differing by their initial conditions, and rarely also in model parameterization, initialization method, or the applied external forcing) leading to a total of 105 ensemble members.However, when computing the median the individual runs are weighted so that the influence of all models on the median is the same, regardless of how many runs from each model are available.As the computation of the median for each grid cell globally smooths out extremes leading to a reduced range of values in the median compared to individual model trend maps, a rescaling of the median trend map was performed based on its empirical cumulative distribution function as described by Jensen et al. (2020).The resulting map is in the following referred to as the Multi Model Median (MMMed) trend and it is shown in Figure 3a.The purple color indicates the simulation of long-term drying, while the green color stands for the models predicting a wetting trend.Since there are large inter-model differences in the individual trend estimates, following Jensen et al. ( 2019) we additionally display the "model consensus" in Figure 3b with dark colors indicating high consensus regions, that is, at least 13 out of 17 models (above 75%) agree on the sign of the drying (orange) or wetting (blue) trend.

GIA Influence on TWS Trends From GRACE/GRACE-FO
Before looking specifically into the TWS trends from GRACE, we note that not all GIA realizations considered in this study are equally probable in representing the truth.Present-day GIA-induced gravity rates are the consequence of the viscous response of the solid Earth to the ice history over at least the most recent glacial cycle, and numerous-although sometimes rather imprecise or even contradicting-opportunities exist to validate such models.This includes geologic evidence on relative sea-level in the past, geophysical evidence on the temperature distribution in the mantle, geochemical evidence on mantle composition, and geodetic evidence on present-day crustal deformation.It is beyond the scope of this paper to thoroughly evaluate GIA models and we do not aim to identify a single best-fitting correction model for GIA-induced gravity rates.Instead, we will only attempt to classify the initially 56 GIA realizations in order to arrive at a subset of ensemble members that could be reasonably well considered as being equally probable in representing the truth.
As a first categorization step, we remove all models using the ICE-5G ice history.ICE-5G has been discouraged from further use by its authors for reasons of obvious flaws in the ice distribution that are inconsistent with geological evidence that became only available after its publication.Please note that the very same recommendation also applies to even older ice histories like ICE-4G (Peltier, 2002).In a second step, we utilize globally distributed measurements of present-day uplift rates from various space geodetic techniques contributing to ITRF2020 (Altamimi et al., 2023) to identify 20 ensemble members (Table 1) that fit best to the geodetic record.Generally speaking, 3D models with higher lateral variability and lithospheric thickness (i.e., higher activation enthalpy factor) for all three viscosity jumps between the upper mantle and transition zone fit better to the observed uplift rates than other Earth structures considered.
We note that the combination of the ICE-6G_C ice history and the radially symmetric VM5a earth structure reveals the smallest misfit to the geodetic data record.This is not surprising, since this combination has been optimized in view of (an earlier version of) the geodetic data that is now used for evaluation.While somewhat longer time-series are now available, the global distribution of geodetic instruments attached to the crust has not changed dramatically over the past decade, so that systematic errors in areas not considered during the optimization cannot be revealed with this validation approach.Instead, we argue that 3D viscosity structures derived from seismic tomography are more realistic than a simplified radial profile of mantle viscosities.We also mention that the ICE-5G(VM2) combination produces excellent results when validated against geodetic data (rmse = 1.24 mm/a), which would bring it at the top of all models considered despite of the fact being now deprecated based on geological evidence.All the ICE-5G experiments with 3D Earth structures, however, perform considerably worse than their ICE-6G and NAICE counterparts.
We will now assess how exchanging the Peltier et al. (2018) GIA model, which is by default subtracted in the GravIS data set, with alternative GIA corrections from the VILMA ensemble will affect the sign of the TWS trend (Figure 4).The darker the color, the higher the number of individual GIA ensemble members that lead to a change in TWS trend relative to Peltier et al. (2018) when either all 56 or just the 20 best-fitting alternative GIA corrections are considered.In a relative sense, both subfigures agree surprisingly well with each other.There are areas, as in central Europe, where the sign is not changed by any GIA model at all.In arid areas like Mongolia, the TWS signals observed by GRACE are tiny so that minor changes in the GIA model might already lead to a change in sign, so that those features do not carry any significance.
On the other hand, we also note that a substantial number of GIA ensemble members do indeed change the sign of the GRACE-based TWS trend in Canada and the Northern U.S.This area is of particular interest because of the mismatch between GRACE and the CMIP5-based mTWS trends as revealed by Jensen et al. (2019).TWS from GravIS using the Peltier et al. (2018) model are slightly positive in the Labrador region of Eastern Canada (Figure 1).Larger solid Earth signals from VILMA as given in Figure 2b lead to a sign change in TWS with the majority of the GIA ensemble members for this area.If those alternative GIA corrections would be applied, GRACE would, in fact, observe a large-scale drying in Labrador and neighboring regions, instead of wetting as indicated currently by the GravIS data.
The number of models that do change the sign of the TWS trend differs greatly across North America.At the coast of Labrador and along the St. Lawrence River, roughly 15 out of the 20 best-fitting VILMA ensemble members would lead to a different sign of the TWS trend when replacing Peltier et al. (2018).This number is somewhat smaller in other parts of North America, but still an appreciable number of ensemble members would lead to a different sign of the secular trend.On the one hand, this underlines the importance of the GIA correction for any application of GRACE over the North American continent, which is confirmed by the fact that the influences of GIA and the TWS trends are mainly in the same order of magnitude of about ±4 cm/a.On the other hand, the results also underline the importance of explicitly accounting for errors in any GIA background model applied, and for properly incorporating the associated GIA uncertainty into the error bars provided by GravIS.

GIA Influence on GRACE-Based Climate Model Evaluation
In the following, we investigate whether the possible modifications of the GRACE trend by changing the GIA correction will have an influence on the outcome of a climate model validation exercise that is based on the satellite gravity data.Following Jensen et al. (2019), for each grid cell the sign of the GravIS TWS trend is compared to the sign of the mTWS trend computed as the Multi Model Median (MMMed) of the CMIP6 models.
Please note that we are deliberately comparing trends calculated from 250 years of climate model experiments with roughly two decades of satellite observations.As elaborated in detail by Jensen et al. (2019), we hypothesize that climate-driven wetting and drying is already imprinted in the gravity record after a comparably short observation period, and it is our aim to identify hot-spot regions where long-term shifts in the hydrometeorological regime predicted by a majority of coupled climate models are already visible in present-day observations.Such assessments would help to bolster the credibility of climate models in general and their predictions on the future evolution of, for example, precipitation as the main driver for changes in mTWS.To assess the influence of the GIA model on this comparison, all GIA models in the ensemble are again subsequently subtracted from the GravIS trend prior to the comparison of the sign.
Figure 5a shows the agreement between the MMMed trend from CMIP6 with 56 different TWS trends from GravIS after subtracting each of the VILMA ensemble members instead of Peltier et al. (2018).The darker the color, the more GIA ensemble members from VILMA lead to a sign agreement of the trend between GRACE and the CMIP6 MMMed.It should be noted that both white (0) as well as dark brown ( 56) indicate no impact of the choice of the GIA model.All hotspot regions identified by Jensen et al. (2019) (as discussed in more detail in the next Section) are fully confirmed independently of the GIA correction considered.Other areas like the northern part of the Amazon catchment or the Southeastern U.S. do not show an agreement with the climate models no matter which GIA model is utilized.In all those areas, the consequences of GIA are too weak to become a relevant contributor to gravity changes monitored by the satellites.
On the other hand, however, we identify quite a number of places in notably GIA-affected regions where only a fraction of the VILMA ensemble members under consideration will align the trends from GRACE with the climate models.This is particularly apparent in Manitoba and Ontario with roughly 30 VILMA ensemble members leading to an agreement, and to a lesser extent also in Labrador, where 10 to 20 GIA ensemble members lead to GRACE-based TWS trends with the same sign as in the MMMed of CMIP6.While globally in 27% of the continental area this agreement is influenced by the choice of GIA correction, this fraction reaches about 65% for the area of Canada.
Figure 5b repeats the same analysis focusing on the 20 best-fitting GIA models, and similarly to Figure 4 the two maps agree quite well with each other.Here 22% of the global land areas and 59% of Canada are affected by the choice of the GIA model.These numbers are somewhat smaller compared to the full ensemble of 56 models, most likely because of the selection process resulting in a slightly enhanced coherence of the best-fitting models.

Alternative GIA Corrections in CMIP6 Evaluation
As outlined above, the particular selection of the GIA correction can have a notable influence on GRACE applications in the climate sciences.In the following, we will elaborate this further by selecting two specific realizations out of the 56 VILMA ensemble members.The first realization, called m_1.0_s16_3d_ice6g, belongs to the "s16" group of Earth structures and has shown the best fit of the present-day uplift rates with observational data in the validation of the 3D models from any of the ICE-6G models considered.The second realization, called m_0.5_s16_3d_ice6g has the same viscosity contrast between the upper mantle and transition zone ("s16" group) but is characterized by a smaller lithospheric thickness and lower viscosities in the asthenosphere underneath Laurentia.While it has a slightly larger misfit against space geodetic data when evaluated globally, it is still among the 20 best-fitting models.Furthermore, it is one of the models with the largest influence on the GRACE/CMIP6 evaluation and has been chosen here to demonstrate the span of possible impacts of the GIA correction on this comparison.
Both VILMA ensemble members are subsequently applied as GIA corrections to the GRACE data, and the resulting TWS trends are then used as the observational reference to evaluate CMIP6 climate model experiments (Figure 6).A red color signals drying, that is, a negative trend in the climate model MMMed, while a blue color corresponds to wetting, that is, a positive trend.A dark color indicates that the GravIS TWS trend confirms the sign of the MMMed mTWS trend, while a light color stands for a disagreement between models and observations.In comparison to using the correction by Peltier et al. (2018), see Figure 6a, we find that with m_1.0_s16_3d_ice6g most of the GRACE signals south of Hudson Bay now show a drying trend that is also predicted by the climate models (Figure 6b).For a better comparison to the Peltier et al. ( 2018) correction, the right part of Figure 6b summarizes the impact of substituting the GIA model on the agreement between the CMIP6 and GravIS trends.Here dark green/red areas indicate agreement/disagreement for the alternative GIA model, which was not the case before.With the lower lithospheric thickness and lower viscosities in the asthenosphere in experiment m_0.5_s16_3d_ice6g, even the TWS trends in Labrador become negative and thus align with the CMIP6 models.We also note that changing the GIA correction leads to disagreement between GRACE and climate models at selected places in Russia and also at further isolated locations in Asia and South America.Those pattern are, however, much less spatially coherent than in the Labrador region, and are likely related to generally weak TWS trends in those areas so that already very small changes to the GIA correction have an impact on the results from the CMIP6 model evaluation.
Please keep in mind that coincidence with CMIP6 shall not be used as a selection criterion for GIA corrections, which still need to be selected based on validation with external data from both modern instrumental records as well as geological evidence.Furthermore, in this analysis we did not mask out regions of low model consensus (cf. Figure 3), therefore the comparison might be also somewhat influenced by uncertainties in the MMMed trend estimate.Nevertheless, the analysis underlines the sensitivity of GRACE applications to GIA particular in North America and other previously and currently glaciated regions, and consequently the importance of an accurate GIA correction for reliable estimates of water storage and ice mass changes.

Improvements Since CMIP5 Assessment by Jensen et al. (2019)
As a final note, we show a comparison of the GRACE/GRACE-FO-based CMIP6 model assessment using alternative GIA corrections with the results previously obtained by Jensen et al. (2019) with CMIP5 and a much shorter gravity record (Figure 7a).Those results were based on a TWS trend estimation from only 14 years GRACE data from ITSG-Grace2018 (Kvas et al., 2019;Mayer-Gürr et al., 2018) and 21 sufficiently uncorrelated climate model experiments from the CMIP5 archive.For the analysis of hot spot regions, the comparison is limited to grid cells with a high model consensus, that is, a large number of models agreeing on the sign of the trend.In a slight deviation from Jensen et al. (2019), we define 75% instead of 71% of the ensemble members as "high consensus" to be consistent with the smaller CMIP6 ensemble used in the present study (cf. Figure 3).
Figure 7b shows the agreement in sign of the long-term trend between CMIP6 and the longer time series of TWS products from GravIS using the GIA model m_0.5_s16_3d_ice6g.We note a substantial increase of coherent regions where models and observations agree on the trend when compared to the previous assessment with the agreement area having increased from 48.0% for the CMIP5 comparison to 59.1% for the CMIP6 comparison.Most notable, the Mediterranean region and large swaths of central and eastern Europe show a drying climate in the GRACE records that is also predicted by the majority of CMIP6 models.We also see a coherent  2019), (b) multi-model median from CMIP6 and GRACE/GRACE-FO TWS grids from GravIS.Analysis is limited to grid cells in which at least 75% of models agree on the sign of the trend, cf. Figure 3.For legend see Figure 6.
pattern of wetter climate conditions in central Africa that has not been detected in our previous work.This increase in coherence among models and satellite data documents the progress that has been recently achieved both in terms of reducing systematic errors and noise in the satellite gravity data as well as in refining numerical climate models.The latter is underlined by a better consensus of the model ensemble regarding the direction of the trend with 43.7% of the continental area being identified as "high consensus" regions for the CMIP6 ensemble versus only 34.5% for CMIP5.

Conclusions
Correcting secular changes in the Earth's gravity field caused by ongoing Glacial Isostatic Adjustment (GIA) are not only important for the quantification of continental ice-mass loss in Greenland and Antarctica, but also for the monitoring of long-term changes in terrestrial water storage from GRACE/GRACE-FO in particular in the former ice-covered regions of North America and Fennoscandia.By utilizing a new ensemble of 56 individual realizations of the GIA signals with the viscoelastic Earth model VILMA using 18 different 3D Earth structures, three different ice histories and two standard 1D models, we find that many of those alternative GIA corrections are able to change the direction of GRACE-derived TWS trends, for example, from showing mass gain into drying conditions in particular in Western Canada.The sensitivity of GRACE to GIA uncertainties underlines the importance of continued GIA research required to further improve the understanding of this important geodynamic process, and subsequently to increase the reliability of GIA corrections applied to GRACE and other geodetic data.The change in the sign of the mTWS trends induced by an alternative GIA correction subsequently also impacts the conclusions to be drawn when using GRACE as observational basis for the evaluation of climate models as it influences the dis-/agreement between observed and modelled wetting/drying trends.We recall, however, that any decision about the choice of a suitable GIA correction should be based on observational constraints and geophysical expert knowledge and must not be rooted on a better fit with numerical climate models.
We select two particular alternative GIA realizations from VILMA in combination with a newly processed mass trend map from GRACE and GRACE-FO to evaluate these wetting and drying trends in climate model experiments collected by the latest international climate model intercomparison project (CMIP6).Compared to our previous assessment based on CMIP5 and a somewhat shorter GRACE record, we find that the spatially coherent regions of agreement between observations and climate models have grown substantially.This is both related to improvements in the climate model experiments, where in particular deeper soil schemes have led to a considerable more realistic mTWS representation, a combined GRACE/GRACE-FO gravity time-series of roughly 20 years that better allows discriminating long-term trends from interannual-to-decadal climate variability, and also to refined GIA corrections, that lead to a better agreement between GRACE/GRACE-FO and the climate models particularly in the Labrador region.Achievements from very different branches of the geosciences (i.e., refined processing of GRACE/GRACE-FO satellite sensor data, revised upper mantle viscosities, more elaborate soil moisture schemes) eventually lead to a better correspondence of observed and modelled trends in mTWS, thereby providing (a) further credibility to the predictive skill of numerical climate models, and (b) further empirical evidence that ongoing changes in the Earth's climate due to continuously increasing concentrations of carbon dioxide and other Greenhouse gases do have a profound (and measurable) impact on hydrological storages already today.

Figure 1 .
Figure 1.Secular trend in terrestrial water storage (TWS) calculated from GRACE/GRACE-FO monthly Level-3 grids for the period 2002/04 until 2021/11 as available from GravIS, where the GIA model by Peltier et al. (2018) has been subtracted.

Figure 2 .
Figure 2. Statistical indicators of the VILMA GIA ensemble with 56 members as expressed in EWH: ensemble mean (a), differences of ensemble mean against (Peltier et al., 2018) (b), ensemble spread, that is, difference between the min and max values of the ensemble (c), and ensemble standard deviation (d) for each grid cell (excluding Antarctica).

Figure 3 .
Figure 3. Multi-model median (MMMed) calculated from 17 different CMIP6 climate models for the secular trend in mTWS as expressed in equivalent water height (EWH) indicating drying (purple) and wetting (green) conditions (a); and the model consensus characterized by the number of models out of the 17 ensemble members that agree on the sign of the trend for drying (orange) and wetting (blue) conditions, with dark/light colors indicate high/low model consensus, that is, more/less than 13 models agreeing (b).

Figure 4 .
Figure 4. Change of sign of the GRACE-based TWS trend when substituting the GIA correction: number of GIA models when considering the (a) full VILMA ensemble consisting of 56 members, and (b) the 20 best-fitting models only.In the colorbar the White/dark red values in the colorbar refer to no/all GIA models.

Figure 5 .
Figure 5. Agreement of the GravIS TWS trend after subtracting alternative GIA models to the MMMed mTWS trend: number of TWS trends after subtracting each model of the GIA ensemble that show agreement with the mTWS MMMed trend, both for the full ensemble (a) and for the 20 best-fitting GIA models (b).In the colorbar the white and dark red values refer to no/ all GIA models.

Figure 6 .
Figure 6.Left: Agreement of the MMMed mTWS trend compared and the GravIS TWS trend using the different GIA models: (a) Peltier et al. (2018), (b) ensemble member m_1.0_s16_3d_ice6g, and (c) ensemble member m_0.5_s16_3d_ice6g.Right: The effect of the substitution of the GIA model on the agreement compared to Peltier et al. (2018).

Figure 7 .
Figure 7. Agreement on the sign of the long-term trend in TWS between numerical climate models and satellite gravimetry: (a) multi-model median from CMIP5 and GRACE TWS grids from ITSG-Grace2018, similar to Jensen et al. (2019), (b) multi-model median from CMIP6 and GRACE/GRACE-FO TWS grids from GravIS.Analysis is limited to grid cells in which at least 75% of models agree on the sign of the trend, cf.Figure3.For legend see Figure6.

Table 1
GIA Model Ensemble With 20 Members as Selected Based on a Validation Against Globally Distributed Uplift Rates Obtained From Space Geodetic Data Note.A larger activation enthalpy factor (AEF) causes larger lateral viscosity variations and lithospheric thicknesses.The radial viscosity profile (RVP) controls the viscosity jump between the upper mantle and transition zone.AEF and RVP are only applicable for 3D structures.The example models presented in Figure6are marked in bold print.