Sensitivity of the Atlantic Meridional Overturning Circulation to Model Resolution in CMIP6 HighResMIP Simulations and Implications for Future Changes

A multimodel, multiresolution ensemble using Coupled Model Intercomparison Project Phase 6 (CMIP6) High Resolution Model Intercomparison Project (HighResMIP) coupled experiments is used to assess the performance of key aspects of the North Atlantic circulation. The Atlantic Meridional Overturning Circulation (AMOC), and related heat transport, tends to become stronger as ocean model resolution is enhanced, better agreeing with observations at 26.5°N. However, for most models the circulation remains too shallow compared to observations and has a smaller temperature contrast between the northward and southward limbs of the AMOC. These biases cause the northward heat transport to be systematically too low for a given overturning strength. The higher‐resolution models also tend to have too much deep mixing in the subpolar gyre. In the period 2015–2050 the overturning circulation tends to decline more rapidly in the higher‐resolution models, which is related to both the mean state and to the subpolar gyre contribution to deep water formation. The main part of the decline comes from the Florida Current component of the circulation. Such large declines in AMOC are not seen in the models with resolutions more typically used for climate studies, suggesting an enhanced risk for Northern Hemisphere climate change. However, only a small number of different ocean models are included in the study.


Introduction
The Atlantic Meridional Overturning Circulation (AMOC) is a key component of the three-dimensional ocean circulation. It transports warm and salty water northward into the upper layers of the Atlantic Ocean and Nordic Seas and exports cold and dense water southward into the deep Atlantic Ocean (Buckley & Marshall, 2016). It plays an important role in the Northern Hemisphere climate, via northward heat transport (NHT) and surface fluxes (Bryan, 1962;Ganachaud & Wunsch, 2000;Jackson et al., 2015;Weijer et al., 2019).
The relative abundance of new and continuous observations of ocean circulation in the North Atlantic, at 26.5°N as part of the RAPID-MOCHA array over 2004 to present (Cunningham et al., 2007;Smeed et al., 2017) and further north at OSNAP over 2014 to present (Lozier et al., 2017) and OVIDE over 2002 to present (Mercier et al., 2015) are challenging previous theories of AMOC (Lozier et al., 2019). Combining these with theoretical models  and reanalyses Karspeck et al., 2017) can give insight into key processes controlling the mean state and variability.
The representation of the AMOC in Coupled Model Intercomparison Project (CMIP, Eyring et al., 2016)-type models (both coupled and ocean only) has been studied extensively (e.g., Weaver et al., 2012;Weijer et al., 2020). However, simulating the AMOC consistently with observations with these models remains a challenge (Cheng et al., 2013;Reintges et al., 2017;C. D. Roberts et al., 2014), even when using the same surface forcing (Danabasoglu et al., 2014(Danabasoglu et al., , 2016. This illustrates that the AMOC includes a complex set of processes sensitive to model settings and properties, including (but not limited to) (a) the representation of buoyancy fluxes and air-sea feedbacks (Kostov et al., 2019), (b) ocean parameters and parameterizations (Danabasoglu et al., 2014(Danabasoglu et al., , 2016Marshall et al., 2017), and (c) systematic model biases that can prevent a realistic representation of ocean stratification and, in turn, of deep water formation (Heuzé, 2017;Sgubin et al., 2017). All of these aspects are themselves likely to be sensitive to both model formulation and model resolution (i.e., grid spacing).
The effect of horizontal resolution in coupled atmosphere-ocean-sea-ice models on the AMOC is debatable in the current literature, with few systematic multimodel comparisons. Some studies show a strengthening of the AMOC with increased resolution (Hirschi et al., 2020;, other studies show an AMOC weakening (Delworth et al., 2012;Putrasahan et al., 2019;Winton et al., 2014), and some even show both local increases and decreases (Sein et al., 2018). The ocean horizontal resolution has also been shown to influence processes that affect the AMOC, such as (a) the temperature and salinity of the subpolar gyre (Menary et al., 2015), (b) the path of the Gulf Stream and its extension (Sein et al., 2018), and (c) deep water formation and sinking (Katsman et al., 2018;Menary et al., 2018). Vertical resolution can also be important in reducing spurious mixing from overflows (Zhang et al., 2011). The impact of atmospheric resolution alone on AMOC in coupled climate simulations has few multimodel studies. Grist et al. (2018) showed small changes to Atlantic NHT relative to atmosphere resolution changes, while for individual models C. D. Roberts et al., 2018; there is some evidence for changes in mean state.
A close relation between AMOC variability and changes in deep convection exists in models and is inferred from paleoreconstructions, especially in the Labrador Sea (Brodeau & Koenigk, 2016;Caesar et al., 2018;Danabasoglu, 2008;Eden & Willebrand, 2001;Thornalley et al., 2018) and biases in deep convection are a likely candidate for biases in AMOC strength, variability, and sensitivity to atmospheric changes. Reduced biases in North Atlantic water mass characteristics thus may increase realism in AMOC simulation in high-resolution (HR) models. For instance, resolving Labrador Sea eddies, which were found instrumental in water mass transformation and densification of the boundary current in an idealized model of a subpolar marginal sea , may change the sinking branch of the AMOC and its sensitivity to changes on ocean and atmosphere background state. Overflows from the Nordic Seas are still poorly represented in even the highest-resolution global models available, but the upper branch of the AMOC, being mainly sourced by warm and salty water from the Indian Ocean via Agulhas Rings , is already resolved at eddy-present resolutions around 25 km. A key change in eddy-present and eddy-rich ocean models (around 10 km resolution) is the response of currents to winds. In low-resolution (LR) models increased energy input by winds is directly transferred to increase the mean kinetic energy of the currents. At higher resolution (HR), the increase in wind input is often compensated by increased energy transfer to the mesoscale eddy field, making mean currents less sensitive to wind changes when the eddy field is resolved. This so-called eddy-saturation effect has been demonstrated to exist in observations (Beal & Elipot, 2016;Meredith et al., 2019). Most CMIP models cannot directly simulate the Agulhas leakage because of their coarse resolution, so there is low confidence in their projections of leakage change. However, HR models potentially give better projections of changes in Agulhas leakage.
In this study, we focus on the role of model horizontal resolution in both the atmosphere and ocean. We use the CMIP6 High Resolution Model Intercomparison Project (HighResMIP) experimental design for our simulations (Haarsma et al., 2016). This protocol has the advantage of being relatively short and with all models starting from the same observed initial conditions, enabling model simulations at HRs that would otherwise be unaffordable. However, due to the short spin-up time, the simulations do suffer from ongoing model drift, and we attempt to account for that using control simulations with constant forcing.
This work complements that of Jackson et al. (2020), in which there is more of a focus on the mechanisms of AMOC decline in the CMIP6 simulations using the HadGEM3-GC31 models. That study makes use of a subset of the HighResMIP multimodel ensemble discussed here to provide additional evidence of robustness of the proposed mechanisms.
Our study has several objectives. Our first goal is to describe the multimodel, multiresolution simulation of the AMOC using a common protocol, the associated effectiveness of this protocol (e.g., impact of model drift), and the systematic differences that horizontal resolution causes. Second, we investigate the reasons for the uncertainties in the model representations of the AMOC. Third, we examine whether the model historic performance explains or relates to the magnitude of future changes.
In section 2, we describe the models and simulations used in this study, with the observational data and model metrics of the AMOC shown in section 3. Section 4 describes the results, including the mean state, variability, and future change in the AMOC. Section 5 is a discussion of the implications of the results.

Simulations
All simulations follow the coupled CMIP6 HighResMIP experimental design (Haarsma et al., 2016), which is composed of the following experiments: i spinup-1950, a short (30-50 year) experiment starting from EN4 ocean temperature and salinity data averaged over the period 1950-1954(Good et al., 2013, which provides initial conditions to both control-1950 and hist-1950 simulations, ii control-1950, which uses constant 1950s forcing and continues for at least 100 years, iii hist-1950, which uses time-varying external forcings from 1950-2014 and provides initial conditions to highres-future, and iv highres-future, which prescribes external forcings following the SSP585 future scenario for the period 2015-2050. The forcings include a simplified aerosol optical properties formulation (Stevens et al., 2017), which incorporates anthropogenic changes over time. We use results from the control-1950 simulations to examine the baseline mean state, long-term stability, and model drift from initial conditions; the hist-1950 simulations to compare with observations and examine forced changes over time; and highres-future to examine projected circulation changes.

ROBERTS ET AL.
We focus on the "standard" LR and HR models, that typically comprise an ocean grid spacing of 1°(noneddying) and (1/4)°(eddy present), respectively, together with a range of atmosphere resolutions (see Table 1). Two models (CMCC-CM2-(V)HR4 and MPI-ESM 1-2) use the same eddy-present ocean with two different atmosphere resolutions, two other models have similar additional configurations (ECMWF-IFS-MR, HadGEM3-GC31-MM), and two models have been run with an eddy-rich ocean resolution (HadGEM3-GC31-HH, CESM1-3-HH). Henceforth, we will use the CMIP6 resolution naming convention (in km) as shown in Table 1.
We note that five of the seven coupled models included here use the NEMO ocean model, and hence, there is a lack of exploration of structural uncertainty in this work. Several models have produced multiple ensemble members (column 4 of Table S3), and these are used where possible to indicate AMOC spread due to internal variability.

Observational Data and Comparison Methods
We use data from the RAPID-MOCHA observational array at 26.5°N (Johns et al., 2011;Smeed et al., 2017) for characteristics of the AMOC, NHT and their components over the period 2004-2017. NHT is defined by where ρ is seawater density, c p is the specific heat capacity of seawater, v is meridional velocity, and θ is potential temperature and where the double integral is taken over the full area of the transbasin section.
We use the RapidMoc package (C.D. Roberts, 2017; C. D. Roberts, Garry, & Jackson, 2013) to calculate model quantities comparable to these observational values, with input parameters shown in Table S2 in the supporting information. Note that some of the calculations, such as the Ekman transport, may be sensitive to the exact latitude used, as well as how the model grid aligns with these latitudes. We use Deep Mixed Volume (DMV; Brodeau & Koenigk, 2016;Koenigk et al., 2020) in the Labrador Sea (70-40°W, 45-72°N) and Nordic Seas (20°W to 20°E, 65-82°N) to evaluate the deep mixing processes with implications for dense water formation.

Structure of Overturning Circulation
The spatial structure of the AMOC in both depth and density (as a function of sigma-2) space from the control-1950 simulations, averaged over years 50-100, are shown in Figure 1, where the upper figures show models with changes in both atmosphere and ocean resolution, while the lower figures show changes in one component only (Table 1). Each row shows the mean state for each resolution, together with the difference between them.
For models with a change in ocean resolution, 1. in depth-space, two models have a strong increase of AMOC strength with increased resolution, one has a weak increase and two have a decrease. There is also a typical decrease around 50-60°N; the mechanism for this is explained in Hirschi et al. (2020) and is due to zonal variations in the depth of the subpolar gyre circulation. 2. in sigma-2 space, the higher latitude/more dense part of the AMOC increases with resolution apart from MPI-ESM 1-2, with maximum change in the subpolar gyre between 40°N and 60°N. This may indicate more dense water formation and/or a difference in the source water properties at these latitudes in the HR models. 3. There is a hint of a reduced or change in density of flow into the Atlantic at 30°S in the HR models.
For models with the same resolution ocean but increased atmosphere resolution, in sigma-2 space the change in the circulation is typically smaller than that seen when the ocean resolution increases; also, both in sigma-2 and depth-space, MPI-ESM 1-2 has a weakening of AMOC with higher atmospheric resolution (as described in Putrasahan et al., 2019), while the other models typically show a small strengthening (including ECMWF-IFS and HadGEM3-GC31 with only an atmosphere resolution change, not shown). The structures of the mean Atlantic meridional stream function (in Sv) calculated in depth and sigma-2 space for HR control-1950 simulations, for years 50-100, and the differences between these and the lower-resolution (LR) counterparts. Columns 1 and 3 are the mean AMOC for each model stated, and columns 2 and 4 are the differences between the resolutions stated. Rows 1-4 show models with a change in ocean resolution from 100 to 25 km (HR-LR models). Note that CMCC-CM2 and MPI-ESM 1-2 only include a change in atmospheric resolution, and HadGEM3-GC31-HH shows only a change in ocean resolution. The bold line shows the 0 Sv contour, and contour interval is 4 Sv.
The simulation of the Denmark Strait and Iceland-Scotland ridge overflows remains a major problem with these models, as with previous model generations (Yeager & Danabasoglu, 2012). The Estimating the Circulation and Climate of the Ocean state estimate (ECCO Version 4 Release 3; https://ecco.jpl.nasa.gov) as shown in Johnson et al. (2019) suggests sigma-2 densities of over 37.8 in the Greenland-Iceland-Norway (GIN) Seas north of 65°N and flows to the south of the ridges of up to 37.4. Although some of the models capture the GIN Seas density, they all have considerably less dense flow of at most 37.0 to the south of the ridges, showing that they have too much mixing in the overflows. The CESM1-3-LL model uses a parameterization of overflows (Danabasoglu et al., 2010) to reduce the mixing and allow greater sinking. This may be why CESM1-3-LL has a stronger and deeper overturning in density space, despite having a weaker overturning in depth space.

Characteristics at 26.5°N and Comparison With Observations 4.2.1. Mean State
The vertical structure of observed and simulated overturning stream functions at 26.5°N are compared in Figure 2 using RAPID-MOCHA and the hist-1950 simulations over 2004-2014. Previous work has demonstrated that the depth structure of overturning stream functions observed by the RAPID-MOCHA array can be sensitive to the choice of reference level for the geostrophic approximation (C. D. Roberts, Garry, & Jackson, 2013). To account for this potential inconsistency, all simulated profiles are calculated using the python RapidMoc package (C. D. Roberts, 2017), which implements a RAPID-style calculation that includes the geostrophic approximation for ocean interior transports and a consistent level of no motion. Using this methodology generally improves the agreement between models and observations in the deep ocean with little impact on the depth and strength of the overturning maximum. Further details on the implementation of this approximation in an ocean general circulation model are provided by C. D. Roberts, Garry, and Jackson (2013).
Although some models accurately simulate the strength and depth of the AMOC maximum, most models underestimate the depth of the return flow such that the overturning cell is too shallow. This result is consistent with previous coupled model studies (e.g., C. D. Roberts, Waters, et al., 2013). The notable exception is the "eddy-rich" HadGEM3-GC31-HH model, which has an overturning stream function that is very close to the observed profile, perhaps related to reduced biases (see M. J. Roberts et al., 2019 and later), while CESM1-3-HH captures the maximum and CESM1-3-LL may be helped by the overflow parameterization. The CMCC-CM2 models are also notable as the only models with overturning maxima that are much higher than observed. However, the vertical shear is also much larger than observed such that the return flow is still too shallow. Models generally compare less favorably with RAPID observations in the deep ocean when stream functions are computed from model velocities (see C. D. Roberts, Garry, & Jackson, 2013;C D. Roberts et al., 2018).
These errors in the depth structure are related to the density of the return flow, and hence the depth at which the Deep Western Boundary Current (DWBC) flows. Analysis of zonal sections of velocity at 26.5°N (not shown) indicates that several of the HR models (EC-Earth3P, ECMWF-IFS, and MPI-ESM 1-2) have the lowest part of the DWBC core at around 3,000 m, and hence, this is the depth at which the Figure 2 profile will reach zero. In contrast, in the CMCC-CM2-(V)HR4 it reaches to nearly 5,000 m, with the other models falling in between these ranges. In the LR models the DWBC has a much less well-defined core, perhaps partly due to the Gent and McWilliams (1990) scheme flattening the isopycnals (Lüschow et al., 2019). The depth of the return flows indicated by these profiles is consistent with the differences between the models seen in Figure 1, in which models with the densest deep water have also a larger return flow at greater depths.
The temperature and salinity biases at 26.5°N for the models from the hist-1950 simulations, compared to EN4 analysis, are shown in Figure 3. Many of the models are too fresh near the surface, and too saline in the thermocline between 500 and 1,500 m. The HadGEM3-GC31-LL model is a typical example, and how this bias evolves over time is shown in M. J. Roberts et al. (2019). The CMCC-CM2 models are too saline at all depths, which would be consistent with their deep water being too dense and overturning too deep and intense; CESM1-3-HH is also too saline near the surface-both these models have versions of the same atmosphere model. The HadGEM3-GC31-HH model is among the best models in minimizing the salinity bias (and the AMOC profile bias), which was also seen in the small drift shown in M. J. Roberts et al. (2019), and perhaps this suggests an important role for explicitly representing mesoscale processes in minimizing this bias (Griffies et al., 2015;Small et al., 2014).
The components of the AMOC transport in the upper ocean are shown in Figure 4 for both models and observations. As for the overturning stream functions, this decomposition is performed on models using the RapidMoc python package (C. D. Roberts, 2017) and follows the RAPID observations methodology (McCarthy et al., 2015). The upper AMOC limb is split into contributions from transport in the wind-driven Ekman component (ekman), the upper mid-ocean (umo) component (based on thermal wind and a mass conservation constraint), and the western boundary Florida Current (fc).
The exact decomposition using RapidMoc depends on the longitudes chosen for each model (Table S2), and the sections were examined to ensure that the Florida Current (fc) did fall within the longitudes specified-in particular the Bahamas are generally not resolved in LR models, and in such cases the longitudes are chosen to span the width of the boundary current. However, the Antilles Current, generally stronger (though seasonal) in the HR models, is not counted in the Florida Current component. Based on these settings, the Florida Current transport is often larger in the LR models and above the observed value. The boundary current in these 100 km ocean models is one to two grid points wide and constrained near the Florida coast (not shown), while at HR it is faster, narrower and less coastally constrained. The LR models typically have a larger compensation in the Upper Mid-level Ocean (umo) component. The Ekman component is typically underestimated by all the models apart from CMCC-CM2 and CESM1-3, and (apart from HadGEM3-GC31) the LR models also have stronger Ekman contributions. This points to errors in the atmospheric wind forcing, with the proviso that the alignment of the latitude of the section with the model grid may impact the Ekman transport in particular (this specifically impacts the MPI-ESM 1-2 model). There is no clear pattern in whether the changes in AMOC strength between resolutions come from the fc or umo component (although the change in the ekman component is generally small). We also note that there are some models (e.g., EC-Earth3P and CNRM-CM6-1) where the AMOC calculated in this way for this time period decreases with model resolution, consistently with Figure  The main sites of deep convective mixing and dense water formation in the North Atlantic are the subpolar gyre (Labrador and Irminger Seas) and the Nordic Seas north of Iceland. To examine which of these regions contribute to dense water (and hence to the AMOC mean state), Figure 5 shows the mean AMOC from hist-1950 models plotted against the DMV (Brodeau & Koenigk, 2016;Koenigk et al., 2020), the volume of water in the mixed layer below a certain depth, here 700 m in the Nordic Seas and 1,000 m in the Labrador Sea. The observed DMV is based on ARGO float mixed layer depth for the period 2000-2015, both using climatological and maximum mixed layer depth. In terms of DMV, these observed values are only consistent with a small subset of models, most of which (apart from CNRM-CM6-1) having relatively weak AMOC. For some models (HadGEM3-GC31 and ECMWF-IFS) the HRs have markedly more activity in the Labrador Sea than in the Nordic Seas, which becomes significant in future projections-see later and Jackson et al. (2020). There is a strong relationship between DMV in both regions and the AMOC strength across models, with those models producing more dense water in the Labrador or GIN Seas having a greater AMOC strength. However, the observations show a much stronger AMOC for a given DMV than the models, suggesting that there are other deficiencies in the models. Further details of mixed layer depths and surface fluxes can be found in Koenigk et al. (2020).
The strength of the link between AMOC variations at different latitudes (Bingham et al., 2007) is central to disentangling the role of Labrador Sea waters on the AMOC, as well as to determine if the RAPID observations at 26.5°N are representative of the large-scale AMOC stream function, and this is discussed further in supporting information S1.
Another way of illustrating the AMOC and gyre circulation structure is to project the meridional transport on potential temperature-salinity (θ-S) plane, as V(θ,S) (e.g., Bailey

Journal of Advances in Modeling Earth Systems
The unit of V(θ,S) is Sverdrup over area in (θ-S) space, Δθ × ΔS. V(θ,S) connects the vertical profiles of the AMOC stream function (Figure 2), salinity, and temperature ( Figure 3) and depicts the circulation in a water mass/property perspective (Döös et al., 2012). Integrating V(θ,S) ( Figure S2) along isopycnals yields the overturning stream function with respect to density ( Figure 6) for the hist-1950 simulations, along with the results from an Atlantic simulation that has been shown to represent the observed structure of the AMOC and subtropical gyre (Xu et al., 2016). At 26.5°N, the stream function typically exhibits a maximum overturning near 36.4-36.64 kg m −3 (denoted by symbols in Figure 6a) and a secondary maximum overturning near 34-35 kg m −3 , representing the diapycnal AMOC and the diapycnal transformation associated with the subtropical gyre, respectively. Three simulations (EC-Earth3Pp1, EC-Earth3P-HRp1, and ECMWF-IFS-LR), however, do not have significant Antarctic Intermediate Water (AAIW) contribution to the AMOC (see S2 and Figure S2 for details). In consequence, their maximum stream function is found at the otherwise secondary overturning density (denoted by symbols with black edge in Figure 6a) and the AMOC transport is much weaker. As in Figure 1 we generally see the HR models (apart from CNRM-CM6-1) having a stronger transport in the densest layers, and the density of the peak transport is indicated in Figure S2.
With V(θ,S) and the corresponding density structure of the stream function in Figure 6a, one can further define characteristic θ and S of the meridional flow across this latitude. Figures 6b and 6c display the transport-weighted θ and S for the northward flow above the maximum overturning density and the southward flow below the maximum overturning density. The results show that all simulations except EC-Earth3P and EC-Earth3P-HR (which have weak AMOC due to a lack of AAIW contribution) exhibit a colder northward flow, and all simulations exhibit a warmer southward flow. In other words, these models exhibit a systematic smaller temperature contrast between the upper and lower layers, which leads to a weaker heat transport for a given AMOC transport (as discussed in section 4.2.2). On the other hand, despite significant variations in the upper layer, the hist-1950 simulations show no such systematic bias in the characteristic S in the northward and southward flows (in the upper layer, there could be some compensation between fresher near surface water and saltier thermocline water, as in Figure 3). For a comparison, the black line/dots are results from the (1/12)°Atlantic simulation in (Xu et al., 2016) in which the water mass structure of the AMOC and subtropical gyre is shown to be consistent with the observations. Circles are eddy-none model resolutions, triangles are eddy-present and stars are eddy-rich.

Heat Transport
While the volume transport associated with AMOC is clearly important for the ocean circulation and water masses, the meridional heat transport associated with both the transport and the flow temperatures is key for the Atlantic and Arctic climate state and surface fluxes (Docquier et al., 2019;Grist et al., 2018;Jackson et al., 2015). The relationship between annual mean AMOC transport and heat transport (total, gyre, and overturning components), for models using hist-1950 (1950-2014) and for RAPID-MOCHA (2004-2017, is shown in Figure 7. For the models with LR (around 100 km), it is rare for the total NHT to lie within the observed range (1.05-1.4 PW) apart from CESM1-3-LL, and this improves as ocean resolution increases to 50, 25, and 8-10 km for most models, as also found in other studies with similar models (Docquier et al., 2019;Grist et al., 2018).
In Figure 7 we also show a decomposition of the total transport into overturning and gyre components (McCarthy et al., 2015). For all models apart from CMCC-CM2-(V)HR4, the gyre component of the heat transport is consistent with, or slightly larger, than that observed (black triangles) at all model resolutions. In the CMCC-CM2 models the gyre component is very small, which is consistent with a reduced umo component (as shown in Figure 4) and a regime dominated by the overturning circulation (McCarthy et al., 2012). The total NHT is the sum of gyre and overturning components in Figure 7, and hence, it follows that the main cause of the too weak NHT across most models is due to the overturning.

Journal of Advances in Modeling Earth Systems
For a given AMOC transport, the associated heat transport as shown in Figure 7 is consistently smaller than that observed, apart from CESM1-3-HH. This is expected given that the modeled temperature contrast between the northward limb ( Figure 6b) and the southward limb (Figure 6c) is also smaller than observed, perhaps with the exception of EC-Earth3P where the peak AMOC is higher in the column (Figure 2). The zonal mean biases in Figure 3a are broadly consistent with this, with most of the models (CMCC-CM2 and CESM1-3-HH being exceptions) being too cold near the surface by up to 3-4 K and too warm in the thermocline between 500 and 1,500 m and deeper. This systematic model offset compared to the observations in PW/Sv means that only models with stronger than observed AMOC span the range of observed NHT. Among the models shown here, the eddy-rich CESM1-3-HH and HadGEM3-GC31-HH models would seem to be overall the most consistent with the observations in terms of the range of AMOC and NHT, given that these simulations also exhibit the closest vertical structure of the AMOC as in Figure 2; however, note that the CESM1-3-HH model has a strong near-surface warm bias which helps to reduce the PW/Sv offset described above. The CESM1-3-LL model stands out in the LR models, suggesting how important the dense overflows are for the North Atlantic circulation .
The generally too-weak temperature difference between the northward and southward flows was also found in the GFDL-CM2.1 and NCAR-CCSM4 climate models (Danabasoglu et al., 2014;Msadek et al., 2013), with the primary cause diagnosed as an overly diffuse thermocline. Both these temperature errors and the shallower bias in vertical structure of the AMOC contribute to the NHT bias, while for CESM1-3 the warmer surface water allows the overturning component of the NHT to agree with observations. Taking a simple relationship, if the temperature difference between northward and southward flow is 10% too low (1-2 K), then the AMOC transport would have to compensate by~2 Sv to obtain a similar NHT. This is consistent with the biases seen in the models.
Given the experimental design used here, with a short multidecadal spin-up from observed initial conditions, these temperature and salinity biases may seem large. This could partly be caused by strong initial adjustment, perhaps due to inconsistencies between the initial ocean state, the forcing and the atmosphere models used. Over longer timescales (for example in a CMIP6 preindustrial control simulation) the model would tend to produce its own climatology and hence perhaps different biases from those shown here. This is an area for future work.

Variability of AMOC and NHT
The time series of the AMOC at 26.5°N from the spinup-1950, hist-1950 and highres-future experiments, for all available ensemble members, are shown in Figure 8.
For the 100 km models, some of the initial trends in hist-1950 are due to model drift common to the control-1950 experiment (as shown in Figure 9 when the hist-1950 and control-1950 time series are subtracted to give near-zero trend). For the HR models, the drift after the spinup-1950 is generally smaller so that the gradual decline over time in most of the models is also reflected in Figure 9. The impact that physics changes can make to the AMOC are illustrated by the EC-Earth3P-LR p1 and p2 models. The subannual variability of the AMOC is discussed in supporting information S3.

Response to Future Forcing
Using the HighResMIP experimental design requires a different approach to analyzing projected future changes, since as discussed above the simulations are not equilibrated in the control period. We will assume in the following that any long-term model drift is common to both the control-1950 and hist-1950/highresfuture simulations, such that the difference between these experiments will show the response to time-varying forcing. More detailed analysis of these simulations, perhaps examining heat uptake or other measures of drift and using the ensemble members where available, will be needed to assess the robustness of this assumption in general.
The change in AMOC over time, calculated as a percentage change using the year-by-year difference between hist-1950/highres-future and control-1950 time series divided by the control-1950 long-term mean, is shown in Figure 9 for each ocean resolution. The time series have been smoothed with a 30 year running 10.1029/2019MS002014

Journal of Advances in Modeling Earth Systems
mean and the ensemble mean is shown as thick lines, in an attempt to remove shorter term variability. All the simulations have a decrease in AMOC over time (though not all have a decrease over the historical period), and this tends to become more pronounced in the future period 2015-2050. All but one of the 100 km ocean models have an AMOC weakening at year 2050 of less than 20%, while all the 25 km models have a stronger than 20% weakening; all of these latter models use NEMO. It should be noted that the 50 km MPI-ESM 1-2 models have AMOC decline between the 100 km and 25 km results, but the CESM-1-3-HH model at 10 km has a smaller decline of about 10%.
The spatial patterns of change in AMOC are shown in Figure 10 for depth-space and sigma-2 space, respectively, as a percentage change relative to the mean control-1950 AMOC value at 26.5°N.
In depth-space the 100 km models have changes of around 10-20% or less, while the HR models all have larger changes than this apart from CESM1-3-HH. The differences are typically largest between 30°N and 40°N, broadly where the AMOC is maximum in the control-1950 state, and for the HR models the weakening is In sigma-2 space, the reductions in the 100 km models are less than 25% and are concentrated between 40°N and 60°N, while for HRs the changes are above 40% (apart from CESM1-3) and distributed over a wider range of latitudes. All the models show an increase in AMOC to the north and an increase in the stream function at lower densities, which can be interpreted as a shift to lighter density water entering the subpolar gyre and GIN Seas.

Journal of Advances in Modeling Earth Systems
The change in the AMOC components at 26.5°N is shown in Table 2 as a percentage of their control-1950 AMOC moc_rapid values. The main component of decline is the Florida Current fc contribution, which typically decreases by 10-20%, with all models showing a future reduction. For the LR models, some of this decline is balanced by a larger change in the Upper Mid-Ocean umo return flow (which becomes less negative), such that the total change is less than that seen in most of the HR models. CNRM-CM6-1-HR, CMCC-CM2, CESM1-3, and HR HadGEM3-GC31 models have an opposite signed umo response, which contributes to a larger total AMOC decline. Reductions in the umo component in future projections of CMIP5 models have been linked to a weakening subtropical return flow from a weakening wind stress curl (Beadling et al., 2018).
The future changes in AMOC are summarized in Figure 11 as a function of model resolution. The range of AMOC decline in the 25 km HR models lies outside that of the LR model range, for the simulations including the future projection, with CESM-1-3-HH a notable exception at 10 km resolution. This reemphasizes the need for using a diverse set of models, since all the 25 km ocean models used here are NEMO-based, so although the model science configurations and the atmospheric models are different (Table 1) there is a lack of structural uncertainty. Figure 11 also suggests that the only way to potentially explore the stronger AMOC/stronger decline regime is to use a HR model. For the NHT changes, there is some overlap between ranges from the LR and HR models (not shown), but clearly, the larger AMOC weakening will produce larger NHT reductions. Given the decisive influence of NHT on North Atlantic surface temperatures and ocean heat content (Grist et al., 2010(Grist et al., , 2018 and the influence of these later on the surrounding continents (e.g., Ghosh et al., 2017;Ossó et al., 2018;Sutton & Hodson, 2005), determining correctly the magnitude of NHT future decline can help to produce better projections of future European climate. Note. Numbers in bold indicate significant differences between the two time series at the 95% level using Welch's t test.

Discussion
The CMIP6 HighResMIP experimental design has made it possible to compare a multimodel ensemble of coupled climate models with ocean resolutions of~100 km, typical of CMIP, and HRs of 50, 25, 10, and 8 km all using the same forcing and with minimal model tuning. We summarize the key results of our analysis and discuss them below.
In the model simulations analyzed here, the AMOC typically increases at 26.5°N with a higher ocean resolution, more consistently in density-space than depth-space, with a corresponding increase in NHT. Changes to the atmosphere resolution alone give generally smaller changes and are more model dependent. These increases in AMOC and NHT with HR ocean tend to make the models agree better with transport characteristics of RAPID-MOCHA observations at 26.5°N, while as shown in Jackson et al. (2020) some properties in the subpolar North Atlantic can be better in LR models.
For most of the models used here (CNRM-CM6-1 and CESM1-3 may be the exceptions), they seem not to be capable of representing the observed AMOC strength at 26.5°N without having too much dense water formation in the Labrador and/or GIN Seas, as measured by the DMV metric. Hypotheses for this may include (1) poor representation of deep overflows, perhaps with consequences for the deeper stratification in the Labrador Sea (Yeager & Danabasoglu, 2012), which is perhaps a factor in the good CESM1-3-LL performance; (2) errors in the transport of saline water into the Labrador Sea, as described in Jackson et al. (2020); (3) inherent issues with the NEMO-based models, perhaps including bathymetry and steering of boundary currents affecting the connectivity with the subtropics; (4) errors in surface fluxes contributing to biases in water mass properties; and (5) lack of eddy activity to export the densest Labrador Sea Water Georgiou et al., 2019). Reconciling the results of Jackson et al. (2020) in which both resolutions of HadGEM3-GC31 have reasonable agreement with the OSNAP observations (Lozier et al., 2019), with the disagreement between modeled and observed DMV (and with observed mixed layer depths in the Labrador Sea), suggests that there is something more to understand with regard to how and where the water is transformed and feeds into AMOC.
The HR models tend to project a stronger decline in AMOC in the SSP585 projections to 2050 compared to coarser resolution models. This is due to a larger decrease in dense water formation in the subpolar gyre and Labrador Sea (Brodeau & Koenigk, 2016;Jackson et al., 2020) as the climate warms. The decline in these HR models is large (though not unprecedented) compared to the AMOC changes by 2050 in RCP8.5 in CMIP5 (Weaver et al., 2012). This suggests that the inclusion of HR models could lead to projections of a greater AMOC decline, although Weijer et al. (2020) suggest an increase in AMOC decline in CMIP6 models compared to CMIP5 that is not due to resolution.
Given the lack of model diversity in this study, it is not possible to be confident that either the LR or HR group of models gives a more trustworthy projection of future AMOC and NHT changes. However, from the perspective of a risk-based framework for future climate, the enhanced range of future changes from plausible model realizations is useful. The HR models in this study suggest a requirement to plan for stronger AMOC decline than do the LR models, with potential consequences for the Northern Hemisphere climate (Jackson et al., 2015). Since the HR models have improved representation of mesoscale features such as boundary currents, they may also be better tools to advise on future monitoring strategies and attempting to isolate signals of change due to external forcing.
In the small set of models and resolutions included here, it may be noteworthy that the eddy-rich models CESM1-3-HH (Small et al., 2014) and HadGEM3-GC31-HH (M. J. Roberts et al., 2019) are among the best-performing models. They generally have the best depth profiles of AMOC, the smallest zonal mean temperature and salinity biases (for HadGEM31-GC31-HH) and the most consistent NHT with observations, with CESM1-3-HH also capturing the overturning component of the NHT. From two samples it is not possible to ascribe the improved performance to resolution alone, but taken together with Griffies et al. (2015), Gutjahr et al. (2019) and Small et al. (2014) it does add evidence that advancing to similar resolutions may be key to unlocking further improvements across a range of measures and timescales (Hewitt et al., 2017).
However, we have also shown that factors other than resolution are important. CESM1-3-LL is the best-performing LR model here and uses an overflow parameterization. The EC-Earth3P model underwent some parameter changes (Versions p1 to p2), which improved the LR model performance. There are also 10.1029/2019MS002014

Journal of Advances in Modeling Earth Systems
clearly important factors other than resolution causing the large spread in AMOC mean strength and change in CMIP5 models (e.g., Stocker et al., 2013, Figure 12.35), and hence, further work to understand the role of, for example, model structural uncertainty and sub-grid-scale parameterizations is also needed.
Future work should include comparing HighResMIP and CMIP6 simulations (Weijer et al., 2020) to assess the importance of the initial state and the amount of model drift on AMOC decline, as well as factors such as more complex aerosol schemes (Delworth & Dixon, 2006;Menary et al., 2013). Long preindustrial simulations will be in better equilibrium than the models here; however, this is likely at the expense of more embedded bias and a smaller range of model resolutions, though a greater model diversity. Such work should also include assessment of AMOC variability, which was beyond the scope of this study, but the current evidence that models may underestimate it at 26.5°N (C. D. Roberts et al., 2014) means that separating any emerging climate change signal is made more difficult.
For a future HighResMIP protocol, it may be useful to spawn ensemble members from different points of the control-1950 simulation, in order to test different initial states closer to equilibrium and evaluate whether the results shown here are robust.
These results emphasize the need for model diversity. The NEMO-based models have a variety of parameterizations and parameter settings (more so at HRs) but generally rely on similar tools to generate quantities like ocean bathymetry. The models also suffer from considerable biases in the Atlantic as a whole as well as the subpolar gyre, and improved understanding on how to reduce such biases (Treguier et al., 2005) may help to increase confidence in the future projections.

Conflict of Interest
There are no real or perceived financial conflicts of interests for any author.

Data Availability Statement
The data used in this work are available from ESGF (https://esgf-index1.ceda.ac.uk/search/cmip6-ceda/) via the references. Data from the CESM1-3 model produced by iHESP are currently available via Globus file transfer, from NCAR's "CESM HighResMIP contribution" endpoint and will soon be on ESGF.