Persistent Model Biases in the Spatial Variability of Winter North Atlantic Atmospheric Circulation

The three leading modes of the North Atlantic atmospheric circulation explain about 70% of the winter climate variability. Although climate models generally can capture these modes, biases may induce large uncertainties in regional climate predictions. Here, we evaluate the leading winter modes simulated by CMIP5‐PMIP3 and CMIP6‐PMIP4 models from the last millennium to future scenarios in comparison with historical reanalysis and paleo‐reconstructions. The models generally have a good representation of the average spatial pattern of the North Atlantic Oscillation (NAO) while showing a larger spread in performance for the East Atlantic and Scandinavian patterns. In contrast to historical reanalysis, the simulated NAO pattern tends to be rather stationary under various climate states over the years 861–2100. Such underestimated spatial variability in the simulated NAO is directly related to the biased spatial shifts in NAO‐related regional temperature and precipitation changes, inducing uncertainties in climate projections over the North Atlantic sector.

Supporting Information may be found in the online version of this article.Yao, 2019).Shifts in the spatial structure of the NAO can further alter the NAO-related regional climate changes (Comas-Bru & McDermott, 2014;Mellado-Cano et al., 2019;Zubiate et al., 2017).Therefore, an evaluation of the simulated variability in these leading modes is important to assess the reliability of North Atlantic-European climate predictions in climate models (Deser et al., 2017;Gonzalez et al., 2019).
Comparison between paleo-simulations and paleoclimate reconstructions allows a more robust paleo-evaluation than merely comparing with the limited time period of observed natural climate variability and response to natural forcing (Harrison et al., 2015).Since paleoclimate records are generally sparse and do not allow for assessing the spatial structure of the leading climate modes (Swingedouw et al., 2017), we can use monthly and seasonally resolved climate field reconstructions for comparison with CMIP-PMIP simulations over the last millennium (Franke et al., 2017;Sjolte et al., 2018Sjolte et al., , 2020;;Valler et al., 2022).These paleo-reconstructions are generated by data assimilation with an optimized combination of proxy records and model simulations, offering a unique opportunity for a paleo-evaluation of CMIP-PMIP models for their spatial skills of simulating the leading North Atlantic climate modes.
In this study, we perform a spatial evaluation of the leading North Atlantic climate modes in the past1000 runs, historical runs, and two future scenarios of 13 CMIP-PMIP models (Table S1 in Supporting Information S1) by comparing model simulations with historical reanalysis after 1850 and paleo-reconstructions during the pre-industrial period.We also evaluate the spatial variability of the NAO because changes in its spatial structure are related to the projected patterns of NAO-related regional climate changes (Hu et al., 2022;Jung et al., 2003;Vicente-Serrano & López-Moreno, 2008).

Data Sets
The available past1000 and historical simulations from 9 CMIP5-PMIP3 and 4 CMIP6-PMIP4 models are selected for analysis (Table S1 in Supporting Information S1), with atmospheric horizontal resolutions ranging from 64 × 56 to 320 × 160 (longitude × latitude) grid points, which are commonly rather coarse to enable longer paleo-simulations.There are 11 models that also provide future projections under the scenarios of low and high emissions, Representative Concentration Pathway (RCP) 2.6 and RCP8.5 in CMIP5, Shared Socio-Economic Pathway (SSP) 126 and SSP585 in CMIP6.Limited by data availability, the past1000 runs of MPI-ESM-LR and EC-Earth3-Veg-LR are alternatively chosen from MPI-ESM-P and EC-Earth 3.1 (following the PMIP3 protocol, Zhang et al., 2021).For the evaluation of historical simulations, the version 3 of the Twentieth Century Reanalysis (20CRv3, Slivinski et al., 2019) is chosen as reference .For the comparison of past1000 simulations, we use an updated version of our previous North Atlantic climate reconstructions (Sjolte et al., 2018) over 1241-1970 (SEA18v2, methods in Text S1 and Figure S1 in Supporting Information S1) and another reconstruction covering the period since the 1600s (EKF400v2, Valler et al., 2022).

Data Comparison of Models, Historical Reanalysis, and Paleo-Reconstructions
Our data comparison focuses on the winter (December-February) NAO, EA, SCA patterns, which can be derived from the three leading empirical orthogonal functions (EOF) of sea level pressure (SLP) over the North Atlantic sector (20°N-70°N, 90°W-40°E) (Hurrell et al., 2003).The EOF analysis is a statistical method to identify the leading modes of variability from the eigenvectors of the cross-covariance matrix of a climate field, showing spatial patterns and corresponding time series of principal components measuring temporal variations (Hurrell et al., 2003).We use detrended SLP anomalies to conduct EOF analysis for the periods of 1851-2000, 1603-1850, and 1241-1850 to compare the models with historical reanalysis (20CRv3) and two paleo-reconstructions (EKF400v2, SEA18v2).For the comparison of the overall structure, the spatial patterns are displayed as the correlation maps between SLP fields and the standardized time series of principal components (EOF-based indices).Then, spatial similarities between the patterns in the models and reference data are quantified using Taylor diagram and Taylor skill score (Taylor, 2001): where R is the correlation coefficient between two patterns.   is the ratio between the standard deviations of two patterns.R 0 is the maximum attainable correlation, which is set to 1 for simplification.The higher the Taylor skill score, the better the spatial performance of the models.

Spatial Variability of the NAO
To track the spatial variability of the NAO, the EOF analysis is performed in moving 30-year time windows without overlaps during 861-1850 for the past1000 runs (33 windows), 1851-2000 for the historical runs (5 windows), and 2011-2100 for two future scenarios (6 windows in total).Some windows are shorter due to different length of simulations, such as the SSP126 and SSP585 runs which start from 2015.The NAO centers of action are then identified as the grid points with the largest positive and negative SLP anomalies associated with the NAO mode, and the NAO-tilt is measured by the angle between the line linking these two centers and the meridian through the southern center of action (Figure S2 in Supporting Information S1, Wang et al., 2012;Yao, 2019).To ensure the robustness of the EOF-based results, the 30-year NAO patterns during 1851-2000 are compared with teleconnection maps of SLP, an alternative way to present the leading atmospheric variability modes (Hurrell et al., 2003;Moore et al., 2013).The teleconnections show the strongest negative correlation of each grid point based on one-point correlation map with all other grid points, indicating the strengths of contemporaneous correlations within a given pressure field (Wallace & Gutzler, 1981).In the next step, the patterns of changes in near-surface air temperature and total precipitation amount associated with different tilts of the NAO are compared, which are shown as the composite anomalies between positive and negative NAO (NAO+ − NAO−).Only the years with the standardized NAO index exceeding ±1 and on the time windows with a NE-SW or NW-SE tilt greater than 10° are selected.

Spatial Patterns of the Leading North Atlantic Climate Modes
Spatial structures of the three leading EOF modes of winter SLP over the North Atlantic show typical NAO, EA, SCA patterns in the 20CRv3 reanalysis (Figure 1).The EOF modes in the two reconstructions (EKF400v2 and SEA18v2) are similar to those in 20CRv3 (Figure 1), showing their capability to reconstruct these leading climate modes (Text S2 and Table S2 in Supporting Information S1).When comparing the overall average patterns during 1851-2000 with 20CRv3, most of the models show a good performance in reproducing the NAO pattern, with correlation coefficients over 0.90 in Taylor diagrams (Figure S3 in Supporting Information S1) and Taylor skill scores over 0.80 (Figure 2a).The performances for the EA and SCA patterns are not as good as for the NAO, showing more scattered distributions in Taylor diagrams and lower Taylor skill scores.The skill scores for model-reconstruction comparison are lower than those for the comparison with 20CRv3 reanalysis (Figures 2b  and 2c), showing larger discrepancies across the models.Overall, the NAO, EA, SCA patterns keep their roles as the leading modes of climate variability in all the data sets during these three periods of the analysis.
For the overall mean patterns of the simulated NAO during 1851-2000, 1603-1850, and 1241-1850, the NAO centers of action are placed in different locations in different data (Figures 2d-2f).Some models show a NE-SW tilt of the NAO, while other models show a NW-SE tilt.The performance of simulated NAO relies on the modeled location of the centers of action; the models with the NAO centers of action close to those in the reference data will show high skill scores, and vice versa for the models with the NAO centers of action far from those in the reference data.For example, MIROC-ESM has a clear westward shift of the NAO, showing a lower score than other models.In addition, there is no clear correspondence between the performance and horizontal resolution of the models.EC-Earth3-Veg-LR and MRI-CGCM3 have 320 × 160 (longitude × latitude) grid points, but EC-Earth3-Veg-LR is one of the best-performing models, whereas MRI-CGCM3 shows a low score.MPI-ESM-LR is also one of the best-performing models but only with 192 × 96 grid points.

Tracking the Spatial Variability of the NAO
The leading EOF modes of SLP in the moving 30-year windows are not spatially stationary from the last millennium to future scenarios.For example, showing varying correlations when comparing with the patterns in the last 30 years of historical period (Figure S4 in Supporting Information S1).Overall, EOF1 is more stable than EOF2 and EOF3, showing the presence of the NAO in most of time windows.The movement of the NAO centers of action is further tracked by mapping their locations in all the 30-year time windows (Figure 3).The northern center generally moves between Greenland and Scandinavia, while the southern center exhibits larger zonal variations between the Azores and the Iberian Peninsula in all data sets.However, some models tend to persistently have a biased location of the NAO centers of action compared with 20CRv3.For instance, MIROC-ESM and MIROC-ES2L prefer to place the northern center in Greenland and the southern center around 30°W, showing an overall westward shift of the NAO pattern.Also, MRI-CGCM3 tends to place the northern center over Scandinavia, showing an eastward shift of the NAO pattern.These models have lower skill scores for their overall average patterns during 1851-2000 than other models (Figure 2).
Angles between two NAO centers allow us to quantify the changes in NAO-tilt (Figure S5 in Supporting Information S1).During 1851-2000, the NAO dipole in 20CRv3 swings between two types of tilt, 1851-1880 (NW-SE), 1881-1910(NW-SE), 1911-1940(NE-SW), 1941-1970(NW-SE), and 1971-2000 (NE-SW).However, none of the models can reproduce the same changes as seen in 20CRv3, showing either only one tilt in all time windows, no tilt in some time windows, or with the opposite tilt to 20CRv3.To ensure that these results are not statistical artifacts from the EOF-approach, we also compare the 30-year EOF-based NAO modes with SLP teleconnections in two historical runs with the best-performing overall average NAO during 1851-2000 (Figure S6 in Supporting Information S1).The NAO centers of action align with the maximum teleconnections of SLP, confirming the robustness of the shifts in NAO-tilt.In addition, the simulated NAO patterns are more stationary than 20CRv3, with standard deviation of the angle of NAO-tilt ranging from 6° to 30° less than that of 20CRv3, corresponding to an underestimation spanning from 18% to 86% relative to the standard deviation of 20CRv3 (Figure S7 in Supporting Information S1).The NAO patterns in 7 models exhibit persistent preferences for one type of tilt with a higher frequency of occurrence more than twice of the other tilt on all the time windows throughout their past1000, historical and future simulations under various climate states over 861-2100 (Table S3 in Supporting Information S1).Such underestimated spatial variability in the NAO pattern can potentially cause biases in the NAO-related climate conditions (Section 3.3).
As mentioned above, the EOF2 and EOF3 are not as stable as EOF1, which means that the EA and SCA patterns cannot always be significantly separated (Figure S4 in Supporting Information S1).For the reconstructions, noise  (North et al., 1982).
in the proxy data or biases in the assimilation procedure may lead to a mix of these two patterns.For the models, the required length of time window to capture these two patterns may vary over time due to the internal variability, while the length of time window in our analysis is fixed to 30 years.It is also possible that the simulated EA and SCA patterns do not exist as persistently as the NAO pattern due to model biases.In addition, models have problems capturing the phases of these three leading modes even in the historical runs with the best-performing overall average patterns (EC-Earth3-Veg-LR, MPI-ESM-LR) (Figure S8 in Supporting Information S1).Since the multidecadal mobility of the NAO and its associated regional climate changes are linked to the combined states of the EA and SCA phases (Comas-Bru & McDermott, 2014;Moore et al., 2013), these uncertainties of the EA and SCA patterns might potentially induce additional biases into the spatial variability in the simulated NAO.

Regional Climate Changes Associated With the Shifted Spatial Patterns of NAO
The NAO-related regional climate changes in 20CRv3 exhibit different spatial patterns when the NAO dipole shows the NE-SW tilt versus the NW-SE tilt (Figure 4).The NAO-tilt is associated with the changes in pressure field, the NAO-related heat and moisture transport also changes accordingly with the NAO-tilt.Thus, the NAO with NE-SW tilt exerts a stronger amplitude of the near-surface air temperature in northern Europe, while the amplitude of the temperature over Greenland is strengthened with the NW-SE tilt.Precipitation patterns also shift with the changes in atmospheric pressure connected to the NAO-tilt.The EKF400v2 reconstruction shows similar shifts in temperature associated with the NAO-tilt as in 20CRv3 but with smaller precipitation amplitudes.The SEA18v2 reconstruction exhibits a rather stationary pattern, although its EOF-based NAO modes do have two tilts (Table S3 in Supporting Information S1).The reason is that only Greenland ice cores have been assimilated in winter and they can potentially skew the variability toward a fixed spatial structure (Text S2 in Supporting Information S1).
To investigate how the overall spatial skills and tilting features can affect the regional climate changes associated with the simulated NAO, we select four models for comparison.Two models have the overall best-performing NAO in historical simulations (Figure 2a), showing NE-SW (EC-Earth3-Veg-LR) and NW-SE (MPI-ESM-LR) tilt (Table S3 in Supporting Information S1).The other two models have lower skill scores, with preference for NE-SW (MRI-CGCM3) and NW-SE (MIROC-ES2L) tilt.All these four models reproduce similar patterns of the NAO-related temperature changes in their past1000 experiments as those in 20CRv3, showing stronger amplitudes over northern Europe and Greenland when the NAO shows the NE-SW and NW-SE tilt, respectively (Figure S9 in Supporting Information S1).Similar patterns also occur in historical runs and future scenarios although with less certainty than the past1000 runs.It could be that the period of 100-150 years with limited sampling years are insufficient to capture the climate signal as effectively as 1,000 years.The simulated NAO-related precipitation patterns also show different shifts between the NE-SW and NW-SE tilt (Figure 5).The two better-performing models (EC-Earth3-Veg-LR and MPI-ESM-LR) show similar patterns to those in 20CRv3 with the same tilt.The other two models (MRI-CGCM3 and MIROC-ES2L) present overall eastward and westward shifted patterns, aligning with their biases in the NAO (Figure 3).Precipitation patterns in these four models all bear strong resemblances to their NAO with a preference for the same tilt from the past, present, and future.These results indicate that the models generally can capture the climate anomalies associated with their preferred tilt of the NAO.However, non-resolved variability in the location of the NAO centers of action can potentially introduce shifts in projected temperature and precipitation anomalies.If the projected NAO happens to exhibit a wrong tilt, the projected temperature and precipitation patterns might also be altered to an incorrect direction.(North et al., 1982).There are fewer visible markers in SEA18v2 reconstruction because it has a rather fixed NAO structure with overlapping markers.

Discussion
As an intrinsic climate variability mode, the NAO is affected by stochastic processes in the atmosphere and external forcing (Hurrell et al., 2003).Consequently, accurate simulation of its realistic variability is difficult (Smith et al., 2016).Although the 13 CMIP-PMIP models evaluated in this study have an overall good performance in reproducing the overall average NAO pattern, the centers of action in 9 models fail to capture the same tilt as 20CRv3 reanalysis on more than half of the 30-year time windows during 1851-2000 (Figure S5 in Supporting Information S1).The angles of the NAO-tilt in all 13 models have standard deviations ranging from 18% to 86% lower than 20CRv3 (Figure S7 in Supporting Information S1), showing an underestimated spatial variability.Shifts in the NAO-tilt may be linked to the NAO-phase (Wang et al., 2012), the eastward movement of the NAO from the period of 1958-1977 to the period of 1978-1997 is accompanied with the shift from the strong negative NAO to positive NAO (Jung et al., 2003;Luo et al., 2010).But such correspondence remains uncertain due to the lack of statistical significance (Vicente-Serrano & López-Moreno, 2008).
The NAO is influenced by atmospheric-oceanic interactions (Hurrell et al., 2003;Pan, 2005;Rodwell et al., 1999).Sea surface temperature can modulate the zonal movement of the NAO through the Atlantic Multidecadal Only the data significant at the level p < 0.05 are plotted.A bootstrapping test of 1,000 times of random sampling is applied for 20CRv3, a two-tailed student-t test is applied for EKF400v2 and SEA18v2.The red stars mark the two centers of sea level pressure differences between the NAO+ and NAO−, serving as an extra indicator for the tilt of the NAO.
Oscillation (Börgel et al., 2020).Biases in the interactions between sea surface temperature and the NAO (Jing et al., 2020) and the absence of significant Atlantic Multidecadal Oscillation (Mann et al., 2020) can potentially explain the shifts in the location of NAO centers of action in models.In addition, in 20CRv3 and two best-performing historical simulations, the zonal wind at 200 hPa presents a similar spatial shift with the NAO-tilt (Figure S10 in Supporting Information S1), indicating the potential linkage between the movements of the NAO and jet streams (Yao, 2019).In CMIP models, although the representation of jet streams and storm tracks has undergone an improvement with reduced amplitude of the biases in CMIP6 than CMIP3 and CMIP5, spatial biases still exist (Harvey et al., 2020;Priestley et al., 2020).These biases are also potentially associated with the biased spatial shift of the NAO, because the variability of NAO reflects the storm track activity over the North Atlantic region (Löptien & Ruprecht, 2005;Vallis & Gerber, 2008).
The preferences of the CMIP-PMIP models for showing a persistent spatial structure of the NAO in their past1000, historical, future simulations highlight the importance of the structural differences in the models, which contribute larger uncertainties of the winter NAO projections than internal variability (McKenna & Maycock, 2021).Overall, our results indicate that the CMIP-PMIP models evaluated in this study can reproduce the climatology of the NAO pattern and its associated regional climate changes.However, the rather persistent NAO-tilt in models highlights a major concern for the reliability of regional climate predictions.The projected temperature and Only the data significant at the level p < 0.05 are plotted.A two-tailed student-t test is applied for the past1000 runs.A bootstrapping test of 1,000 times of random sampling is applied for historical and future simulations.The red stars mark the two centers of sea level pressure differences between the NAO+ and NAO−, serving as an extra indicator for the tilt of the NAO.
of modeled North Atlantic climate modes over the years 861-2100 with reanalysis and reconstructed data • The North Atlantic Oscillation (NAO) in 13 Coupled Model Intercomparison Project-Paleoclimate Modelling Intercomparison Project models shows underestimated spatial variability compared to historical reanalysis • Projections of regional temperature and precipitation may show biased patterns due to the underestimated spatial shifts in simulated NAO Supporting Information:

Figure 1 .
Figure 1.Spatial patterns of the three leading empirical orthogonal functions of winter sea level pressure (SLP) in historical reanalysis (20CRv3) and paleo-reconstructions (EKF400v2, SEA18v2) during three periods for comparison.The modes are characterized as the North Atlantic Oscillation (upper panel), East Atlantic Pattern (middle panel), and Scandinavian Pattern (lower panel).Spatial patterns are displayed as the correlation maps between SLP fields and the standardized time series of principal components.Only the data with a significance level p < 0.05 are shown.The percentage on the upper right corner shows the explained variance of each mode, with * indicating that this mode is significantly separated from its neighboring modes(North et al., 1982).

Figure 3 .
Figure 3. Spatial variability of the North Atlantic Oscillation (NAO) dipole during past, present, and future.Triangles and circles represent the northern and southern centers of NAO action, respectively.The moving 30-year time windows are color-coded by the starting years.The three numbers in the bracket at the top of each panel show the number of time windows during 861-1850 (past1000 runs), 1851-2000 (historical runs), and 2011-2100 (future scenarios).The number for 2011-2100 includes two different scenarios, RCP2.6/SSP126 and RCP8.5/SSP585 (marked with *).The hollow symbols with × denote that the NAO patterns in these time windows are not significantly separated from the neighboring modes(North et al., 1982).There are fewer visible markers in SEA18v2 reconstruction because it has a rather fixed NAO structure with overlapping markers.

Figure 4 .
Figure 4. Spatial patterns of the changes in winter (DJF) near-surface air temperature (tas, unit: °C) and total precipitation amount (prcp, unit: mm/DJF) associated with the North Atlantic Oscillation (NAO) in historical reanalysis (20CRv3) and paleo-reconstructions (EKF400v2, SEA18v2).These associated patterns with the northeast-southwest and northwest-southeast tilt of NAO are shown as the composited differences between the NAO+ and NAO− (NAO+ − NAO−).Only the data significant at the level p < 0.05 are plotted.A bootstrapping test of 1,000 times of random sampling is applied for 20CRv3, a two-tailed student-t test is applied for EKF400v2 and SEA18v2.The red stars mark the two centers of sea level pressure differences between the NAO+ and NAO−, serving as an extra indicator for the tilt of the NAO.

Figure 5 .
Figure 5. Spatial patterns of the changes in winter (DJF) total precipitation amount (prcp, unit: mm/DJF) associated with the North Atlantic Oscillation (NAO) in four experiments of the four selected Coupled Model Intercomparison Project-Paleoclimate Modelling Intercomparison Project models.These associated patterns with the northeast-southwest and northwest-southeast tilt of NAO are shown as the differences between the NAO+ and NAO− (NAO+ − NAO−).Only the data significant at the level p < 0.05 are plotted.A two-tailed student-t test is applied for the past1000 runs.A bootstrapping test of 1,000 times of random sampling is applied for historical and future simulations.The red stars mark the two centers of sea level pressure differences between the NAO+ and NAO−, serving as an extra indicator for the tilt of the NAO.