On the representation of atmospheric circulation modes in regional climate models over Western Europe

Atmospheric circulation is a key driver of climate variability, and the representation of atmospheric circulation modes in regional climate models (RCMs) can enhance the credibility of regional climate projections. This study examines the representation of large‐scale atmospheric circulation modes in Coupled Model Inter‐comparison Project phase 5 RCMs once driven by ERA‐Interim, and by two general circulation models (GCMs). The study region is Western Europe and the circulation modes are classified using the Promax rotated T‐mode principal component analysis. The results indicate that the RCMs can replicate the classified atmospheric modes as obtained from ERA5 reanalysis, though with biases dependent on the data providing the lateral boundary condition and the choice of RCM. When the boundary condition is provided by ERA‐Interim that is more consistent with observations, the simulated map types and the associating time series match well with their counterparts from ERA5. Further, on average, the multi‐model ensemble mean of the analysed RCMs, driven by ERA‐Interim, indicated a slight improvement in the representation of the modes obtained from ERA5. Conversely, when the RCMs are driven by the GCMs that are models without assimilation of observational data, the representation of the atmospheric modes, as obtained from ERA5, is relatively less accurate compared to when the RCMs are driven by ERA‐Interim. This suggests that the biases stem from the GCMs. On average, the representation of the modes was not improved in the multi‐model ensemble mean of the five analysed RCMs driven by either of the GCMs. However, when the best‐performed RCMs were selected on average the ensemble mean indicated a slight improvement. Moreover, the presence of the North Atlantic Oscillation (NAO) in the simulated modes depends also on the lateral boundary conditions. The relationship between the modes and the NAO was replicated only when the RCMs were driven by reanalysis. The results indicate that the forcing model is the main factor in reproducing the atmospheric circulation.


| INTRODUCTION
Climate models are tools to both understand the climate and make future climate projections. General circulation models (GCMs) are applicable when studying the climate system on a global scale. They have a coarse horizontal resolution that is typically not sufficient to study local and regional climates. Thus, the GCMs are downscaled to finer horizontal resolution either statistically or dynamically (Benestad, 2016). The downscaled GCMs enable the understanding of local and regional climate responses to global change. In statistical downscaling, the empirical relationship between climate variables at smaller and larger scales (e.g., precipitation and sea level pressure [SLP]) is used to downscale a climate variable (Wilby and Wigley, 1997). In dynamical downscaling (i.e., the focus of this study), a GCM or reanalysis data provides the lateral boundary conditions for a regional climate model (RCM). The dynamical downscaling method incorporates both the physics and statistics of the climate system to obtain regional climate information. Thus it is considered a better alternative to statistical downscaling (Rosen, 2010). A more detailed characterization of regional climate can be provided by RCMs (Paeth and Diederich, 2011). Nonetheless, RCMs inherit biases from the GCM data that provides the lateral boundary conditions (Prein et al., 2019). A good representation of largescale atmospheric circulation modes in RCMs will enhance the accuracy of the future climate projections made with the RCMs (Fernandez-Granja et al., 2021). Thus, within the regional context of Western Europe, this study addresses the representation of large-scale atmospheric circulation modes in a suite of European Coordinated Regional Climate Downscaling Experiment (EURO-CORDEX) RCMs.
The classification of atmospheric circulation modes, using climate simulations, commonly employs GCMs (e. g., Huth, 2000;Sheridan and Lee, 2010;Ibebuchi, 2022a). Studies have identified that GCMs are capable to simulate the large-scale atmospheric circulation modes as observed, though with biases (Cannon, 2020;Herrara-Lormendez et al., 2022;Ibebuchi, 2022a), such as lack of blocking in the North Atlantic sector and misrepresentation of the North Atlantic westerlies (Simpson et al., 2020). The aforementioned circulation biases are regional constraints in studying and projecting changes in dynamically downscaled surface variables such as precipitation (e.g., Zhang and Soden, 2019;Fernandez-Granja et al., 2021). Evaluation of RCMs commonly consists of a systematic comparison between temporal and spatial distributions of observed and simulated statistics of climate variables (e.g., Jacob et al., 2007;Paeth, 2011;Ibebuchi et al., 2022). Though the added value of dynamical downscaling towards a better representation of largescale circulation features is unclear (Prein et al., 2019); nonetheless, through better representation of smallerscale processes such as orography, improvements, for example, in rains shadow effects, have been reported (Clark et al., 2010). However, there are also concerns that RCMs might not be physically consistent with the driving GCM because increasing resolution might alter the structure of climate parameters, such as wind field (Benestad, 2016). Within the regional context of North America, Prein et al. (2019) reported that biases from GCMs can significantly impact the representation of weather types in RCMs. de Castro et al. (2007) found that RCMs can fairly reproduce climate regimes in Europe. Landgren et al. (2013) found that the RCM choice and the classification method used constrained the capability of the RCMs to replicate the observed atmospheric circulation modes over Scandinavia. This study uses the fuzzy obliquely rotated T-mode (i.e., variable is a time step and observation is a specific grid point) principal component analysis (PCA) (Richman, 1981(Richman, , 1986Huth, 1996;Compagnucci and Richman, 2008;Ibebuchi, 2022aIbebuchi, , 2022b to classify large-scale atmospheric circulation modes in Western Europe. Studies have applied T-mode PCA to classify large-scale atmospheric circulation modes in Europe (e. g., Huth, 1996;Huth et al., 2008). However, the representation of atmospheric circulation modes in the EURO-CORDEX RCM ensemble, based on the data providing the initial and lateral boundary conditions for the RCMs and the quality of the RCMs, and analysed from the aspect of (fuzzy) synoptic classifications of circulation types (CTs), has not been addressed. Hence using atmospheric modes from ERA5 reanalysis data as reference, the focuses of this study are on the representation of the circulation modes in five EURO-CORDEX RCMs with respect to (i) the data providing the lateral boundary condition-here ERA-Interim and two GCMs are used for the same set of RCMs; (ii) and the choice of RCM. Also, the physical relationship between North Atlantic Oscillation (NAO) and the simulated modes is investigated. There have been reports of specific scenarios (e.g., quality of model combinations) where the multi-model ensemble might either improve or constrain the underlying simulated physics in the model output (Tebaldi and Knutti, 2007). Hence, this study goes further to address this concern by examining if the classified large-scale atmospheric circulation modes in Western Europe are either well-represented or misrepresented in the multimodel ensemble mean of the participating RCMs.
2 | DATA AND METHODOLOGY 2.1 | Data SLP and 850 hPa specific humidity and wind vector reanalysis data that are physically consistent are obtained from ERA5 (Hersbach et al., 2020). The horizontal resolution of the ERA5 data is 0.25 longitude and latitude. Simulated SLP data is obtained from five EURO-COR-DEX CMIP5 RCMs  driven by the MPI-ESM-LR and CNRM GCMs. Table A1 contains an overview of the climate simulations. Also, three of the five RCMs that are available in the longer time frame (i.e., the simulations are available from 1979) with the initial and lateral boundary conditions provided by ERA-Interim are selected. Since ERA-Interim is based on the combination of different observations and numerical short-term weather forecasts, the simulations are more consistent with the observed climate. Hence evaluating the quality of the RCMs when the lateral boundary condition is provided by ERA-Interim provides further information on whether the biases stem from the GCM or due to the weaknesses of the RCM. The RCMs have a horizontal resolution of 0.11 longitude and latitude. Several studies have evaluated the performance of a range of climate variables from the selected GCM-RCM combinations in Table A1 over different parts of Europe (e.g., Feldmann et al., 2008;Teichmann et al., 2013;Müller et al., 2018;Vautard et al., 2021) and found that though with biases, such as the misrepresentation of atmospheric blocking frequency over Europe, sea surface temperature biases, misrepresentation of teleconnections such as the NAO, orographic effects, and cloud parameterizations, the climate models can nonetheless represent the European climate. All data sets are obtained for the 1979-2005 period when there is overlap, and at a daily temporal resolution. Bilinear interpolation is used to interpolate the SLP fields to a common 0.25 longitude and latitude.

| Method
To classify the CTs in Western Europe (15 W to 19 E, and 30 to 60 N), obliquely rotated PCA is applied to the T-mode standardized SLP data sets (Richman, 1981;Huth, 1996;Ibebuchi, 2022aIbebuchi, , 2022b. The spatial extent designated as Western Europe is selected to capture synoptic features (e.g., North Atlantic anticyclone; the Northern Hemisphere mid-latitude cyclones, etc.) and adjacent Oceans that are also within the spatial extent of the EURO-CORDEX domain. Singular value decomposition is used to obtain the PC scores, eigenvectors and eigenvalues. The eigenvectors localize in time the spatial patterns captured by the PC scores (Compagnucci and Richman, 2008). To make the eigenvectors responsive to rotation they are multiplied by the square root of the corresponding eigenvalues (Richman and Lamb, 1985)-the output is the PC loadings that can be longer than a unit length. The PC loadings are obliquely rotated iteratively using Promax (Hendrickson and White, 1964) at a power of 2 and above (up to 4) and keeping at least two components. According to Richman (1986), the PCs are rotated to reflect the patterns embedded in the similarity matrix (i.e., the correlation matrix). Hence, the number of rotated components and Promax power at which all the rotated PC loadings match the correlation vector (i.e., from the correlation matrix) that indexes the highest loading magnitude at that particular PC loading vector with a congruence coefficient (Equation 1) of at least 0.92 (i.e., the threshold that designates a good match) is designated as an optimal Promax power and the optimal number of components to retain.
r c is the congruence coefficient; X and Y are two distinct (PC loading) vectors. Further, the largest number of component and the Promax power at which all the rotated components has the largest congruence match with the correlation patterns is finally selected as the most optimal. The modes are expected to resemble the correlation patterns to pass the test for physical interpretability (Richman, 1981).
The oblique rotation relaxes orthogonality constraint in the PC scores and maximizes the number of near-zero loadings; hence the retained and rotated components have a simple structure that is physically interpretable (Richman, 1981). For the retained components a hyperplane threshold of ±0:2 (Richman and Gong, 1999) is used to separate loadings within the zero-interval from signal. Hence each retained component forms two classes with loadings above and below the hyperplane threshold. A component comprising both the positive and negative phase of the loadings is defined as the mode, whereas the SLP composite of the days assigned to a given phase of the mode (i.e., days with loadings above the hyperplane threshold) is defined as the CT or map type. A day can also be grouped under more than one CT, using the hyperplane threshold width to define signal and the probability of group membership. Thus, an overlapping solution is possible, implying that a day can be assigned to more than one CT insofar as the loadings under the CT in question are associated with signal, that is, the loadings are outside the hyperplane. The preference of an overlapping solution is due to the continuous nature of atmospheric circulation patterns.
The map types (SLP composite) and time series (i.e., PC loadings) as classified from ERA5 are used as a reference and then matched with their counterparts as classified from the climate models using the congruence coefficient (Equation 1) as a measure of goodness-ofmatch. The congruence match measures both the phase and the amplitude of the vectors that are compared (Richman, 1986). Following Richman (1986) the congruence coefficients are defined as follows: 0.98-1.00 (excellent match); 0.92 to <0.98 (good match); 0.82 to <0.92 (borderline match); 0.68 to <0.82 (poor match); <0.68 (terrible match). The comparison is done between the classification output from ERA5 and (a) that of individual RCMs driven by ERA-Interim; (b) individual RCMs driven by MPI-ESM and CNRM, respectively; (c) the multi-model ensemble mean of the RCMs driven by reanalysis and by the GCMs, respectively. Thus, the analysis aims to measure the sensitivity of the results to (a) the choice/quality of the RCM; (b) uncertainty introduced by the choice of GCM; (c) reduction of inter-model uncertainties using the multi-model ensemble mean. Also, the representation of the relationship between climate drivers such as the NAO and the classified modes is investigated.

| RESULTS AND DISCUSSIONS
3.1 | Capability of the climate models to replicate the observed atmospheric circulation modes By matching the rotated PC loadings from the ERA5 and the climate simulations, respectively, to the correlation vectors that they are indexed to, the first four Promax rotated components all have congruence coefficients greater than 0.92. At a Promax power of 2, the highest magnitude of congruence matches for all the four components was attained. Thus, the four optimal components rotated at a Promax power of 2 are analysed. The optimal Promax power can be related to Table A2 which shows that the modes do not so much deviate from orthogonality since the maximum off-diagonal correlation between the PC scores is 0.04 and 0.1 at a Promax power of 4 (not shown). Figures 1-4 show the classified CTs when the classification is applied to the RCMs driven by ERA-Interim, the GCMs (i.e., MPI-ESM and CNRM), and the multi-model ensemble mean of the RCMs, respectively. The simulated CTs are compared to the same ERA5 CTs that is used as the reference. The ERA5 CTs in Figure 1 are similar to the synoptic weather patterns over Europe and the North-East Atlantic as detected by James (2007) using the classical Grosswetterlagen of Hess and Brezowsky (1952). Overall, the map types were replicated in each case with one-to-one correspondence, as obtained from ERA5. There are biases (i. e., mismatch in the isopleths of the maps) specific to both the choice of RCM and the GCM-RCM combination. To quantify these biases, that is, how well the simulated map types match with the ERA5 map types, Table 1 shows the congruence coefficients between the anomaly map types from ERA5 and the simulated anomaly map types from the climate models. First, it can be seen that the congruence match is highest for the CTs classified from RCMs driven by ERA-Interim compared to when the GCMs provide the lateral boundary condition. This is an indication that the major biases constraining the model chain in replicating the map types stem from the driving GCM. A similar result was reported by Herrara-Lormendez et al. (2022) that in Europe, there is a better agreement in the representation of synoptic circulations among reanalysis products compared to GCMs. Also, Table A3 and Figure A1 show that the explained variance of the analysed components from ERA5 and the climate models are quite close and comparable across the RCMs driven by a given data. The major uncertainty arises when the data driving the RCM is changed, mostly for the first retained component that explains most of the variability.

| Representation of the atmospheric circulation modes in the RCMs driven by ERA-Interim
When the RCMs driven by ERA-Interim are considered in Table 1 and Figure 1, there are indeed indications that the choice of RCM can introduce errors in the map types. Specifically, the COSMO model underperformed in representing the negative phase of mode 4 (i.e., CT4−) which is associated with the dominance of an anticyclone over the majority of the study region. A cursory investigation of the maps of CT4− in Figure 1 indicates that this appears to be due to the extent of the magnitude of the high pressure over Germany and the adjacent regions, under the COSMO RCM. Other models also have some shortcomings (e.g., RACMO). Investigating the other simulated maps in Figure 1, relative to ERA5, reveals some disparities in the isopleths; nonetheless, the maps mostly match well in good to excellent range (Table 1) Figure 4 indicate that in the F I G U R E 1 Circulation types from ERA5 and the regional climate models driven by the ERA-Interim. The circulation types are the SLP composites of the days grouped under a given class [Colour figure can be viewed at wileyonlinelibrary.com] multi-model ensemble mean of the RCMs driven by ERA-Interim, on average, the representation of the classified modes slightly increased. Hence it can be inferred that in this case (i.e., driving the RCMs with data closer to observation) combining models might increase the skill, reliability, and consistency (Tebaldi and Knutti, 2007). Moreover, Palmer et al. (2005) reported that predictions for modes of atmospheric variability such as the El Niño Southern Oscillation (ENSO) were improved in multi-model ensembles compared to singlemodel forecasts.
From Table 2, the loadings (amplitude) of mode 1 and mode 2 match mostly in the good range with ERA5 loadings. The accuracy relatively drops under mode 3 and mode 4. The plausible reason can be that the first two modes (and their temporal evolution), which explain 56% of the variability in the SLP field (Table A3), are well represented in the climate models, compared to modes that explain lesser variability (which can be relatively prone to be contaminated by noise). Figure 5 shows that the inter-annual variability in the amplitude of the modes from ERA5 is in phase with the simulated modes-that is, when ERA-Interim provides the lateral boundary condition. Further, from Table 2, on average, the multi-model ensemble helps in reducing the model uncertainties in the temporal variations of the amplitudes of the simulated modes.

| Representation of the atmospheric circulation modes in the RCMs driven by GCMs
From Table 1, when MPI-ESM and CNRM provide the lateral boundary condition, the congruence matches between the simulated and observed maps are mostly within the borderline range to the terrible range. On average, CNRM performs better than MPI-ESM, which has been reported to exhibit circulation biases over Europe (e.g., Müller et al., 2018). Also, systematic deficiencies across CMIP GCMs have been reported (e.g., Cannon, 2020;Simpson et al., 2020). The representation of the modes in the GCM-RCM combination performs differently across the RCMs. Thus, even when the quality F I G U R E 2 Circulation types from ERA5 and the regional climate models driven by the MPI-ESM GCM. The circulation types are the SLP composites of the days grouped under a given class [Colour figure can be viewed at wileyonlinelibrary.com] of the RCMs is evaluated (and confirmed to represent the modes as observed), the driving GCM can introduce errors dependent on the GCM-RCM pairing (e.g., Fernandez-Granja et al., 2021). For example, from Table 1, there are cases where the GCM-RCM combination replicates the maps in a good range for some RCMs but in less than the good range in other RCMs, with significant margins (e.g., CT1−, CT3− and CT4− for MPI-ESM and CT3+ for CNRM). The uncertainties in representing the modes across the analysed RCMs are relatively higher under MPI-ESM.
Further, from Table 1, the multi-model ensemble mean appears to follow a pattern: improving one phase of the same atmospheric mode and falling short to improve the other phase-mostly for MPI-ESM. On average, Table 1 shows that for both GCMs, the ensemble mean of the RCMs does not add value in improving the skill of the RCMs. This is unlike when ERA-Interim provides the boundary condition for the same set of RCMs. However, on average, RACMO and COSMO models indicate good performance in representing the modes when the boundary condition is provided by CNRM (Table 1); hence, the ensemble mean of both RCMs was computed. Table 1 shows that in this case, that is taking only the ensemble mean of the best performed RCMs, on average, a slight improvement in the representation of the modes is plausible. The results suggest that when the independent models do not capture the atmospheric circulation modes with congruence matches in at least a good range, overall, an ensemble of the RCMs might fall short in the representation of the modes. Perhaps, a sophisticated approach to combine the models based on weighted average-where the weights are determined based on the relationship between historical forecasts and observations (e.g., Krishnamurti et al., 2000) might be optimal compared to the unweighted mean. This is, however, beyond the focus of this study and interested readers might be referred to Krishnamurti et al. (2000) and Robertson et al. (2004). Alternatively, as introduced in this section, the models can be decomposed using the fuzzy rotated T-mode PCA, and the best-performed models are noted and pre-selected so that their ensemble mean might improve the skill of the (best performing) RCMs combined. 3.4 | Representation of climate drivers that modulate the regional atmospheric modes Finally, climate drivers such as ENSO, the NAO, and so forth, can modulate the large-scale modes of atmospheric circulation in different regions of the world. In this section, simple correlation analysis is used to examine if any of the climate drivers modulate the amplitude of the classified CTs over time. Further, it is examined if the relationship is represented in the climate models. For all the considered climate drivers (e.g., ENSO and northern F I G U R E 4 Circulation types from ERA5 and the multi-model ensemble mean output of the regional climate models driven by ERA-Interim; MPI-ESM and CNRM GCMs [Colour figure can be viewed at wileyonlinelibrary.com] T A B L E 1 Congruence match between the anomaly maps from ERA5 CTs and the corresponding CTs from the RCMs hemisphere teleconnection patterns that are available at https://www.cpc.ncep.noaa.gov/data/teledoc/telecontents. shtml), a statistically significant relationship at a 95% confidence level was found only between mode 1 and the NAO (Table 3). This is expected given that the NAO is the major mode of variability that modulates the climate of Europe (e.g., Hurrell, 1995;Ricardo et al., 2002;Scaife et al., 2008). Table 3 shows that mode 1 has statistically significant correlations with the NAO both from the ERA5 and the RCMs driven by ERA-Interim. The NAO mode is not present in the GCM-RCM combination since the daily time sequence in the amplitude of the modes does not match well with the reanalysis (cf. Table 2). Figure 6 shows that the inter-annual variations in the amplitude of mode 1 can be modulated by the anomalies of the NAO and this is represented equally in the RCMs driven by ERA-Interim. A physical justification of the relationship between mode 1 and the NAO can be confirmed in  Figure 7. The statistical correlation implies that CT1+/ CT1− (cf. Figure 1) is related to the positive/negative NAO phase. During the positive phase of the NAO, the subtropical anticyclone is located over the central part of the North Atlantic while a low-pressure system is centred over Iceland. The positive phase of the NAO is also associated with the northward shift of the mid-latitude cyclone, coupled with enhanced westerly wind over the North Atlantic. Conversely, during the negative phase of the NAO, the reverse condition is expected since westerly winds are weak, coupled with the intrusion of Arctic air into Europe. Given that these features are obvious in Figure 7, it can be concluded that the signal of the NAO is represented in mode 1.

| CONCLUSIONS
In this study, Promax rotated T-mode PCA is used to classify the large-scale atmospheric circulation modes in Western Europe. The representation of the circulation modes in Western Europe was examined in five EURO-CORDEX CMIP5 RCMs, driven by two GCMs. The classification method allows an overlapping solution (i.e., more than CTs can be grouped in a given day) considering the continuous nature of large-scale atmospheric signals. The classification method also results in physically interpretable modes (cf Figure 7) that are characterized by two asymmetric states, which are typical of atmospheric modes of variability (e.g., El Niño and La Niña which are opposing states of the ENSO mode). Thus, the method can be optimal for CT classification that can be reproduced and compared across different data sets (e.g., observations, reanalysis and climate models). The conclusions of the analysis in this study are as follows: • Overall, regardless of the data providing the boundary condition, the climate models can replicate the largescale atmospheric circulation modes as obtained from ERA5. Moreover, evaluation of the RCMs (i.e., when they are driven by ERA-Interim) results in atmospheric modes that are quite comparable to their counterparts from ERA5, suggesting that the RCMs have the skill to reproduce the atmospheric circulation modes in Western Europe. • When the RCMs are driven by GCM, the biases associated with the representation of the modes depend on (a) the choice of RCM; (b) and the data providing the lateral boundary conditions. However, in this work, the lateral boundary data from global models determine most of the RCM's ability to represent the classified modes, that is, GCMs have large deficits in simulating large-scale atmospheric circulation compared to ERA-Interim. Between the two analysed GCMs, on average, the lateral boundary conditions derived from CNRM are better suited to reproduce the correlation of large-scale patterns compared to boundary conditions from MPI-ESM. On average, the multimodel ensemble mean of the analysed RCMs slightly improved the representation of the large-scale atmospheric circulation modes when the RCMs are driven by ERA-Interim. No improvement was attained in the ensemble mean of the five RCMs driven by the GCMs. But a slight improvement was attained if only the ensemble mean of the best performing RCMs were considered. Thus, there is no guarantee that multimodel ensembles will improve the skill of the RCMs in simulating the large-scale atmospheric circulation modes. Only with careful consideration of the ensemble members, one might obtain a benefit. • The signal of climate drivers that modulate the regional atmospheric modes, such as the NAO, is present in the RCMs when driven by ERA-Interim, but absent when the RCMs are driven by the GCMs. Although the GCM-RCM chain can reproduce the climatological mean of circulation patterns, there is nearly no skill in reproducing the temporal sequence of circulation patterns.

ACKNOWLEDGMENT
Open Access funding enabled and organized by Projekt DEAL.

APPENDIX A
See Figure A1 and Tables A1-A3.