Climate Models Underestimate Dynamic Cloud Feedbacks in the Tropics

Cloud feedbacks are the leading cause of uncertainty in climate sensitivity. The complex coupling between clouds and the large‐scale circulation in the tropics contributes to this uncertainty. To address this problem, the coupling between clouds and circulation in the latest generation of climate models is compared to observations. Significant biases are identified in the models. The implications of these biases are assessed by combining observations of the present day with future changes predicted by models to calculate observationally constrained feedbacks. For the dynamic cloud feedback (i.e., due to changes in circulation), the observationally constrained values are consistently larger than the model‐only values. This is due to models failing to capture a nonlinear minimum in cloud brightness for weakly descending regimes. Consequently, while the models consistently predict that these regimes increase in frequency in association with a weakening tropical circulation, they underestimate the positive cloud feedback associated with this increase.

In order to isolate cloud responses to circulation changes, Bony et al. (2004) proposed binning cloud radiative effects (CREs) by circulation regime and using this framework to decompose cloud feedbacks into dynamic (due to changes in the frequencies of the circulation bins) and thermodynamic (due to changes in the CRE for a given circulation regime) components. Applying this decomposition, Yuan et al. (2008) studied the relationship between CREs and circulation in observations and found that the dynamic component of inter-annual variability in tropical CREs was relatively small. Similarly, the global mean dynamic feedback has been found to be small in climate models (Byrne & Schneider, 2018;Wyant et al., 2006), which Byrne and Schneider (2018) attributed to a combination of mass conservation in the atmosphere and the quasi-linear relationship between CRE and circulation in CMIP5 models. On the other hand, Mackie and Byrne (2023) found that some cloud resolving models predict a non-negligible dynamic cloud feedback component, which was attributed to nonlinearities in the relationship between CRE and circulation.
In this study we compare the relationships between tropical clouds and circulation in the latest generation of global climate models to observations. We find that, while climate models provide a reasonable approximation of the observed circulation, they are unable to reproduce the observed relationships between clouds and circulation. Decomposing simulated tropical cloud feedbacks into dynamic and thermodynamic components reveals that the thermodynamic component dominates both the multi-model mean and the inter-model spread. However, constraining the model cloud feedbacks using observed relationships for the current climate results in a significant and consistent increase in the dynamic component, which suggests that the climate models may be underestimating this component and consequently the total cloud feedback in the tropics.

Methods
Similarly to previous studies (e.g., Bony et al., 2004;Byrne & Schneider, 2018), this study uses 500 hPa vertical velocity (ω 500 ) to quantify the large-scale circulation. "Observations" of ω 500 are based on reanalyses. Since ω 500 is difficult to observe and is not directly assimilated by reanalyses, three different reanalyses-ERA5, JRA55 and MERRA2-are used. This facilitates estimates of the uncertainty in the relationships between CRE and circulation due to uncertainty in ω 500 . A bin width of 2 hPa day −1 is used.
ERA5 (Hersbach et al., 2020) is the ECMWF's 5th generation global reanalysis and covers the period from 1950 to the present day. It has a spatial resolution of approximately 31 km, with 137 vertical levels for the atmospheric component. Sea surface temperatures are fixed, based on the Met Office Hadley Centre HadISST2 product from 1950 to 2007 and the Met Office OSTIA product from 2007 onwards. The following analysis uses ω 500 from the hourly analysis fields on a regular 0.25° latitude-longitude grid.
The Japanese 55-year Reanalysis (JRA55; Kobayashi et al. (2015)) is produced by the Japan Meteorological Agency. It spans 1958 to the present day. JRA55 has a resolution of approximately 55 km, with 60 vertical levels in the atmosphere. The atmospheric analysis is performed 4 times per day at 00, 06, 12, and 18 UTC. Sea surface temperatures are based on the Centennial In Situ Observation-based Estimates of the Variability of SSTs and Marine Meteorological Variables (COBE) SST data set, which has a resolution of one degree. For the following analysis, ω 500 is approximated as the level that is closest to 500 hPa at each point in space and time, before being regridded using bilinear interpolation onto a regular 0.5° latitude-longitude grid.
The Modern-Era Retrospective Analysis for Research and Applications, Version 2 (MERRA2; Gelaro et al. (2017)) focuses on the modern satellite era, encompassing 1979 to the present day. It is produced by NASA's Global Modeling and Assimilation Office and has an approximate resolution of 0.5° × 0.625°, with 72 atmospheric vertical levels. Due to the lack of a single high resolution daily SST data set, MERRA2 uses a combination of SST datasets. Vertical velocity at 500 hPa is retrieved as an hourly average quantity at the native MERRA2 resolution before being regridded using bilinear interpolation onto a regular 0.5° latitude-longitude grid.
Observed top of atmosphere (TOA) CREs are obtained from the Clouds and the Earth's Radiant Energy System Energy Balanced And Filled (CERES-EBAF; Loeb et al. (2018)) edition 4.1 data set, which is based on satellite measurements. CREs are calculated using the "clear-sky for total region" clear-sky estimates to ensure consistency with the climate models.
The reanalyses' vertical velocities and CERES-EBAF TOA CREs are matched on the regular latitude-longitude CERES-EBAF grid (i.e., 1°, monthly means) and subsequently averaged to a 2° grid to ensure they can be compared to climate models at a consistent resolution.
This study focuses on the most recent collection of climate models from the sixth phase of the Coupled Model Intercomparison Project (CMIP6; Eyring et al. (2016)). To facilitate a fair comparison with the observations, the focus here is on feedbacks derived from the difference between the atmosphere-only AMIP, which use observed sea surface temperatures, and AMIP4K simulations (Webb et al., 2017). This analysis includes all 13 models that provided AMIP4K simulations, which are listed along with some of their properties in Table 1. Qin et al. (2022) demonstrated that the cloud feedbacks calculated from these atmosphere-only simulations are highly correlated with those calculated from fully coupled simulations. Indeed, similar results to those presented here are obtained for cloud feedbacks derived from the atmosphere-ocean coupled abrupt4xCO2 and piControl simulations (Figures S12 and S13 in Supporting Information S1).
Climate model data is regridded using area-weighted bilinear interpolation onto the same regular 2° latitude-longitude grid as the reanalyses and satellite datasets. Cloud feedbacks for each climate model are calculated after the data has been regridded by dividing the change in net TOA CRE by the change in surface temperature following Cess and Potter (1988).
The subsequent analysis focuses on the tropics, which is defined as the entire region between 30°S and 30°N for the purposes of this study. Since the CERES-EBAF data set corresponds to 2001-present day and the AMIP runs correspond to 1979-2014, we must compromise between ensuring that the observed and climate model periods are consistent and minimizing the sampling uncertainty for each. For the results presented here we use 2001-2014 for observations and the entire 1979-2014 AMIP period. Using a longer observational or shorter model period has no systematic impact on our results. Figure 1a compares the AMIP simulations multi-model mean ω 500 distribution to the average obtained from the three reanalyses. The observed distributions are reasonably consistent with each other ( Figure S1 in Supporting Information S1) with a skewed distribution with weak subsidence regimes being most common, and strongly ascending regimes occurring more frequently than strongly subsiding regimes. This is consistent with previous studies (e.g., Bony et al., 2004;Su et al., 2011;Wyant et al., 2006). For the climate models, separate values are shown for the observation period and the entire historical period. These are very similar to each other and  Bony et al. (2013)

), and Longwave and Shortwave Cloud Radiative Effects
have similar distributions to the observations. However, the multi-model mean intensity of the circulation is stronger than observed (cf . Table 1), with fewer occurrences of weakly descending regions around the mode of the distributions and an increased frequency of both stronger subsidence and more strongly ascending regimes. In general, these differences are within the range of the spread between the different reanalyses. Only the GISS-E2-1-G and IPSL-CM6A-LR models have notably different ω 500 distributions ( Figure S3 in Supporting  Table 1 for the years indicated in the legend. Information S1). GISS-E2-1-G has a more intense circulation, with the probability density function (PDF) showing a much smaller mode for weakly descending regimes, while IPSL-CM6A-LR has a secondary mode for weakly ascending regimes. Figures 1b and 1c show LW and SW CREs as a function of ω 500 , respectively. In both LW and SW the observed CREs increase in magnitude with increasing ascent strength for ascending and weak subsidence regimes. For stronger subsidence regimes, the CRE shows little variability with subsidence strength. There is very little variability in these relationships across the different reanalyses and the relationships are consistent with previous studies (e.g., Bony et al., 2004;Wyant et al., 2006). However, the SW CRE shows a local minimum in magnitude for weak subsidence regimes (around 20 hPa day −1 ). This subtle feature occurs in all three reanalyses ( Figure S1 in Supporting Information S1) in all years ( Figure S2 in Supporting Information S1), and is linked to changes in the prevailing cloud regime with circulation regime changes ( Figure S9 in Supporting Information S1). It has not been highlighted by previous studies, yet it is key to the observationally constrained increase in dynamic cloud feedback that is described later.
The multi-model mean climate model distributions of CRE with ω 500 in Figure 1 are very similar for the two time periods shown. In the LW, the magnitude of the CRE is underestimated across all circulation regimes. This bias is significant (larger than the uncertainty arising from the different reanalyses) across almost all regimes, except for the infrequent most strongly ascending regimes, where the observational uncertainty increases due to the limited sampling. The bias is smallest for weak subsidence regimes, and grows with increasingly strong ascent or subsidence. In the SW, the magnitude of the CRE is overestimated in ascending and weak subsidence regimes, and underestimated in stronger subsidence regimes. With the exception of IPSL-CM6A-LR (Figures S5b and S5d in Supporting Information S1), the models underestimate or completely miss the local minimum in magnitude for weak subsidence regimes, resulting in a relatively large bias for these regimes.
We have shown that climate models are unable to reproduce the observed relationship between CREs and ω 500 in the tropics. Next we consider how this might affect the tropical cloud feedbacks predicted by these models.
Following Bony et al. (2004), the total tropical cloud feedback for each climate simulation can be decomposed into dynamic, thermodynamic and co-variation components: where C(ω) may be the SW, LW or net CRE as a function of the 500 hPa vertical velocity in the AMIP simulation, P(ω) is the simulated PDF of the 500 hPA vertical velocity, and the δC(ω) and δP(ω) represent the differences between the future climate and current climate values for the CRE and PDF of ω 500 , respectively (i.e., the difference between the value in the AMIP4K simulation and the AMIP simulation).
Using the observational analysis, observationally constrained cloud feedbacks can be calculated by replacing those terms that represent the climate models representation of the current climate with functions derived from observations as follows: Here C obs (ω) is the relationship between CRE and ω 500 in observations and P obs (ω) is the observed ω 500 PDF. The only difference between the right hand side of Equations 2 and 1 is that P and C are taken from observations instead of models; δC and δP are the same, still derived from differences between simulations. The co-variation term is unchanged. Hereafter we shall refer to the terms in Equation 1 as model-only feedbacks and the terms in Equation 2 as observationally constrained feedbacks.
Figure 2 compares net tropical cloud feedbacks derived from the AMIP4K and AMIP simulations. Focusing first on the model-only feedbacks, the thermodynamic component dominates the intermodel spread, which is consistent with previous studies (e.g., Byrne & Schneider, 2018). For the multi-model mean, the thermodynamic component is much larger than the dynamic component, which is not consistent with Byrne and Schneider (2018), which found similar magnitude dynamic and thermodynamic cloud feedback components over the tropics in 10.1029/2023GL104573 6 of 10 coupled simulations in CMIP5. However, this is subject to large sampling uncertainty due to the large spread in thermodynamic cloud feedbacks between models.
For each of the models in Figure 2, the observationally constrained and model-only thermodynamic cloud feedbacks are very similar. This is because the models predict δC(ω) terms which remain relatively constant across different ω 500 regimes. However, the observationally constrained dynamic component is consistently larger than the model-only dynamic component, with the exception of the IPSL-CM6A-LR model. The observational constraint also reduces inter-model spread in the dynamic cloud feedback, with an inter-model standard deviation of 0.04 W m −2 K −1 for the observationally constrained dynamic cloud feedback compared to a value of 0.07 W m −2 K −1 for the model-only dynamic cloud feedback. In the multi-model mean, the dynamic component increases from 0.06 to 0.14 W m −2 K −1 . The thermodynamic component increases by a negligible amount, resulting in the total cloud feedback increasing from 0.25 to 0.33 W m −2 K −1 . Note that the IPSL-CM6A-LR simulation, which is the only simulation where the observational constraint reduces the dynamic component, also comes closest to reproducing the local minimum in the magnitude of the SW CRE for weakly descending regimes (Figures S5b and S5d in Supporting Information S1).
Vertical error bars on the observationally constrained components in Figure 2 show the range of estimates from the individual reanalyses. The observationally constrained dynamic cloud feedback is larger than the model-only value for all reanalyses, with a lower limit for the multi-model mean observationally constrained feedback of 0.31 W m −2 K −1 and an upper limit of 0.38 W m −2 K −1 . A similar increase in the dynamic cloud feedbacks is obtained if the domain is limited to the Pacific ocean (i.e., 30°S-30°N, 165°E-235°E.), or for atmosphere-ocean coupled simulations (Figures S11-S13 in Supporting Information S1). We found no sensitivity to the width of the ω500 bins used to construct the C(ω) functions, for bin widths from 0.5 to 10.0 hPa day −1 .
To understand the reasons for the increase in dynamical cloud feedback due to the observational constraint, we analyze the individual terms that contribute to these differences. From Equations 1 and 2, the difference between the observationally constrained and model-only dynamic cloud feedback can be expressed as follows: Figure 2. Cloud feedback components for each of the CMIP6 AMIP4K simulations. Feedbacks are calculated following Cess and Potter (1988), based on differences between AMIP4K and AMIP experiments. Dynamic, thermodynamic and co-variation components are calculated following the decomposition proposed by Bony et al. (2004). "Obs" denotes observationally constrained feedbacks that are calculated using the observed ω500 distributions and CRE-ω500 relationships for the present climate. Vertical error bars on these "Obs" terms indicate the range of estimates obtained using each of the reanalyses individually. Figure 3a shows the difference between the observationally constrained and model-only multi-model mean SW and LW dynamic cloud feedbacks as a function of ω 500 (i.e., C obs (ω)δP(ω) − C(ω)δP(ω)). The legend shows the values of the integral over ω 500 (i.e., the total difference between the observationally constrained and model-only feedbacks). In the SW, the differences are positive for most circulation regimes, with large positive differences for most of the common weakly ascending and subsidence regimes, and only a small region around 30 hPa day −1 where the difference is negative. Consequently, the total SW difference value is relatively large (0.09 W m −2 K −1 ). In the LW, the differences as a function of ω 500 are of a similar magnitude to the SW, but the positive differences for weakly ascending and weak subsidence regimes are canceled by similar decreases for stronger subsidence regimes, resulting in a smaller total change (−0.02 W m −2 K −1 ). Figure 3b shows the separate factors that produce the lines shown in Figure 3a, that is, the δω 500 term, and SW and LW values for the C obs (ω) − C(ω) term. The multi-model mean bias in LW CRE is positive and near-constant across all circulation regimes, resulting in the near-cancellation between circulation regimes that increase and decrease in frequency seen in the upper panel. However, the multi-model mean bias in the SW CRE changes with circulation regime. It is positive for ascending and weak subsidence regimes and negative for stronger subsidence regimes, with a particularly large positive bias for weak subsidence regimes that the models predict will show the largest increases in frequency under climate change. For the regimes that are predicted to decrease in frequency, the bias in the SW CRE decreases and becomes negative, so that the total effect of these regimes is relatively small. To summarize, the increase in the net observationally constrained feedback is primarily driven by an increase in the SW. This is due to the changes in the SW CRE model bias with circulation regime. In particular, the multi-model mean difference in SW CRE between the regimes that increase and decrease in frequency as the tropical circulation weakens is too small. Consequently, the models underestimate the positive feedback due to the changes in frequency of these regimes.
The change in the SW CRE multi-model mean bias with circulation regime is due to errors in the models' representation of the observed nonlinearity, so ultimately it is this nonlinearity that causes the models to underestimate the dynamic cloud feedback. This can be demonstrated directly by calculating a linearized observationally constrained dynamic cloud feedback, where the nonlinearity in C obs (ω 500 ) is removed. This is achieved by replacing the SW C obs (ω 500 ) term in Equation 2 with a new term C lin (ω 500 ) which is obtained by using linear interpolation to generate new values for the SW CRE in weak subsidence regimes. In particular, if the values between 0 and 30 hPa day −1 are replaced by interpolating between the circulation regimes outside this range, then the linearized observationally constrained SW (and consequently net) dynamic cloud feedback is very similar to the model-only value for most climate models ( Figure S9 in Supporting Information S1).

Discussion and Conclusions
We have demonstrated that if climate model predictions of tropical circulation changes are reasonable then errors in the way that climate models represent the relationship between clouds and circulation mean that they underestimate the dynamic component of the tropical cloud feedback by 0.06-0.13 W m −2 K −1 , depending on the reanalysis used to determine the "true" relationship between clouds and circulation. We have also shown that this underestimation is driven by the failure of the models to capture the nonlinearity of the observed SW CRE-ω 500 relationship for weak subsidence regimes. A similar result is also obtained for coupled atmosphere-ocean simulations ( Figure S12 in Supporting Information S1).
The realism of the observationally constrained dynamic cloud feedbacks depends on the climate models' ability to predict changes in the tropical circulation regime PDF (δP(ω)). The decrease in circulation intensity that manifests in the δP(ω) function derived in this study is thought to be reasonably reliable; several generations of climate models have consistently predicted a weakening of the tropical circulation (e.g., Bony et al., 2013;Vecchi and Soden, 2007, Figure S6 in Supporting Information S1). Moreover, this weakening is well understood, having been shown to be caused by increases in lower-tropospheric water vapor and the dry static stability of the atmosphere (Held & Soden, 2006). However, observational records appear to show a strengthening of the Walker circulation in recent decades that is not captured by climate models. This strengthening may be due to aerosol and/or transient ocean dynamical effects (e.g., Heede & Fedorov, 2021) or models may respond incorrectly to the climate forcing (e.g., Lee et al., 2022). If models predicted a strengthening of the tropical circulation with future warming then the failure of the models to capture the observed SW CRE-ω 500 relationship as documented in this study would lead to the models overestimating the dynamic cloud feedback instead of underestimating it.
The nonlinear relationship between the SW CRE and ω 500 is thought to be linked to the occurrence of different cloud regimes. Weak subsidence regimes (10-30 hPa day −1 ), where the magnitude of the SW CRE is smallest, occur in regions where shallow cumulus cloud is typically found. Stronger subsidence regimes (30-50 hPa day −1 ), where the magnitude of the SW CRE is larger, coincide with typical stratocumulus regions ( Figure S10 in Supporting Information S1). This is due to the association between subsidence and inversion strength (Myers & Norris, 2013). Moreover, shallow cumulus cloud typically has a smaller magnitude SW CRE than stratocumulus due to having lower cloud cover (e.g., Tselioudis et al., 2021). In this context, the climate model SW CRE biases for the subsidence regimes are consistent with those previously identified for stratocumulus and shallow cumulus clouds (e.g., Crnivec et al., 2023;Konsta et al., 2022), with the climate models overestimating the brightness of shallow cumulus cloud (corresponding to 10-30 hPa day −1 ) and underestimating the brightness of stratocumulus (30-50 hPa day −1 ). Consequently, improving the representation of these cloud regimes in climate models may be a path to reducing the climate model dynamic cloud feedback biases. Indeed, IPSL-CM6A-LR, which has a notably different dynamic cloud feedback, was highlighted by Konsta et al. (2022) for having notably different low cloud biases to other climate models.
Using observations to constrain the tropical cloud feedback results in a total cloud feedback between 0.31 and 0.38 W m −2 K −1 , with a best estimate of 0.33 W m −2 K −1 , which is 0.08 W m −2 K −1 larger than the value obtained directly from the climate models. Both are within the uncertainty range of −0.06 ± 0.73 W m −2 K −1 for cloud feedback over the tropical oceans estimated by Williams and Pierrehumbert (2017). Increasing the tropical cloud feedback in line with the observationally constrained values leads to an increase in the multi-model mean ECS between 0.1 and 0.2 K (see Text S3 in Supporting Information S1 for further details).
In the future, global storm-resolving climate simulations are expected to better capture the coupling between clouds and circulation. This may lead to dynamic cloud feedbacks that are closer to the observationally constrained values documented in this study. Future work could also attempt to use natural monthly variability or longer-term trends to estimate dynamic cloud feedbacks directly from observations.

Data Availability Statement
The climate model data used in this study are available from the Earth System Grid Foundation https://esgf-node. llnl.gov/projects/cmip6/. ERA5 data are available from the Copernicus Climate Data Store https://cds.climate. copernicus.eu. MERRA2 data were downloaded from https://disc.gsfc.nasa.gov/datasets?project=MERRA-2. JRA55 were downloaded from the NCAR research data archive https://rda.ucar.edu/. CERES-EBAF data were