Coupled Climate Models Systematically Underestimate Radiation Response to Surface Warming

A realistic representation of top‐of‐the‐atmosphere (TOA) radiation response to surface warming is key for trusting climate model projections. We show that coupled models with freely evolving ocean‐atmosphere interactions systematically underestimate the observed global TOA radiation trend during 2001–2022 in 552 simulations. Locally, even if a simulation spontaneously reproduces observed surface temperature trends, TOA radiation trends are more likely under‐ than overestimated. This response bias stems from the models' inability to reproduce the observed large‐scale surface warming pattern and from errors in the atmospheric physics affecting short‐ and longwave radiation. Models with a better representation of the TOA radiation response to local surface warming have a relatively low equilibrium climate sensitivity. Our bias metric is a novel process‐based approach which links a model's current response to climate change to its behavior in the future.


Introduction
Virtually all future projections of climate change rely on coupled climate simulations in which ocean and atmosphere interact freely.Recent research debates why these models have difficulties to reproduce observed surface warming patterns potentially caused by errors in the simulated spatial and temporal patterns of internal variability, the forced response, or a biased representation of radiative forcing (e.g., Coats & Karnauskas, 2018;Heede et al., 2020;Heede & Fedorov, 2021;Olonscheck et al., 2020;Raghuraman et al., 2021;Raghuraman et al., 2023;Seager et al., 2019;Seager et al., 2022;Watanabe et al., 2021).When atmosphere-only models are forced with the observed time and spatial evolution of sea surface temperatures, they are able to reproduce observed interannual variations in global and local TOA radiation (Andrews et al., 2018(Andrews et al., , 2022;;Loeb et al., 2020).This is interpreted as supporting trust in the atmospheric component of the climate models (Sherwood et al., 2020).However, research has also pointed out that radiative feedbacks-the response of the global mean radiation to a certain global mean warming-are as uncertain and little-overlapping with observations in atmosphere-only models with prescribed surface warming as in fully coupled models (Chung et al., 2010;Uribe et al., 2022).Hence, it remains unclear whether coupled climate models are able to reproduce the observed coupling between TOA radiation and global and local surface warming.
Investigating the realism of the simulated coupling between TOA radiation and surface warming is challenging because we are confined to a 22-yr long observed record (NASA/LARC/SD/ASDC, 2023).Interannual variability is large within these 22 years, but a trend is clearly detectable (Raghuraman et al., 2021).However this trend is very dependent on the number of years it is computed, hence the internal variability of 22-yr trends is still large.
We tackle this problem by comparing the observational record CERES EBAF 4.2 and the gridded observational surface temperature data set HadCRUT5 in its infilled mode (Morice et al., 2021) to eleven single-model initialcondition large ensembles with each 30-100 realizations of year 2001-2022 (Deser et al. (2020), Table S1 in Supporting Information S1).Four large ensembles are CMIP5-and seven are CMIP6-generation models.Using multiple large ensembles enables us to consistently quantify agreement between models and observations, accounting for internal variability, the forced response, and differences in the model configuration.The switch from historical emissions to emission scenarios is in 2005 for CMIP5 and in 2014 for CMIP6.We use the emission scenarios RCP8.5 for CMIP5 models and SSP2-4.5 or SSP3-7.0 for CMIP6 models.The inconsistent use of scenarios is caused by the limited availability from the large ensembles and justified by very similar emissions in the global mean before year 2030 (Maher et al., 2019).We analyze the net radiation at the top-of-the-atmosphere (net TOA = rsdt rsut rlut; CMIP notation, here referred to as "TOA radiation"), and the 2 m surface air temperature ("tas", here referred to as "surface temperature").From CERES EBAF Edition 4.2, we use the TOA net flux for all-sky conditions.CERES uses a 1°× 1°resolution and we regrid HadCRUT5 and all simulations to that same horizontal grid by bilinear interpolation.
We introduce a new metric of surface-TOA coupling that differs from radiative feedbacks and shows that coupled climate models consistently underestimate observed TOA radiation trends even if they reproduce observed surface temperature trends (Section 2 and 3).This too weak local coupling between surface warming and TOA radiation is caused by errors in both the atmospheric model component itself and the simulated spatial pattern of warming (Section 4).We find that this systematic response bias in surface-TOA coupling strength correlates with the long-term climate response: less-sensitive models are able to reproduce observations at the TOA better than more-sensitive models (Section 5).

Hypotheses for Underestimated TOA Radiation Trends
The global-mean TOA radiation observed by CERES falls within the range of the simulations for all 11 coupled models for anomalies to 2001-2022 (Figures 1a-1k, Figure S1 in Supporting Information S1).Although coupled model simulations have their own internal variability, the large number of ensemble members allows a few realizations to also reproduce the observed evolution in TOA radiation, as measured by the correlation coefficient between the observations and ensemble members.However, we find that all simulations systematically underestimate the observed 2001-2022 TOA radiation trend (Figure 1l).The maximum trend (t max ) ranges from 0.18 to 0.45 Wm 2 dec 1 across models compared with 0.46 Wm 2 dec 1 in the observations.Positive TOA radiation trends indicate an increasing uptake of energy into the climate system.We here estimate the uncertainty of the observed trend as two standard errors of the linear regression of ±0.13 Wm 2 dec 1 .This estimate is in between previous estimates: Loeb et al. (2021) quantified the difference between global-mean ocean-and TOA-based radiation trends as ±0.09Wm 2 dec 1 over the period 2005-2019 (Loeb et al., 2021) whereas Raghuraman et al. (2021) estimate an observational uncertainty of ±0.20 Wm 2 dec 1 over the period 2001-2020, which is still substantially smaller than the observed trend itself.No model simulates trends higher than observed.When accounting for the observational uncertainty of ±0.13 Wm 2 dec 1 , 97% of the simulations fall out of this range, notably all on the lower side.The discrepancy between models and observations stems from both larger regions under-than overestimating the observed TOA radiation trends (33%-61% of global area with negative trends for the full range of simulations vs. 33% in the observations) and greater magnitudes of negative trends ( 0.82 to 0.59 Wm 2 dec 1 ) than observed ( 0.57 Wm 2 dec 1 ), compare Figure S2 in Supporting Information S1).We conclude that all models systematically underestimate the observed global mean TOA radiation trend.
We test five hypotheses that, in principle, could explain the observation-model discrepancy: (a) the interannual variability in TOA radiation is underestimated in all models; (b) the observed trend is an extreme outlier in the distribution of the real climate system; (c) local mean-state (climatological) biases prevent the models from simulating strong enough TOA radiation trends; (d) the effective radiative forcing is too weak in the models, and (e) the local coupling between surface warming and TOA radiation trends is too weak in the models.
We reject hypothesis (1) because the models overestimate rather than underestimate global and local interannual variability (Figures S2 and S3 in Supporting Information S1).For the global mean, the observed standard deviation across the 22 years detrended with the multi-model ensemble mean lies well within the simulated range (0.38 Wm 2 in the observation, 0.24-0.47Wm 2 in the models).Regions of strong trends tend to also show high interannual variability in both observations and models (Figure S2, in Supporting Information S1 see Supplementary text for details on quantifying variability).Simulated interannual variability is in large regions of greater magnitude than in the observations and the exact regions of large interannual variability and strongest trends differ considerably between models and between models and observations (Figure S2 in Supporting Information S1).Averaged globally, the models (2.63-3.74Wm 2 ) overestimate the observed local interannual variability (2.68 Wm 2 ) while underestimating the observed trend which is in contrast to the fluctuationdissipation theory (Cox et al., 2018;Leith, 1975).This rejects the hypothesis that the models have too little variability and therefore do only hardly include the observed trend in the ensemble spread.Note that we here evaluate the magnitude of simulated interannual variability, which differs from the simulated internal variability of trends that we account for in this study but which cannot be robustly estimated from the short CERES-observed record.

Geophysical Research Letters
Concerning hypothesis (2), the trend in observed ocean heat uptake matches the trend in TOA radiation (Loeb et al., 2021), indicating that if internal variability was the main driver of the trend it would have been caused by coherent ocean-atmosphere interactions able to force strong 22-yr trends.The only mode capable of such strong variability might be the Pacific Decadal Oscillation, which changed sign within the 22 yr and thus cannot explain the continued trend.ENSO events are, on average, neutral in their net radiation although they have a strong pattern effect during the event (Ceppi & Fueglistaler, 2021).Further, model analyses have shown that the observed 2001-2020 trend is larger than any trend in 20-yr periods of unforced simulations (Raghuraman et al., 2021).This renders hypothesis (2) possible but unlikely and impossible to prove wrong with only one realization of the real world.Furthermore, we cannot exclude the possibility that the observed trend is wrong, because CERES was not built to be fully stable in time.
Hypothesis (3) is difficult to rebut without specific model simulations correcting for mean-state biases.We relate the models' ability to simulate the observed trends with the magnitude of its mean-state bias and do not find any relationship (Figures S4 and S5 in Supporting Information S1).That is, models with a smaller versus stronger mean-state bias in TOA radiation and surface temperature do not reproduce the observed trends more versus less likely, respectively.Regions of large mean-state biases can-but do not need to-overlap with regions of strong observation-model discrepancies.With our model set-up, we do not find a first-order correspondence between the models' mean-state bias and their inability to simulate observed global or local TOA radiation trends.
We cannot reject hypothesis (4) that the model-underestimated TOA radiation trends are caused by too weak effective radiative forcing in all models.However, it is unlikely that errors in radiative forcing cause the consistent underestimation of TOA radiation trends across models because the prescribed emissions differ between CMIP5 and CMIP6 models with radiative forcing in CMIP6 being lower than in CMIP5 at the end of the historical period (Fredriksen et al., 2023;Fyfe et al., 2021), and the models also show substantial spread in their effective radiative forcing within each model generation (Smith et al., 2020;Raghuraman et al., 2023).In addition, we find that the simulated and observed surface temperature responses to radiative forcing much better agree with each other during 2001-2022 than the simulated and observed TOA radiation trends (see Section 3).
In summary, although hypotheses (1)-( 4) may still contribute to the discrepancy between observed and simulated TOA radiation trends, we find them unlikely to play a dominant role.Instead, we show in the following that the model bias in TOA radiation trends is due to a too weak local surface-TOA coupling (hypothesis 5), which is driven by both erroneously simulated surface warming patterns and atmospheric physics.

Underestimated Local Surface-TOA Coupling
To understand the bias in simulated global mean TOA radiation trends and its relation to surface warming, we first show that the models do not underestimate the observed TOA radiation trend because of underestimated global surface temperature trends (Figure 1m).We find that either observed global surface temperature trends fall comfortably within the ensemble, or the models overestimate the observed trend.We now investigate the patterns in the observation-model discrepancy for trends in surface temperature and TOA radiation (Figures 2a and 2b; see Figure S6 in Supporting Information S1 for the observed trend patterns).The discrepancy is calculated as the difference at every grid point between the simulated 22-yr trend of each ensemble member and the observed 22-yr trend, then averaged across the members of each model and across models.Most models do not reproduce observed strong TOA radiation trends such as in the subtropical eastern Pacific and subtropical east Atlantic (Figure S7 in Supporting Information S1).The congruity of large-scale regions with a negative or positive discrepancy in both simulated surface temperature trends and TOA radiation trends primarily over tropical to mid-latitude oceans suggests a connection between surface warming and TOA radiation.We investigate this "local surface-TOA coupling" by regressing the discrepancy between simulated and observed trends of surface temperature and TOA radiation.To minimize errors in both variables, we use orthogonal regression for which we normalize both axes by division with the standard deviation across the discrepancies of all ensemble members (Figures 2c-2e, Figures S10 and S11 in Supporting Information S1).We now focus on three regions that show striking model-observation discrepancies in TOA radiation, namely the subtropical eastern Pacific (Loeb et al., 2020), the subtropical east Atlantic (Myers et al., 2018), and the difference between the West-Pacific warm pool and the Tropics (Dong et al., 2019;Fueglistaler, 2019).
In the subtropical eastern Pacific, the discrepancies between observed and simulated trends in surface temperature and TOA radiation correlate well (r 2 across all models is 0.49, ranging from 0.10 to 0.79 for each model).The observed trends lay at the edge of the simulated ensemble spread.Importantly, all coupled models occasionally reproduce the strong observed trend in surface temperature, while some models never, and others only rarely, simulate the observed trend in TOA radiation.Thus, simulating the observed local trend in surface temperature is necessary but not sufficient for simulating the observed local trend in TOA radiation.Coupled models underestimate the sensitivity of the local TOA radiation to surface temperature trends: the regression line across all ensemble members does not cross (0,0) but at (0, <0).This is true when regressing all models' ensemble members as shown here, but also for every single model ensemble, with the regression line passing x = 0 at y = 1.6 to 0.5 Wm 2 dec 1 .
For the subtropical east Atlantic, a similar interpretation holds: The discrepancies between observed and simulated trends in surface temperature and TOA radiation correlate (r 2 across all models is 0.28, ranging from 0.09 to 0.36 for each model), but observed trends lay at the edge of the simulated range in a subset of the models.The observations are closer to the centre of the range of simulated surface temperature trends than of TOA radiation trends, which reiterates that simulating the observed trend in surface temperature is necessary but not sufficient for simulating the observed trend in TOA radiation.The coupled models overestimate the sensitivity of the local TOA radiation to surface temperature trends: the regression line across all ensemble members for each model does not cross (0,0) but at (0, >0).This is true when regressing all models' ensemble members, but also for 10 out of the 11 single model ensembles, with the regression line passing x = 0 at y = 0.2 to 1.7 Wm 2 dec 1 .
For the Tropics, it is debated whether coupled models are capable of simulating correct relative warming patterns between the West-Pacific warm pool and East-Pacific or global Tropics (Andrews et al., 2018;Coats & Karnauskas, 2018;Dong et al., 2019Dong et al., , 2020;;Fueglistaler, 2019;Heede et al., 2020;Olonscheck et al., 2020;Seager et al., 2019;Zhou et al., 2016).In the 22 years we focus on here, the difference between the West-Pacific warm pool and the tropical mean is reasonably simulated in the models: the CERES-observed trend lays within the simulated range of each model.This can be interpreted as either the observations are largely driven by interannual variability which is well captured by the models or, as argued above, the models overestimate interannual variability and hence cover up potential model bias (Heede et al., 2020;Olonscheck et al., 2020;Seager et al., 2019).
To understand the local sensitivity of TOA radiation to surface temperatures more broadly, we expand the analysis to the grid-point scale (Figure 2f).Assuming that a simulation replicates the observed surface warming (x = 0 in Figures 2c-2e), we find that the observed trend in TOA radiation is more often under-than overestimated.This is true for larger regions (64% vs. 36% of the global area, blue vs. red) and the magnitude of the under-and overestimation with a global mean underestimation of 0.47 Wm 2 dec 1 , consistent with the underestimation of global mean TOA radiation trends (Figure 1l).We call the discrepancy between the observed and simulated response of the TOA radiation to the observed surface temperature trends a "response bias".We quantify the response bias by the y-intercept of the regression line at x = 0 at every grid point (contours in Figures 2f-2h), and the local surface-TOA coupling strength by both the coefficient of determination and the regression coefficient (stippling in Figures 2f-2h).We find that regions with a large local surface-TOA coupling strength primarily show a negative response bias in net TOA radiation.This highlights that models consistently underestimate local and global TOA radiation trends-in larger regions and with a stronger magnitude than the occasional local overestimation.Importantly, our finding is not only true using all ensemble members of all models but also within each single model, representing different model physics, tuning strategies, climate sensitivities, and cloud feedbacks (Hourdin et al., 2017;Schmidt et al., 2017;Zelinka et al., 2020) (Figure S12 in Supporting Information S1).The underestimation occurs on large spatial scales representing different types of clouds, lapse rate, water vapor, and circulation conditions as well as different mean-state bias magnitudes.
In summary, we identify a robust response bias apparent in all coupled models studied here: Given a simulation replicates the observed surface warming locally, the simulated local TOA radiation trend more often under-than overestimates the observed trend, both in larger regions and with larger magnitudes.While global surface warming is generally simulated well, the global TOA radiation trend is systematically, strongly underestimated.

Causes for the Too Weak Local Surface-TOA Coupling
We investigate two possible causes of the identified response bias: First, next to the local surface warming the leading factor controlling clouds, and hence TOA radiation, is the lower-tropospheric inversion strength.The inversion strength is set by the warming of the free troposphere which, in turn, is set by surface warming in regions of deep atmospheric convection (e.g., Barsugli et al., 2006;Dong et al., 2019;Fueglistaler & Silvers, 2021;Liu et al., 2018;Wood & Bretherton, 2006).Hence, the first reason for the error in local coupling could be the inability of the models to replicate observed surface warming patterns.Second, atmospheric processes could render the TOA response to a correct surface warming wrong, even for a potential correct surface warming pattern.TOA radiation is influenced by myriads of processes such as radiative transfer, heat transport and large-scale circulation, the local boundary layer structure, humidity, cloud types, and cloud microphysics (e.g., Myers et al., 2021;Wood & Bretherton, 2006;Zhou et al., 2017).
To separate the effect of the surface warming pattern versus the atmosphere alone, we compare the coupled models' simulations with atmosphere-only simulations that are forced with the observed sea surface temperature evolution ("AMIP" simulations, Figures S15 and S16 in Supporting Information S1).Similar to the coupled models, we find that three out of five AMIP ensembles also underestimate the observed TOA radiation trend, but at a lower magnitude and within observational uncertainty.In the model mean, about one third of the models' error in surface-TOA coupling stems from atmospheric processes alone, but with substantial intermodel differences (see in Supporting Information S1).Our finding supports recent research showing that radiative feedbacks in AMIP simulations do not match observations substantially better than the ones in coupled models (Raghuraman et al., 2021;Uribe et al., 2022).The relevance of the atmosphere in explaining the response bias is seemingly in contrast to some studies arguing that the local atmosphere likely is not the dominant reason for biases in TOA radiation because AMIP-type simulations reproduce observed variations of TOA radiation (Andrews et al., 2018(Andrews et al., , 2022;;Loeb et al., 2021;Zhou et al., 2021).However, these studies do not investigate trends but interannual variability, which matches the observations astonishingly well (Figure S17 in Supporting Information S1).Initial condition ensembles of updated AMIP-type simulations are urgently needed to assess climate models' response biases with respect to radiation.We conclude that we cannot rule out the atmosphere as a cause for the local surface-TOA coupling response bias and in some models, errors in the atmospheric physics might be a dominant cause.
The other reason for the response bias stems from the inability of the models to reproduce the observed surface warming pattern.Our analysis in Figure 2 reveals that simulating the observed local surface warming is necessary but not sufficient for simulating the observed local trend in TOA radiation.This is in line with a growing body of literature arguing that the spatial pattern of surface warming is key to the global TOA radiation trends ("pattern effect", e.g., Senior and Mitchell (2000); Andrews et al. (2015Andrews et al. ( , 2022)) 2021)).A contributing driver for the model bias in the surface warming pattern could be model biases in aerosol emissions (Takahashi & Watanabe, 2016;Smith et al., 2016) but the main cause for the discrepancy is unknown.
We further investigate the cause of the response bias by separating the net TOA radiation response bias into its shortwave and longwave component (Figures 2g-2h).The shortwave response bias shows a similar pattern as the total response bias especially in mid-and high-latitudes, whereas the shortwave response bias in the Tropics is largely compensated or modified by the longwave response bias (Raghuraman et al., 2021).For all (net, shortwave, and longwave) fluxes, larger regions show a negative than a positive discrepancy of the simulated TOA radiation for a correct surface warming (blue vs. red).In the shortwave radiation, the underestimation has substantially larger magnitudes ( 0.56 vs. 0.18 Wm 2 dec 1 ) such that in the global-mean the shortwave radiation trends show much larger discrepancies than the longwave radiation trends (Figure S18 in Supporting Information S1).
In summary, we find that the response bias stems from both differences in the large-scale surface warming patterns and errors in atmospheric physics affecting short-and longwave radiation.

Response Bias Reflects in Climate Sensitivity
The global-mean underestimation of the TOA radiation response to surface warming is correlated with the models' short-term, global-mean effective feedback parameter λ eff and effective climate sensitivity (EffCS, Rugenstein and Armour (2021)).Coupled models with a greater response bias have a less negative 2001-2022 λ eff and a higher EffCS (Figure 3).We quantify the global feedback parameter λ eff by calculating the anomalies in ΔN ΔF and regressing them against ΔT.While ΔN and ΔT are easy to quantify for observations and each ensemble member, the effective radiative forcing F is uncertain for observations and models.We calculate F from the RFMIP simulations available for CanESM5 (1.2022)), or other periods and observational products (Huber et al., 2011).The spread of ensemble members in λ eff is substantial which highlights that a feedback defined with just 22-yr regressions has limited explanatory power.Importantly, ensemble members which reproduce the observed feedback do so combining a wrong TOA radiation trend with a wrong surface warming.Contrary to interpreting radiative feedbacks as a potential emergent constraint, our analysis suggests that models with a smaller surface-TOA coupling bias tend to have more negative feedbacks and hence a lower EffCS.Our new metric takes into account whether a model-which is given the opportunity to act-out its full spectrum of internal variability-is able to reproduce local observed TOA radiation trends given the observed correct local surface warming.Using the concept of local radiative feedbacks results in a similar interpretation (Figure S19 in Supporting Information S1).The metric is a measure of a response bias and, even though it is not based on one single well understood process, gives us indications about the erroneous processes counter to global-mean surface warming (Flynn & Mauritsen, 2020;Jiménez-de-la Cuesta & Mauritsen, 2019;Tokarska et al., 2020) or radiative feedbacks (Dessler, 2013;He et al., 2021;Uribe et al., 2022) to constrain λ eff or EffCS.
Our response bias metric is a new line of evidence that low-EffCS models more realistically reproduce climate change over the last 22 years than high-EffCS models (e.g., Modak & Mauritsen, 2021;Myers et al., 2021).Models with a stronger response bias, that is, a weaker TOA radiation response to surface warming, accumulate too much energy in the atmosphere which reflects in a higher EffCS.These models might have a less realistic pattern of warming leading to wrong magnitudes of ocean heat uptake and radiative feedbacks and/or larger errors in their atmospheric physics reflecting in wrong radiative feedbacks but also the wrong representation of forcing.
To actually constrain EffCS and more importantly, warming over the next decades, we need to disentangle these issues and increase trust in the models' ability to faithfully represent both, the surface warming pattern and the atmospheric physics.
The code used to both process the data and create the figures for this paper can be publicly accessed at Olonscheck and Rugenstein (2024).and innovation programme under grant agreement 820829.M.R. was further funded by the National Aeronautics and Space Administration under Grant 80NSSC21K1042.Computational resources were made available by the German Climate Computing Centre (DKRZ).We thank the two anonymous reviewers for valuable comments, Cathy Hohenegger for a pre-submission review, Timothy Andrews, Yue Dong, Eric Maloney, Mark Richardson, Shiv Priyam Raghuraman, and Jochem Marotzke for valuable comments on the draft paper.We further thank the US CLIVAR Working Group on Large Ensembles for providing the Multi-Model Large Ensemble Archive, and the various modeling groups for carrying out the large ensemble simulations used here.Open Access funding enabled and organized by Projekt DEAL.

Figure 1 .
Figure 1.Observed and simulated global-mean radiation anomaly at TOA. (a)-(k) Annual-mean CERES observations are shown in black and model simulations in color.The ensemble member with the maximum correlation coefficient to the observations (r max ) is depicted in gray and the ensemble member with the maximum 22-yr trend (t max ) is highlighted in color.The observed 22-yr trend is 0.46 ± 0.13 Wm 2 dec 1 .The number of ensemble members is shown in the panel title.The triangle shows where the historical simulations are continued with the scenario simulations.The mean of the entire period is subtracted for observations and models.(l)-(m) Observed (black) and simulated (colored) 2001-2022 trends in (l) global mean TOA radiation and (m) global mean surface temperature.Each filled dot represents one ensemble member; black circles represent the ensemble mean.The vertical dashed line and gray shading shows the observed trend ±2 standard errors of the 22-yr linear regression.Positive TOA radiation trends indicate an increasing uptake of energy into the climate system.See TableS1in Supporting Information S1 for details on the model ensembles and for numerical values of correlation coefficients and trends.

Figure 2 .
Figure 2. Relationship between surface temperature and TOA radiation.(a) and (b) Discrepancy between each ensemble member and observed 2001-2022 trends averaged across models in a surface temperature and (b) TOA radiation.Black boxes frame regions of interest.Orthogonal regression across all ensemble members between (a) and (b) averaged for (c), the subtropical eastern Pacific (150°W-1100°W, 10°N-40°N), (d), the subtropical east Atlantic (30W°-10°W, 25°N-40°N) and (e), the West-Pacific warm pool (90°E 180°E, 20°N-22°S) minus the entire Tropics (30°N-30°S).Individual ensemble members are shown as dots colored for each model as shown in the label bar in panel c.The multi-model ensemble mean is shown as black filled dot.Pattern of the response bias of TOA radiation trends to observed surface temperature trends measured as the y-intercept of the regression line at x = 0 (compare (c)-(e)) for (f) net TOA radiation, (g) TOA shortwave radiation and (h) TOA longwave radiation.Stippling highlights regions where the coefficient of determination is >0.25 and the regression coefficient is >1 Wm 2 C 1 .The percentages indicate the global area for which the models overestimate (red) or underestimate (blue) the observed TOA radiation response to surface warming, and the number in black shows the magnitude of the global mean response bias in Wm 2 dec 1 .Figures S8-S14 in Supporting Information S1 show results for individual models.