Using Explainable Artificial Intelligence to Quantify “Climate Distinguishability” After Stratospheric Aerosol Injection

Stratospheric aerosol injection (SAI) has been proposed as a possible response option to limit global warming and its societal consequences. However, the climate impacts of such intervention are unclear. Here, an explainable artificial intelligence (XAI) framework is introduced to quantify how distinguishable an SAI climate might be from a pre‐deployment climate. A suite of neural networks is trained on Earth system model data to learn to distinguish between pre‐ and post‐deployment periods across a variety of climate variables. The network accuracy is analogous to the “climate distinguishability” between the periods, and the corresponding distinctive patterns are identified using XAI methods. For many variables, the two periods are less distinguishable under SAI than under a no‐SAI scenario, suggesting that the specific intervention modeled decelerates future climatic changes and leads to a less novel climate than the no‐SAI scenario. Other climate variables for which the intervention has negligible effect are also highlighted.

• An explainable artificial intelligence framework is introduced to quantify the "climate distinguishability" after a climate intervention • The distinctive patterns between the pre-and post-intervention climates are not predefined but are learned directly from the data • For the climate simulations analyzed, stratospheric aerosol injection is shown to reduce distinguishability for some climate variables

Supporting Information:
Supporting Information may be found in the online version of this article. 10.1029/2023GL106137 2 of 11 definition of an appropriate metric (e.g., plotting the ensemble mean differences, using statistical metrics like the t-test to assess local significance, calculating Euclidean distance, etc.; Diffenbaugh et al., 2008;Langdon & Lawler, 2015;Tye et al., 2022;Richter et al., 2022).
Here, a new, explainable artificial intelligence (XAI) framework is proposed to quantify how "novel" an SAI climate might be.Model simulations from the Community Earth System Model version 2 (CESM2) are considered under two future scenarios spanning the years 2015-2069: an intermediate climate change scenario where global temperatures continue rising, and an identical climate change scenario except where SAI is deployed to limit warming to 1.5°C relative to the preindustrial era (Richter et al., 2022).The "climate distinguishability" between the pre-and post-SAI worlds is then quantified by tasking an artificial neural network to distinguish between the two across a variety of climate variables.The more successful the network is at this task the more "distinguishable" the pre-and post-SAI worlds are in terms of their climate.
Specifically, to quantify the climate distinguishability after SAI, we train a neural network to distinguish between maps of a variable of interest that originate from the SAI climate (i.e., the SAI climate is defined as the 2040-2059 climate under the SAI scenario; see blue box in Figure 1a) versus maps that originate from the pre-deployment/reference climate (the reference climate is defined as the 2020-2039 climate under the intermediate climate change scenario: the Shared Socioeconomic Pathway 2-4.5;SSP2-4.5;O'Neill et al., 2017; see gray box in Figure 1a).Although the prediction itself is not useful in this setting (i.e., we already know which map originates from which set of simulations), the accuracy of the network informs us about the climate distinguishability between the two periods.In this way, the degree of climate distinguishability is quantified with a single number: the accuracy of the network.To put this number into context, the network accuracy is compared with its "baseline" value, which is the network accuracy in the case where there was no intervention.That is, the aforementioned prediction task is repeated but this time the network is trained to distinguish between the reference climate and the future SSP climate without intervention (i.e., the future SSP climate is defined as the 2040-2059 climate under the SSP2-4.5 scenario; see magenta box in Figure 1a).
The main advantages of the proposed framework are: (a) it provides a new way to quantify with a single number the impact of the intervention in terms of introducing novel climate conditions; and (b) it is purely data-driven, thus, the form of change between the two compared climates does not need to be predefined; instead, the data determines "the ways" that the two climates are different.To gain insight into these distinctive patterns that distinguish the two climates, XAI tools are used.XAI aims to elucidate the decision-making process of deep learning models and has been increasingly applied in the geosciences in the recent years (see Mamalakis, Barnes, & Ezbert-Uphoff, 2022;Mamalakis et al., 2022aMamalakis et al., , 2022b;;McGovern et al., 2019;Toms et al., 2020).In Section 2, details about the data, the prediction task of the framework and the methods used are provided, while results are presented in Section 3. The major conclusions and future research directions are discussed in Section 4.

Data
Data from an ensemble of Earth system model simulations: "Assessing Responses and Impacts of Solar climate intervention on the Earth system with Stratospheric Aerosol Injection" (ARISE-SAI; publicly available at https://www.cesm.ucar.edu/community-projects/arise-sai;Richter et al., 2022) are used in this study.The ARISE-SAI experiment consists of two sets of parallel simulations performed with the CESM2 (Danabasoglu et al., 2020), using the Whole Atmosphere Community Climate Model version 6 as its atmospheric component (CESM2(WACCM6); Gettelman, et al., 2019;Tilmes, et al., 2020;Richter et al., 2022): (a) 10 ensemble members from 2015 to 2069 under the SSP2-4.5 (O'Neill et al., 2017), which represents an intermediate climate change scenario; and (b) 10 ensemble members from 2035 to 2069 under an SAI deployment scenario.In the latter, SO 2 is injected every day at roughly 21 km height at 180° longitude and 30°S, 15°S, 15°N, and 30°N using a "controller" algorithm (Kravitz et al., 2017;MacMartin et al., 2014).The SAI simulations aim to keep the global-mean surface air temperature near 1.5°C above the preindustrial temperature.For more detailed information on the ARISE-SAI experiment, the reader is referred to Richter et al. (2022).
Climate distinguishability is quantified for 21 climate variables that are listed in Table S1 in Supporting Information S1.Prior to training the network, all variables are bi-linearly re-gridded to a 2.5° by 2.5° resolution grid from an approximate 1° by 1° resolution to reduce the dimensionality of the prediction task.Since this re-gridding is applied to the climate data of both scenarios, it does not affect the conclusions about the impacts of SAI. 10.1029/2023GL106137 3 of 11

Prediction Task
The reference climate is defined as the CESM2(WACCM6) output over the period 2020-2039 under the SSP2-4.5 scenario, following the original study of ARISE-SAI (Richter et al., 2022).A network is then trained to distinguish between the reference climate (see gray box in Figure 1a) and the climate under SAI over the period  1a).Specifically, given a randomly chosen map of a variable of interest as an input (e.g., a map of annual mean surface temperature or annual maximum precipitation, see Table S1 in Supporting Information S1), a fully connected network is tasked with estimating the probability that the map originated from the 2040-2059 SAI climate.A probability value less than 0.5 indicates that the map is predicted to belong to the reference climate, while a probability value greater than 0.5 indicates that the map is predicted to belong to the SAI climate; see Figure 1b.Framing the prediction task in this way requires the network to identify patterns that serve as robust and distinctive indicators to separate the pre-and post-deployment periods.It is important to note that the patterns used by the network could be of any form: local, global or any type of combination of patterns, which emphasizes the generic nature of the suggested framework.
To place climate distinguishability under SAI into context, it is compared to the climate distinguishability under the scenario of no intervention.This is done by repeating the same approach, but by tasking the network to distinguish between the reference climate and the climate in the period 2040-2059 under the SSP2-4.5 scenario (see magenta box in Figure 1a).The network's accuracy from this second task serves as a "baseline" value of climate distinguishability for the variable analyzed and is compared with the results from the first task to help assess the potential benefits (or risks) of deploying SAI.For details on the training approach and the architectures of the networks, please see Text S1 in Supporting Information S1.

Explainable AI Method
The local attribution method Deep SHAP (Lundberg & Lee, 2017) is utilized to explain the predictions of the network.Specifically, for a given input map, Deep SHAP identifies the regions that help increase (or decrease) the network-derived probability that the map belongs to the SAI world.Thus, for a variable that the network can accurately distinguish between pre-and post-deployment periods, the Deep SHAP-identified regions constitute robust and distinctive SAI signals (see also Barnes et al., 2020;Labe et al., 2023).Deep SHAP has been chosen for two reasons: (a) it allows the user to define the baseline for which the attribution is derived (see Mamalakis et al. (2023) on the importance of baselines); and (b) it satisfies the completeness property (Sundararajan et al., 2017), which holds that the attributions add up to the difference between the network output at the current sample and the one at the baseline.For details on the Deep SHAP algorithm, please see Text S2 in Supporting Information S1.The method Integrated Gradients (Sundararajan et al., 2017) was also used to explain the network's predictions, and the results were very similar to those based on Deep SHAP (not shown).

Results
We start by presenting the results for the case of annual maximum daily precipitation in Figure 2. We first discuss the results for a future climate with no intervention.The global-mean annual maximum precipitation exhibits an increase throughout the century, albeit with large ensemble spread (magenta lines, Figure 2a).The largest increases occur in the deep tropics, especially over the tropical Pacific (Figure 2b; see also O'Gormanm & Schneider, 2009;Kharin et al., 2013;Pfahl et al., 2017).The network successfully distinguishes between the reference climate and the SSP future climate 85% of the time, which is significant at a 0.01 level (Figure 2d).Moreover, the probability assigned by the network that a map corresponds to the future SSP climate increases linearly with the actual year of the map and maximizes in the out-of-sample years 2060-2069 (Figure 2d).This suggests that there are robust signals of climate change that become more evident with time.It also suggests that the learned patterns generalize successfully, since the network can correctly classify the years 2060-2069, although those years were not used during training (see Text S1 in Supporting Information S1).Based on the results from the XAI method Deep SHAP, the network mainly uses precipitation extremes over the tropical eastern Pacific (and to a lesser degree over the Southern Ocean and the tropical Atlantic) to make its predictions (Figure 2f).Interestingly, the network does not use precipitation over the western Pacific or Australia, even though the corresponding ensemble mean difference between the two periods is of high magnitude (Figure 2b).This implies high internal variability of precipitation extremes over these regions, which does not make them robust indicators from a signal-to-noise perspective.
Under the SAI scenario, the overall accuracy of the network is only 58% (Figure 2e), which is not statistically different from a random chance-based model (at a 0.01 significance level, a random chance-based model would perform with up to 69% accuracy, derived using a binomial distribution).The network-derived probability that a map corresponds to the SAI climate is almost independent from the actual year of the map (Figure 2e), which indicates that there are no robust long-term climate signals under SAI that the network could use for distinguishing from the reference climate.This is also suggested by the XAI results; note the incoherent and noisy attributions in Figure 2g.Generally, the results in Figure 2 indicate that although the CESM2(WACCM6) simulates a robust increase in future extreme daily precipitation under the SSP2-4.5 scenario, possible deployment of SAI could largely preserve the conditions of the reference (i.e., pre-deployment) climate.
Next, consider the annual mean surface temperature over land (Figure 3).Under the SSP scenario, a clear increase in surface temperature is shown throughout the century that is evident globally (Figures 3a and 3b).Accordingly, the network accuracy in distinguishing between the reference and the future SSP climate is high, on the order of 93%.Many regions around the globe are highlighted by Deep SHAP as robust distinctive patterns; for example, Mexico, southern South America, southern Africa, Indonesia, and southern Australia (Figure 3f).Under the SAI scenario, although the global mean temperature is similar to the one under the reference climate, there are robust patterns of regional cooling that make the two climates distinguishable 91% of the time (Figure 3e).Regional cooling happens mainly over southern South America, eastern Africa, eastern Australia, and Greenland (Figure 3c), regions that the network uses to distinguish between the reference and the SAI climates (see Figure 3g).Overall, these results indicate that the CESM2(WACCM6) projects that a potential SAI deployment would lead to a less warm climate than SSP; however, the annual mean surface temperature over land in an SAI world would also be highly distinguishable from the reference climate.Importantly, the distinctive patterns in the two scenarios are quite different, with warming patterns dominating the SSP scenario, while regional cooling patterns are robust distinctive patterns under SAI.
The same analysis as in Figures 2 and 3 has been repeated for 21 variables in total (see Table S1 in Supporting Information S1), and the results are summarized in Figure 4.For all variables, the network accuracy under the SSP scenario (magenta circles in Figure 4a) is statistically significant.This means that even under the intermediate climate change scenario SSP2-4.5, the CESM2(WACCM6) projects that the Earth system would exhibit climatic conditions that are distinguishable from the reference climate in the coming decades.Regarding the intervention, for most variables SAI would lead to a less distinguishable climate than the SSP scenario, decelerating greenhouse-gas driven climate changes in surface temperature extremes, precipitation, drought occurrence, sea level pressure, and Arctic sea ice (see also W. Lee et al., 2020, W. R. Lee et al., 2023;Tye et al., 2022;Xu et al., 2020).On the other hand, there are several variables for which SAI is projected to have minimal impact relative to climate change, including soil moisture, evapotranspiration, and ocean acidity.Importantly, however, an SAI climate would also be novel.That is, even with SAI, the new future climate would be distinguishable from the reference climate for most of the Earth system variables examined (note that the network accuracy (light blue circles) is higher than the random chance-based accuracy).
The network accuracy depicted in Figure 4a does not provide information on the distinctive patterns and how they might be modified by SAI.For example, as is shown in Figure 3 for annual mean surface temperature over land, the climate distinguishability under the SSP and the SAI scenarios is similar, but the corresponding distinctive patterns are very different.To explore this aspect further, the spatial correlations between the XAI heatmaps under the SSP and SAI scenarios are presented in Figure 4b.First, note that there are a few variables for which the spatial correlation is high, such as ocean acidity and ocean heat content, which means that the anticipated SSP-driven distinctive patterns are projected to remain almost unchanged under SAI.Yet, in most cases, the correlation is not statistically different from zero, which means that SAI is projected to introduce novel distinctive patterns relative to those from the SSP scenario.There are several such patterns worth mentioning since they appear consistent across many variables (see Figures S1 and S2 in Supporting Information S1).For instance, both the annual average and the annual maximum surface temperatures exhibit strong regional cooling over the northern Atlantic (see also Richter et al., 2022).Regional cooling is also a robust SAI signal over the Southern Hemisphere middle latitudes (see also Figure 3).We emphasize that regional cooling might not necessarily yield optimal outcomes for the affected countries (Abatayo et al., 2020), highlighting that regional changes are no less important than global average changes.Additionally, although annual precipitation, drought duration and evapotranspiration are all indistinguishable between pre-and post-deployment periods from a global average perspective (see also Richter et al., 2022), regional drying over the Amazon and tropical Africa are robust SAI signals that render the pre-and post-deployment fields distinguishable.
The results in Figure 4 illustrate the diverse impacts of SAI on different components of the Earth system.They also confirm that SAI would likely lead to a different "vector" of change compared to the SSP scenario (see also  2, but results are for the annual mean surface temperature over land.Irvine et al., 2017), as shown by the introduction of novel distinctive patterns for most variables.Further, the results show that SAI would likely lead to a less novel climate than the climate without any intervention.

Conclusions
In this study, a new framework was used that allows quantification (with a single number) of the degree of climate distinguishability between a reference climate and future climate states from both SAI and no-SAI worlds.The framework is based on the use of machine learning and leverages XAI tools to identify robust distinctive patterns under the different scenarios.The framework is purely data driven, nonlinear, nonlocal, and it accounts for underlying uncertainties in the data that may originate from internal stochastic variability or uncertainties in Earth system model physics.
This framework was applied to data from ensembles of simulations that were developed to examine the potential impacts of SAI on Earth's climate; namely, the ARISE-SAI project (Richter et al., 2022).SAI was shown to have a diverse range of impacts, including minimizing changes due to greenhouse gas forcing in temperature and precipitation extremes, while having negligible effect on ocean acidification.Overall, a future SAI world would be more similar to the current climate than a world without SAI, even though most of the variables examined here would still be distinguishable due to new patterns of regional change.This raises the possibility of SAI leading to novel (and perhaps unwanted) changes in specific components of the Earth system or in certain regions of the world.
There are some potential limitations of the data-driven framework used here.One is the dependence of the results on the amount of data.Neural networks are known to be "data-thirsty" models (LeCun et al., 2015), so it is possible that certain patterns that were not identified as robust indicators during training could become robust with more data.However, the dependence on the amount of data is present in virtually all climate settings involving questions of signal-to-noise and statistical significance.Another limitation is the possible dependence of the results on the network architecture.To address this issue, many different architectures and combinations of hyperparameters were examined before training the network, as described in Text S1 in Supporting Information S1.That way, the data served as the guide for the best architecture for each climate variable.Yet, it is possible that some of the results of this study depend on the adopted architectures.
Our work highlights the need to further research the impacts of possible climate intervention approaches (see also Cheng et al., 2019Cheng et al., , 2022;;Simpson et al., 2019;Tye et al., 2022;Xu et al., 2020), including the ARISE-SAI simulations (Hueholt et al., 2023;Keys et al., 2022;Labe et al., 2023).In doing so, the notion of "quantifiable climate distinguishability" will be a relevant and informative metric that can be used to assess impacts and expand the design space of possible interventions (W. Lee et al., 2020).Further investigation could include assessing the climate distinguishability by considering multiple variables at the same time (i.e., the network input consists of many channels each of which refers to a different variable) to assess potential impacts on the dependence structure of different components of the Earth system and the occurrence of compound events.Future work could also focus on analyzing the output of more than one Earth system model and of more than one climate intervention strategy.

Figure 1 .
Figure 1.Schematic of our framework to quantify stratospheric aerosol injection (SAI) impacts using explainable artificial intelligence (XAI).(a) Assessing climate distinguishability between reference and future climates.Note that the pre-2040 period under an intermediate climate change scenario is used as the reference climate, in accordance to Richter et al. (2022).(b) Schematic of the prediction task to quantify climate distinguishability after SAI and the use of XAI to derive the distinctive patterns between the reference and SAI climates.

Figure 2 .
Figure 2. Results of our framework for annual maximum daily precipitation.(a) Series of global-mean annual maximum precipitation (in mm/d) under the SSP2-4.5 scenario and the ARISE-SAI scenario.All 10 ensemble members and the ensemble mean are shown.(b) Ensemble mean difference between the annual maximum precipitation in the 2040-2059 SSP2-4.5 climate and the reference climate.(d) Network-generated probability that different annual maximum precipitation maps originated from the 2040-2059 SSP2-4.5 climate.The actual year of each map is provided in the horizontal axis.The overall accuracy of the network is shown on the bottom right corner.(f) Distinctive patterns that were used by the network to separate the reference climate from the 2040-2059 SSP2-4.5 climate, as estimated using the method Deep SHAP.The presented attributions correspond to the average attributions across the 2060-2069 network predictions and all testing members, using the years 2035-2044 as baseline.Panels (c, e, g) same as panels (b, d, f), but the network is trained to separate the reference climate from the 2040-2059 ARISE-SAI climate.

Figure 3 .
Figure 3. Same as in Figure 2, but results are for the annual mean surface temperature over land.

Figure 4 .
Figure 4. (a) Accuracy of the network in distinguishing between the reference climate and the future SSP 2-4.5 climate (magenta) or the future ARISE-SAI climate (light blue), for all variables considered in the study (see Table S1 in Supporting Information S1).Results from individual testing members (smaller circles) and the ensemble mean (bigger circles) are presented.The critical values for the 10% and 1% significance levels are derived using a binomial distribution.(b) Correlation coefficient between attribution heatmaps that correspond to predicting in the two scenarios.Results from individual testing members (smaller circles) and the ensemble mean (bigger circles) are presented.