Precipitation Over Southern Africa: Moisture Sources and Isotopic Composition

Southern Africa, with its vast arid to semiarid areas, is considered vulnerable to precipitation changes and amplifying weather extremes. However, during the last 100 ka, huge lakes existed in the currently dry central Kalahari. It has been suggested that these lakes could have existed due to altered atmospheric circulation pattern, leading to an increase in precipitation or to changes in the annual precipitation distribution. Past climate changes are recorded in paleo‐archives, yet, for a proper interpretation of paleo‐records, for example, from sedimentological archives or fossils, it is essential to put them in a context with recent observations. This study’s objective is, therefore, to analyze spatially differing annual precipitation distributions at multiple locations in southern Africa with respect to their stable water isotope composition, moisture transport pathways, and sources. Five different precipitation distributions are identified by end‐member modeling and respective rainfall zones are inferred, which differ significantly in their isotopic compositions. By calculating backward trajectories, different moisture source regions are identified for the rainfall zones and linked to typical circulation patterns. Our results furthermore show the importance of the seasonality, the amount effect, and the traveled distance of the moisture for the general isotopic composition over the entire southern Africa. The identified pattern and relationships can be useful in the evaluation of isotope‐enabled climate models for the region and are potentially of major importance for the interpretation of stable water isotope composition in paleo‐records in future research.

. Studies have shown that rainfall intensity and seasonality dominate over animal grazing intensity in determining the productivity of the grasslands (du Toit et al., 2018).
In historical times, livestock has been a traditional livelihood across southern Africa (Sadr, 2015) and past climate change supposedly triggered large-scale human migration (McLeman, 2014) although this view has recently been questioned (Hannaford, 2020). The prehistoric dispersal of ancient modern humans is commonly related to climate change too (Blome et al., 2012;Timmermann & Friedrich, 2016) and hunters and gatherers appeared in southern Africa not later than a hundred thousand years ago (Henshilwood et al., 2011;Robbins et al., 2016). It was suggested that out-of-Africa migration started in southern Africa and was triggered by a more humid climate opening gateways to East Africa (Rito et al., 2019).
On the other hand, ancient people occupied certain sites such as the Tsodilo Hills in the northern Kalahari over tens of thousands of years (Robbins et al., 2010) indicating that access to water and food was sustainable over long periods.  reconstructed the paleo-hydrological setting of the Tsodilo site with multiple lake phases and concluded that periods with higher precipitation/lower evaporation were more extended than previously assumed (Thomas et al., 2003). In the nowadays dry central Kalahari, huge lakes existed repeatedly during the last 100 ka (Burrough, Thomas, & Bailey, 2009;Riedel et al., 2014;Schmidt et al., 2017).  emphasized that, beside climate, tectonic processes may have played a major role, particularly by the redirection of river systems (Carney et al., 1994;Riedel et al., 2014;Ringrose et al., 2005).
Against this background, a considerable number of studies have focused on short-and long-term climate variability for a better understanding of past climate changes (Chevalier & Chase, 2016;Dieppois et al., 2016;Hart et al., 2013;Zhang et al., 2015) and on projections of future climate scenarios (Dunning et al., 2018;Howard & Washington, 2020;Maúre et al., 2018;Mayaud et al., 2017), eventually providing essential information on which political decisions can be based (Department of Environmental Affairs, 2017). The spatiotemporal behavior of the tropical rain belt received particular attention for explaining major past and potential future precipitation changes over Africa (Collins et al., 2011;Mamalakis et al., 2021;Nicholson, 2009;Schneider et al., 2014).
Southward migration or extension of the tropical rain belt has been related with intensified austral summer precipitation over the Kalahari and subsequent development of vast lakes in the structural basins of the Kalahari (Burrough, Thomas, & Singarayer, 2009). Additionally, the behavior of the southern westerlies plays a major role in the regional climate system too (Perren et al., 2020;Figure 2). A northward shift of the westerlies would enhance winter precipitation over large parts of southern Africa (Chase & Meadows, 2007;Cockcroft et al., 1987;Stuut et al., 2004;van Zinderen Bakker, 1976) and development of ponds and lakes in the pans and basins of the Kalahari during the Last Glacial Maximum was associated with it (Riedel et al., 2014;Schüller et al., 2018).
Naturally, observational precipitation data from southern Africa are much more comprehensive than proxy data inferred from paleo-archives and interlinking these research domains remains an enormous challenge. Identifying moisture sources and trajectories and assess their relevance for precipitation variability over southern Africa (Gimeno et al., 2020;Hewitson et al., 2004;Leketa & Abiye, 2020;Rapolaki et al., 2020) represents a possible bridge between the present and the past, in particular, via the analysis of stable water isotopes (Bowen et al., 2018;Munksgaard et al., 2019;Wanke et al., 2018;West et al., 2014). Based on Global Network for Isotopes in Precipitation (GNIP) stations' data, Bowen and Revenaugh (2003) interpolated the isotopic composition of modern meteoric precipitation at the global scale and produced a map with a good resolution for Africa.

Contemporary Climate Setting
Southern Africa is influenced by both tropical and midlatitude circulation systems (Tyson & Preston-Whyte, 2000; Figure 2) and thus the controls on climate variability are complex (Reason et al., 2006). The major regional climate features are the seasonally migrating Intertropical Convergence Zone (ITCZ) and the Congo Air Boundary (CAB; Dieppois et al., 2016;Gasse et al., 2008;Howard & Washington, 2019). The interplay of these systems with circulation features such as the Botswana High and the Angola Low triggers austral summer precipitation over most of southern Africa (Figure 2). The Botswana High interlinks with the South Indian Ocean Anticyclone (Cherchi et al., 2018;Miyasaka & Nakamura, 2010;Xulu et al., 2020), controlling the migration of so-called tropical-temperate troughs (TTTs) over southern Africa. The TTTs originate from the South Indian Convergence  Trabucco and Zomer (2019) showing the dry lands, major rivers, and shaded relief Global Multi-Resolution Terrain Elevation Data (GMTED;  10.1029/2022JD037005 4 of 24 Zone (Cook, 2000) and are visible as cloud bands in satellite images (Macron et al., 2014). During austral summer, the TTTs add significantly to the rainfall amount over southern Africa . In addition to the regional circulation patterns, also remote teleconnections affect southern African precipitation, in particular with the El Niño-Southern Oscillation (ENSO; Dieppois et al., 2016). The strength Figure 2. Some aspects of the present-day African climate during the (a) austral summer and (b) austral winter (after Driver & Reason, 2017;Gasse et al., 2008;Nicholson, 1996). The sea surface temperatures (SSTs) are shown for July and January 2020 (data source: NASA/JPL, 2020). Seasonality of continental precipitation is shown by percentage of annual precipitation of the summer (DJF) and winter (JJA) months (data source : Fick & Hijmans, 2017) and the rainfall regimes over southern Africa (Chase & Meadows, 2007) are indicated by green (solid + dashed) lines: winter rainfall zone (WRZ), year-round rainfall zone (YRZ), and summer rainfall zone (SRZ). Low-level wind and pressure patterns: Angola heat low (AL), Botswana High (BH), Congo Air Boundary (CAB), Intertropical Convergence Zone (ITCZ), northerly East African monsoon (NEM), South Atlantic Anticyclone (SAA), southerly East African monsoon (SEM), and South Indian Anticyclone (SIA). In section: southern Hadley Cell (SHC). Oceanic surface currents are indicated by temperature: warm (red arrow) and cold (blue arrow). of the Botswana High is considered to be in phase with ENSO being stronger during low phases (El Niño) of the Southern Oscillation and weaker during high phases (La Niña; Driver & Reason, 2017;Reason, 2019). During El Niño, the TTTs are shifted east of the continent which then generally leads to droughts over southern Africa (Tyson & Preston-Whyte, 2000). Furthermore, La Niña leads to above-normal rainfall due to the shifting of the TTT back above the continent (Tyson & Preston-Whyte, 2000). The drought pattern, however, is not fully understood yet. During a couple of El Niño events, droughts over southern Africa did not materialize (Driver et al., 2019;Pascale et al., 2019). It has been controversially discussed whether the Angola Low modulates the summer rainfall and eventually prevents drought conditions during certain El Niño events (Crétat et al., 2019;Lyon & Mason, 2007;Pascale et al., 2019). Yet, the role of the Angola Low, for example, for moisture transport from the Southern Atlantic is understudied, while the role of the Botswana High is better understood (Reason, 2019).
Total summer precipitation over southern Africa decreases from north to south into the southern Kalahari, and from east to west across the continent (Conway et al., 2015;Hewitson et al., 2004;Jury, 2012), from annually ca. 1,300 mm in central Mozambique (Silva & Matyas, 2014) to less than 50 mm in the Namib Desert (Mendelsohn et al., 2002;Figure 3a). The aridity of southern Africa is primarily caused by the descending dry air of the southern Hadley Circulation (Lu & Vecchi, 2015), but the Namib is also controlled by the cold Benguela Current and the upwelling along the coast of southwestern Africa (van Zinderen Bakker, 1975). Additionally, the precipitation over the eastern coast of southern Africa is enhanced by moisture from the warm Agulhas Current (Nkwinkwa Njouodo et al., 2018;Reason, 2001). A fraction of the moisture of this summer precipitation, penetrating inland from the east, originates in the Southern Ocean and the Southern Atlantic, however (Leketa & Abiye, 2020;Rapolaki et al., 2020).
Austral winter precipitation is controlled by the southern westerlies and falls predominantly in westernmost South Africa and to a smaller amount along the southern coast of South Africa (Chevalier & Chase, 2016). The influence of the Antarctic Oscillation on the austral winter precipitation is under discussion (Pohl et al., 2010).
Except for the transitional areas between summer and winter rainfall, and the southern coast of South Africa, seasonality is pronounced, with a stable dual pattern of wet and dry season (Chevalier & Chase, 2016;Tyson & Preston-Whyte, 2000). The Köppen-Geiger classification reflects this very well (Engelbrecht & Engelbrecht, 2016;Peel et al., 2007; Figure 3b).

Precipitation Distribution Groups and Seasonality
To determine different groups of annual precipitation distributions (PDs) in southern Africa, defined for purposes of this study as Africa south of 15°S, monthly precipitation data (in mm) from WorldClim 2.1 (Fick & Hijmans, 2017) were analyzed using an end-member (EM) modeling algorithm. The identification of distinct annual precipitation pattern and their spatial distribution allows us to frame basic differences in precipitation regimes. The precipitation data were downloaded from http://worldclim.org and represent average monthly climate data for the period 1970-2000. To reduce computing times, a spatial resolution of 10 min (∼340 km 2 resulting in 25,881 points) was chosen for the classification.
The precipitation data set's underlying main PDs were unmixed and their contribution to the data set was determined using an EM modeling approach by Dietze et al. (2012) with the R package EMMAgeo (Dietze & Dietze, 2019). On the one hand, a high goodness of fit (r 2 ) for variables and samples is wanted but on the other hand, the interpretation should be kept as simple as possible (Weltje, 1997). This suggests the choice of a model with high coefficients of determination (R 2 ) but a low number of EMs. After these considerations, a robust 3-EM model was identified as best choice.
For a differentiated consideration of the results according to the seasonality of precipitation, after prior examination of the different identified PD groups, at each location the months of the year were classified into three categories: precipitation season, off-season, and transition times (both, from season to off-season and from off-season to season). While months of the precipitation season are defined by >10% of annual precipitation (P a ), off-season months have <5% P a and months during the transition times have 5%-10% P a .

Water Samples and Stable Isotope Analysis
Our compiled database includes 852 monthly samples (composites from the total precipitation during one calendar month) during the period 1958-2013 of the eight GNIP (IAEA/WMO, 2021) stations in southern Africa ( Figure 4; Table S1 in Supporting Information S1) and 232 self-collected water samples (50 precipitation, 166 rivers, 9 lakes, 6 springs, and 1 ocean) across southern Africa (Figure 4), collected during the period 2016-2021.
The self-collected samples were analyzed for stable isotope ratios of oxygen ( 18 O/ 16 O) and hydrogen ( 2 H/ 1 H) using a PICARRO L1102-i isotope analyzer, which is based on the wavelength-scanned cavity ring down spectroscopy (WS-CRDS) technique (Gupta et al., 2009). Calibration of the measurements was done by linear regression with the standard calibration materials Vienna Standard Mean Ocean Water (VSMOW), Standard Light Antarctic Precipitation (SLAP), and Greenland Ice Sheet Precipitation (GISP) from the IAEA. The ratios are expressed in the conventional delta notation (δ 18 O, δ 2 H) in per mil (‰) relative to VSMOW as defined by Craig (1961b) and Gonfiantini (1978). A total of six replicate injections were performed for each sample and arithmetic mean and standard deviation (1 sigma) were calculated, resulting in a reproducibility of the replicate measurements of generally better than 0.1‰ for oxygen and 0.5‰ for hydrogen. Values for the second-order isotope parameter deuterium excess (d-excess) for both data sets were calculated using the formula d = δ 2 H − 8 δ 18 O (Dansgaard, 1964).
For further analysis, only samples with values for both δ 18 O and δ 2 H of a month/site were used. Samples from the same location and day were combined and their means used in the statistical analysis. As the respective precipitation amounts are unknown, normal means were used instead of the usually preferred amount weighted means.
To test for significance of differences among the isotopic compositions of the different PD groups and among their precipitation seasonality, an analysis of variance (ANOVA) with post hoc Tukey HSD test (Tukey, 1949) was used in R (R Core Team, 2019).

Lagrangian Moisture Source Diagnostic
To identify and compare the moisture sources of the collected water samples around southern Africa, backward trajectories were calculated with the LAGRANTO model , using the European Centre for Medium-Range Weather Forecasts (ECMWF)'s latest three-dimensional reanalysis, ERA-5 (Hersbach et al., 2020) wind fields (u, v, w), available hourly with a horizontal grid spacing of 7 of 24 31 km on 137 vertical levels from the surface up to 0.01 hPa. Backward trajectories were calculated, starting at the specific sampling locations and at 18 different pressure levels (1,013.25 hPa, and from 1,000 to 200 hPa in steps of 50 hPa). Depending on the type of water sample, multiple back trajectories were calculated for different timespans: precipitation, 1 day; streamlet, 7 days; upper river, 14 days; middle river, 21 days; lower river or lake, 28 days (for sake of computation time) and started every 3 hr for the 1 day and 7 days timespans and every 6 hr for 14 days and greater timespans (to reduce computation time). The springs and ocean samples were excluded for the trajectory analysis. Thus, a total, of 203 different target locations and times were used. Additionally, to analyze differences over time at one place, the station at Pretoria (South Africa) was chosen, as it has comparably few gaps in the data set. For the period from 1996 to 2001, every 6 hr, backward trajectories were calculated from the GNIP station location, based on preprocessed ERA-5 data with a grid spacing of 0.5°.
All trajectories were calculated 10 days (240 hr) backward in time and specific humidity, among other variables, was recorded along the trajectories for each 1 hr interval. The trajectory length of 10 days was chosen as maximum timescale on which integrity of the traced air parcel can be assumed (Pfahl & Wernli, 2008) and, according to analysis of Nieto and Gimeno (2019) also seems to be an appropriate integration time for the region of southern Africa. Trajectories for our collected samples were calculated inside the box of the coordinates 30.0°W, 70.0°E, 50.0°S, and 10.0°N, as the fraction of moisture sources outside this box is minimal (<2% for GNIP-Pretoria, according to trajectory calculations for this location).
To identify evaporative moisture sources along the trajectories, a method for moisture source attribution after Sodemann et al. (2008) was used. A moisture uptake is attributed to surface evaporation along the trajectory if it occurs within the atmospheric boundary layer. We also used the scaling factor of 1.5 for the boundary layer height (BLH), which was recommended by Sodemann et al. (2008), because the BLH tends to be underestimated by models (Zeng et al., 2004). Furthermore, only those trajectories exceeding a relative humidity threshold of 80% at the location of sample collection are selected for the analysis, as it is assumed that clouds are then existent, and precipitation is likely to occur.
For statistical analysis, the evaporation locations along each trajectory were weighted by their contribution to the precipitation of that specific trajectory at the target location. The trajectories for each target location were then weighted by their contribution to the cumulated precipitation at the specific target location. The contribution of each trajectory was measured by its negative change in specific humidity −dq in the last timestep, with only trajectories with dq < 0 taken into account (Sodemann et al., 2008). Finally, for every traced variable, a weighted mean value over the identified source regions was calculated for further statistical analysis.
To analyze whether the landcover at the source locations shows an impact on the isotopic composition of the precipitation, landcover classes along the trajectories were extracted from the Global Landcover 2000 database (Mayaux et al., 2003;resolution: 1 km at the equator). In order to analyze the continental isotope effect, the shortest path distance to the coast was calculated in ArcGIS Pro (ESRI, 2018). Additionally, we used the approach of Walter and Lieth (1960) to define humid and arid months, where arid months are defined as those with the monthly precipitation P in mm representing less than twice the average monthly temperature T a in degrees Celsius (i.e., P < 2 T a ) and where months are classified as humid when the precipitation exceeds 2 times the average temperature (P > 2 T a ). This classification was also based on the WorldClim 2.1 Climate Data of monthly precipitation (mm) and additionally the average near-surface temperature (°C) with a spatial resolution of 30 s (∼1 km 2 ).
For exploration of the relationships between the measured isotope values and diagnosed water vapor source conditions, Pearson correlation coefficients and linear regression models were determined, assuming Gaussian distributions for all variables. Here, only samples were used for which moisture uptakes below the BLH have been attributed for more than 60% of the final precipitation (Ra > 0.6), to get reliable statements about the relationship between isotope ratios and moisture source conditions, which is only possible if sources can be detected for the greater part of the precipitation (Pfahl & Wernli, 2008).
To assess the relative importance of meteorological variables for the stable isotope data, we applied the machine learning Random Forest (RF) regression algorithm, using the cforest() function of the R Package party (Hothorn et al., 2006;Strobl et al., 2007Strobl et al., , 2008. This function was chosen, because our predictor variables are of different types, and so the options to construct unbiased RF and to calculate the variable importance following the permutation principle could be used. As some of the predictor variables are furthermore highly correlated, the condi tional importance was applied. Again, if the sample size allowed it, only samples with Ra > 0.6 were analyzed. To ensure stable variable importance, after trial of different numbers, a large number of trees (ntree = 10,000) have been set (Behnamian et al., 2017). With this approach, at least the top five variables showed a stable ranking over multiple model-runs. Furthermore, the number of variables per level for the chosen number of trees was optimized by tuning in terms of out-of-back (OOB) -RMSE. The RF analysis represents an objective way to choose covariate setting for multiple regression in order to gain maximum multiple regression information and in contrast to the regression analysis, RF also allows including categorial variables for the importance analysis.

Precipitation Distribution Groups
The unmixing of the PDs resulted in a model with three robust EM (spatial distribution in Figure S1 in Supporting Information S1) explaining 64% (mean R 2 = 0.64 between the original and modeled data) of the data variance ( Figure S2 in Supporting Information S1).
Based on this 3-EM model, five different PDs were identified for southern Africa which are clustering in space and match the spatial distribution of the percentual summer precipitation over southern Africa (in Figure 2). The corresponding inferred rainfall zones (RFZ) have been named according to their main precipitation season and location ( Figure 5): • a summer rainfall zone in the western part (SRZw = EM1), • a summer rainfall zone in the eastern part (SRZe = EM3), • a mixed summer rainfall zone (SRZm) at the intersection of SRZw and SRZe • a winter rainfall zone (WRZ = EM2), and • a year-round rainfall zone (YRZ) at the intersections of SRZw and/or SRZe with WRZ.
The assignment (cf., decision tree in Figure S3 in Supporting Information S1) of the raster points to each RFZ is shown in Figure 5a. The average PDs for each RFZ (Figures 5b1-5b5) were determined by the mean monthly precipitation amount relative to the annual precipitation. In the SRZw, YRZ, and WRZ, during March, there is a remarkably high variation in the precipitation fraction, which is caused by outliers from the western and southern part of Namibia. The PDs of the RFZ differ in their general distribution as summarized in Table 1.

Stable Water Isotopes
The isotopic data of our water samples are shown in Figure 6 plotted against the global meteoric water line (GMWL: δ 2 H = (8.17 ± 0.06) δ 18 O + (10.35 ± 0.65) with r 2 = 0.99; Craig [1961a], refined by Rozanski et al., 1993)   Most of our collected samples plot roughly along the GMWL and AMWL and are within the range of the samples of the eight GNIP stations from southern Africa (long-term statistics in Table S2 in Supporting Information S1). Samples that plot below are mainly from lakes or the larger rivers ( Figure 6). There are also quite a few precipitation samples from the GNIP data set in the top right quadrant of the plot that deviate stronger from the GMWL. These are mostly precipitation samples from the WRZ during the off-season of precipitation ( Figure S4 in Supporting Information S1). Note. Note that for YRZ, there is no precipitation seasonality, and the precipitation fraction of the annual precipitation (P a ) is listed in transition time. The following analysis regarding the isotopic compositions of the RFZs could only be done for the SRZw, SRZe, and WRZ. The YRZ and SRZm needed to be excluded from the statistical analysis, due to insufficient samples size. There are no GNIP stations in these regions and also none (YRZ) or only a few (SRZm; n = 6) of the collected precipitation samples belonged to these groups.
A comparison of the δ 18 O frequency distributions of the precipitation samples from the RFZs shows that these obviously differ in median and variation (Figure 7a). SRZw differs from SRZe and WRZ and shows a comparably greater range and lower mean. Within the RFZ, there are also differences among the precipitation seasons (season, off-season, and transition times; Figure 7b). In the SRZ region, the isotopic ranges are slightly greater during the precipitation season compared to the off-season or transition times, while in the WRZ, the off-season samples show a greater range than the transition times or seasonal samples.
This general picture is supported by ANOVA and post hoc Tukey test concerning the δ 18 O means (with 95% family-wise confidence level). The δ 18 O group mean of SRZw differs significantly from the means of the SRZe and WRZ (Table S3 in Supporting Information S1) and in all three RFZ the δ 18 O values of seasonal rainfall differ significantly compared to off-season rainfall (Table S4 in Supporting Information S1). Additionally, in the SRZw, the season and transition times differ significantly, while in the SRZe the off-season and the transition times which are significantly different. In the WRZ, the δ 18 O means are significantly different for all seasons (two-sided with p < 5%). Figure 8 shows the varying signature in d-excess among the precipitation samples from the three analyzed RFZs for the different precipitation seasons in order to assess the potential impact of moisture source conditions. A dynamic d-excess in water samples might indicate moisture of different sources, for example, from a warmer ocean versus a cooler ocean or colder high latitudes (Leketa et al., 2018). However, the ranges of the d-excess values among the RFZs and among the precipitation seasons do not differ significantly, except for samples from the WRZ during off-season, which are most scattered.

Moisture Sources
The identified moisture sources for the different RFZ during the precipitation season are shown in Figure 9, and the proportions from the different areas are compiled in Table 2. Moisture source areas for off-season and the transition times as well as respective information about the proportions from the different sources are given in Figures S5 and S6 and Table S5 in Supporting Information S1.
The SRZw (Figure 9a) is the region with the highest moisture uptake above the continent (87.5%) during the precipitation season, mainly above northern Namibia (22%), northern Botswana (11.4%), southern Angola (25.6%), and southern Zambia (11.6%). The transport pattern shows three main directions: from the Atlantic Ocean with the low-and midlevel westerlies mainly along the southern and western coast of Africa, with the low-level south-easterly trade winds from the Indian Ocean south of Madagascar (Agulhas Current), and with the midlevel north-easterly trade winds from above the Congo basin.
The main moisture uptake during the precipitation season for the SRZe (Figure 9b) is located over the continent (56.6%), mainly above eastern South Africa, and over the Indian Ocean (35.0%). The majority of the trajectories come from the Atlantic Ocean with the low-and midlevel westerlies along the southern and eastern coast of Africa, but there are also a few coming with the low-level south-easterly trade winds from the Indian Ocean south of Madagascar.
For the SRZm (Figure 9c), the main moisture uptake during the precipitation season is located over the southeastern continent (>50%) above northern South Africa, eastern Botswana, and southern Zimbabwe and Mozambique as well as over the Indian Ocean (30.7%) and the Mozambique Channel (14.7%) near the eastern coast of Africa. The majority of the trajectories come from the Atlantic Ocean with the low-and midlevel westerlies along the southern and western coast of Africa, but there are also quite a few coming from the Indian Ocean with the low-level south-easterly trade winds and the northerly East African monsoon along the eastern coast.
These estimates of continental recycling appears to be high compared to previous estimates for other African regions (e.g., Arnault et al., 2021), which might point to potential biases in the Lagrangian source analysis. However, a previous intercomparison study between different methodologies by Winschall et al. (2014) has shown that, at least for one case study, our trajectory method yields similar results as other numerical water tagging approaches and that the methodological uncertainties associated with the latter can also be large.  Note. YRZ is not included, due to an insufficient number of samples. The African continent is furthermore subdivided into regions south of 15°S (southern Africa) and further north. Madagascar was merged with continental Africa because the contribution was very small (<0.4%). In the WRZ (Figure 9d), the main moisture uptake during the precipitation season origins from over the South Atlantic Ocean (>90%) west of South Africa. The main transport pattern is with the low-level westerlies from the Atlantic Ocean.
For the YRZ, there is just one sample for this subgroup (here: precipitation season = month with >10% of P a ), thus this RFZ was not analyzed as results would not have been representative.
In summary, during the precipitation season, the SRZw has the main moisture source region above the north-western region of southern Africa, while the SRZe and SRZm have their main moisture sources above the eastern region of southern Africa and additionally from the Indian Ocean near the western coast south of Madagascar. The WRZ has almost all the moisture sources from the Atlantic Ocean.
To additionally analyze differences between the precipitation seasons at one place, moisture source regions have been determined for the precipitation at Pretoria for a period of 5 years. The main moisture source regions (southeastern continental Africa and Indian Ocean) and transport pathways (westerlies along the southern and eastern coast of Africa) are similar within the three precipitation seasons; however, during the main precipitation season, there are also trajectories coming from the northern Indian Ocean along the East African coast ( Figure S7 in Supporting Information S1). These results match with the source regions and transport pathways identified from the "spatial data set" for SRZe (Figure 9 and Figures S5 and S6 in Supporting Information S1). The main uptake above the African continent is concentrated on the southern part of southern Africa (south of 15°S) for all three precipitation seasons and is lowest during the transition times, while the uptake above the Indian Ocean is highest during these times (Table 3). The proportion of oceanic and continental uptake is similar for the precipitation season and the off-season. However, during the precipitation season, the uptake above southern Africa is a bit higher compared to the off-season, as is the uptake above the Indian Ocean.

Conditions at Moisture Sources and During Transport
Various variables were recorded along the backward trajectories to assess whether these have an impact on the final isotopic composition. Correlation and simple linear regression analysis were used to explore the statistical relationships between the measured isotope values and these variables of the diagnosed moisture source and sink conditions. Here, we laid special focus on the variables representing the isotope effects described by Dansgaard (1964): latitude effect, altitude effect, continental effect, amount effect, and seasonal effect. The results of the analysis are shown in Tables 4 and 5

Note.
Variables are values at the target location (sink = Snk), weighted mean values of the trajectories at the target location (wmSnk), or the difference between the weighted mean source value and the value at the target location (wmSrc to Snk). Signif. codes: **0.001; *0.01.

Table 4 Summary of Regression Parameters of Selected Variables, Representing the Isotope Effects, on the Collected Precipitation Samples (With Ra > 0.6)
For our collected samples, analyses show the importance of the amount effect and the traveled distance of the moisture. A multivariate regression with precipitation amount and traveled distance as independent variables showed a multiple R 2 of 0.63 (adjusted R 2 = 0.58; p-value = 0.00063) for δ 18 O, and for δ 2 H a multiple R 2 of 0.54 (adjusted R 2 = 0.48; p-value = 0.0028). For the GNIP data set, analyses also showed traces of the amount and the traveled distance of the moisture. A multivariate regression with precipitation amount and traveled distance as independent variables showed a multiple R 2 of 0.33 (adjusted R 2 = 0.27; p-value = 0.012) for δ 18 O, and for δ 2 H a multiple R 2 of 0.32 (adjusted R 2 = 0.25; p-value = 0.015).
To get insights about the relative importance of the traced variables to predict the stable isotope data, RF analysis was done on all available variables, including the weighted means of the variables recorded along the trajectories at the evaporation source locations, the weighted means of the variables at the sink (sample) locations, and general variables at the sink location.
For our collected samples, including samples of all analyzed water types (with Ra > 0.6; n = 117), the top 10 importance scores of the variables for the prediction of δ 18 O, δ 2 H, and d-excess are shown in Figure 10 (top plots: a-c). It was not possible to analyze a subset of only the precipitation samples due to the small sample size (n = 30 or n (Ra > 0.6) = 18) of the subset. In all three models, the RFZ of sampling was among the most important variables. For the δ 18 O and δ 2 H models, also among the most important variables were the main water type (precipitation, river, and lake) and the month of the sampling. Furthermore, for δ 18 O and δ 2 H, the top variable in the importance ranking had a score about 30%-50% higher than the second most important variable.
For the GNIP station at Pretoria (n = 50), the top 10 importance scores of the variables for the prediction of δ 18 O, δ 2 H, and d-excess are shown in Figure 10 (bottom plots: d-f). It was not possible to analyze a subset of only the samples with Ra > 0.6 due to the small sample size (n (Ra > 0.6) = 25) of the subset.
For δ 18 O and δ 2 H, the models show almost the same top 10 variables with only some variation in the order of importance, however, the top three have the same ranking. The most important variable in these two models is the weighted mean fraction of moisture uptake at the source locations that contributes to the precipitation at the sampling location. The second and third important variables are the weighted mean relative humidity at the source location and the total amount of precipitation at the sample location.
For d-excess, the most important predictor variable is the convective precipitation at the sampling location, followed by the large-scale precipitation at the sampling location with a score a bit more than half as large.

Classification of Rainfall Zones
Our results are generally in line with previous studies. Rainfall seasonality in southern Africa has commonly been classified into three zones: summer, winter, and year-round (e.g., Chase & Meadows, 2007;Dieppois et al., 2016;Tyson & Preston-Whyte, 2000). Our findings allow a further differentiation of the summer rainfall zone into western (SRZw), eastern (SRZe), and transitional (SRZm)  Variables are values at the target location (sink = Snk), weighted mean values of the trajectories at the target location (wmSnk), or the difference between the weighted mean source value and the value at the target location (wmSrc to Snk). Latitude and elevation effect were not considered, as there is only one sampling location. Signif. codes: **0.001; *0.01. suggested by Liebmann et al. (2012), who studied the spatial and temporal distribution of rainfall by means of harmonic analysis. With focus on South Africa, in certain studies, the SRZ was subdivided in early, mid, late, and very late precipitation (Botai et al., 2018;EUMETSAT, 2011;Garnas et al., 2016). The early + midsummer and late + very late summer rainfall zones roughly match with our SRZe and SRZm, respectively. Regarding Botswana, Maruatona and Moses (2021) considered the onset and length of the rainfall season to define three zones which roughly match our results. It has to be emphasized, however, that the cited zonings are based on mesoscale analyses, while our classification considers macroscale rainfall variation. Moreover, our approach is independent from rather subjectively defining onset and termination of the precipitation season and no long time series are needed.  (Strobl et al., 2009). Snk = sink (sample location); wmSnk = weighted mean values at sink; wmSrc = weighted mean values at source locations. Detailed meaning of the variables is given in Table S6 in Supporting Information S1.

Spatial and Temporal Differences in the Isotopic Compositions
It is a novel finding that the δ 18 O group mean of SRZw is significantly different to the means of SRZe and WRZ, while the latter two rainfall zones do not differ significantly from each other. There were not enough data for corresponding analyses of SRZm and YRZ. It is noteworthy that due to the high number of GNIP samples in our study, these dominate the data analysis while own precipitation samples have relatively little influence. Our results indicate that within the different RFZ, the isotopic compositions differ significantly between seasonalities (Figure 7b). Isotope values are on average lighter during rainfall seasons compared to off-season and transitional periods. This is consistent with a global analysis of isotopic seasonality in precipitation (Feng et al., 2009) as well as with regional studies from northern Namibia (Wanke et al., 2018) and South Africa (de Wet et al., 2020). Isotope values were highest during the end of the dry season until the onset of the rainy season. The values varied strongly during the early rainy season and were less variable and lightest during the rainfall peak.

Moisture Source Regions
The identified rainfall zones have different predominant moisture source regions. The moisture for SRZw comes mainly from the north-western subcontinent, while SRZe and SRZm receive moisture mainly from the eastern continent and to about one third from the Indian Ocean. The Mozambique Channel contributes 3-4 times more moisture to SRZm than to SRZw and SRZe. The WRZ receives moisture almost exclusively from the Southern Atlantic Ocean. A global scale analysis by Gimeno et al. (2010) identified the Southern Atlantic Ocean, the Indian Ocean, especially above the Agulhas Current and tropical southern Africa (only during JJA) as main source regions, thus partly matching our results. However, Gimeno et al. (2010) identified only those regions as main moisture sources, where the vertically integrated moisture flux divergence (E − P) reached a certain threshold. For Namibia, Kaseke et al. (2016), based on gradients in isotope values, identified either both the Atlantic and Indian Oceans as source regions or the Indian Ocean alone, depending on the isoscape method used. Miralles et al. (2016) suggested that the Kalahari moisture largely originates from the continent and that recycling of the precipitation is highly relevant. This is in line with our results. They calculated that three quarters of the evaporation during the main growing season (∼our main precipitation season) originates from transpiration.

Impact of Moisture Source and Transport on Stable Isotope Ratios
The differences in the moisture sources of the three analyzed RFZ are also evident in the d-excess versus δ 18 O plot. The points from the WRZ are most clustered (except during off-season when outliers with negative d-excess are most likely associated with evapoconcentration). This is both because the Atlantic Ocean is the only source of moisture here, and oceanic values generally do not vary much (ocean source regions of our study are rather uniform in δ 18 O, ranging from −1‰ to +1‰ [LeGrande & Schmidt, 2006]). Furthermore, the source conditions apparently are quite stable, in terms of relative humidity and SST. Points from SRZe and SRZw are a bit more scattered, possibly due to the multiple and different moisture sources (Leketa et al., 2018) from above the African continent and the Indian Ocean (for SRZe). Especially, the recycling of moisture originating from the evaporation of surface waters as well as ocean evaporation into dry air increases the d-excess because of kinetic isotope fractionation . However, local subcloud evaporation of the precipitation can also decrease the d-excess values due to evapoconcentration (Froehlich et al., 2008;Stewart, 1975), which is visible in the off-season plot in the WRZ ( Figure S4 in Supporting Information S1).
Several processes during transport and precipitation have been shown to influence the isotopic composition. In addition to the source conditions, the traveled distances between sources and sinks, as well as the seasonality and the amount of rainfall are principally controlling the values in our rainfall zones. The temperature effect is neglectable for the tropics and subtropics which was also demonstrated for Pretoria (Gat et al., 2001) and Johannesburg (Leketa & Abiye, 2020). The relevant role of the amount effect in our rainfall zones is indicated by the high correlation of isotope values with the total precipitation during the rainy season. Our data also support the general assumption that different traveled distances between sources and sinks lead to discriminable isotope values, predominantly due to preferred rainout of 18 O along a trajectory. That is, the depletion of the moisture from heavy isotopes depends significantly on travel/rainout time. This is particularly supported by the data set of our collected samples. Although this is a study comparing isotopes in precipitation across southern Africa on a broader perspective, it has to be noted that our precipitation samples are relatively few and somehow geographically clustered. Local studies on isotopes in precipitation from, for example, Cape Town SA (Harris et al., 2010) increase the data set locally but have no changing effect on the general pattern.
The RF analysis showed that the distinction between the water types (precipitation, river, lake, and spring) is crucial for considering the influence of moisture source and moisture transport on the stable isotope ratios. Isotope values of small rivers are very similar to those of precipitation though. When analyzing d-excess, however, the type of water does not seem to play a role. In general, spatial and temporal distributions of the water samples have to be considered in more detail for a comprehensive analysis. This applies to both isotopic ratios and d-excess. Consequently, the data set needs to be largely increased to obtain a more complete picture.
Our special focus on the GNIP data set of Pretoria with corresponding RF analysis exhibited that the most important factors are the evaporation rate and the humidity at the sources as well as the total amount of precipitation at the sink. Furthermore, the humidity gradient above the ocean determines the importance of nonequilibrium effects during evaporation (Pfahl & Wernli, 2008) and should thus mainly affect d-excess and less the individual ratios. For d-excess, however, we found the most important variables to be the amount of convective precipitation and large-scale precipitation This importance could potentially be explained by corresponding differences in subcloud evaporation between the precipitation types (Aggarwal et al., 2016).

Summary and Conclusions
In this study, we have analyzed stable isotope data of water samples across whole southern Africa and their moisture sources. We identified five different annual PDs in southern Africa that cluster in space and inferred respective RFZs: a WRZ, a SRZw, SRZe, and SRZm and a YRZ.
However, due to sampling numbers, only the isotopic compositions of the SRZw, SRZe, and WRZ could be further analyzed. The isotopic composition of rainfall from the SRZw had the largest range and differed significantly from the WRZ and to SRZe. We also identified significant seasonal differences within the three analyzed RFZ, as the mean isotope composition during the main precipitation season was significantly lower than during the off-season.
Our Lagrangian analysis suggested that these spatial differences are associated with notably different moisture source regions. While the moisture for SRZw originates mainly from the continent, the SRZe moisture sources are both the continent and the Indian Ocean, and the origin of the moisture that precipitates in the WRZ is the South Atlantic Ocean alone. These findings about the moisture source regions are quite important, as regions with only one or two main source regions might become more strongly affected by a changing water cycle due to climate change compared to regions where the moisture origins from multiple sources (Gimeno et al., 2010).
For the general isotopic composition across southern Africa, the amount effect and the traveled distance of the moisture were found to be important. For the precipitation over Pretoria, analyses also showed traces of the amount effect and the importance of the traveled distance of the moisture. The variable importance analysis by RF furthermore pointed to the spatial and temporal distribution of the samples as well as the water type as important variables. However, the conclusiveness of this statistical RF analysis is limited because of the limited sample size. Future analyses should therefore be based on even more isotopic measurements of precipitation from all RFZ and from all precipitation seasons. Yet, our findings enhance the understanding of the stable isotopes in the atmospheric water cycle over southern Africa and can be useful in the evaluation of isotope-enabled climate models for the region.
Our results provide important insights for the interpretation of paleo-data in southern Africa, as knowledge of the different precipitation regimes, their source areas, and resulting variations in the isotopic composition are important factors for paleo-data interpretation.

Data Availability Statement
The monthly precipitation data used for the identification of the rainfall zones as well as the average near-surface data used for the Walter and Lieth (1960) approach in the study are available at WorldClim 2.1 (Fick & Hijmans, 2017) via http://worldclim.org. All isotope data from GNIP stations (IAEA/WMO, 2021) used in this study are available at the WISER portal (Water Isotope System for Data Analysis, Visualization and Electronic Financial support by the Deutsche Forschungsgemeinschaft (DFG-GZ: RI 809/34-2 and HA 4368/3-2) is gratefully acknowledged. Many thanks go to Elisha M. Shemang from the Botswana International University of Science & Technology who has supported the water sampling and to Florian Cordt for the trajectory calculations for the GNIP-Pretoria station. The authors furthermore thank Annette Rudolph for advice given on Random Forest analyses and Arne Ramisch for discussions on end-member modeling and compositional data. Finally, we thank the two unknown reviewer for their thorough examination of our manuscript and many helpful comments for further improvements. Open Access funding enabled and organized by Projekt DEAL.