Contrasting Drought Propagation Into the Terrestrial Water Cycle Between Dry and Wet Regions

Drought's intensity and duration have increased in many regions over the last decades. However, the propagation of drought‐induced water deficits through the terrestrial water cycle is not fully understood at a global scale. Here we study responses of monthly evaporation (ET) and runoff to soil moisture droughts occurring between 2001 and 2015 using independent gridded datasets based on machine learning‐assisted upscaling of satellite and in‐situ observations. We find that runoff and ET show generally contrasting drought responses across climate regimes. In wet regions, runoff is strongly reduced while ET is decoupled from soil moisture decreases and enhanced by sunny and warm weather typically accompanying soil moisture droughts. In drier regions, ET is reduced during droughts due to vegetation water stress, while runoff is largely unchanged as precipitation deficits are typically low in these regions and ET decreases are buffering runoff reductions. While these water flux drought responses are controlled by the large‐scale climate regimes, they are additionally modulated by local vegetation characteristics. Land surface models capture the observed water cycle responses to drought in the case of runoff, but not for ET where the ET deficit (surplus) is overestimated (underestimated), related to a misrepresentation of the general soil moisture‐evaporation interplay. In summary, our study illustrates how the joint analysis of machine learning‐enhanced Earth observations can advance the understanding of global eco‐hydrological processes, as well as the validation of land surface models.

, and lead to a decrease of runoff. Runoff is indispensable for aquatic ecosystems, and for the food and energy production through irrigation (Destouni et al., 2013), and it can be strongly and quickly reduced under drought, challenging the freshwater management (Fuentes et al., 2022;Orth & Destouni, 2018). In addition, soil moisture drought can trigger ET anomalies which may further affect the land water balance in general and runoff in particular. Even though ET from soils and other surfaces is also relevant, ET consists mainly of plant transpiration. Therefore, ET anomalies are primarily associated with vegetation functioning, including that of agricultural crops and natural vegetation. In wet regions, drier than usual soil moisture conditions are typically accompanied by abnormal high temperature and radiation which induce ET surpluses (Orth & Destouni, 2018). By contrast, in dry regions, soil becomes too dry under drought and cannot satisfy vegetation water demand and likely induce reductions in vegetation productivity and growth (Mishra & Singh, 2010). ET anomalies induce anomalies in the regional atmospheric water content which can be transported into downwind regions to cause remote water balance effects (Hoek van Dijke et al., 2022;Schumacher et al., 2022). Understanding the propagation of soil moisture drought into the terrestrial water cycle can help to identify areas of particularly high vulnerability for water deficit propagation. Moreover, this study can inform potential drought mitigation measures, for example, irrigation or dam regulations, to maintain the ecosystem's environmental and socio-economic services.
Land surface models (LSMs) simulate ET and runoff which are essential water fluxes that feed back to climate extremes such as droughts (Pitman, 2003). LSMs typically incorporate a suite of physical processes through parameterizations for simulating ET and runoff in normal or extreme conditions. Therefore, the models are inherently uncertain given (a) potentially ignored mechanisms and processes under droughts (Ukkola et al., 2016b), (b) inaccurate or simplified representations of drought-related processes (Clark et al., 2015;Warren et al., 2015), and (c) inaccurate parameterizations which can be related to calibration of the model against observations of only a single hydrological variable (Dembélé et al., 2020;Koppa et al., 2019). Previous studies have found relatively good performance of ET simulations from LSMs in terms of normal conditions, but ET during water-stressed conditions is not properly reproduced (Berg & Sheffield, 2018;Best et al., 2015;Ukkola et al., 2016b). This is likely related to the inaccurate soil moisture controls on vegetation, and related to the neglect of sub-surface flow variations such as groundwater which then feeds back to ET (Destouni & Verrot, 2014;Ghajarnia et al., 2021). Similarly, LSMs can capture total runoff based on high-quality precipitation forcing data (Fallah et al., 2020;Zhou et al., 2012), but some of the models provide a poor simulation of drought propagation into runoff, related to too-fast drought recovery due to an oversensitivity to precipitation (Van Loon et al., 2012). Different from physically-based LSMs, machine learning-based datasets are directly derived from observations and do not require physical assumptions in their models, and hence are new opportunities to assess the uncertainty of LSMs. Machine learning-based soil moisture, evapotranspiration, and runoff datasets benefit from the growing suite of satellite-based Earth observations and in-situ measurements, as well as from machine learning techniques that have recently become available and can be used to understand land surface drought responses (e.g., Jung et al., 2019;O & Orth, 2021;Ghiggi et al., 2021). Machine learning algorithms can learn complex relationships between ground measurements and meteorological conditions to extrapolate the ground measurements to unobserved regions using globally available meteorology data (Papale & Valentini, 2003;Tramontana et al., 2016). Machine-learning-driven ET and soil moisture products are less affected by uncertainties in vegetation rooting depths and water uptake depths which are required in physically-based models.
In this study, we detect droughts based on minimum warm-season soil moisture during the period 2001 and 2015 in each grid cell using the SoMo.ml dataset (O & Orth, 2021) across global vegetated areas. And we analyze the related response of ET from FLUXCOM  and of runoff from the G-RUN ensemble (Ghiggi et al., 2021). Thereby we benefit from the opportunity that since recently all major land water state and flux variables are available from state-of-the-art observation-based machine learning-extrapolated global datasets. We determine ET and runoff anomalies under drought development and recovery periods, and then investigate their relationship with climate regimes, vegetation types, soil characteristics, topography and human activities. Further, we compare the drought propagation into ET and runoff between observation-based data and an ensemble of state-of-the-art LSMs from TRENDY models. In this context, we additionally employ a hybrid hydrological model which aims to combine the flexibility of machine learning with physical constraints and is hence in between data-driven and process-based modeling approaches (Kraft et

Observation-Based Data
An overview of all employed datasets is presented in Table 1. The three main focused variables of this study, ET, runoff and soil moisture, represent water fluxes and storage, and are obtained from global gridded datasets which upscale in-situ measurements using machine learning techniques, and are independent from each other in terms of their model schemes. We use SoMo.ml soil moisture to detect drought periods (O & Orth, 2021). SoMo.ml comes with 0.25° spatial resolution and covers the period from 2000 to 2019. It distinguishes three soil layers (0-10 cm, 10-30 cm, 30-50 cm), and we use a depth-weighted average of the three layers in this study. The underlying Long Short-Term Memory machine learning model has been demonstrated to learn relationships between in-situ soil moisture measurements and meteorological data, and to extrapolate soil moisture dynamics to unobserved regions (O & Orth, 2021;.
We use ET from FLUXCOM to analyze its anomalies during drought conditions . FLUXCOM ET is based on ensembles of machine learning approaches for upscaling in-situ measurements from FLUXNET eddy covariance towers. We use the FLUXCOM ET product which is based exclusively on remote-sensing data to model ET, and therefore independent of meteorological and soil moisture data . FLUXCOM ET from the remote sensing setup provides 8-daily data at 0.0833° spatial resolution from 2001 to 2015.
We use runoff from the G-RUN ensemble runoff dataset to analyze respective anomalies during drought (Ghiggi et al., 2021). The G-RUN ensemble product comes at 0.5° spatial resolution and covers the time period 1902 to 2019. It is the second version of G-RUN (Ghiggi et al., 2019) and uses random forests and global meteorological data to upscale a comprehensive dataset of international in-situ runoff observations. G-RUN Ensemble runoff has been shown to compare well with independent observations from large river basins, and compared with other runoff products over the period 1982-2010 (Ghiggi et al., 2021). In summary, the input data used to produce the machine-learning datasets of soil moisture, ET, and runoff are largely independent (Table S1 in Supporting Information S1). While SoMo.ml soil moisture and G-run runoff share the ERA5 climate forcing, additionally other 20 precipitation and air temperature datasets are used in G-RUN, and skin temperature instead of air temperature is used in SoMo.ml, as input variables. Note that in addition to largely independent input data, the standard procedures of validation and evaluation of these ML-based datasets against different reference datasets and in-situ measurements ensure the applicability of these data streams in our analyses.
We validate our main analysis in two ways: (a) To complement our drought analysis based on soil moisture deficits in the top 50 cm of the soil, we additionally use terrestrial water storage from GRACE which also includes deep soil moisture and groundwater dynamics to detect drought (Landerer & Swenson, 2012;Swenson & Wahr, 2006); (b) To validate the ET drought responses detected with FLUXCOM's global gridded data, we use flux tower ET measurements based on the eddy covariance method and obtained from the FLUXNET2015 dataset. We calculate monthly ET anomalies by removing long-term trends and mean seasonal cycles after the quality control and gap filling (Pastorello et al., 2020). We focus on 39 sites with more than 8 years of continuous data since 2001.

Land Surface Modeled Data
We compare our observation-based results with drought responses simulated by state-of-the-art LSMs from the TRENDY v7 ensemble. All three variables, ET, runoff and soil moisture, are derived in monthly resolution from each TRENDY model: CABLE-POP, CLM5.0, ISAM, JSBACH, JULES, LPJ-GUESS, LPX, ORCHIDEE-CNP, and VISIT. In particular, we use simulations from Scenario 3 which fully account for changes in CO 2 , climate and land use (Le Quéré et al., 2018;Sitch et al., 2015). To be consistent with observation-based data, we convert the units of soil moisture, ET and runoff from LSMs from originally kg/m −2 , kg/m −2 s −1 , and kg/m −2 s −1 to mm, mm/ day, and mm/day, respectively. The TRENDY models provide data at different spatial resolutions ranging from 0.5° to 2°, such that we downscale the outputs to 0.5° spatial resolution to match the observational datasets. The downscaling is done by using the same values from surrounding lower-resolution grid cells.
In addition, we use ET, runoff and soil cumulative water deficit simulated from the hybrid hydrological model, H2M, which combines physical process representations with machine learning algorithms (Kraft et al., 2022). The model consists of a simple hydrological scheme, which represents the water storage of snow, soil cumulative Modeled soil moisture and water fluxes Soil moisture (with different depths in Table S2 in Supporting Information S1); Evaporation; Runoff

TRENDY v7
Output from physically-based land surface models Le Quéré et al. (2018) and Sitch et al. (2015) Soil moisture and water fluxes from the hybrid modeling Soil cumulative water deficit; Evaporation;

H2M
Output from the hybrid hydrological model Kraft et al. (2022) Runoff Water fluxes and ancillary data from the eddy covariance measurements Evaporation;

FLUXNET2015
Water fluxes and ancillary in-situ data derived using the eddy covariance method Pastorello et al. (2020) Runoff; Temperature;

Net radiation; Precipitation
Dynamic ancillary data Terrestrial water storage GRACE Global water storage anomalies relative to a mean storage Landerer and Swenson (2012) and Swenson and Wahr (2006) Temperature; Precipitation; Solar radiation; VPD CRU-JRA v2.0 Climate forcing data for land surface models Harris et al. (2014) and Kobayashi et al. (2015) Static ancillary data

Table 1
Overview of Employed Datasets water deficit, and groundwater, and ensures the conservation of water across compartments. It uses a recurrent neural network to generate spatio-temporally varying parameters which are derived by calibration against FLUX-COM ET, GRUN runoff, GRACE terrestrial water storage, and GLOBSNOW snow water equivalent. Thus, the model is data-driven yet physically constrained by the water balance equations which improves the performance beyond physical model-based LSMs.

Auxiliary Data
To study the meteorological conditions associated with drought, we use 0.5°-resolution ERA5-Land meteorological data including 2-m air temperature, short-wave incoming solar radiation (hereafter "solar radiation"), precipitation, and vapor pressure deficit (VPD) (Muñoz-Sabater et al., 2021). The ET and runoff responses to drought are analyzed across different climate regimes which we characterize by the aridity index. We use the traditional equation to calculate aridity by a ratio of long-term equivalent ET to precipitation. We use long-term average net radiation to estimate equivalent ET (in mm) by multiplying the inverse of the latent heat of vaporization (Budyko et al., 1974;Orth & Destouni, 2018;. Higher aridity values denote drier climate conditions. The aridity index used in the flux tower ET analysis is calculated using flux tower net radiation and precipitation. To understand the multifaceted controls of the spatial patterns of ET and runoff anomalies at drought peaks, we consider a range of land surface characteristics listed in Table 1, including variables related to climate (aridity index), vegetation (tree cover fraction, anisohydricity index, and leaf area index (LAI)), topography obtained at original 250 m resolution (medians and standard deviations of elevation, slope, roughness and aspect for each 0.5° grid cell), soil type (fractions of silt, clay and sand), and human activities (population and irrigation density).

Data Processing
Our analysis focuses on gridded global datasets with 0.5° spatial resolution and monthly time steps. Data from daily products are aggregated to the monthly time scale by calculating averages across all days of each month. The study time period is 2001-2015 as constrained by the concurrent availability of all relevant datasets. Our analyses focus on grid cells where the fraction of total vegetation cover from 2001 to 2015 is higher than 5% to exclude non-or low-vegetated areas such as deserts and lakes. When studying ET, runoff and other hydro-climate conditions under drought, we focus exclusively on anomalies, which are obtained by removing the mean seasonal cycles and long-term trends. After extracting all variables from the time period between 2001 and 2015, mean seasonal cycles are calculated using 15-year data for specific months, and long-term trends are derived by using a locally-weighted smoothing filter with a window size of 40% of the time series length.

Drought Detection
We study ET and runoff responses to drought within vegetation growing seasons, so that we remove monthly time periods where the temperature from ERA5-Land is lower than 5°C. We then select the most extreme drought event for each grid cell based on the lowest soil moisture value in our study period from 2001 to 2015 using observation-based and land surface modeling data, respectively. We analyze ET and runoff anomalies at these drought peak months, and additionally focus on the development and recovery periods by considering the 3 months before and after (Orth & Destouni, 2018).
We also study the drought duration which is defined as (a) the drought development period starting when soil moisture is decreasing below the seasonal mean (=start of dry anomaly) until drought peak, and (b) the drought recovery period which extends from drought peak until the soil moisture is for the first time above the seasonal mean again (=end of dry anomaly). By distinguishing long and short drought duration for the development and recovery periods, we can better understand the role of drought types on regulating water fluxes responses.
In the analysis of flux tower ET located in the northern hemisphere, since soil moisture is not always measured at each site, we use a cumulative water deficit index (CWD) to detect drought peaks. First, we remove monthly time periods where the eddy tower temperature is lower than 5°C. CWD is calculated by accumulating site-measured precipitation (P) and ET for each year (Yu et al., 2022): where t indicates the monthly time step. The initial value of CWD is set to zero. Potential data gaps are filled with ET from the GLEAM v3.5a product and precipitation from (a) 0.25°-resolution gridded ERA5 data and (b) machine-learning downscaled precipitation product (Besnard et al., 2019). CWD is reset to zero at the end of each year to close the annual water balance.

Attribution Analysis
Attribution analysis is conducted to understand spatial patterns of ET and runoff anomalies associated with the drought peaks. For this purpose, we train random forests to model ET and runoff anomalies at drought peaks, respectively, across all global grid cells with several ancillary land surface data (see Table 1). Using cross-validation we ensure a useful model performance with cross-validation out-of-bag R 2 higher than 0.5 (Breiman, 2001). Then we evaluate the relevance of individual variables using the Shapley Additive Explanations (SHAP) attribution method which is a robust explainable machine learning method (Lundberg & Lee, 2017). SHAP is a game theoretic approach to explain the output of the random forest model by accounting for contributions of individual variables to the overall prediction. This way, to understand the most important controls of ET responses to drought, we calculate SHAP values to quantify the marginal contributions of each predictor on the target variable ET, and rank the variable importance by the sum of absolute contributions across all grid cells. To understand the most important controls of runoff responses to drought, we then use runoff anomalies to replace ET anomalies to repeat the attribution analysis. When studying spatial patterns of ET anomalies under drought we also use runoff drought anomalies as predictors, and vice versa.

Detecting Soil Moisture Droughts
The months and years when the driest soil moisture values are detected across the globe are shown in Figure 1. Regions over Africa, the Middle East and Greenland are excluded due to the sparse vegetation. The month-of-year of drought peak occurrence varies across latitudes: In the northern hemisphere, drought occurs from June to October as a consequence of the interplay of limited water input and higher ET in summer and autumn months. Near the equator, drought rather occurs from January to May, corresponding to the meteorological dry seasons in northern South America, Central Africa, India and Southeast Asia. In the Southern Hemisphere drought occurs mostly also in meteorological dry months for example, from July to December in Amazon, except for the southern parts of South America, South Africa and Australia. Drought peak months across Australia are variable and are modulated by the local climate regimes (Peel et al., 2007). Spatial patterns of drought years show more heterogeneity than the month-of-year results, while larger clusters correspond well with drought events reported in previous studies, such as the 2003 European drought (Fink et al., 2004), the 2010 western Russian drought (Barriopedro et al., 2011), the 2012 Midwest drought in the United States (Rippey, 2015), and the 2010 Amazon drought (Lewis et al., 2011).

Water Cycle Response to Drought in Observation-Based Data
The global distributions of ET and runoff anomalies at soil moisture drought peaks are shown in Figure 2. Compared to long-term average conditions, ET shows both increases and decreases under drought (Figure 2a). Strongly positive ET anomalies are found in the high latitudes and the tropics, while strongly negative ET anomalies occur mostly in the subtropics and mid-latitudes. Negative ET anomalies are larger in an absolute sense and more widespread than positive ET anomalies. By contrast, runoff anomalies are negative during drought peaks across most of the globe with the strongest negative values located in the Amazon and Asian tropics (Figure 2b). Exclusively focusing on ET and runoff negative anomalies which can affect regional ecosystems and also socio-economic systems, we find that ET negative anomalies are slightly stronger than runoff negative anomalies across the globe (Figure 2c), and the preferential propagation of soil moisture deficits into runoff in northern Europe confirms results from a previous study (Orth & Destouni, 2018).
Latitudinal patterns of ET and runoff anomalies in Figure 2d present two peaks of ET surpluses in boreal regions around 65°N and around the equator. These regions are typically wet ( Figure S1a in Supporting Information S1) and energy-limited (Denissen et al., 2021;W. Li et al., 2021) such that even during periods with soil moisture deficits, the soil moisture content is sufficient to sustain plant photosynthesis and associated transpiration (O . Further, soil moisture drought in high latitudes is typically accompanied by sunny and warm weather that benefits boreal ecosystem productivity, which is often limited by low temperatures and waterlogging (Ohta et al., 2014). Similarly, tropical regions are also wet and often have limited radiation supplies (W. Li et al., 2021). Interestingly, runoff anomalies show opposite patterns to ET anomalies in boreal and tropical regions as a consequence of severe negative anomalies of soil moisture ( Figure S1b in Supporting Information S1) and ET surpluses (Condon et al., 2020;Zhao et al., 2022). In low latitudes around 0°-40°N and 15°S-35°S, ET reductions typically exceed runoff reductions which indicates a considerable green-water vulnerability to drought in these areas ( Figure 2d). Although subtropical regions show mostly low runoff reductions, some sub-regions such as southern China and eastern South America exhibit much stronger runoff decreases where their unique topography could influence rainfall-infiltration processes.

Understanding the Observed Water Cycle Response to Drought
Next, we perform an attribution analysis to understand the controlling factors of the spatial patterns of ET and runoff drought responses shown in Figures 2a and 2b. We find that tree cover fraction, VPD anomalies, runoff anomalies and aridity are the four most important predictors for the ET responses to soil moisture drought ( Figure  S2a in Supporting Information S1). Higher VPD is associated with the higher ET deficits at drought peaks, because plants close stomata to prevent water loss when VPD is high (Fu et al., 2022;Novick et al., 2016). Similarly, aridity and tree cover fraction are also found as main controls to explain the spatial patterns of runoff responses to soil moisture drought ( Figure S2b in Supporting Information S1), together with precipitation and soil moisture anomalies. ET anomalies also play a role in regulating the spatial variability of runoff anomalies at drought peaks where negative relationships between ET and runoff anomalies are expected, as available precipitation is partitioned into both fluxes, and ET reductions could buffer runoff deficits. We note that such an attribution analysis can only reveal plausible land surface characteristics controlling the water cycle drought response, but it cannot detect actual causal relationships. Further, many of the variables identified as controls of the spatial patterns of the ET and runoff drought responses are not employed in the derivation of the ET and runoff products. This means that our attribution results are not an artifact of the derivation of the data products.
After identifying aridity and tree cover fraction as major modulators of the ET and runoff drought responses, we investigate these two drivers further in Figure 3 by grouping the global ET and runoff drought responses (Figures 2a and 2b) according to classes of aridity and tree cover fraction. Figure 3 confirms systematic gradients of ET and runoff drought anomalies across aridity and tree cover classes. ET increases in low aridity (wet) regions where vegetation is not limited by water availability and is enhanced by drought-associated increases in atmospheric water demand (Green et al., 2020). ET decreases during soil moisture droughts in dry regions where aridity is higher than 1 (Figure 3a). In these regions, water availability often limits vegetation functioning even under normal conditions (O et al., 2022). Higher ET surpluses (or lower deficits) are found in regions with abundant tall vegetation. This can be explained as (a) tall trees likely have deeper-reaching roots to access deeper soil moisture and groundwater (Stocker et al., 2023), or (b) they have better water saving strategies during pre-drought periods , such that they can benefit more from the drought-related radiation and temperature increases to enhance transpiration. Note that regions with aridity greater than 4 potentially have large uncertainty in runoff in extreme dry regions (Ghiggi et al., 2021), and there are less in-situ soil moisture observations (O et al., 2022), but the results are barely changed visibly when excluding regions with aridity greater than 4 ( Figure S3 in Supporting Information S1).
Different from ET, runoff responses to drought show the strongest deficits in very wet regions with high tree covers. This is related to severe soil moisture deficits which is then amplified by the concurrent ET surpluses, leaving a smaller fraction of available water for runoff ( Figure S1 in Supporting Information S1). Additionally, the precipitation deficits in wet regions are typically larger than in dry regions ( Figure S4c in Supporting Information S1). Also, in areas with dense tree cover, more precipitation water is likely intercepted, enhancing ET and decreasing the water amount available for runoff (Owens et al., 2006). Figure 3 displays median ET and runoff anomalies, but we note that the variability of ET and runoff anomalies within each aridity-tree cover class is substantial ( Figures S5 and S6 in Supporting Information S1). This is related to land surface heterogeneity and the influences of other controls of the water cycle drought response ( Figure S2 in Supporting Information S1).
Furthermore, we study the role of drought duration for the observed ET and runoff responses. For this purpose, we repeat the analysis of Figure 3 for different subsets of droughts with different development and recovery period lengths, and find overall similar ET and runoff drought responses across aridity-tree cover classes ( Figure  S7 in Supporting Information S1). ET and runoff anomalies are more negative in cases of longer duration for both drought development and recovery, reflecting more pronounced soil moisture stress (Figures S7a and S7b in Supporting Information S1). Interestingly, the intensity of the ET drought response is stronger related to the drought development duration and less affected by the drought recovery period, while the opposite is observed for the runoff drought response ( Figures S7c-S7f in Supporting Information S1). The lower influence of drought recovery duration on ET implies that ET recovery utilizes incoming precipitation water after drought peak which in turn can delay soil moisture and runoff recovery. Moreover, we reproduce Figure 3 with the second strongest drought in each grid cell which is at least 6 months before or after the first drought and find no systematic differences, except for slight lower magnitudes of ET and runoff anomalies ( Figure S8 in Supporting Information S1). This suggests that the studied spatial variations of drought influence is generally representative of other severe drought events occurring in the same grid cells. But still, lower magnitudes of ET and runoff anomalies in the second strongest drought are expected due to less soil water stress compared to the strongest drought.
Moving beyond the focus on peak drought anomalies, we also study changes of water fluxes over the whole course of droughts. Apparent ET surplus can be found from 1 month before drought peaks in wet regions which corresponds to the appearance of increased temperature and radiation (Figure 4a; Figures S4d and S4e in Supporting Information S1). In very dry regions, negative ET anomalies can be found already 3 months before drought peaks corresponding to concurrently low soil water availability (Figure 4a and Figure S4a in Supporting Information S1). Runoff reductions start to be 1 or 2 months before drought peaks in line with precipitation anomalies (Figure 4b and Figure S4c in Supporting Information S1).
When focusing on the recovery period, we do not find substantial ET anomalies after drought peaks despite the fact that the soil moisture deficit is still severe ( Figure S4b in Supporting Information S1). The quick recovery of ET can be attributed to precipitation events which initiates drought recovery, and through ET recovery this incoming water directly compensates for high VPD ( Figure S4f in Supporting Information S1). Runoff deficits continue for one-two months (Figure 4b) following low soil moisture (and hence baseflow) and preferential partitioning of precipitation water input to ET. Furthermore, we also map the contrasting ET and runoff anomalies during drought development and recovery periods in Figure S9 in Supporting Information S1. ET surpluses are found predominantly in the drought development period and in high latitude and tropical regions, while runoff reductions are most substantial in these regions in both drought phases.
In this observation-based analysis, droughts are identified from soil moisture deficits in the top 50 cm, such that deep soil moisture or groundwater are not directly considered. For this reason, we repeat the drought detection using terrestrial water storage measured from the GRACE satellite mission, and similarly we determine the ET and runoff anomalies when terrestrial water storage is the lowest. We find very similar results of observation-based ET and runoff anomalies during drought periods ( Figure S10 in Supporting Information S1). Interestingly, in wet regions, GRACE-detected droughts involve an earlier onset of ET surpluses and less pronounced ET surpluses at drought peaks. In these regions, tall vegetation can benefit from its deep-reaching roots during the early drought stages of soil moisture droughts, while deeper water sources such as groundwater are still available in the drought development periods but not during drought peaks (Fan et al., 2017;Mu et al., 2021). In addition, an earlier onset of runoff reductions in wet regions is found for droughts detected through terrestrial water storage. This is probably related to reductions in the deeper sub-surface storage (level) of groundwater and associated variations in the groundwater flows that feed runoff (Destouni & Verrot, 2014), which are not fully captured in the case of topsoil droughts. Overall, our results highlight that topsoil droughts do not affect the water cycle fundamentally different from droughts of total water storage and confirm the robustness of our findings. Note that in the next section about ET and runoff responses to drought in LSMs, we detect droughts using total soil moisture, which is not fully comparable with the soil moisture depths considered in the cases of SoMo.ml and GRACE in the observation-driven analyses (see Table S2 in Supporting Information S1 for soil moisture depths in LSMs). Nevertheless, the soil moisture depth is uncertain due to model assumptions on vegetation types and their reference rooting depths in LSMs. At the same time our results of using SoMo.ml and GRACE do not differ much, demonstrating the validity of our approach for further comparing the observation-based drought responses simulated by LSMs from which we employ total soil moisture to detect droughts.

Water Cycle Response to Drought in Land Surface Models
Next, we analyze the output from state-of-the-art LSMs from the TRENDY ensemble. As shown in Figure 4c, the models overestimate drought-related ET reductions in dry regions. By contrast, drought-related runoff reductions simulated by LSMs show similar patterns and magnitudes as in the observation-based results, with the largest decreases in wettest regions during 2 months before and after drought peaks. Runoff recovers slightly more quickly than the observation-based result, which can be related to oversimplifications of sub-surface hydrological processes and less or no simulated ET surpluses leaving more water (Ukkola et al., 2016a) for runoff. Compared with these multi-model averages, the results from individual models show similar response patterns for wet versus dry regions but different magnitudes of simulated ET and runoff anomalies ( Figure S11 in Supporting Information S1). Global and latitudinal ET and runoff anomalies under drought simulated by LSMs show that the strong runoff reductions in boreal and tropical regions are properly captured, whereas ET reductions in these regions are overestimated and ET surpluses are not reproduced ( Figure S12 in Supporting Information S1). Also, the global spatial patterns of runoff anomalies are better captured by LSMs than those of the ET anomalies, even though the overall agreement of the patterns with observation-based results is limited in both cases (Figures S13 and S14 in Supporting Information S1). Given that global distributions of drought peak months and years from LSMs are largely similar to observation-based results (Figures S15 and S16 in Supporting Information S1), LSMs biased representation of drought propagation into the ET deficits is not strongly associated with the soil moisture drought timing. Moreover, ensemble-mean ET and runoff anomalies from the LSMs during drought development and recovery periods are shown in Figure S17 in Supporting Information S1. Runoff reduction patterns are overall well captured in LSMs, while ET deficit during drought development and ET surpluses during drought recovery are overestimated in many regions compared with observation-based results.
To test if a different modeling approach yields similar global drought propagation patterns as established LSMs, we consider simulations from the hybrid hydrological model H2M in Figure S18 in Supporting Information S1. As a hybrid model, it combines machine-learning data-driven approaches and physical-based modeling, so that it differs from common hydrological models that can be integrated in LSMs. It provides an independent perspective and is less affected by potentially missing or incomplete representations of relevant processes challenging physically-based models. We find stronger ET decreases in medium-dry to dry regions in H2M and also in LSMs. H2M ET is driven by data and has a closed water balance in contrast to FLUXCOM ET, so that the potential underestimation of extreme magnitudes of ET can be partly offset by forcing water balance closure. However, similarly as in the case of the TRENDY LSMs, H2M does not accurately reproduce the observed contrast positive and negative ET responses to drought peaks across humid and arid regions. H2M slightly misrepresenting the positive changes in wet regions found in ground observations ( Figure S18 in Supporting Information S1) is possibly related to biases in implicit physical assumptions such as causal pathways of the water cycle, and related to biases due to trade-offs between the physical constraints (Kraft et al., 2022).
We also find similar biases in runoff drought anomalies from H2M compared with the TRENDY simulations, even though the positive ET response in humid regions is captured better while the recovery of ET anomalies is slower than that in the reference data. Note, however, that better agreement with the reference data results is somewhat expected as the H2M model is calibrated against FLUXCOM ET and other observation-based products and hence not as independent as the TRENDY models.
Since the independent observation-based datasets employed here involve uncertainties and might not always complement each other to fully close the water balance, we seek to confirm the results of FLUXCOM ET drought responses which show considerable differences with modeled results. Therefore, we additionally analyze in-situ measurements of ET from flux-towers with the eddy covariance technique. To detect drought peaks at the towers, we determine minima in the cumulative water deficit, as soil moisture measurements are not consistently available. We find substantial variability of drought peak ET anomalies across sites highlighting the role of local climate ( Figure S19 in Supporting Information S1). Overall, there is a tendency for more positive than negative ET anomalies at wet sites, and negative anomalies are dominant at dry sites. Therefore, eddy-tower-measured ET results confirm the ET drought responses found from the global gridded dataset despite the impact of subgrid-scale land surface heterogeneity, and with different gap-filled methods when detecting drought peaks using cumulative water deficit we find similar results.

A Spotlight on the ET Drought Response Across Observations and Models
We then study the general representation of the ET-soil moisture coupling in LSMs inferred from the correlation between them across all growing season months of the entire study period, as this can help to understand the biases in the simulation of respective drought anomalies. Figure 5a shows that ET responds positively to soil moisture changes in water-limited regions, including central North America, central Eurasia, Australia, eastern and South Africa. In contrast, a negative ET-soil moisture relationship is found in energy-limited regions in boreal Eurasia, tropical regions, eastern North America, north and central Europe and central eastern Asia. This negative relationship results from soil moisture anomalies typically behaving opposite to temperature and radiation anomalies which are actually controlling ET anomalies in these regions. Although LSMs capture the positive relationships between ET and soil moisture in dry regions, they cannot represent the negative coupling in humid regions (Figure 5b), which is also the case for most individual models ( Figure S20 in Supporting Information S1). When we relate the peak-drought ET biases from LSMs to the biases of their ET-soil moisture coupling (Figure 5c), we find that higher ET-soil moisture correlation biases coincide with exaggerated negative ET anomalies at drought peaks. This result illustrates that deficiencies in capturing overall land-atmosphere interactions affect the estimation of water flux anomalies during droughts.
These deficiencies could be a joint result of several individual uncertainties in the LSMs related to for example, the representation of vegetation water stress, soil hydraulics and structure, atmospheric boundary layer parameterizations, or parameterizations related to plant functional types, as illustrated in previous studies (De Kauwe et al., 2015;Powell et al., 2013;Ukkola et al., 2016b;Zhao et al., 2022). The modeled drought response can be affected by an incomplete representation of water stress considering for example, solely soil moisture or solely (c) Relationships between biases of simulated evaporation anomalies at drought peak (y-axis) and the respective differences between modeled and observed ET-soil moisture coupling (x-axis) as shown in (a)  VPD . In our results, the bias of ET-soil moisture coupling in models is found in all individual models ( Figure S21 in Supporting Information S1), implying that the diverse vegetation water stress functions in different models do not dominate biases in ET-soil moisture interactions. The misrepresentation of the ET-soil moisture coupling in wet regions could be related to the missing consideration of biophysical processes such as waterlogging which inhibits vegetation growth and transpiration, especially in energy-limited tropical and boreal regions (Ohta et al., 2014). Thereby, water stress applies not only in the case of dry soils but also for very wet soils. Further, the misrepresentation of soil hydraulic conductivity in models contributes largely to the underestimation of dry-regions soil ET during drought (Zhao et al., 2022), which helps to explain the overall stronger ET deficits in dry regions (Figure 4c). Since ET depends strongly on vegetation structure (as represented by e.g., leaf area), the misrepresentation of LAI sensitivity to soil moisture can also partly explain the ET-soil moisture deficiencies (W. Li et al., 2022). Our findings are largely in line with that of Zhao et al., 2022 regarding the biases of LSMs in modeling ET under drought. However, at the same time, other aspects of our results differ from the results of Zhao et al., 2022. For example, we do not confirm that drought-related positive ET anomalies are globally widespread and mainly controlled by changes in precipitation and terrestrial water storage. These differences are related to, and highlight the relevance of (a) drought definitions and (b) approaches to estimate ET.

Conclusions
In conclusion, we find that (a) ET and runoff responses to drought across wet and dry regions are contrasting, (b) vegetation characteristics additionally regulate water trajectories under drought aside from climate, and (c) LSMs systematically overestimate the green water flux ET deficits under drought.
ET and runoff, the two main terrestrial water fluxes, show contrasting responses to soil moisture droughts ( Figure 6). Drought propagation into runoff is stronger and longer-lasting in wet regions than that in dry regions. This result is consistent with the result of runoff anomalies after divided by absolute values of seasonal means, indicating that spatial-varying runoff anomalies under drought are less driven by seasonal means of runoff ( Figure  S22 in Supporting Information S1), but are more primarily driven by spontaneous precipitation and soil moisture Figure 6. Schematic illustration of the interplay between soil moisture and surface water flux anomalies. Soil moisture drought, resulting from anomalous meteorological conditions, induces evaporation (ET) deficits in dry regions via vegetation water stress and reduced transpiration. In wet regions, ET is decoupled from soil moisture and enhanced by associated temperature and radiation increases. In both cases, ET anomalies feed back to soil moisture through enhancing or mitigating the initial deficit. Runoff is reduced during drought as a result of reduced water input and its reduction is aggravated by ET surpluses through increases of soil moisture deficits specifically in wet regions. Dashed lines are shown for completeness as these feedback processes also exist, while our study mainly focuses on the process indicated with solid lines.
anomalies, and long-term aridity ( Figure S2 in Supporting Information S1). Meanwhile, results of ET anomalies ( Figure 4a) and ET anomalies divided by ET seasonal means ( Figure S22 in Supporting Information S1) both show that the propagation of soil moisture deficits into reduced ET is only found in dry regions, while in wet regions, vegetation functioning is not limited by water availability and benefits from sunny and warm weather conditions typically accompanying soil moisture droughts in these regions. These emerging large-scale signals are mainly related to regional climate, that is, aridity, and are modulated by heterogeneous land surface characteristics (e.g., fraction of tree cover and topography). Drought duration also play a role in regulating magnitude changes of ET and runoff anomalies with stronger water flux anomalies existing in long-duration drought events. The interplay of these drivers relevant at different spatial scales determines the observed drought propagation into the water cycle, and explains its spatial heterogeneity.
Further, these results are obtained with machine learning-based datasets. While these datasets have particular shortcomings such as inaccurate or incomplete underlying predictor variables, our findings are in line with previous research covering main aspects of our analysis. More importantly, our analyzed datasets provide global gridded estimates of key land surface variables independent from physically-based models, and global patterns of ET and runoff drought responses derived from individual datasets indicate that these data follow the water balance assumption even without constraining the water balance in individual models. Therefore, they present a robust quantification and validation of global drought response patterns which so far could only be obtained with models.
Land surface models can overall represent the timing and magnitude of drought-related runoff anomalies. However, they largely fail to capture the ET surplus observed during drought in wet regions and overestimate drought propagation into ET reductions in dry regions. These problems are due to biases in the land-atmosphere coupling in models which may further be related to potential missing or misrepresented biogeochemical and biophysical processes and aggravated by problems in simulating vegetation structure during droughts. Overall, our results characterize regions with drought-vulnerable water fluxes which should be taken into account when developing strategies of freshwater management to overcome water shortages under drought.