Flash Drought: Review of Concept, Prediction and the Potential for Machine Learning, Deep Learning Methods

This paper reviews the Flash Drought concept, the uncertainties associated with FD prediction, and the potential of Machine Learning (ML) and Deep learning (DL) for future applications. For this, 121 relevant articles covering different aspects of FD ‐ definitions, key indicators, distinguishing characteristics, and the current methods for FD assessment (i.e., ‐ monitoring, prediction, and impact assessment) are examined. FD is typically a short‐term drought event ‐ characterized by the rapid progression of heat waves and precipitation deficits, causing cascading impacts on the land and surface hydrology. FD prediction is constrained by the lack of consistent FD definitions, key indicators, the limited predictability of FD at the subseasonal‐ to‐seasonal (S2S) timescale, and uncertainties associated with the current prediction methods. Some of the uncertainties in the current methods are associated with a lack of our understanding of the physical processes. They are also related to the error in the input datasets (imperfect representation of indicators), parameter uncertainty (parameterization scheme adopted by the prediction model), multicollinearity, nonlinear, and non‐stationary interactions among different indicators. Combining traditional methods and multisource fusion data with ML and DL methods shows promise to better understand FD evolution and improves prediction.

The FD capturing ability of such drought indices-based methods is limited by their inefficiency in dealing with multicollinearity and non-linear relations between the independent and dependent variables (Feng et al., 2019;Prodhan et al., 2021). As an alternative, physical models have been used to capture FD events. For example, observational data and the Variable Infiltration Capacity (VIC) model were used to simulate soil moisture (SM) for FD projection over India . But the efficiency of such modeling efforts is constrained by a lack of good quality input data and also due to insufficient knowledge of underlying complex physical processes (Huntingford et al., 2019;Yuan et al., 2020;. Thus, both drought indices-based and physical model-based methods have their limitations, and improvement in the prediction of FD remains an area of extensive research. Recently, Machine learning (ML) and Deep Learning (DL) methods have evolved to address the issue of multi-dimensional relations efficiently and extract information from complex datasets (Feng et al., 2019;Yuan et al., 2020;. Within the context of this review, ML applies computational algorithms for classification, prediction, clustering, and pattern recognition in a target data set by transforming, sorting, and splitting the input data set. On the other hand, DL refers to sophisticated ML algorithms with multiple hierarchical layers to fit complex functions. Building on the success of these methods in different fields of earth sciences (Liakos et al., 2018;Huntingford et al., 2019;Reichstein et al., 2019;Yuan et al., 2020), recent drought studies have applied ML and DL methods in a stand-alone manner and in combination with physical-based models to achieve better prediction, physical consistency, reduced uncertainty, and reduced computational demands Park et al., 2016;Prodhan et al., 2021;Rhee & Im, 2017). Notably only a limited discussion about the role of ML and DL methods in FD prediction is available in the research literature thus far. Therefore, this review will mainly focus on the potential of the ML and DL methods for FD prediction by reducing associated uncertainties.
The motivation of this paper is to synthesize the concept of FD, highlight the known and unknown of FD, identify the gaps in current approaches for FD prediction, and explore the scope of ML and DL methods in bridging these gaps by addressing the following questions: 1. What are the definitions, key indicators, and distinguishing characteristics of FD? 2. What are the limitations of current methods used for FD prediction? 3. What is the role of ML and DL methods in improving long-term drought prediction? 4. What are the challenges of applying ML and DL methods for FD prediction?
The workflow for the literature review is summarized in Figure 1. Section 2 introduces the definition, key indicators, and distinguishing characteristics of FD, Section 3 describes the current methods used in FD assessment and highlights their limitations, Section 4 focuses on the efficiency of ML and DL methods for long-term drought prediction, Section 5 focuses on the potential of ML and DL methods with associated challenges and uncertainties in improving FD prediction, and Section 6 provides the conclusion. Finally, the Supporting Information S1 (Text S1 and Figure S1) outlines the literature review methodology.

Definitions, Key Indicators, and Distinguishing Characteristics of Flash Drought
This section discusses the definitions, indicators, and distinguishing characteristics of FD.

Definitions of Flash Drought
There is no single definition concerning FD in the literature. The term FD has been described differently based on its impacts, rate of onset, duration of the event, and combination of both rates of onset and duration of events. The timeline of different definitions of FD has been described in detail by Lisonbee et al. (2021). We compared the different definitions of FD in the following sections.

FD Definition Based on the Impacts
The initial definitions of FD described it as the rapid loss in crop yield due to high temperature, moisture deficiency, and dryness (Senay et al., 2008;Svoboda et al., 2002). Early studies suggested that FD occurred during the growing season, causing devastating impacts on the agricultural system (Haile et al., 2020;Hunt et al., 2014;Sanchez et al., 2016;Yuan et al., 2015Yuan et al., , 2018. In recent studies, the impact of FD on a non-cereal crop (Bamboo plantation) and overall terrestrial ecosystem has been used to define FD as an event with a rapid onset and insufficient early warning, causing a widespread impact on the ecosystem productivity .

FD Definition Based on the Rate of Onset
The rapid onset of drought events is the most definitive feature of FD . Several studies have defined FD based on the rate at which different drought indicators such as Soil Moisture (SM), Standardised Precipitation Evaporation Index (SPEI), Standardized Evaporative Stress Ratio (SESR), Evaporative Demand Drought Index (EDDI; , 2021Ford et al., 2015;Koster et al., 2019;Liu, Zhu, Zhang, et al., 2020;Pendergrass et al., 2020;Noguera et al., 2020Noguera et al., , 2021 decline below a pre-defined threshold limit for at least 15-30 days , 2021Noguera et al., 2020Noguera et al., , 2021. It is crucial to distinguish the onset of FD from the intensification of pre-existing traditional drought . For this, a decline in SM percentile from 40th to 20th percentile (Ford & Labosier, 2017), an increase in US Drought Monitor (USDM) drought severity by three and more categories within 8 weeks (Ford et al., 2015;Otkin et al., 2018) was used to define FD.

FD Definition Based on the Duration of Drought Events
Several studies have considered the short-term drought events (less than four weeks to upto two months) to be termed as FD (Hunt et al., 2009;Mo & Lettenmaier, 2015. These short-term FD events were categorized into two subcategories: (a) Heat Driven-high temperature triggers FD by causing an increase in ET and a decrease in SM. Here, pr is also a key indicator but not the initiating indicator, and (b) Water Deficit Driven-lack of pr triggers FD by causing a reduction in ET and an increase in temperature Mo & Lettenmaier, 2015Zhang et al., 2017;. But the rigid criteria of both heat and water-driven approaches led to underestimating FD. Further, the definitions of FD, solely based on the duration of the event, may ignore the change in key indicators with time and fail to capture the severity of FD events Qing et al., 2021). A recent study suggested that rapid onset and intensification should be the major defining factor for FD, irrespective of the duration of the event .

FD Definition Based on the Rate of Onset and Duration of the Event
Few studies have included both onset rate and duration to define FD. Pentad values of the evaporative index-SESR were used based on set criteria to ensure that the identified FD event had a sudden onset, rapid intensification, and did not slow down with precipitation, low temperature, and cloud cover effects Gou et al., 2022). The pentad-scale Standardized Evapotranspiration Deficit Index (SEDI) was used by Li, Wang, Wu, Xu, et al. (2020), for defining FD and tracking its propagation (duration-minimum 25 days and maximum 60 days). The existing definitions of FD primarily focus on the sudden onset and rate of intensification of FD, but limited definitions focus on the FD severity. SM-based flash drought intensity index (FDII) was used to capture the drought severity for the 2012 FD event in the US . The inclusion of severity in FD definition is important to capture the evolution of FD (Qing et al., 2021).
The efficiency of capturing FD is sensitive to the choice of indicator used to define FD (Cook et al., 2018;Osman et al., 2021). Therefore, it is important to understand the sensitivity of FD to different indicators (discussed in the next subsection).

Key Indicators Used to Define Flash Drought
Different indicators such as triggering indicators, local variables, drought indices, climatic drivers, and vegetation indicators have been used for FD assessment. These indicators are summarized in the Supporting Information S1 ( Figure S2).
Most FD definitions have focused on triggering indicators: temp, pr, SM, ET, and PET. These are further classified as initiating or driving indicators. By assessing the rapid changes in these indicators, the development of a FD event can be captured . FD was identified based on rapid changes in initiating indicators: temp and pr and driving indicators: ET and SM (Mo & Lettenmaier, 2015). The inclusion of ET and SM improved the assessment of FD development. Another study by Mahto and Mishra (2020) highlights the contribution of key indicators: negative pr anomalies, positive temp anomalies, monsoon break, and delayed onset of monsoon in triggering FD events.
Several studies have also utilized drought indices to capture the rapid changes in triggering indicators. Examples include: pr and ET-based (SPI, SPEI; Hunt et al., 2014;Li et al., 2021;Noguera et al., 2020), Soil moisture-based: Soil Moisture Stress (SMS Otkin et al., 2016Sehgal et al., 2021), and Soil Moisture Volatility Index (SMVI; Osman et al., 2021), Relative Rate of Dry Down (RRD; Sehgal et al., 2021), Evaporative Stress Index (ESI; Otkin et al., 2016;Nguyen et al., 2019Nguyen et al., , 2021, ET and PET-based: EDDI (Noguera et al., 2021), Atmospheric Evaporative Demand (AED)-based: Evaporative Stress Index (ESR; Christian et al., 2021;Lisonbee et al., 2021),Vegetation indices-Vegetation Drought Response Index (VegDRI; Otkin et al., 2016), Normalized Difference Vegetation Index (NDVI; Park et al., 2016), Leaf Area Index (LAI; Chen et al., 2021;Feng et al., 2019;Li et al., 2021;Lisonbee et al., 2021;. The composite index (ESI, EDDI, and SPI) outperforms the individual indices in identifying FD event over a cropping region (Parker et al., 2021). For improving FD assessment-pr, temp and vegetation based -Very Short-Term Drought Index (VSDI), Microwave Integrated Drought Index (MIDI), and Scaled Drought Condition Index (SDCI) were used (Park et al., 2018). The performance of USDM was analyzed for FD identification across US regions (Otkin et al., , 2016. Results suggested that although USDM successfully determined the spatial extent of a developed FD event, it lagged in identifying the rapid onset. This delay in capturing onset and the rate of intensification is attributed to the discrepancy between different datasets involved in USDM. As a result, while the USDM's strength is its multi-variable inputs, its ability as an early warning tool for FD monitoring needs additional investigation. Studies have also focused on the composite approach -including climatic drivers with the local variables (Aghakouchak, 2014; Hao et al., 2014), and the drought indices (AghaKouchak et al., 2015;Hao & Aghakouchak, 2013) for capturing FD. A combination of key indicators -temp, pr, ET, SM, and local variables (WS, RH, VPD), with climatic drivers such as El Niño-Southern Oscillation (ENSO), Indian Ocean Dipole (IOD) can provide a two-three weeks lead time over the USDM in detecting FD Gerken et al., 2018). A study by Nguyen et al. (2021), showed that positive IOD and ENSO, along with negative Southern Annular Mode (SAM), helped explain the development of the 2019 FD event over the Australian region. Similarly, considering La Nina and associated precipitation and temperature anomalies could help assess the development of FD over US regions .
In a recent study by Hu et al. (2021), a probabilistic and multivariate method based on meteorological variables and composite indices -SPEI and Standardized Soil Moisture Index (SSI) was applied to understand the propagation of FD. But for efficient application of such an approach, high spatio-temporal resolution data of different indicators is required.

Characteristics of Flash Drought
A major difference between FD and traditional drought is the rate of onset and rapid intensification of the events . But certain other factors ( Figure 2) distinguish FD from traditional drought.
The following four additional characteristics have been considered for analyzing FD events.
1. FD's development stage is rapid, and depending on the precipitation and temperature anomalies, it may either rapidly transform into normal conditions in a short duration (a season; Hunt et al., 2014;Mo & Lettenmaier, 2016;Senay et al., 2008) or persist to long-term drought Otkin et al., 2018;Nguyen et al., 2019). In comparison, traditional droughts are slow and can last for years. Further, the development of the latter is typically governed by precipitation deficiency. In contrast, the FD is driven by reduced precipitation and high temperature, leading to rapid evaporative stress and reduced SM (as shown in Figure 2). 2. Most FD events develop during the warm crop growing summer season (drier than normal conditions in the post-cold winter season), unlike -a traditional drought that may occur throughout the year. Further, climate drivers such as La Nina and El Nino were found to be relevant for a better understanding of FD Mahto & Mishra, 2020;Pendergrass et al., 2020). 3. Evaporative Stress (ES) is the main driver of FD and owing to its dependence on the vegetation, and soil type, the occurrence of FD is governed by the land cover and terrain of the region. The croplands are more vulnerable to FD than the forested lands (Jin et al., 2019). For instance, the US's Great Plains -Corn Belt region is highly susceptible to FD conditions due to its shallow root zone and high ET. Such semi-arid regions are susceptible to changes in SM, which reduces ET and affects the atmospheric moisture availability (Pendergrass et al., 2020). This further increases the ES and reduces precipitation chances by affecting the deep atmospheric convection (positive feedback due to land-atmosphere coupling), leading to the rapid intensification of FD Hoerling et al., 2014). This situation is even more prominent in humid and sub-humid regions (Mukherjee & Mishra, 2022;Qing et al., 2021;. On the other hand, highly elevated arid and forested mountainous regions appear to be less susceptible to FD. This is because such regions have low soil moisture profiles and are sparsely vegetated with deeper root zones (limited ET). This prevents any rapid increase in ES and is not favorable for FD Ojima, 2021). But a recent study based on land-atmosphere-vegetation coupling suggests that semi-arid and arid regimes (vegetated regions) are more susceptible to FD than humid regimes . This is because the low precipitation and SM profile of semi-arid and arid regions favor a stronger coupling of latent heat flux (LE) and SM that further decreases SM and leads to the development of FD.
On the contrary, adequate precipitation and strong SM-precipitation coupling in humid regions compensate for high LE and offset further SM reduction . However, high solar radiation and vapor pressure deficit may increase transpiration over highly vegetated humid regions (Oogathoo et al., 2020). This will compensate for even slight moderation in PET and temperature and may favor FD onset over these regions.
4. Across the globe, the occurrence of FD may increase manifold with the projected increase in hot and dry conditions under future climate change scenarios Wang et al., 2016;Yuan et al., 2019;. Further, the excessive stress coupled with low precipitation conditions post the recovery of FD, may act as a precursor for increased heatwave events in the future (Christian et al., 2020;Hoerling et al., 2014).
Reviewing these characteristics, it is apparent that an efficient FD prediction is necessary for better management of FD risk. For this, it is important to analyze the current approaches used for FD assessment and highlight their limitations. These aspects are summarized in the following section.

Monitor, Prediction, and Impact Assessment Methods of Flash Drought
The section summarizes the studies undertaken for assessing FD with a particular focus on monitoring, prediction, and associated impact. The section structure is shown in Figure 3.
FD studies can be classified into different categories based on the intent: monitoring, prediction, and impact assessment ( Figure 3). Furthermore, these categories can be further subdivided based on the modeling approach: physical model, indices-based model, and fusion of models, indicators and their interactions with climate/vegetation drivers used to describe FD.

FD Monitoring
Efficient monitoring of FD relies on the availability of high temporal resolution datasets. The daily values of key variables are usually converted into pentad or weekly scales for FD identification as this minimizes the effect of noisy variables. , highlighted the suitability of the pentad scale SESR index for FD monitoring. SESR was used to capture the rapid onset (below the 40th percentile between individual pentads) and duration (below the 20 th percentile for ≥6 pentads) of FD events. In addition, SESR can capture information from several climate variables -temp, pr, SM, SRAD, WS, VPD, latent, and sensible heat fluxes.
The performance of these tools varies across seasons and regions . The efficiency of these tools depends on their latency, spatial resolution, long period of record, and sensitivity to rapid changes. For example, tools with weekly latency (USDM, ESI, VHI) might not capture the rapid shift in drought conditions. The spatial and temporal resolution of the datasets used by these tools is also critical for monitoring FD. For instance, MODIS data have a relatively high temporal resolution. Still, its spatial resolution is 250 m, whereas Landsat has a spatial resolution of 30 m, but its infrequent temporal resolution inhibits its monitoring abilities (cf. Salehi et al., 2021, compared different resolution RS data of ET and suggested fusion method for generating high-resolution data). This provides an opportunity for developing new datasets with improved resolution, coverage, and latency for improving FD monitoring across the globe.
These tools use different indicators for capturing FD Mukherjee & Mishra, 2022;Otkin et al., 2018;Parker et al., 2021). But this composite approach may not yield a consistent, coherent signal and can lead to sluggishness in detecting FD. Among these indicators, SM stands out in its ability to monitor FD. A global-scale study by Qing et al. (2021), suggests that FD identification is sensitive to the soil depth at which SM is considered and the SM threshold used to define FD. The top layer (top 10 cm) SM can identify more FD events than the root zone as the top layer is the first layer affected by the increase in ES and below-normal precipitation. But root-zone SM is relatively better at representing vegetation stress. FDSI index based on the Soil Moisture Active Passive (SMAP)-SM (at a shallow depth of 0-5 cm) data was used to identify the emerging FD hotspots of FD (Sehgal et al., 2021). FDSI could identify the emerging FD events by up to two weeks of lead time. This supports the relevance of the FLASH platform that uses FDSI for real-time monitoring of FD. But a tool based on the root-zone SM might perform better in capturing FD intensification and severity (Mukherjee & Mishra, 2022).
The application of SM for FD monitoring is limited by the lack of high-resolution measurement (Sehgal et al., 2021). For this, studies (Ford et al., 2015;Ford & Labosier, 2017;Liang & Yuan, 2021;Mo & Lettenmaier, 2020;Otkin et al., 2019) have explored the efficiency of Land Data Assimilation Systems (LDAS) generated SM data in capturing FD. It captured the onset of FD well, but FD's progression was falsely detected. In a study by Sun et al. (2019), auxiliary and microwave satellite datasets were applied to generate high resolution (similar to MODIS -5 km) daily SM data and capture the warm season FD conditions. In another study by Liu, Zhu, Zhang, et al. (2020), microwave remote sensing-based SM was utilized to identify FD over China.
Besides satellite-based data, hydrological models simulated SM are also used for FD monitoring tools. But the simulation abilities are sensitive to the choice of parameters, calibration efficiency, and quality of the input data set. To improve the simulation ability, SM simulations from the hydrological VIC model were assimilated with RS data for FD monitoring (Yan et al., 2018). However, this approach is dependent on the availability of model simulations, which affects its robustness for FD monitoring (Sehgal et al., 2021).

FD Prediction
Despite its significance, fewer studies have focused on FD prediction. Most of these studies have focused on SM-based indices for FD prediction Deangelis et al., 2020;Liang & Yuan, 2021). Recent studies have highlighted that understanding the interaction between local indicators (temp, pr, ET, SM, VPD) and climatic drivers such as ENSO and IOD can provide a greater context to FD events Nguyen et al., 2021;Vogt et al., 2018). Another study by Liang and Yuan (2021), focused on the multiscale land-atmosphere-ocean interactions and suggested that FD is a sub-seasonal phenomenon; therefore, intraseasonal land-atmosphere-ocean interactions can potentially improve FD prediction.
In another approach, drought indices such as SPEI, SRI, and SEDI can be applied in a probabilistic run theory and copulas aid framework (Yevjevich, 1967) for understanding sub-seasonal drought propagation (Ho et al., 2021). Example studies include a probabilistic framework to estimate SSI and capture the 2012 FD event of US (AghaKouchak, 2014). This study highlights the efficiency of SM-based indices over stand-alone precipitation-based indices for FD prediction which is consistent with FD monitoring findings. Climate Forecasting System (CFS) based weekly forecast of USDM, highlights that the accuracy of FD prediction is dependent on the efficiency of the model and datasets used for forecasting USDM (Lorenz et al., 2018). The NLDAS-2 datasets-based sub-seasonal FD prediction tool was developed to improve the Monthly Drought Outlook (MDO) skills in a recent approach. The MDO is a framework that uses the USDM map as an initial condition and predicts drought tendency for the upcoming month . However, the prediction skill of these monthly scale FD prediction tools is limited due to their inability to capture weekly changes in climate anomalies Pendergrass et al., 2020). These studies highlight that increased interseasonal variability, and anthropogenic warming will increase the risk of FD in the future.

FD Impact Assessment
FD has an immediate impact on the vegetation state; therefore, most of the impact studies, have assessed the decline in SM for capturing the vegetation stress with the progression of FD. Assessment of FD impacts on the ecosystem (yield, species growth, and pathogen outbreaks) is critical for better mitigation of its impacts. Otkin et al. (2019), highlighted the potential of multisource datasets in capturing the impact of FD on vegetation conditions by utilizing SMAP and NLDAS-based LAI and SM indices. In another study by Kimball et al. (2019), the information from SMAP-based SM and carbon products were applied to study the impact of FD-induced rapid changes in SM on the vegetation productivity. Owing to the susceptibility of croplands relative to the natural ecosystem, Zhang and Yuan (2020), studied the impact of FD on carbon and water fluxes of cropland, based on water use efficiency and gross primary productivity (GPP). Both water user efficiency and GPP decreased with the rapid decline in SM. Jin et al. (2019), utilized composite indices-NDVI, Enhanced Vegetation Index (EVI), Land Surface Water Index (LSWI), Sun-Induced Fluorescence (SIF), and GPP to study the direct and simultaneous impacts of the 2012 FD event of US on the vegetation condition. There was a significant reduction in these indices during the recovery period of the FD event. Assessing the changes in crop phenology with the development of drought can help better manage the agriculture system (Prodhan et al., 2021). For instance, the impact assessment of FD on the water use efficiency of bamboo vegetation helped identify that reduced density of vegetation can be an effective way to alleviate the impact of FD .
Most of the studies mentioned above have analyzed the vegetation conditions during the recovery phase of FD.
Few studies have also focused on the pre and post-drought conditions of the ecosystem. But the response of the ecosystem to FD is absent from these impact assessment studies (Crausbay et al., 2017). A recent study by Chen et al. (2021), the Community Earth System Model, version 2 (CESM2), highlighted that vegetation greening leads to SM depletion and favors FD development. This has negative feedback on the vegetation growth, especially for the regions with low precipitation and limited moisture availability . The inclusion of these vegetation indicators and the underlying feedback mechanism, likely aid a better understanding of the FD mechanisms (Jin et al., 2019).

Limitations of Current Methods Used for FD
The FD studies have focused on capturing FD using drought indices-based, physical-based, and coupled models. But the efficacy of these models is limited; the physical-based and coupled models have significant uncertainties associated with different parameters, simplification of processes, and input datasets. The drought-indices-based models are also oversimplified and may fail to acknowledge the complex feedback, relationships, and multicollinearity among different indicators. Further, these models have limited ability to perform multiscale and multisource analysis for capturing FD.
Performing multiscale analysis is most challenging for FD prediction, as it requires understanding the future behavior of different processes associated with FD. But the current monthly scale FD prediction models fail to capture the FD events solely driven by weather noises (Hoerling et al., 2014;Liang & Yuan, 2021;Nguyen et al., 2021). For instance, three Sea Surface Temperature (SST)-IOD, ENSO, and SAM were used to predict ESI and capture FD. But the study fails to capture the changes in precipitation and temperature caused by the chaotic nature of the weather (predictable within one-two weeks; Nguyen et al., 2021) Therefore, the development of weekly based prediction tool for FD can help improve the predictability of these climate anomalies . Whereas study by Pendergrass et al. (2020), suggested that even weekly prediction products may not be able to capture these anomalies; instead, the products must be updated daily to improve S2S FD prediction. Further, the inconsistent initialization protocol, lack of dynamic representation of physical processes, and unavailability of long-term high spatio-temporal resolution datasets affect the efficiency of current FD prediction models.
In addition, the current FD prediction approaches are built on -geophysical indicators. The relevance of social media data for understanding FD progression and severity has not been explored so far. Social media has the potential to provide relevant information for capturing the progression, and spatial extent of FD (cf. Wagler & Cannon, 2015;Kim et al., 2019, who have studied traditional drought evolution using social media datasets). This can help understand people's perception of the severity of FD events and how they respond to them, which may be important for region-specific management of FD impacts .
To address the different limitations, this study postulates that multiscale data-driven methods such as Machine learning (ML) and Deep Learning (DL) aid FD assessment. It can help understand the strength of the relationship between different indicators and climatic drivers at varying spatio-temporal scales and improve the S2S FD predictability . For instance, the USDM forecast by the ML-based framework captured the 2017 FD events over the USA with a 12-week lead time (Brust et al., 2021). Similarly, a recent study by Zhang et al. (2021), applied ML and DL methods for monitoring FD over China based on SM percentile and nine climate variables. In another study by Zhu and Wang (2021), an ML-based method was used to estimate root-zone SM for FD prediction over humid and semi-humid regions of China. These studies bolster the potential of data-driven methods for identifying key indicators and future prediction of FD in data-scarce regions.
Although ML and DL methods show promising results for drought prediction, most of the studies have applied them for long-term traditional drought and only limited studies have focused on their relevance in FD prediction. Therefore, the following section discusses the advances in ML and DL methods for traditional drought prediction and the potential for applying these methods for FD prediction.

ML-Based Methods for Drought Prediction
This section discusses, studies that have applied ML and DL methods for traditional drought prediction, focusing on how these can be eventually applied for FD analysis. The comparative analysis of ML and DL methods is also presented. and Random Forest (RF; Park et al., 2016) were applied based on observation data in stand-alone and infusion with RS data for predicting drought. Hydrological drought prediction based on the SRI index using four ML models-ANN, Adaptive Neuro-Fuzzy Inference System (ANFIS), SVM, and DT suggests that all four ML models performed well with SVM outperforming other methods for 3-12 month timescale (Achite et al., 2022).

ML-Based Drought Prediction
The SVM-based model performed better than other ML methods in capturing drought intensity for the cropping season (Khan et al., 2020). An ensemble approach based on SST and three ML methods -Support Vector Regression (SVR), RF, and Extreme Learning Machine (ELM) were used to estimate the SPEI index, highlighting the effectiveness of ML-based ensemble approaches in drought prediction . ML approaches such as RF, Boosted Regression Trees (BRT), Cubist, and SVM fused with RS data have proven -robust in identifying key indicators and successfully predicting meteorological and agricultural drought (Feng et al., 2019;Park et al., 2016). The efficiency of ML methods lies in their ability to deal with multicollinearity and non-linear relations between different indicators (Tufaner & Özbeyaz, 2020). For instance, both RF and DT methods can provide a promising prediction of meteorological drought by using RS-based hydrometeorological variables and large-scale indicators such as pr, PET, land surface temp, NDVI, AOI, ENSO for different lead-time (Rhee & Im, 2017). In the context of FD, these similar techniques hold good promise.
The ML methods can also assist in identifying key indicators for drought prediction by understanding the complex interactions between different indicators (Rhee & Im, 2017;Tian et al., 2018). For instance, Recursive Feature Elimination (RFE) applied using SVM, ANN, and KNN methods successfully identified the optimum range of key drought parameters (Khan et al., 2020). A study by Deo et al. (2017), suggested that drought prediction can be made more robust by applying a combination of wavelet transformation and ML methods such as the ELM, ANN, SVR, and Least Squares Support Vector Regression (LSSVR). The wavelet-based drought models can decompose the drought indicators into time-frequency components and grasp key information for drought assessment . ANN and SVM models developed by coupling wavelet transforms with bootstrap and boosting ensembles were used to predict SPI-based drought for different lead times. Wavelet transformation helped screen the data whereas, both bootstrapping and boosting improved the prediction accuracy of ML models by developing multi-layer model sequence to look after the statistical interpretation such as biases, variance, and capturing important information from the training data (Belayneh et al., 2016). In a study by Fung et al. (2020), similar ensemble approach was applied with a different ML method -SVR (ensembled with Fuzzy support and boosting support) for agricultural drought prediction, which successfully improved the prediction accuracy with one-month lead time.

DL-Based Drought Prediction
In earlier applications, the prediction accuracy of ML methods was limited by the over-fitting of lag components involved in time-series data, and due to the assumption of non-stationarities . The development in DL methods has helped overcome these limitations. As a result, studies have applied DL methods for identifying key indicators for drought prediction (Majhi et al., 2020;Xiao et al., 2019;Gao et., 2020). Deep learning methods such as Long Short-Term Memory (LSTM) and Deep Neural Network (DNN) have shown an edge in drought prediction due to their ability to retain historical information and gated architecture. ML methods use uniform weighting across the time steps, whereas DL methods can use decaying weights to improve drought prediction. Further, DL methods have shown superiority over other methods in dealing with multiscale and multisource data .
Due to the promising performance of DL methods, significant progress has been made in developing a DL-based drought prediction framework. For instance, Agana et al. (2017) adopted Deep Belief Network (DBN) -DL architecture for long-term drought prediction based on its efficiency in image classification and speech recognition. DBM method outperformed traditional data-driven methods such as Multi-Layer Perceptron (MLP) and SVR. The LSTM-based DL method can reduce biases associated with drought prediction (Agana et al., 2017;Kaur & Sood, 2020;Prodhan et al., 2021;Yuan et al., 2020;. The prediction accuracy of LSTM was higher when the observational data set was fed into the algorithm relative to the simulated data . In contrast, a recent study posits that observations may contain inherent errors and missing values. A study by Dikshit et al. (2020), highlighted that interpolated datasets such as Climate Research Unit (CRU) for different meteorological (temp, pr, PET, cloud cover) and lagged climatic variables such as ENSO, Pacific Decadal Oscillation (POD), Southern Oscillation Index (SOI), and SST when used in an LSTM framework, can efficiently forecast the meteorological drought at varying lead time. This enhanced performance was also due to LSTM's ability to deal with multisource and multiscale data for efficient drought prediction at both long-term and short-term lead times. Another recent study by Kaur and Sood. (2020), adopted a combination of ML and DL methods such as Genetic Algorithm-based optimized ANN (ANN-GA), and DNN to perform dimension reduction for screening the input data set and then pass only an information-rich data set into SVR for drought prediction. This approach also yielded a positive outcome on FD predictions.
Incorporating information from a physical-based model into a data-driven model has created a paradigm shift for drought prediction. A composite approach based on Explainable Artificial Intelligence (XAI) model using SHapley Additive exPlanations (SHAP) was able to efficiently predict a 12-month SPEI index at different temporal-spatial resolutions . This study highlighted the importance of the explainable DL model in identifying the relevant climate variables, understanding their interactions, and contributions to the development of a drought event. Furthermore, the explainable ML and DL models, can improve the accuracy and transparency of ML and DL-based drought prediction . The comparative analysis of classical drought prediction methods with ML and DL methods is presented in Table 1.
The proficiency of ML and DL methods in traditional drought prediction (as described in Table 1) has made it accentuating to analyze their potential in improving FD prediction. The scope of ML and DL in FD prediction and associated challenges -are discussed in the following section.

Potential of Applying ML and DL Methods for FD Prediction
In this section, the challenges encountered in FD prediction are initially discussed. This is followed by a summary of the role of ML and DL methods in improving these predictions. Finally, the uncertainties involved in adopting ML and DL approaches are also discussed.

Challenges in FD Prediction
The challenges in FD prediction are outlined in Figure 4 and discussed here.
The lack of a single definition to describe FD is a foundational problem. Due to this ambiguity in the FD definition, there is no standard method or protocol to encapsulate FD conditions. As a result, it is challenging to ensure that the definition used to predict FD does not lead to a fragmented FD understanding.
Recognizing that FD is characterized by rapid onset and intensification, it is difficult to identify appropriate time-sensitive indicators to capture the rapid changes in the drought state and aid FD prediction. Studies have highlighted that the long-term drought indices such as SPI and SPEI (three months, six months) can capture traditional -drought efficiently, but fail to capture FD events, especially with the increase in lead time Lisonbee et al., 2021). This becomes more challenging when only a single drought indicator is used to capture the rapid onset and predict FD. To address this limitation, this review suggests a combination of different indices based on high temporal resolution data can help in capturing the rapidly changing condition of key indicators during FD development Nguyen et al., 2021;Otkin et al., 2019).
Land-atmosphere-ocean interactions can improve FD prediction (Bolles et al., 2021;. A study based on low-level Humidity Index (HI; refers to the relative humidity in the boundary layer) and Convective Triggering Potential (CTP; refers to air mass column above the boundary layer) was used to study the evolution and spread of FD . In the case of positive values of CTP and HI, boundary layer turbulence during daytime leads to the mixing of the dry air mass aloft with lower-level surface air (reduced relative humidity), it further reduces the relative humidity and creates evaporative stress. This prevents precipitation development with a boundary layer and lower troposphere that may trigger FD. But since FD is a short-term drought event (sub-seasonal), it becomes challenging to identify the appropriate timescale when the ocean anomaly does not control the land surface signal. For this, land-atmosphere-ocean interactions at different timescales (S2S), can aid FD prediction .
But S2S scale prediction of FD is challenging owing to the inherent unpredictability of weather noise-driven FD events at the S2S scale, inconsistent initialization of soil conditions, and unrealistic representation of dynamic vegetation across many prediction models (Hoerling et al., 2014;Liang & Yuan, 2021;Woloszyn et al., 2021). For this there is a need to develop robust prediction models at a daily or at least weekly scale by including multisource high-resolution datasets and improved vegetation processes to understand the mechanism of FD Pendergrass et al., 2020).
In the past news stories and survey-based reports were utilized for FD impact assessment . But real-time social media information from Facebook and Twitter for capturing FD development and communities' response to it has not been utilized so far. This can be attributed to the inability of traditional methods to extract relevant information from such unformatted data. Due to the lack of social media-based FD studies its unsure if the sensitive citizen science can be useful in understanding the evolution time scales of these droughts.

Role of ML and DL in Improving FD Prediction
A summary of the role of ML and DL methods for FD prediction (as shown in Figure 4), suggests that these methods reduce the computation cost of dealing with multisource datasets at the different temporal and spatial resolutions which are required for FD prediction Feng et al., 2019;Park et al., 2016;Tian et al., 2018). Furthermore, it provides a multiscale-multisource framework for understanding the sensitivity of FD to various definitions and key indicators. Also, ML and DL methods have the potential to unravel the non-linear relationship between the predictands and predictors of FD prediction . This helps improve the fidelity of teleconnections, which can improve FD predictability .
ML and DL methods can also estimate the drought indices at different time scales (sub -seasonal to seasonal), required for FD prediction. Ensemble ML and DL methods with data screening options such as wavelet transformation, boasting, and bootstrapping can further improve the statistical interpretation and extract hidden information that may improve FD prediction. ML and DL methods can provide a framework for categorization, extraction, and processing of relevant multi-scale information for capturing FD (Dikshit et al., 2022). Such an approach can be used for extracting relevant social media information to capture FD progression, spatial extent, and severity. ML and DL methods combined with explainable algorithms -could potentially improve the trust, transparency, and accuracy of FD prediction. For example, Dikshit et al. (2022) -show that DNN and interpretable ML and DL aided drought prediction more robustly.

Uncertainty in ML and DL Based FD Prediction
The following uncertainties are associated with ML and DL-based FD prediction (as shown in Figure 4).
(a) The efficient implementation of ML and DL-based FD prediction methods will depend on the availability of long-term high temporal and spatial resolution data of key variables. ML and DL methods perform better with in situ data. Still, in the case of satellite-based or simulated data, their efficiency is determined by how well the datasets represent the FD indicators such as pr, temp, and SM. For instance, the user's and producer's accuracy of ML methods used to predict meteorological drought, suggests that ML methods show higher producer accuracy than traditional methods. However, the user accuracy was low, indicating that ML methods' observation and long-range forecast data failed to capture the non-drought conditions and led to overestimating of drought events. The uncertainty is higher in the case of FD prediction, owing to its sensitivity to the rapid changes in the key indicators (within weeks). Therefore, the lack of high temporal resolution data may limit the FD prediction accuracy. Furthermore, if coarse resolution data is used for prediction, the additional uncertainty associated with the downscaling and disaggregation methods applied, degrades the drought prediction accuracy (Alizadeh & Nikoo et al., 2018;Feng et al., 2019;Park et al., 2016;Rhee & Im, 2017). The other risks involved in the wider implication of ML and DL in FD studies are (b) biases associated with datasets of different spatio-temporal resolutions, (c) simplistic downscaling, (d) ignorance of sensitive indicators, (e) missing, irrelevant and duplicate social media data may lead to overfitting or underfitting (Ofli et al., 2020) and (f) inability to capture the physics behind the phenomenon.
This study suggests that an efficient way to reduce the associated uncertainties and improve FD prediction is to adopt: a multisource, fusion-based data development, and improved Land Assimilation System with changing vegetation and soil conditions to help the prediction model in obtaining 3H: High Spatio-Temporal Resolution, High Spatial Coverage, and High Accuracy, required to improve FD prediction. For instance, a fusion-based land data assimilation and DL approach were applied to generate continuous SM with 3H (Huang et al., 2022). Such continuous datasets can be used for more accurate initialization and retrospective forecast to improve the FD prediction models' efficiency. Further, an amalgamation of physical-based knowledge with different ML and DL methods can be used for performing a vast range of tasks such as classification, augmentation, and generation of multiscale data, downscaling of the input data set, screening of key indicators, land-atmosphere-ocean interaction, and considering feedback mechanism between vegetation system and prevailing FD condition. A possible schematic for ML and DL-based FD prediction is summarized in Figure 5.

Conclusion
A systematic review of the FD concept -definitions, key indicators, distinguishing characteristics, current methods of FD prediction with their associated uncertainties, and the potential of ML and DL methods in reducing these uncertainties is presented. A total of 121 peer reviewed, research studies have been systematically analyzed.
One of the overarching conclusions of the review is the lack of a single generalized definition of FD in the current literature. Several indicators, climatic drivers, and physical processes are found to be typically associated with FD development. Therefore, it is likely that a single definition may not have the potential to describe FD completely. As a result, various FD definitions allows one to focus on FD through different lenses. However, no matter what the definition is, it is important to ensure that the rapid onset, intensification and severity of FD can be captured and differentiated from traditional drought events.
Compared to traditional long-term drought, FD develops quickly due to the rapid increase in temperature, below-normal precipitation, and increase in ET, often leading to a rapid decline in SM within days and weeks. These rapid changes in key variables often have a devastating impact on the ecosystem. The semi-arid and sub-humid agricultural regions are more susceptible to FD than the elevated and arid mountainous regions. This is because ES is the main driver of FD, and semi-arid, sub-humid agricultural land has higher ES due to the shallow root zone, high ET, and sensitivity to SM; this creates positive feedback for further increase in ES favors FD development. In contrast, the rapid loss of ET is limited by the deeper root zone and the low availability of SM in an arid and mountainous region. Studies suggested that FD risk is projected to increase, with increasing hot and dry conditions. Thus, it is important to predict FD and identify the potential hotspots for managing FD impacts.
The current methods of FD prediction are challenged in capturing the rapid development and propagation of FD due to (a) their inability to deal with the complex relationship between key indicators and climate drivers. This is the issue that requires more process-scale investigations. (b) Their predictability is also limited by the lack of long-term, high-quality data, and uncertainty associated with the knowledge of complex physical processes. This problem is associated with data sparsity that is likely to be addressed through synthetic data development. (c) They do not adequately capture multiscale and multisource analysis, which is required for efficiently dealing with the diverse range of FD definitions, indicators, climate drivers, social indicators, and vegetation indicators associated with FD prediction. This is a problem associated with multiscale uncertainty that needs to be minimized.
Various studies have highlighted the efficiency of ML and DL approaches over traditional methods in predicting long-term drought events. But its proficiency in rapid and short-term drought prediction (such as FD) is still in nascent stages. This motivated us to highlight the potential of ML and DL methods and provide support for a wider-scale application for improving FD prediction. The efficiency noted in ML and DL methods is likely due to their ability to deal with -(a) multicollinearity, non-linear and multiscale relationship among different key indicators. (b) Both ML and DL methods can perform screening of numerous key variables and predict drought at varying lead times and timescales by using several indicators, climate drivers and vegetation indicators. Studies also suggest that DL methods may have a potential advantage over classical ML methods in dealing with -the assumption of stationarity, over-fitting of lag component, and utilizing multisource memory by retaining historical information and gated architecture 3) Knowledge from the physical-based models can be incorporated into ML and DL methods based on the explainable algorithms. Such amalgamation of physical-based models and ML and DL methods can aid better understanding of the physical processes associated with FD. This can help improve the FD prediction. (d) Multisource fusion and land data assimilation systems (reanalysis) data can be utilized by the ML and DL methods to deal with the lack of availability of long-term high-resolution data.
Therefore, this study suggests that ML and DL-based possible framework-encompassing (a) a wide range of FD definitions, (b) LDAS and RS-based multisource fusion data of key indicators, (c) Multiscale land-atmosphere-ocean interaction, (d) feedback mechanism between pre and post-vegetation condition and FD, (e) physical model-based knowledge, and (f) Social media platform based real-time information can be adopted to capture FD development and severity. It is suggested that future research can help integrate the different components and continue to take advantage of data-driven ML approaches to help better characterize and predict FD evolution.

Data Availability Statement
This paper is a review article, and no datasets were used for this review.