ResU‐Deep: Improving the Trigger Function of Deep Convection in Tropical Regions With Deep Learning

Modeling deep convection accurately in tropical regions is important. However, biases remain in current trigger functions. To alleviate the overestimation of frequency and wrong depiction of the diurnal cycle, we propose a deep convection trigger function, ResU‐Deep, based on the framework of U‐net with three modifications to better suit the problem of deep convection identification: (a) adding the upsampling process into the encoder part, (b) replacing the double convolution block with a residual‐convolutional block, and (c) adding a dynamic weight into the loss function. Thirty‐three environmental variables within tropical regions are used in ResU‐Deep, including 31 features from ECMWF atmospheric reanalysis (ERA5) data set, and two historical convection fields. Tropical Rainfall Measuring Mission 3B42 data set is used as the precipitation observation. Central America, North Africa, South and East Asia, and West Pacific Ocean within 0°∼30°N are selected as the study regions for the high frequency of deep convection activities. ResU‐Deep, incorporating the surrounding information, is separately trained and evaluated in four regions and has the F1‐scores of 58%, 53%, 60%, and 63% for the occurrence, outperforming the single‐column‐based machine learning methods. Also, a unified model has similar performance in four regions. Further comparisons are made with convective available potential energy‐based trigger functions in Southern Great Plains. Results show that ResU‐Deep can capture the trends and peaks of diurnal cycles on complex terrains in large regions. According to feature importance test, the contribution levels of environmental features are different in four regions, indicating the model can learn the mechanisms of deep convection in specific region, thus improving the prediction accuracy.

Although using parameterization schemes can solve the problems such as coarse grid resolution or insufficient computing resources to resolve all the length and time scales of deep convection in GCMs, the trigger function is still a major source of error in the diurnal cycle simulation.For example, GCMs will rain too frequently at a reduced intensity, the diurnal cycle may not be well-captured during summer seasons and over the ocean, and trigger functions have satisfactory performance only when the convection intensity is high (Covey et al., 2016;Dai, 2006;Dai & Trenberth, 2004;Lee et al., 2007;Suhas & Zhang, 2014;Terai et al., 2018).The lack of regional climate and geographical features are major limitations in trigger functions used in GCMs.First, many factors, such as the vertical velocity, large-scale dynamical constraints, solar radiation, surface heat flux and moisture fields have huge impacts on the trigger of deep convection (Liu & Moncrieff, 1998;Guichard et al., 2004;J. M. Wallace, 1975).Besides, the convection diurnal cycle may be modulated by regional features or complex terrains such as the land-sea contrast or the mountain-valley geography, as well as the regional climate (Garreaud & Wallace, 1997;Xu & Zipser, 2012;Yang & Slingo, 2001).According to Doswell et al. (1996), only considering a few parameters is not enough.Trigger functions should consider different factors which can determine the occurrence of deep convection.The most important factors may vary by region due to different climatologies of convective environments.For example, in the central United States, the Rocky Mountains are essential factors for the diurnal cycle (Scaff et al., 2020;B. Wallace & Minder, 2021).In East Asia, summer deep convection is mainly modulated by the changes in the meteorological environment such as CAPE, the height of neutral buoyancy, transport of moisture, and vertical wind shear (Luo et al., 2011;Wu et al., 2013;Xu, 2013).While on warm tropical oceans, relative humidity, surface evaporation rate and moist entropy flux are the main factors introducing biases in the convection (Raymond & Flores, 2016).These factors are not fully considered in the classical trigger functions in GCMs.Therefore, considering regional trigger functions with related variables may help to improve the identification of deep convection.Besides, building a unified model to include all the regional-specific inhomogeneity is also a practical solution.However, it is hard to find the quantitative relationships between all these variables with traditional methods, which are both labor-consuming and computation-intensive.
10.1029/2022MS003521 3 of 25 complex non-linear relationship between different variables, machine learning has been applied to the development of GCMs and the parameterization schemes for improving simulation accuracy (Gentine et al., 2018;Han et al., 2020;O'Gorman & Dwyer, 2018;Rasp et al., 2018;Scher, 2018;Ukkonen & Mäkelä, 2019;P. Wang et al., 2022;T. Zhang et al., 2021).Among them, Ukkonen and Mäkelä (2019) used the ECMWF atmospheric reanalysis (ERA5) and variables from climate models to train the convective trigger functions based on the logistic regression, decision tree and fully-connected neural network in Europe and Sri Lanka.T. Zhang et al. (2021) used the data from the Atmospheric Radiation Measurement program to build deep convection trigger functions based on XGBoost trees (T.Chen et al., 2016).These methods were single-column oriented and neglected the impact of the surrounding environment.P. Wang et al. (2022) used non-local 3 × 3 columns inputs and found it can improve offline prediction of several sub-grid processes.With the help of deep learning, especially convolutional neural networks (CNN), high-resolution simulations, incorporating surrounding information, and learning from multi-source observations can be achieved, while the computational cost is relatively low compared with running GCMs.Unlike machine learning methods such as random forest or Multilayer Perceptron, CNN can automatically learn the spatial dependence at each point with convolution kernels.As mentioned previously, deep convection can be affected by the surrounding environments and related to different variables.Therefore, it is natural and reasonable to choose CNN to model the trigger function of deep convection.
In this paper, we propose a CNN-based method to construct a novel location-aware convection trigger function (ResU-Deep) with the fifth generation ERA5 data set and the Tropical Rainfall Measuring Mission (TRMM) data set in northern tropical regions (Hersbach et al., 2018;TRMM, 2011).Unlike classical trigger functions used in GCMs, ResU-Deep can take many related variables into account and build trigger functions in different regions with distinct climate and geographical features.ResU-Deep is a modified version of U-net with improved training efficiency, upsampling for feature fusion, and a weighted loss function for imbalanced class training.Environmental variable fields related to dynamic, thermal, or thermodynamic processes are used to train the model.Furthermore, we compare our model with the variants of CAPE trigger functions on Southern Great Plains (SGP), and validate our model on the Southern Hemisphere tropics in Manaus (MAO).Results show that ResU-Deep improves the diurnal cycle simulation and successfully interprets the discrepancy of physical mechanisms between different regions.
The paper is organized as follows.In Sections 2 and 3, we introduce the data and methods used in this study.Section 4 explains the model performance from the modeling ability, the comparison against single-column-oriented machine learning methods and the performance of a unified model.Besides, we explore the time dependence of deep convection occurrence.We also show the improvement of capturing the diurnal cycle, including the comparison against the CAPE-based trigger functions in SGP, and the feasibility in the Southern Hemisphere tropics.The interpretation of the physical mechanisms related with deep convection from the model is described in Section 5. Conclusions and discussions are given in Section 6.

Data
The European Center for Medium-Range Weather Forecasts (ECMWF) fifth generation atmospheric reanalysis (ERA5) data and the TRMM 3B42 data from NASA are used to construct the trigger function of deep convection in this study (Hersbach et al., 2018;TRMM, 2011).Replacing the ERA-Interim reanalysis, ERA5 data sets provide high-quality hourly estimation for various atmospheric, ocean-wave, and land-surface quantities at 0.25° × 0.25° spatial resolution.TRMM 3B42 data set is the output from TRMM Multi-satellite Precipitation Analysis Algorithm, which provides a nearly-zero-bias precipitation estimate between 50°S∼50°N with the same spatial resolution as the ERA5 data set but with 3-hr temporal resolution.The TRMM rainfall product has been widely used to study the spatial distribution, the seasonal and diurnal cycle of rainfall, and can capture the spatial and temporal variation of the rainfall (Dai, 2006;Sakaguchi et al., 2018;Xu & Zipser, 2011).Considering the spatial and temporal resolution of these two data sets, we choose 0.25° × 0.25° spatial resolution and 3-hr temporal resolution for the variables in this study.
For a data-driven deep learning model, the data can be divided into predictors and predictands.In this study, variables provided by ERA5 data are used as the predictors.According to previous studies (Holloway & Neelin, 2009;Ukkonen & Mäkelä, 2019;T. Zhang et al., 2021), variables related with convection can be categorized into the thermal part, dynamic part, and humid part.Besides, the vertical structure of the atmosphere also plays a crucial role in the development of convection.Therefore, different layers are chosen to represent the layers of the lower, middle, and upper troposphere.Moreover, considering the temporal dependence would benefit for simulating the diurnal cycle due to the development of convection systems (Futyan & Del Genio, 2007;Machado et al., 1998).Therefore, both deep convection occurrence information at t − 1 and t − 2 (in the past three and 6 hr) are added to the predictors.Since the terrain can influence the development and occurrence of convection, the altitude at each grid is also added to the model.All the variables used as predictors for deep convection in the ResU-Deep trigger are listed in Table 1.

Category
Predictors Abbreviation

Energy
Convective available potential energy (J/kg) a CAPE Convective inhibition (J/kg) b CIN Humidity related Specific humidity at 300 hPa (kg/kg) sh_300 Specific humidity at 500 hPa (kg/kg) sh_500 Specific humidity at 700 hPa (kg/kg) sh_700 Relative humidity at 300 hPa (%) rh_300 Relative humidity at 500 hPa (%) rh_500 Relative humidity at 700 hPa (%) rh_700 Total Temperature at 500 hPa (K) t_500 Temperature at 700 hPa (K) t_700 Surface temperature (K) Since the diurnal variation of deep convection has been extensively studied from rainfall observation, we use the precipitation rate obtained from TRMM 3B42 to estimate the occurrence of deep convection in tropical regions (Gray & Jacobson, 1977;Janowiak et al., 1994).When the precipitation rate is greater than or equal to 0.5 mm/ hr at each grid, the grid is labeled as deep convection occurring at that time (Song & Zhang, 2017;Suhas & Zhang, 2014;T. Zhang et al., 2021).The occurrence or non-occurrence of deep convection in a given time interval at each grid cell is the predictand of our model.The grid is labeled with class 1 for the occurrence of deep convection, and labeled with class 0 for the non-occurrence.Using 0.5 mm/hr as the threshold for deep convection may introduces some uncertainties, since the convective systems contain both shallow and deep convective precipitation.However, since large precipitation in summer over tropical and mid-latitudes is commonly associated with deep convective events, this threshold is a simple but reasonable criterion for defining labels for a large-scale and long-period data set.The test results for different thresholds from 0.3 to 0.7 mm/hr are listed in Appendix C.
Because deep convection occurs more common in summer, we choose 15 boreal summer months (June, July, August) from 2015 to 2019 for this study.Figure 1 shows the distribution of the number of precipitation and deep convective precipitation events, and the ratio between them in tropical regions at each grid during 15 months.Four regions in blue boxes are regions with high ratio numbers, which means deep convection is likely to occur during precipitation.According to Xu and Zisper, the continental, monsoon and oceanic are three archetypical regimes of deep convection (Xu & Zipser, 2012).To include all three archetypes of deep convection, we choose these regions as the study areas.
Figure 2 further shows the precipitation information in 5 years respectively.The precipitation ratio in blue bars show the ratio of deep convective precipitation amount to total precipitation amount, the frequency ratio in green bars state the ratio of deep convective precipitation events to total precipitation events.The gray line indicates the occurrence ratio of deep convection in each year, which is the occurrence number of deep convection divided by the number of 3-hr intervals in the period of five boreal summers.The occurrence rates in the four regions over 5 years in summer seasons are different, but all the rates are low, which range from 4% to 12%.This means the designed model should tackle the imbalanced training classes between positive labels (occurrence) and negative labels (non-occurrence).Furthermore, since both the precipitation and frequency ratios do not change much during 5 years, different partitions of the data set will not result in a significant difference.Therefore, data from 2015 to 2017 is used as training data, data from 2018 is for validation and the performance is tested on data from 2019.All of them have a temporal resolution of 3 hr and a spatial resolution of 0.25° × 0.25°.

ResU-Deep Convection Trigger Function
In this study, we adopt U-net to construct the ResU-Deep convection trigger function.U-net has been widely used for image segmentation with satisfying performance (Ronneberger et al., 2015).Since identifying the grid with deep convection is also a segmentation task, that is, the areas where deep convection occurs are divided from the whole region, it is reasonable to adopt the U-net as our basic structure.To suit the special task for deep convection identification, we propose the following three modifications on U-net to derive our ResU-Deep convection trigger function: • We use a convolution plus a residual block (He et al., 2016) to replace the double convolution block to improve the training efficiency with huge volumes of data; • We add upsampling blocks to the encoder part to fuse different scales of features during learning the mechanism of convection; • We adopt a location-aware weight to the loss function to tackle the imbalanced class.
Figure 3 describes the flow diagram of our proposed ResU-Deep trigger function.First, the network is mainly composed of two parts, the left half is the encoder which serves as the feature extraction tool, and the right half is the decoder to do the upsampling and to output the results.Since the volume of training data is huge, using a traditional U-net would suffer from a slow learning process and a large number of iterations.The computational challenge would grow with the region and time interval that we want to cover in the training process.To accelerate the training process, we use a residual block (He et al., 2016) to replace the first convolution block to form a residual-convolutional (Convres) block.Besides, we choose the kernel size of 3 (around 83.25 km) as the convolution kernel, which is also the common size used in CNNs.The convolutional kernel slides on the input variable fields to extract the spatial features.Furthermore, as the encoder goes deep, the width and height of the input variable fields are halved, while the number of channels doubles.Thus, the kernel size has a larger receptive field at deeper layers, which can learn the features of convective systems with larger spatial ranges.By adding the upsampling process in the encoder part, the model can fuse features from different scales, thus improving the learning.Moreover, according to Figure 2, the sample numbers of positive and negative classes are imbalanced.We modify the binary cross entropy loss function with a location-aware weight according to the ratio between positive and negative classes in each region, which enhances the learning ability of the positive class.The model results are not sensitive with slightly different class ratio in each year.Otherwise, the model would pay more attention to the non-occurrence due to the overwhelming number of negative samples and neglect the features related to the positive samples.For more explanation of the architecture, please refer to the Appendix A. The amount ratio of deep convective precipitation, the frequency ratio of deep convection events, and the occurrence ratio of deep convection in four regions during 5 years in June, July, and August.The ratios of deep convection events average over 5-year summer seasons are 8.48%, 4.18%, 12.19%, and 11.74% in four regions, respectively.
The ResU-Deep trigger function is trained and evaluated with the ERA5 and TRMM 3B42 data sets.Thirty-one variable fields and two historical convection fields are used as features and put into the model as 33 channels.Therefore, the model can learn both spatial and temporal relationships during the training process.The output field has the same height and width as the input data, but the number of the output channel is one, which means the model can directly give the results on the probability of the occurrence at each grid point at a specific time.

Performance Metrics
The ResU-Deep convection trigger function is a binary classifier, as the output is either positive or negative.The positive result means the deep convection occurred at a specific grid at a given time, while the negative result means the opposite.Therefore, a 2 × 2 contingency table can be used to count all the four possible outcomes from the output: true positive (TP), true negative (TN), false positive (FP), and false negative (FN).TP and TN respectively describe the correct prediction of the occurrence and non-occurrence.FP and FN mean the overprediction of the occurrence and non-occurrence.
Based on the contingency table, precision (P), recall (R) and F1 score (F1) can be calculated and used as performance metrics, which are shown in Equations 1-3.Precision is the ratio between TP and total positive predictions.The overprediction of occurrence results in a low precision value.Recall describes the ratio between TP and overall positive cases.It will be low if underpredicting the occurrence cases.Both the results for class 0 (non-occurrence) and class 1 (occurrence) will be reported in Section 4.1.The F1 score is the harmonic mean of the precision and recall and reflects the robustness of the model.

𝑃𝑃 = 𝑇𝑇 𝑃𝑃 𝑇𝑇 𝑃𝑃 + 𝐹𝐹 𝑃𝑃
(1) (2) Besides, we use macro precision, recall and F1 score to evaluate the performance of both classes.The macro calculates the average score between two classes, which can be expressed in Equation 4.
where X can be regarded as the P, R and F1, and the subscripts denote the class 0 and class 1.
Based on these metrics, the precision-recall (PR) curve and the area under Receiver Operating Characteristic (ROC) curve (AUC) can be used to interpret the binary classification results.Since the output of our model is the probability of the occurrence of deep convection, choosing different probabilities as the threshold values will give different results.For the classification task, it is always a deal between precision and recall when choosing the threshold.The PR curve shows the trade-off between them under different threshold values.We choose the threshold that the model has a balanced performance of both precision and recall in all four regions.To focus on the occurrence prediction, we plot the PR curve for class 1.The ROC curve also depicts the classification ability with varied thresholds.The model has a higher predicting ability with a larger area under the ROC curve.

Modeling Results
Figure 4 shows the comparison between the observation and the modeling results made by the ResU-Deep trigger function, the precision and recall score for both classes at each grid, and the validation loss curves during the training process.We use the results in Central America as an example.
Figure 4a describes the observed and modeled occurrence of deep convection at 0000 UTC in 15 August 2019, and the difference between them.The modeled results can reflect most of the occurrence cases with similar locations and shapes.However, the blurring effects brought by CNN remain.For example, the ResU-Deep trigger cannot infer some of the isolated convective areas.Also, the margins of the detected convection clusters are smoother compared with real cases.Figure 4b depicts the total number of deep convection occurrences during the year 2019 in June, July and August.The modeling result is similar to the observation with some overestimation, which shows that deep convection often occurs in Central America along the American Cordillera, Guiana Highlands, and the Intertropical Convergence Zone over the East Pacific Ocean.The white areas in Figure 4c are the locations where the model fails the prediction.The major reason is the extremely low positive samples at these grids.The average occurrence among them is only 3.40 times during June, July, and August in 2019, with a 3-hr time interval.This means the occurrence rate is only 0.46% under the threshold of 0.5 mm/hr.Therefore, it is difficult for our model to learn the characteristic of deep convection at these grids and always tends to predict as non-occurrence since there are not enough positive samples to describe them.Figure 4e  The bar plot with colored background in Figure 5 shows the performance for modeling occurrence and non-occurrence.For the occurrence case, the model has F1 scores of 0.58, 0.53, 0.60, and 0.63 for four study regions, respectively.The simulation of convection occurrence is relatively lower with respect to non-occurrence, even after the weighted loss function is applied, which is a general problem brought about by the imbalanced classes in deep learning and a problem needs to be solved in the future.With the upsampling module in the encoder part and the dynamic weighted loss function, the f1-score for the occurrence can improve 13%, 11%, 6%, and 6% in four regions respectively, while the performance of non-occurrence has no noticeable improvement.

The Time Dependence of Deep Convection
In Besides, the model's performance in predicting the initiation and end of deep convection with and without time dependence has also been compared.The results are shown in Figure 6.Note that the colorbar is set within the range of 0 and 0.5 for better comparison, and we neglect areas where the accuracy score is greater than 0.5 in the colorbar (showing dark purple in the figure).Two models have similar performance in South and East Asia and West Pacific Ocean regions.However, models trained with time dependence perform better in Central America and North Africa than those trained without time dependence.Furthermore, given temporal information, the model can predict regions with fewer or no deep convection with higher accuracy, for example, the northeast of North Africa, which shows the usefulness of temporal information in predicting rare events.In this part, we make the comparison against single-column-oriented machine learning methods such as NN and the XGBoost tree classifier (XGBoost) to see the performance improvement brought by adding the surrounding information.We follow the process in Ukkonen and Mäkelä (2019) to adjust the parameters.The input data of both single-column-oriented methods are the 33 variables shown in Table 1 at each grid.Therefore, only the information at the grid will be included during the machine learning process, which is different from our proposed method considering the horizontal dependence.precision, recall, and F1 scores for three methods.Two single-column-oriented methods do not have a good balance between recall and precision, especially for the occurrence, even after adjusting the class weight during the training process, resulting in lower F1 scores.

Comparison Against a Unified Model
To make ResU-Deep more applicable to GCMs, we train a unified model and evaluate its performance in four regions.We use data from all four regions from 2015 to 2019 in June, July, and August.The first 3 years are used to train the model, and data from 2018 is used for validation.Then we apply the unified model to four regions separately with data from 2019.The results are shown in Figure 7.
Viewing from the PR curves and the AUC values for the occurrence class, using a unified model does not have an obvious decrease in performance compared with regional-specific models.For macro precision, recall and F1 scores, the unified model outperforms the regional model in West Pacific Ocean.The results indicate that more training data can improve the performance and increase the model's robustness.

Diurnal Cycle Simulation
Apart from identifying the deep convection occurrence, the simulation of the diurnal cycle of convection is another important application area for the trigger function (Lee et al., 2008;Rio et al., 2009;Suhas & Zhang, 2014;Xie et al., 2019).The diurnal cycle is one fundamental mode of the climate variability and reflects the energy budget, hydrology and surface temperature.Therefore, diurnal cycle simulation is often used as a benchmark for evaluating the trigger function.In Central America, the modeling results in regions east of the American Cordillera, around 22.5°N∼28°N show a 3-hr lead to the observation.Besides, the model fails to simulate the variation of peak time near the equator on the ocean with the range of 115°W to 85°W.The Gulf Stream and Equatorial Counter Current influence the climate in these two regions, respectively.When the wind blows across the sea surface temperature (SST) front toward warmer waters, it changes the atmospheric layers of different levels, resulting in anomalous turbulent heat fluxes out of the ocean and enhancing atmospheric convection (Putrasahan et al., 2017).Besides, prominent climatic asymmetries are shown in the eastern tropical Pacific, with a cold tongue region with scarce convective activities near the equatorial waters, compared to the intense convective activities north of the equator (Deser & Wallace, 1990;Philander et al., 1996).From Amador et al. (2006), a cloud belt with high cloud coverage between the equator and 10°N inhibits solar radiation from reaching the ocean surface.Besides, this region is characterized by a strong meridional SST contrast and an upwelling zone with seasonal and interannual variations.In North Africa, the model captures the peak time in regions such as the Congo Basin, and the Arabian Pen.The simulation results in the Sahara region show a 3-hr lag at some grids.One possible reason is the rare precipitation events in these regions, resulting in the extremely low sampling numbers of deep convection, thus limiting the model's ability to learn the regular patterns of deep convection due to the overwhelming numbers of the negative class.In East Asia, some biases exist on the Deccan Plateau and around the Himalayas.In these two areas, the precipitation and moisture transport is complex (Dong et al., 2016;Sen Roy, 2009), and the interaction between them is elusive.Adding more variables or adjusting convolutional kernel sizes related to the mechanism may alleviate the biases.However, further information is required to guide the modeling in these regions.The modeling results in West Pacific Ocean are not as good as on land, showing a 3-hr lag.One possible reason is the importance of terrain information in modeling the deep convection in the ResU-Deep trigger function.The altitude in the West Pacific region is uniform, and may neglect the useful information from the complex terrains.Besides, this region experiences complex sea-air interaction.Belongs to the Western Pacific Warm Pool, these regions undergo a complex and unique ocean circulation.Several major tropical ocean currents such as the North Equatorial Current, Mindanao Current, and the Kuroshio, etc. originate and flow through, and the north and south hemisphere currents converge.As stated in Hu et al. (2020), small perturbations in such regions can induce significant effects on tropical convection and global climate.

Comparison Against the Variants of CAPE Trigger Functions
We further compare the diurnal cycle and peak time with results obtained from CAPE-based trigger functions, which are commonly-used deep convection schemes in climate models such as the Community Atmosphere Model and the Energy Exascale Earth System Model (Bogenschutz et al., 2018;Y. Wang & Zhang, 2016;Xie et al., 2018).Five variants of the CAPE trigger functions, undilute CAPE, undilute dCAPE, dilute CAPE, dilute dCAPE and dCAPE_ULL are evaluated and compared in the Southern Great Plains (SGP) in the central United States.We choose the SGP to examine the modeling ability of the ResU-Deep trigger function on mid-latitudes.
The SGP of North America spans Texas, Oklahoma, and Kansas.It experiences extremely diverse climatic regimes (Kloesel et al., 2018), and has some of the deepest and most intense thunderstorms outside the tropics (Liu & Zipser, 2015;Maupin et al., 2021;Schumacher & Rasmussen, 2020).To focus on this region, we build a ResU-Deep trigger with a smaller size around SGP, with the range of 32°N∼40°N and 105°W∼96.CAPE calculates the vertical integral of buoyancy of an air parcel lifted from the boundary layer to the neutral buoyance level (G.Zhang, 2009) and is defined by where the T vp and T ve are the virtual temperatures of the air parcel and the environment.Z f and Z n are the height of the free convection level and neutral buoyancy level, respectively.Convection is initiated when CAPE exceeds a threshold value.
Based on the CAPE trigger function, Xie and Zhang (2000) added the large-scale forcing into the CAPE closure to develop the dCAPE trigger function, which refers to the CAPE generation rate from large-scale forcing in the free troposphere and is defined by where T and q are the temperature and specific humidity, adv(T) and adv(q) are the temperature and moisture changes aroused by the large-scale advection in parcel and ambient environment (G.J. Zhang, 2002).Considering the impact of convection on El Niño-Southern Oscillation (ENSO), Neale et al. (2008) introduced the entrainment effect into CAPE and modifies the CAPE into the dilute CAPE.The entropy S when a parcel ascends is governed by where m is the mass of the air parcel, ɛ is the entrainment effect from the environment per unit height, and  S is the entropy of the environmental air.When the entropy S at each level is determined, the temperature and specific humidity of an air parcel are modified accordingly.For dCAPE_ULL, the ULL trigger alleviates the boundary layer constraint by scanning the source layer from the surface up to 600 hPa.For dilute CAPE and CAPE, we use 70 J/kg as the criteria.For dilute dCAPE and dCAPE, we use 65 J/kg/h as the criteria.
Figure 10 shows the peak time of the diurnal cycle at each grid and the RMSE values for the time step of the maximum frequency of deep convection.Figure 11 compares the diurnal cycle between CAPE-based trigger functions and the model output.The number of convective events is normalized by the grid size in this region.The ResU-Deep, dilute dCAPE and dCAPE trigger functions can capture the peak time of the diurnal cycle at each grid well compared with the other three trigger functions.However, all the CAPE-based trigger functions except the dilute dCAPE overestimate the occurrence of deep convection.From this perspective, our model is able to model the occurrence of deep convection at mid-latitudes with fewer positive samples (4%), and the model can be built on much smaller regions or grid sizes for specific research purposes.

Model Validation on the Southern Hemisphere Tropics
MAO observation site is in the Amazon basin at the location of 3°06′S and 60°01′W.We choose MAO to test the applicability of our model on the tropical region in Southern Hemisphere all year round instead of only in June, July and August.We use ERA5 and TRMM data from 2011 to 2015.The first 3 years are used to train the trigger function, data in 2014 is used to validate, and data in 2015 is used to test the performance.The model is trained on areas with the range 0°∼30°S and 95°W∼25°W, which has the same area size of models built in the Northern Hemisphere.
Figure 12 compares the observation and modeling results of the peak time in the diurnal cycle at each grid, similar to Figure 8.The model can capture the peak time in the diurnal cycle at most grids, especially along the coastlines, which shows that the ResU-Deep can be applied effectively to the tropical region in the Southern Hemisphere all year round.However, there are some biases at the East Pacific Ocean, southern to 15°S.Several reasons may cause this bias.The most major factor might be the extremely low samples of occurrence in this region, which hinders the model from learning the mechanism of the deep convection.Besides, environmental factors around this region is complex, such as the Humboldt Current and the Andes.Adding more features describing the ocean and the ocean-atmospheric relationship may help to learn the mechanisms and reduce the bias.

Interpretation of Feature Importance
Different from other trigger functions directly modeling from physical equations, the ResU-Deep trigger function is a black-box model, which is a major concern when applying deep learning methods.In order to understand the physical mechanisms behind this model, the importance of each feature should be interpreted.By getting an insight into the contribution of each variable, the model can give a general description of the relationship between the occurrence of deep convection and each variable.Among all the ways to explain the feature importance, we choose the relative RMSE value to quantify one variable's contribution.In this study, for each time we randomly permute one variable and sent it into the trained model together with other variables to predict the occurrence of deep convection.We repeat this process for 10 times and take the average results to eliminate the uncertainty.Then the RMSE between the predicted results and the labeled data can be obtained.The relative RMSE is calculated as: where RMSE permute means the RMSE after one variable is randomly permuted and RMSE origin means the RMSE between the label and prediction with all the variables with true values.High relative RMSE means the elimination of one variable causes great loss to the original model, indicating the variable contributes more to the model.Therefore, the higher the relative RMSE, the more important one variable is.However, since this model cannot detect the consequence between deep convection and these variables, we cannot conclude whether the variables are the indicators or the results of deep convection events.Nevertheless, one thing is clear that variables with higher contributions are more closely related to deep convection than other variables in the ResU-Deep trigger.
Figure 13 respectively lists the top 15 variables that contribute to the models in four study areas.Some common variables exist.One interesting finding is that the convection field of 3 hr before ranks high in all four regions, and the total column water vapor and the terrain height information also have significant impact except in North Africa.Besides, when we change the range of ocean areas from 90°E-180°E to 130°E-180°E, which means we remove the lands in East and Southeast Asia, the terrain height information only ranks 25 among all 33 variables (not shown) in tropical ocean region.This indicates that the topography information is essential in the occurrence of deep convection.For example, the ratio of deep convective precipitation is higher or lower than other places along the Rocky Mountains and the Andes in Central America; also along the Western Ghats and near the Tibet Plateau in South and East Asia.However, the geography is not as complex as other three regions in Africa near around 15°N where the deep convection often occurs.As a results, the humidity and dynamics related variables rank higher in North Africa compared with the geographic information.Besides, according to Neelin et al. (2009) and Peters and Neelin (2006), convective precipitation is highly correlated with the total column of water vapor.
Since in these studies, the criteria to detect the deep convection is when the convective precipitation is larger than 0.5 mm/hr, our model can detect the strong relationship between deep convection and the water vapor at each grid in all four areas.Furthermore, the importance of the convection field of 3 hr before indicates the temporal dependence of deep convection system in trigger function.Considering previous convection is of great help when predict the deep convection at current time step, which shows a development process of convection clusters in a time period.
Besides these common variables, Figure 13 also demonstrates the inhomogeneity of the deep convection mechanisms in four regions.For the Central America region, the thermal-related variables such as surface temperature, surface latent/sensible heat flux, and surface net solar radiation contribute much to the deep convection.In North Africa, the humid-related variables such as relative humidity, and the moisture flux are the dominant factors.Also, the vertical wind shear in the u direction is an important indicator, since the zonal wind at 925, 700, and 300 hPa all rank high in North Africa.In South and East Asia, the thermal and dynamics related variables have significant impacts, while in West Pacific Ocean, the situation is complex.We further examined the feature importance of the unified models trained in Section 4.1.4,and the first five crucial features are the deep convection field of 3 hr before, the specific humidity at 700 and 500 hPa, the u component of wind at 700 hPa and the terrain height information.

Discussion and Conclusion
This study presents a novel and efficient deep convection trigger function based on the convolutional neural network with three modifications from classical U-net.After modification, the training efficiency is increased around 6∼10 times, and the model can fuse different scales of features as well as alleviate the problem of an imbalanced training data set.For the occurrence case, the model has F1 scores of 58%, 53%, 60%, and 63% for four study regions, respectively.The model has a promising prediction of the non-occurrence case, with an average value of 96% for the precision, recall and F1 score.Considering the performance over two classes, the model has the values of 77%, 76%, 77%, and 79% for macro F1 scores, respectively.Different from other trigger functions without using machine learning methods, many environmental variable fields are used as learning features in this model.By setting this model to different regions, the model can easily learn the important features related with deep convection in each region.
The comparison against two single-column-oriented machine learning methods proves that taking the surrounding information and building the horizontal dependence in the model can help improve the performance.Besides, a deep-learning-based model can facilitate an automated parameter tuning process, thus enabling an easy switch to different regions and scenarios.The model structure doesn't need to be changed according to the amount of data set, and the hyperparameters of the loss function are defined by the ratio of positive and negative samples.In this way, the parameters of the model can be automatically adjusted according to the data.In contrast, there are more parameters to be adjusted, such as the number of boosting rounds, the maximum depth of each tree and the maximum number of leaves for the XGBoost; and the number of layers, the neurons in each layer, and the optimizer for the fully-connected neural network.Meanwhile, the parameters of these two methods should be manually adjusted according to the size of a data set, resulting in a long time before finding a suitable set of parameters when migrating the model to a new region.Moreover, our unified model does not show an apparent decrease in performance compared with region-specific models, which reveals that increasing the amount of training data can improve the robustness of a deep learning model.
As a problem faced by many convection trigger functions, the simulation of the diurnal cycle and its peak is challenging.For the peak time simulation, ResU-Deep can capture the peaks on complex terrains such as the American Cordillera, the Arabian Peninsula, the Congo Basin, and Indian Peninsula.The simulation of the diurnal cycle has small biases and follows the trend of the observation.Therefore, deep learning shows great potential in diurnal cycle simulation, especially in alleviating the overprediction of deep convection.We also compare our results with CAPE-based trigger functions in the SGP.ResU-Deep shows a promising modeling ability not only in tropical regions during summer seasons, but also in regions with different sizes and latitudes.In the future, ResU-Deep's ability to simulate the interannual character of deep convection can be investigated by extending the time of the training data set.
Due to the complexity of the internal structures of the deep-learning-based method, explaining its mechanism is difficult.Therefore, we use relative RMSE to interpret the feature importance.The model indicates the importance of the water vapor, the terrain height information and the convection field of 3 hr before.The feature importance of deep convection field at previous time steps is not only because of the continuity of deep convection events, but it may also have potential influence on the surrounding convection events at the current time step.Besides these common important variables, the different feature importance given by ResU-Deep trigger explores the inhomogeneity of deep convection mechanisms in different regions, thus claiming the importance of location-specific modeling in deep convection.In Central America, variables related to heat account for a large proportion in variable importance analysis, which agrees well with Magaña et al. (1999) and Maldonado et al. (2013).From Laing et al. (2008) and Maurer et al. (2017), the humidity-related variables and the vertical wind shear greatly influence the deep convection in Africa, and our model finds these regional features.In Yano et al. (2013), the deep convection over the tropical ocean is controlled by the humidity due to weak horizontal temperature gradients.Our feature importance test shows both thermal-and humidity-related variables have impacts on the deep convection prediction.
There are some limitations in our work, which we could improve in the future.The performance of the occurrence case is not as satisfactory as the non-occurrence case.One major problem is the imbalanced data set, although using the weighted loss function has alleviated this problem to some extent.In the future, we will focus on improving the modeling accuracy of the occurrence case by adding related variables from climate models or integrating the physical mechanisms of deep convection into the loss function or the model structures.However, there is always a trade-off between complex model design and accuracy, which means the trade-off between the slower training speed, larger model storage requirement, the generalization of complex models, and higher performance.Therefore, achieving a balance between them needs further studying.
Besides, we will improve the threshold values for labels.Using 0.5 mm/hr as the threshold may introduce some uncertainties.We have tested the sensitivity for choosing different deep convective precipitation thresholds from 0.3 to 0.7 mm/hr, and the results are listed in the Appendix C. In the future, more thresholds related to lightning or clouds can be incorporated to define a more robust indicator for deep convection.Also, data sets such as Global Precipitation Measurement (GPM-IMERG) with higher temporal and spatial resolution can be used to improve the model's performance and decrease the uncertainties in labels (Huffman et al., 2020).
At the current stage, it is challenging to integrate a convolutional neural network into the workflow of a GCM, which is column-oriented.There are some potential solutions.The first one is outputting the required environmental variables from the model at each time step and inputting them into the ResU-Deep trigger function.With a fast inference process of the ResU-Deep trigger function, the output results can be fed back to GCMs and used for the next time step.Other methods include moving trigger calculation from the physical process to the dynamical core, where the data exchange between columns is allowed.However, this method is more complex and needs further development of both the dynamics and physics parts.Maximum depth of each tree 3,5,7,9 7 Maximum number of leaves / No limit L2 regularization 0.0001, 0.001, 0.01 0.001 positive weights 1, 2, 3, 3.5, 4, 5 3.5

Table B2
The Hyperparameters of XGBoost Tree Method convection) increases, which means the model can perform slightly better in predicting the positive class and vice versa.Figure C2 lists the top 5 variables contributing to the model in Central America region with different thresholds for deep convection.There is no significant difference between different thresholds, and all "conv.t-1," "tcwv," and "geoheight" are essential in all cases.Other important variables such as the "st," "sh_500," and "cape" also rank high in five situations.The above-mentioned figures and table show little difference between choosing different thresholds between 0.3 and 0.7 mm/hr for defining the deep convection.
column cloud ice water (kg/m 2 ) tciw Total column cloud liquid water (kg/m 2 ) tclw Total column water vapor (kg/m 2 ) tcwv Instantaneous moisture flux (kg/m 2 s) c ie Dynamics related Vertical velocity at 500 hPa (Pa/s) ver_500 Vertical velocity at 700 hPa (Pa/s) ver_700 u component of wind at 300 hPa (m

Figure 1 .
Figure 1.The number of precipitation events (upper), number of deep convective precipitation events (middle, with 0.5 mm/hr as the threshold), and the ratio between them (lower) at each grid in June, July, and August, from 2015 to 2019.Four regions in blue boxes have a high ratio of deep convective precipitation.A: Central America, 116°W∼46°W; B: North Africa, 18°W∼52°E; C: South and East Asia, 60°E∼130°E; D: West Pacific Ocean, 110°E∼180°E.In the following text, we will use A, B, C, and D to refer to these regions.All of them have the latitudes range from 0°N∼30°N.

Figure 2 .
Figure2.The amount ratio of deep convective precipitation, the frequency ratio of deep convection events, and the occurrence ratio of deep convection in four regions during 5 years in June, July, and August.The ratios of deep convection events average over 5-year summer seasons are 8.48%, 4.18%, 12.19%, and 11.74% in four regions, respectively.

Figure 3 .
Figure 3. Flow diagram of ResU-Deep trigger.The final decision is made by the output of the decoder.The number on each block represents the channel.The width and height of each block can be changed with the input feature sizes.
Figure4adescribes the observed and modeled occurrence of deep convection at 0000 UTC in 15 August 2019, and the difference between them.The modeled results can reflect most of the occurrence cases with similar locations and shapes.However, the blurring effects brought by CNN remain.For example, the ResU-Deep trigger cannot infer some of the isolated convective areas.Also, the margins of the detected convection clusters are smoother compared with real cases.Figure4bdepicts the total number of deep convection occurrences during the year 2019 in June, July and August.The modeling result is similar to the observation with some overestimation, which shows that deep convection often occurs in Central America along the American Cordillera, Guiana Highlands, and the Intertropical Convergence Zone over the East Pacific Ocean.The white areas in Figure4care the locations where the model fails the prediction.The major reason is the extremely low positive samples at these grids.The average occurrence among them is only 3.40 times during June, July, and August in 2019, with a 3-hr time interval.This means the occurrence rate is only 0.46% under the threshold of 0.5 mm/hr.Therefore, it is difficult for our model to learn the characteristic of deep convection at these grids and always tends to predict as non-occurrence since there are not enough positive samples to describe them.Figure4eshows the validation loss curve in four regions during the training process.The training process ends when the loss values stop decreasing.The epochs shown in the x-axis describe the iteration steps during the training process.The ResU-Deep model can achieve a 6∼10 times speedup compared with the traditional U-net, while maintaining the same performance, showing the improved fitness of our proposed network to the specific convection trigger problem.
this part, we explore the time dependence of deep convection events by training a ResU-Deep model without deep convection information from previous time steps, while other variables remain the same.The results are shown in the bar plot with oblique line background in Figure 5.Removing the information of previous time steps results in a slight performance decrease in region A, C, and, D, but a clear performance drop in region B. This shows that the time dependence of deep convection is strong in tropical regions.Considering the structure of the ResU-Deep, the importance of deep convection at previous time steps is not only because of the continuity of deep convection events.Still, it may also come from its potential influence on the surrounding convection events at the current time step.

Figure 4 .
Figure 4.The comparison between the observation from Tropical Rainfall Measuring Mission and the modeling results from ResU-Deep trigger functions (a and b).(c and d) show the precision and recall for the occurrence and non-occurrence at each grid.(e) Is the validation loss of ResU-Deep and U-net during the training process.In legends, A, B, C, and D refer to Central America, North Africa, South and East Asia and West Pacific Ocean, respectively.The gray dashed value is the final validation loss obtained with the ResU-Deep trigger function.

Figure 5 .
Figure 5.The performance of the ResU-Deep trigger function for the non-occurrence case (class 0), and the occurrence case (class 1).The bars with color background show the results when model is trained with previous time steps, and the bars with oblique line background list the results when model is trained without time dependence.In legends, A, B, C, and D refer to Central America, North Africa, South and East Asia and West Pacific Ocean, respectively.

Figure 6 .
Figure 6.The comparison between the prediction of both initiation and end of deep convection with (first row) and without (second row) time dependence.
The hyperparameters of NN and XGBoost, and the corresponding software packages are listed in Appendix B. The results are shown in Figure 7.The PR curves show that the ResU-Deep model outperforms both NN and XGBoost in all four regions in predicting the occurrence, and the performance of XGBoost is slightly lower than both ResU-Deep and NN.Viewing from the AUC values, the ResU-Deep model has the highest values among all four regions.Besides, we report the macro

Figure 7 .
Figure 7.The comparison between fully-connected neural network, XGBoost Tree, and our proposed ResU-Deep model trained on separate regions and a unified ResU-Deep model trained on four regions.(a) Shows the precision-recall curve and the area under Receiver Operating Characteristic curve of the occurrence class.(b) Is the macro evaluation metrics for different models in four regions.Region A, B, C, and D refer to Central America, North Africa, South and East Asia and West Pacific Ocean, respectively.
Figure 8 compares the observation and modeling results of the peak time in the diurnal cycle with a 3-hr temporal resolution.ResU-Deep trigger function can capture most diurnal cycle peaks.It can also well depict the diurnal peak time of regions along the coastline.However, there are still some biases.The biases in the four regions may have different causes of formation.
Further interpretations about the feature importance in each region, including the terrain information, are shown in Section 5.The root mean square error (RMSE) values for the time of maximum frequency of deep convection in four regions are 4.8 hr (1.6 time step), 3.9 hr (1.3 time step), 4.5 hr (1.5 time step) and 5.1 hr (1.7 time step), respectively.The RMSE values can be reduced with an increased temporal resolution of labels.Adding features related to the ocean-atmospheric feedback, such as the cloud coverage, and the near-surface environmental variables like lower-level wind field and upper ocean heat content can also help the model to learn the interaction between sea and air, and may improve the predicting performance.Besides, we show the diurnal cycle of the modeling results from the ResU-Deep trigger function and the TRMM observations.Since the study regions across different time zones, we split the regions according to time zones and transfer the time from UTC to local solar time.Figure9shows the diurnal cycles in four regions.The y-axis depicts the occurrence number of deep convection normalized by the grid size in each time zone.Since the value is divided by grid size, the diurnal cycle is flatter compared with the diurnal cycle on a single grid.However, it can still reflect the variation during the day.The ResU-Deep gives similar order of magnitudes and variations as the TRMM observations.Therefore, the ResU-Deep effectively captures the trends of the diurnal cycle in tropical areas.

Figure 8 .
Figure 8.Comparison between the observation and modeling results on the peak time of the diurnal cycle at each grid in four study areas.The time is interpreted in Coordinated Universal Time for better comparison.The colorbar represents the time in UTC format.
5°W.Ten years of JJA data from 1999 to 2008 of ERA5 and TRMM 3B42 are used.The first 6 years are used to train the model, data from 2005∼2006 are used to validate, and the performance is tested and compared in 2008 and 2009.

Figure 9 .
Figure 9.The diurnal cycle of deep convection simulated by ResU-Deep trigger function, and the Tropical Rainfall Measuring Mission observation.The x-axis shows the local solar time, and the y-axis is the number of deep convection events normalized by grid numbers in each study area.

Figure 10 .
Figure 10.Comparison between the peak time of the diurnal cycle predicted with different trigger functions at each grid in Southern Great Plains.The time is interpreted in Coordinated Universal Time with a 3-hr temporal resolution.

Figure 11 .
Figure 11.The diurnal cycle comparison of convective available potential energy (CAPE)-based trigger functions and model output in Southern Great Plains.

Figure 12 .
Figure 12.Comparison between the observation and modeling results on the peak time of diurnal cycle in MAO.The time is interpreted in Coordinated Universal Time with a 3-hr temporal resolution.

Figure 13 .
Figure 13.Feature importance of top 15 variables in four study areas.The importance is quantified by the relative root mean square error (RMSE) value.
and output layers.

Figure C1 .
Figure C1.Number of the occurrence of deep convective precipitation under different thresholds.

Table 1
List of Predictors for the ResU-Deep Trigger Functions

Table C1 The
Comparison of Model Performance Between Different Thresholds in Central AmericaFigure C2.The results of the feature importance test with different chosen thresholds in region Central America.