Multi‐Task Learning for Tornado Identification Using Doppler Radar Data

Tornadoes, as highly destructive weather events, require accurate detection for effective decision‐making. Traditional radar‐based tornado detection algorithms (TDA) face challenges with limited tornado feature extraction capabilities, leading to high false alarm rates and low detection probabilities. This study introduces the Multi‐Task Identification Network (MTI‐Net), leveraging Doppler radar data to enhance tornado recognition. MTI‐Net integrates tornado detection and estimation tasks to acquire comprehensive spatial and locational information. As part of MTI‐Net, we introduce a novel backbone network of Multi‐Head Convolutional Block (MHCB), which incorporates Spatial and Channel Attention Units (SAU and CAU). SAU optimizes local tornado feature extraction, while CAU reduces false alarms by enhancing dependencies among input variables. Experiments demonstrate the superiority of MTI‐Net over TDA, with a decrease in false alarm rates from 0.94 to 0.46 and an increase in hit rates from 0.23 to 0.81, highlighting the effectiveness of MTI‐Net in handling small‐scale tornado events.


Introduction
Tornadoes, as the most destructive intense convective weather events, can reach wind speeds of up to 130 m s 1 , thus presenting a significant threat to human life and property (Nouri et al., 2021).Therefore, achieving accurate and efficient tornado identification is crucial for mitigating the damages caused by tornadoes.This not only provides support for meteorological research but also holds paramount significance in the field of disaster reduction.However, tornadoes present a considerable challenge for real-time detection and identification due to their small spatial scale, short duration, and rapid movement (Lagerquist et al., 2020;McGovern et al., 2019).Doppler weather radar, with its high-spatial and temporal resolution, has become an essential tool for tornado research (Segall et al., 2022).Many traditional tornado identification methods combine radar observations with empirical rules.For instance, Burgess et al. utilized Doppler radar to capture the radial velocity shear feature at the location of tornado occurrence, known as the Tornado Vortex Signature (TVS), which greatly contributed to the development of radar-based tornado detection algorithms (Burgess et al., 1975).Subsequently, the Mesocyclone Detection Algorithm (MDA) defined mesocyclones as rotating updrafts within a thunderstorm, which could be further compressed and intensified under suitable environmental conditions to eventually form a tornado (Stumpf et al., 1998;Zrnić et al., 1985).The Tornado Detection Algorithm (TDA) was built on the principle of mesocyclone detection, utilizing radial velocity information from radar Level-II data to identify vortex features (Mitchell et al., 1998).However, these traditional methods have limited performance, for instance, the identification accuracy of the MDA algorithm was found to be less than 50%, primarily due to its over-reliance on inherent physical assumptions (Wang et al., 2017).
With the development of artificial intelligence, advanced machine learning methods are now being applied to tornado identification tasks.Unlike traditional methods, these approaches can directly extract tornado features from radar data without heavily relying on specific physical assumptions.Taking the 23 variables in the MDA as direct inputs, Trafalis et al. used a Support Vector Machine (SVM) to predict the likelihood of tornado formation in the current environment (Trafalis et al., 2003).Furthermore, Zeng et al. introduced the TDA-RF algorithm based on the Random Forest (RF) classification algorithm (Zeng et al., 2022).The TDA-RF algorithm can identify tornadoes in real-time by utilizing Level II features of Chinese S-band Doppler weather radar, including radial velocity, reflectivity, and spectrum width.
Although significant progress has been made in applying machine learning methods to identify tornadoes, these methods require extensive preprocessing of radar data, potentially resulting in the loss of critical information inherent in radar data (Wimmers et al., 2019).In contrast, deep learning methods possess powerful end-to-end learning capabilities and can directly process raw data without manual feature extraction (Han et al., 2021).Consequently, some studies have attempted to integrate deep learning methods into tornado identification tasks.For example, Basalyga et al. developed a three-layer CNN network specifically tailored for identifying tornado patterns (Basalyga et al., 2021).To further enhance performance, they also integrated a real-time sample augmentation technique within the CNN framework to address the issue of imbalanced tornado samples.However, deep learning methods often rely on basic CNN architectures with relatively limited receptive fields, which may restrict their ability to comprehensively capture and analyze spatial information related to tornado events in the input data.
In this study, we introduce a novel multi-task learning based tornado identification Network (MTI-Net).MTI-Net adopts a multi-task learning architecture, seamlessly integrating tornado/no-tornado detection and tornado number estimation tasks within a unified model.Through mutual learning and synergy between these tasks, the model comprehensively captures the spatial distribution and positional information of tornadoes from the input data.To more precisely capture the localized details of tornadoes, we introduce an innovative backbone: Multi-Head Convolutional Block (MHCB) inspired by Transformer blocks.MHCB integrates both Spatial Attention Units (SAU) and Channel Attention Units (CAU).Specifically, SAU leverages multi-head spatial attention to extract tornado-related details from multiple representation subspaces in parallel.Concurrently, CAU incorporates multi-head channel attention to selectively enhance dependencies from various input variables, thereby further suppressing noise interference in the data.
The remainder of this article is structured as follows: Section 2 outlines the data utilized in this study.Section 3 provides an in-depth overview of the architecture of the proposed method.In Section 4 presents the experimental results.Finally, Section 5 summarizes the study.

Data Set
The research domain for this study was selected with geographic coordinates spanning [18 41°N, 109 123°E].This region was chosen based on several key criteria that made it suitable for our study.Firstly, tornados are often observed in this area.Furthermore, the study area experienced a monsoon climate, characterized by distinct wet and dry seasons (Chen et al., 2018).Between April and October, the region was prone to severe convective weather events, which could trigger tornado formation.Additionally, our preliminary statistical analysis from 2017 to 2023 in the region revealed that there were over 70 tornadoes of EF2 intensity or higher, highlighting the severity of tornado activity in the area.Moreover, the presence of 109 S-band meteorological radars in the region enhanced the quality and availability of radar data for our study compared to other types of radar systems.
Utilizing Doppler radar data and tornado observation records from 2017 to 2023 provided by the National Meteorological Center (NMC) of China, we established the first tornado dataset based on fine radar structural features in China.Considering that most tornadoes occur in low-altitude regions, we utilized data from the lowest three elevation angles of the radar to construct the dataset, limiting it to a radius of 150 km centered on the radar station.
In this study, we focused on three key variables: reflectivity, radial velocity, and spectrum width, to gain insight into tornado characteristics.
Reflectivity: Reflectivity represents the atmospheric conditions in the tornado region, providing crucial insights into the precipitation, its intensity, and the types of particles present in the atmosphere.This information is essential for understanding the humidity and environmental factors contributing to tornado formation.

Radial velocity:
Radial velocity reveals the internal airflow dynamics of tornadoes, which helps to provide a deeper understanding of the rotational nature of tornadoes, and can also provide insight into the strength of the tornado.
Spectrum width: Spectrum width indicates the turbulence of the tornado, with a larger spectrum width typically indicating irregularities in air movement and higher turbulence.
The radar dataset after manual selection and determination spans from 2017 to 2023, which was divided into a training subset, a validation subset, and a test subset.Specifically, data from 2017 to 2022 were divided into training and validation subsets.In each training batch, 85 samples were selected as the validation subset, with the remaining 765 samples constituting the training subset.Additionally, to fully evaluate the model's performance, 367 samples from 2023 were designated as the independent test subset.
After determining the dataset, to facilitate the model learning process, we conducted necessary preprocessing on the dataset.Firstly, we resized the data to 9 × 1,200 × 1,200, representing the number of channels, height, and width, respectively.Subsequently, we normalized each variable.Reflectivity and spectral width were normalized within the range of [0,1], while radial velocity was scaled within the range of [ 1,1], aiming to enhance the convergence and effectiveness of the model.
Deep learning models often require abundant training data due to their complex architectures and numerous parameters.However, the available dataset is insufficient for comprehensive model training.To overcome this limitation, we implemented two data augmentation techniques during training: random rotation and random cropping.Random rotation is a method to enhance dataset by randomly rotating the radar data around the radar station.Random cropping was employed to increase the sample number of the dataset by randomly cropping radar data to patches with a window size of 400 × 400.To ensure the cropped patch contains the appropriate tornado information, the center of the tornado should fall into the cropped patch.

Ground Truth
In this study, we accurately identify tornado targets based on tornado observation records and radar data to provide accurate ground truth data for subsequent model training.Specifically, to obtain accurate tornado locations, we further utilize radar data on the basis of collected tornado observation records and strictly follow the following steps to obtain accurate tornado coordinates: Selection of Radar Station: Based on the recorded time and location of tornado occurrences, choose the radar station closest to the corresponding time and location.

Identification of TVS Regions:
Starting from the lowest elevation angle in the base data of the selected radar station, identify the area near the recorded latitude-longitude with evident Tornado Vortex Signature (TVS) features.

Measurement of TVS Features:
Within the identified TVS region, record the maximum positive (V max ) and minimum negative velocities (V min ), along with the distance from the TVS center to the radar station.

Calculation of Rotation Velocity:
The rotation velocity of the tornado V rot is computed using the formula V rot = (V max V min ), providing an indication of the rotational speed.The type of cyclone is determined based on this velocity.

Selection of Tornado Targets:
Based on the cyclone type, choose tornado targets, including weak cyclones, moderate-strength cyclones, and strong cyclones.
This study adopts a multi-task learning framework, dividing the tornado identification task into detection and estimation tasks.Specific ground truth is crucial for training and evaluating the model's performance in these tasks.To achieve this, we design specialized ground truth based on the identified tornado targets.
Detection Task Ground Truth: A binary map is created according to the marked tornado locations.In this map, the tornado locations and a 5 × 5 area around it are labeled as "1," while other areas are labeled as "0," providing a clear representation of the tornado's location in ground truth.
Estimation Task Ground Truth: A heat map is generated using a Gaussian kernel to blur the binary map from the detection task.This heat map serves as a ground truth for the estimation task, capturing the spatial distribution of tornado occurrences.

Method
In this study, the tornado identification task is defined as a multi-task learning task at the grid point level, comprising a detection task and an estimation task.For this purpose, the proposed methodology utilizes a multitask learning architecture that follows the principle of hard parameter sharing.It consists of a shared layer and two specific task layers, each designed to serve the different requirements of their respective tasks.

Model Architecture
Figure 1 illustrates the architecture of the proposed Multi-Task Identification Network (MTI-Net).The shared layer in the model adopts a 5-layer encoder-decoder architecture, comprising encoder, decoder, and skip connections.The encoder is responsible for extracting low-level features and down-sampling input features.Given the relatively small scale and weaker features of tornado events, we introduce a novel backbone into the encoder: the Multi-Head Convolutional Block (MHCB) to enhance the ability of the model in discriminating tornadorelated features.The MHCB consists of spatial attention units and channel attention units, utilizing a multihead attention strategy to capture the dependencies among radar data across different dimensions in parallel.
To implement an effective down-sampling strategy, we introduce a depth embedding operation within the encoder structure.Built upon the SPD-Conv structure (Sunkara & Luo, 2022), the depth embedding operation can minimize the loss of crucial details, thereby enhancing the model's ability to extract small-scale tornado features.Within the encoder design, the number of convolutional kernels is sequentially set to 64, 120, 256, 512, and 1,024, facilitating layered feature extraction and dimension transformation.
The primary role of the decoder is to aggregate and refine the multi-scale features output by the encoder, aiming to achieve optimal outputs that align with the target.Within the structure of the decoder, its backbone comprises multiple standard convolutional blocks.Each block is organized sequentially as BN-ReLU-Conv (3 × 3), encompassing a batch normalization layer, a ReLU activation function, followed by a 3 × 3 convolutional layer.In addition, the skip connection is a direct bridge between the encoding and decoding phases to preserve finegrained details.
The output layer of the model consists of two specific task layers: a tornado/no-tornado detection layer and an estimation layer.Both of these layers share a similar network structure, comprising a 1 × 1 convolutional layer followed by an activation function layer.The detection layer utilizes a Sigmoid function to produce a binary classification output of either "0" or "1," offering a definitive determination regarding the presence or absence of tornadoes.Simultaneously, the estimation layer employs the ReLU function to generate regression outputs of tornado numbers, which provides a more detailed understanding of the distribution of tornadoes on specific spatial scales.

Spatial Attention Unit
Benefiting from the advantages of multi-head self-attention collaboration, transformer blocks have demonstrated significant effectiveness in various visual tasks and some weather forecasting applications (Z.Liu et al., 2021).However, in scenarios that require accurate identification of small-scale targets, such as tornado identification, the transformer block faces challenges in capturing local and global contextual correlations.To alleviate these limitations, we introduce a Spatial Attention Unit (SAU).SAU aims to combine the efficiency of Convolutional Neural Networks (CNNs) in capturing local relationships with the collaborative advantages of transformer blocks.
As shown on the right of Figure 1, the SAU consists of two primary components: Multi-Head Spatial Attention (MHSA) and Multi-Layer Perceptron (MLP).MHSA is responsible for capturing the contextual correlations between different local regions to better integrate and understand the local information.Afterward, the MLP performs a nonlinear transformation of the feature representation through a series of convolutional layers and activation functions.The MHSA can be represented as follows: (1) where z denotes the input feature, z l+1 and z l+1 are the outputs of MHSA and the MLP layer.
MHSA, inspired by multi-head self-attention, combines multi-head attention to attend, in parallel, to information from different representational subspaces at different locations.The definition of proposed MHSA can be summarized as follows: Here, z = [z 1 , z 2 , …, z i ] signifies the division of the input feature z into a multi-head form using grouped convolution, z i denotes each sub-feature after the grouped convolution operation.To promote the information interaction across the multiple heads, we also equip MHSA with a learnable linear transformation w p .SA i (z i ) is single-head spatial convolution attention which captures information from h parallel representation subspaces.SA i (z i ) can be defined as:

SA i (z
where s , K s , and V s respectively denotes the context encoding obtained from the convolution operation, and d k denotes the scaling factor.In addition, w is a trainable parameter, which is used to learn the dependencies between local regions.

Channel Attention Unit
In specific regions with high reflectivity and low radial velocity, the identification model may exhibit pronounced tornado false alarms, potentially attributed to an over-reliance on the reflectivity channel.To address this issue, adjusting the weights of various channels is essential.To this end, we design a Channel Attention Unit (CAU) based on the SAU structure.
As shown on the right of Figure 1, the CAU has the same structure as the SAU that consists of Multi-Head Channel Attention (MHCA) and Multi-Layer Perceptron (MLP).Therefore, the CAU can be represented as: where z denotes the input feature, z l+1 and z l+1 are the outputs of MHCA and the MLP layer.
Given an input feature z, grouped convolution is used to divide it along the channel dimension into a multi-head form as z = [z 1 , z 2 , …, z i ].Subsequently, single-head channel attention CA i (z i ) is used to capture channel dependencies: where Q c , K c , and V c respectively denotes the context encoding operation.
Finally, all the channel attention CA i (z i ) are combined to obtain the output of the multi-head channel attention:

Loss Function
In the detection task, given that a tornado is a small-scale instance, only a 5 × 5 region within the 400 × 400 training sample contains positive instances, resulting in a significant imbalance of positive and negative samples.
To address this problem, this study introduces weighted binary cross entropy (BCE) loss (Aurelio et al., 2019) and DiceLoss (Q.Liu et al., 2021).These losses aim to increase the weight of the positive samples to further address the imbalance between positive and negative samples.The estimation task is a standard regression problem, and the loss used for this task is the MSELoss (X.Li et al., 2021).These above loss functions can be calculated as: where y represents the ground-truth label, ȳ represents the output logits of the model, n represents the number of samples, and σ represents the sigmoid function.

10.1029/2024GL108809
During the training of the model, if the optimization is based only on a direct summation of the losses from the detection and estimation tasks, the model may exhibit significant biases for specific tasks, thus potentially the performance of other tasks.To address this, three learnable parameters, namely σ 1 , σ 2 , and σ 3 are introduced into the overall loss function.These parameters aim to ensure that the model appropriately considers both tasks during the training process (Yang et al., 2023).The final loss function of the model is defined as: where loss 1 and loss 2 are the detection losses, loss 3 is the estimation losses, respectively.

Evaluation Metrics
In this study, an object-based approach is employed for evaluating the model performance (Lagerquist et al., 2021).Initially, within a predefined threshold of 5 km, the actual tornado location is compared with the model predicted location to establish evaluation metrics.To quantify the model's performance, we introduce three key concepts:

True Positive (TP):
The result is considered a True Positive when the identified tornado location by the model and the actual tornado location are within the set threshold.This means that the model successfully and accurately identified the tornado location.

False Positive (FP):
The result is considered a False Positive when the identified tornado location by the model and the actual tornado location are outside the set threshold.This means that the model incorrectly indicates the presence of tornado.

False Negative (FN):
The result is considered a False Negative when actual tornadoes existed within the set threshold, but the model failed to identify them.This means that the model fails to accurately capture the real tornado.
This study employs four established evaluation metrics from atmospheric sciences to assess the proposed model's capability in identifying tornado objects: Critical Success Index (CSI), Probability of Detection (POD), False Alarm Rate (FAR), and Bias Score (Bias) (W.Li et al., 2023).These metrics are selected based on their relevance and significance in evaluating tornado detection models.The formulations for these metrics are delineated below: Probability of Detection (POD): POD measures the ratio of correctly identified tornadoes to the total number of actual tornadoes.It is particularly valuable in understanding the model's ability to detect true positive cases.
False Alarm Rate (FAR): FAR quantifies the ratio of falsely identified tornadoes to the total number of predicted tornadoes.This metric helps in evaluating the model's propensity for false alarms or false positives.
Critical Success Index (CSI): CSI evaluates the overall performance of the model by considering both detection and false alarm rates.It provides a comprehensive measure of the model's skill in identifying tornadoes while minimizing false alarms.
Bias Score (Bias): Bias assesses the model's tendency to overestimate or underestimate tornado occurrences.

Threshold Tests
During the construction of the model, we initially set five different binary classification thresholds: 0.5, 0.6, 0.7, 0.8, and 0.9, which means that when the probability of the model's output exceeds these specific thresholds, the output is identified tornado.In this section, we present in detail the impact of each threshold on model performance.
As shown in Figure 2a, as the threshold increased, the POD gradually decreased from 0.82 to 0.77.Conversely, the FAR reduced from 0.51 to 0.39, indicating that higher thresholds could significantly minimize false alarms, thereby enhancing the reliability of the model.The CSI improved from 0.44 to 0.51, indicating that with an increased threshold, there was an enhanced consistency between the model's predictions and actual occurrences.Combining these observations, we observed the balance at the threshold of 0.7.This threshold achieved a moderate value across multiple performance metrics, maintaining high detection accuracy while effectively reducing false alarms.

Loss Function Tests
We introduce various loss functions into our multi-task learning framework, including weighted BCELoss, DiceLoss, and MSELoss, to optimize tornado detection performance.These loss functions have different roles in guiding the training process and improving the effectiveness of the model in capturing various aspects of tornado features.In this section, we systematically use multiple evaluation metrics to comprehensively assess the effectiveness of different loss functions, thus exploring the effectiveness of employing the multi-task learning framework in the tornado identification task.
Figure 2b illustrates the evaluation results for different loss combinations.Using a single weighted BCELoss in tornado identification results in a high POD of 0.71, which indicates that this loss function is better at identifying positive categories to some extent.However, this is accompanied by a higher FAR of 0.50 and a relatively low CSI of 0.40.The introduction of DiceLoss results in a decrease in the FAR to 0.46 and a slight increase in the CSI to 0.45, which suggests that the combination of both improves the accuracy of the detection and reduces the number of false positives.
In addition, the addition of MSELoss improves the POD of the model to 0.80, which emphasizes its critical role in tornado number estimation.Notably, the integration of weighted BCELoss, DiceLoss, and MSELoss produces the highest CSI of 0.47.This highlights the effectiveness of the multi-task loss strategy in overall model optimization.

Data Augmentation Tests
In this section, we systematically explore the impact of data augmentation methods on model performance and verify the effectiveness of data augmentation methods in small-scale tornado identification tasks, thus providing useful references and guidance for small-scale weather identification tasks.The specific experimental methods are as follows: DA 1: Not using data augmentation method.DA 2: Using only the random rotation method.DA 3: Using only the random cropping method.DA 4: Using both random rotation and random cropping methods.
The results from Figure 2c clearly indicate significant differences in model performance across four different methods.Firstly, comparing DA 1 and DA 2, it is observed that the introduction of random rotation leads to improvements in all evaluation metrics.Particularly noteworthy is the decrease in the FAR score from 0.48 to 0.27, indicating that random rotation effectively enhances tornado detection sensitivity.Meanwhile, the utilization of random cropping in DA 3 significantly improves the model's hit rate, suggesting that random cropping further enhances the model's ability to capture tornado detail information.However, when both random rotation and random cropping techniques are simultaneously applied, the model's performance is further enhanced.In this scenario, the model achieves optimal performance in both POD and FAR, reaching 0.81 and 0.46, respectively.These results clearly demonstrate the significant impact of data augmentation methods on model performance, particularly the combined usage of random rotation and random cropping.

Results and Analysis
In order to demonstrate the performance of the proposed MTI-Net for identifying tornadoes, we select several groups of representative methods for comparison experiments on the test set.The specific comparison experiments are shown in Table 1.
As can be seen from

Ablation Experiment
This study systematically designs a series of ablation experiments, progressively introducing various components to validate their contributions and effectiveness within the model.Through the implementation of ablation experiments, we can precisely identify which pivotal components in the model play a decisive role in tornado identification, thereby deepening our understanding of the model's mechanisms.
Based on the model structure, we design the following experiments: Baseline: The baseline model is an encoder-decoder structure composed of a five-layer standard convolutional block.This serve as the reference model for comparison purposes.
Baseline + SAU: In the baseline model, spatial attention units (SAU) in the multi-head convolutional block are used instead of the standard convolutional block in the encoder.
Baseline + CAU: In the baseline model, channel attention units (CAU) in the multi-head convolutional block are used to replace the standard convolutional block in the encoder.

MTI-Net:
The spatial attention unit (SAU) and channel attention unit (CAU) are simultaneously introduced to replace the standard convolutional block in the encoder.
Based on the results in Figure 2d, it can be seen that the introduction of SAU improves the POD score of the model from 0.76 to 0.80, which shows the positive function of SAU in improving the sensitivity of tornado detection.Subsequently, the FAR scores of the model with the introduction of CAU are significantly decreased compared to the baseline, indicating that CAU suppresses some interfering information by emphasizing the importance of different input variables, which reduces the false alarm rate of the model.As shown in Figure 4, with the Most importantly, the integration of SAU and CAU in MTI-Net significantly improves its overall performance.MTI-Net shows significant advantages in key metrics such as accuracy and false alarm rate, which highlights the synergistic benefits of SAU and CAU in the model.As a result, MTI-Net can effectively reduce false alarms and ensure more reliable identification of tornado events.

Conclusions
This study presents an innovative tornado identification algorithm called MTI-Net based on the multi-task learning that effectively addresses the challenges posed by small-scale tornadoes.MTI-Net utilizes two specific layers to integrate the tornado/no-tornado detection task and the tornado number estimation task into a single model.By integrating the two tasks, MTI-Net optimally captures tornado feature information from input data, ensuring robust performance in low-quality data.To overcome the limitations of transformer blocks in small-object recognition tasks, we innovatively introduce the Multi-Head Convolutional Block (MHCB) in this study.This design strategy seamlessly combines the advantages of Convolutional Neural Networks (CNNs) in capturing local features with the potential of transformer blocks in handling global information.Specifically, MHCB integrates spatial attention units and channel attention units, strengthening the model's ability to capture local tornado-related features and simultaneously capturing complex dependencies between different input variables.Importantly, the experimental results show that MTI-Net demonstrates higher hit rates in tornado identification compared with other methods.In addition, the algorithm effectively mitigates the false negative problem in tornado frequent regions, providing strong support for the accurate identification and warning of tornado events.
The success of MTI-Net underscores the importance of integrating multi-task learning and innovative architectural designs.Moreover, this research underscores the critical importance of addressing the challenges associated with small-scale weather phenomena, which remains a vital area for further investigation.Future research directions may involve incorporating temporal information to augment the model's ability to capture the dynamic nature of tornadoes.Additionally, examining the integration of other meteorological data sources, such as dual polarization data and satellite data, could further enhance tornado detection methods.

Figure 1 .
Figure 1.Architecture of the proposed multi-task identification network (MTI-Net).The left is the overall architecture of MTI-Net, and the right is the architecture of the multi-head convolutional block (MHCB) in MTI-Net, which consists of spatial attention units (SAU) and channel attention units (CAU).

Figure 2 .
Figure 2. Results of ablation experiments.From left to right, top to bottom, are the threshold evaluation results, the loss function combination evaluation results, the data augmentation method evaluation results, and the component evaluation results.

Figure 3
Figure 3 illustrates a set of case studies of tornado identification using various methods.It is clear that MTI-Net performs well in accurately locating tornadoes and skillfully reduces interference from various types of noise, thereby effectively minimizing false alarms.

Figure 3 .
Figure 3. Various tornado identification models display visual results on radial velocity maps.Black solid circles indicate the actual tornado locations, black hollow circles represent the locations identified by the models.Subfigures (a-d) correspond to tornado events detected at Huai'an station at 12:51:08 UTC on 19 September 2023, and subfigures (e-h) correspond to tornado events detected at Lianyungang station at 11:59:28 UTC on 19 September 2023.
Table1, the results of the traditional tornado identification method TDA are not satisfactory, with a low POD of 0.23 and a significant high FAR of 0.94, resulting in a CSI of only 0.05.This stark contrast highlights the potential of deep learning methods for the tornado identification task.Despite the fact that SwinUNet achieves a POD of 0.68, it also has a bias score of 2.34, which indicates that the method has excessive false positives in tornado identification.Compared to SwinUNet, ACmix achieves a CSI of 0.41 in tornado recognition tasks, demonstrating superior performance by effectively integrating the global attention of the Transformer and the local feature extraction of CNN.However, MTI-Net holds a distinct advantage over ACmix.Specifically, MTI-Net achieves a POD score of 0.81, significantly outperforming ACmix, indicating its enhanced capability in capturing specific tornado details.In addition, MTI-Net has a lower FAR compared with ACmix, which indicates that MTI-Net can more effectively avoid noise interference.

Table 1
Performance Comparison of Different Models