TL-GNN: Android Malware Detection Using Transfer Learning

Malware growth has accelerated due to the widespread use of Android applications. Android smartphone attacks have increased due to the widespread use of these devices. While deep learning models oﬀer high eﬃciency and accuracy, training them on large and complex data sets is computationally expensive. Hence, a method that eﬀectively detects new malware variants at a low computational cost is required. A transfer learning method to detect Android malware is proposed in this research. Because of transferring known features from a source model that has been trained to a target model, the transfer learning approach reduces the need for new training data and minimizes the need for huge amounts of computational power. We performed many experiments on 1.2 million Android application samples for performance evaluation. In addition, we evaluated how well our framework performed in comparison to traditional deep learning and standard machine learning models. In comparison to state-of-the-art Android malware detection methods, the

Malware growth has accelerated due to the widespread use of Android applications.Android smartphone attacks have increased due to the widespread use of these devices.
While deep learning models offer high efficiency and accuracy, training them on large and complex data sets is computationally expensive.Hence, a method that effectively detects new malware variants at a low computational cost is required.A transfer learning method to detect Android malware is proposed in this research.Because of transferring known features from a source model that has been trained to a target model, the transfer learning approach reduces the need for new training data and minimizes the need for huge amounts of computational power.We performed many experiments on 1.2 million Android application samples for performance evaluation.In addition, we evaluated how well our framework performed in comparison to traditional deep learning and standard machine learning models.In comparison to state-of-the-art Android malware detection methods, the proposed framework offers improved classification accuracy of 98.87%, a precision of 99.55%, recall of 97.30%, f1 measure of 99.42%, and a quicker detection rate of 5.14 ms by utilizing the transfer learning strategy.system-based smartphones quickly gained popularity [1].With an 84 percent market share of smartphones worldwide, the most widely used mobile operating system in the world is Android[ [2][3]].In 2022, around 12 new upgraded versions of Android will be released.
Security attacks are becoming more common due to this level of adoption and the open-source nature of Android applications [4], which significantly compromise the integrity of such applications.According to statistics, there are more than 50 million instances of potentially unwanted applications (PUAs) and malware for Android, as shown in Figure 1.
There are already over three million apps available on Google Play.Unfortunately, these applications contain a significant amount of harmful malware[ [5], [6]].Attackers attempt to obtain people's money by stealing and monitoring their data and personal information[ [7], [8]].
Due to the open-source nature of Android-based applications, hackers are able to easily upload malicious code programs to Google Play, including Trojan horses, adware, file infestations, riskware, backdoors, spyware, and ransomware [ [9], [10]].It is crucial to develop efficient malware detection techniques due to the present spread and rising complexity of malware in order to address this important issue [11].
There is complexity and uncertainty with traditional malware detection techniques [12].The extensive use of deep learning and machine learning techniques in recent years [13] has significantly increased the accuracy of malware detection mechanisms, which has contributed to the development of Android malware detection utilizing these techniques[ [14], [15]].
An extensive amount of research has been done on methods for deep learning-based Android malware detection in response to the rapid growth of Android malware [16].Using various types of deep and machine learning models, [ [17], [18]] researchers have put forth a number of methodologies and produced a number of research results.
Unfortunately, deep learning-based malware detection methods need a lot of labeled data points in order to accurately identify malicious threats[ [19] [20]].The size of the data set needed to identify new malware threats is typically small, and finding new data sets takes more time [ [21], [22]].It takes a lot of time and resources to train deep learning models from scratch for a new data set in order to identify a new malware threat [23].
Using the deep transfer learning technique is one efficient method of overcoming the issues of model retraining and high computational complexity [ [24], [25]].The main approach adopted in our research to reduce computational complexity is to transfer well-known feature sets from a trained GNN model to a destination model with less training data [ [26], [27], [28]].
In the modern world, antivirus software is no longer appropriate; instead, the Google Play Store runs a security check to stop the upload of harmful applications[ [29] [30]].Yet, despite the security check, the Play Store still has a large number of harmful applications[ [31], [32]].To counter these workarounds, numerous machine learning and deep learning techniques were developed.Most of the proposed deep learning methods require a significant amount of training time [33].The proposed approach reduces the amount of time needed for training.
• Obfuscation techniques were used by malware developers to make it more challenging to detect their malware using conventional dynamic and static analysis approaches.
• There are countless malware zero-days that are easy to prevent signature-based systems from detecting.
• Malware analysts are required to evaluate a significant amount of data, which takes time and might cause analysis fatigue.Malware analysis requires highly specialized knowledge.
• Research on classifying and identifying Android malware has not used transfer learning.
Protecting users from threats like ransomware, botnets, and spyware is the main objective, and deep transfer learning is being used to differentiate between benign and malicious applications.The objectives and goals of the research are two-fold: • Employing a range of techniques to effectively identify malware from samples of both benign and malicious applications.
• To evaluate the results of various approaches and algorithms and offer recommendations for the most effective malware detection approach.
The following are the main goals of using transfer learning at the initial stage: Computation cost: Using complex and hybrid data sets for training deep learning models requires a lot of computational power.The transfer learning approach can be used to reduce this high computational cost.
The following are the significant contributions of our work: • We discuss the basic principles of transfer learning and deep learning that were used in developing the proposed approach.
• We propose TL-GNN, a new automatic malware detection approach for Android that precisely detects the malware and its type using a graph neural network.
• To avoid data bias, we evaluate TL-GNN on a wide range of public datasets.The results of the experiments show that TL-GNN has higher accuracy than other approaches.
• The training of the GNN model is accelerated through transfer learning.Transfer learning was used to transfer the model to the classification phase.

| LITERATURE REVIEW
In this part of the paper, we discuss earlier frameworks or techniques for detecting malware.As we develop the model for malware detection, we also talk about the gaps in the literature that currently exist and how we can overcome them.
Zaki et al. [34] used a hybrid approach to analyze the behavior of mobile malware.A broad model of mobile malware behavior that the authors provided can help identify the key components for detecting Android malware in an Android application.
Su et al. [35] proposed a deep learning-based malware detection approach for the Android platform.They utilized static analysis approaches and reported detection accuracy of above 97%; however, continuously emerging attacks were not included in the analysis.
Wang et al. [36]  In Visualdroid [43], security analysts must first obtain a sample of the malware, then create an appropriate signature (or make sure the sample can be correctly labeled based on existing signatures), and finally push the new signature to all antimalware tools, typically via an online update mechanism.A malware sample cannot be automatically or instantly used to generate a signature.
In StormDroid [44] the-art approaches was also made.
The proposed approach assesses static features while using feature sets that obfuscate information.The model attained a 99.82% accuracy with no false positives and a 99.26% family categorization accuracy.
Table 1 provides details about the studies that were examined, including the authors' methodology and algorithmic choices.The papers' results are displayed in Table 1.

| PROPOSED METHODOLOGY
In this section, we discuss the dataset, implementation details, and workflow of our proposed model.In the proposed approach, the Android applications were collected from various sources, such as (MALNET [45], MALNET-Tiny [45], BIG [12], Malicia [6]).• Version: This provides information about the APK file's most recent version.Typically, when an application is upgraded, versions change.
• Dalvik code: These Java bytecodes were converted into executable code.In more recent Android versions, the ART (Android.runtime)library has replaced Dalvik Code.Debugging tools are provided by the ART library to help identify application issues.
• Component: Android APKs are divided into components for better storage.The components store permissions, activities, intents, and resource files.
• System Call: To access the resources of the Android operating system, applications use system calls.They serve as system-level APIs that can communicate with system files.
• Runtime Libraries: Debugging and diagnostic tools are made available via the Android Runtime (ART).
Ranking was done after feature vectors had been generated from the feature sets.For instance, the parameter "permission" is more important than the parameter "version".In a way similar to this, the importance of each feature parameter is ranked.By ranking the parameters, it is possible to filter out insignificant features.
The significant feature sets have been converted into binary graphs once the feature sets have been filtered.

| Generating Graph from Android Applications
The research criteria state that static features of an APK file, like the string XML files, resource files, Android Manifest.xml,and Dalvik files, can be used to efficiently visualize an APK graph.These files were used to extract the malware graph from the malicious APKs.By converting files into binary vectors, graphs can be produced.
The components required to generate graph data sets are extracted from the dataset APK archives.The APK data has been kept in a binary array vector matrix and can be interpreted as a binary stream.The 8-bit binary files generated by disassembling the APK files are then mapped to the grayscale range of the graph.The vector array matrix generated from the binary streams is used to construct the graph, as shown in Figure 2.
The following are the steps to generate the graph: Step-1: The files Android Manifest.XML, Resources.arsc,Classes.dex, and jar are extracted from data sets that contain APK archives.
Step-2: The generated files are disassembled into 8-bit binary files.When the data in the files is transformed into binary data, binary vector streams are generated.
Step-3: From the binary vector streams, an 8-bit array vector matrix is generated.
Step-4: The graph is generated using the array vector matrix, which is then stored in a graph data set.
The generated graph serves as an input for both the transfer learning and conventional GNN models.The last few layers of the transfer learning model are updated, while the first few layers are left unchanged because they provide generalized feature sets.The following steps demonstrate the algorithm's workflow: • Generating Dalvik bytecode files by decompiling the applications.
• Extracting all defined methods from each Dalvik bytecode file by scanning it, and finally constructing a node for every method.
• Building an edge between the caller and caller nodes based on the call relationship by iteratively traversing each method's call statement (such as "invoke-*") to identify the call interaction.The method invoca-

| Dataset
In this section, we've discussed well-known malware datasets for model training and classification.Datasets are listed in Table 2.
TA B L E 2 Datasets List.  3. The dataset was tested using both conventional and transfer learning methods.

Dataset
In a hierarchy of 47 kinds and 696 families, MALNET [45] contains 1.2 million function call graphs with over 35k edges and 15k nodes, as shown in Table 4.
MALNET-TINY [45], which has 5,000 graphs in 5 different types.In order to keep the dataset truly "tiny," we also set a 5K node limit for each network.The purpose Before we can apply graph-based analysis, the Malicia [6] samples must be converted from binaries into graphs.We found that 1,192 samples from the Malicia dataset didn't have a family label, and 581 samples from the dataset were not executable files.
After eliminating these samples, we were left with 9,895 binaries from 51 families in the Malicia dataset, listed in Table 5.
In Algorithm 1, a pseudo-code for a GNN is displayed.
The classifiers are initially trained with relevant training data and weights in every iteration of sequential learning.The data weights will be modified for the next iteration in accordance with the classifiers' results from training.Until classifiers are trained, both operations are carried out.

| RESULTS AND DISCUSSION
A comparison analysis of the results of the several experiments we performed was done.Here, we examine the results of the experiments that were performed, as shown in Table 6.

| Performance Measurement
To evaluate a classification algorithm, the confusion matrix has to be visualized, and specific performance met- Output: Combination of classifiers E N (e ); Ensure: 1: Procedure: GNN Algorithm 2: Initialising: Transfer the (x − 1) t h GNN's learning parameters to the x t h GNN classifier; Normalise the sample weight S x +1 and modify the sample weight S x −1 in accordance with p x k (e ); rics must be calculated.These will make it easier to analyze the effectiveness of different methods and compare each one's performance.

FP -False Positive:
A benign application that was misclassified.

| Evaluation Matrix
Here, we calculate the following evaluation metrics along with the confusion matrix and evaluate various models using these metrics to determine which model works best.

Accuracy:
The entire percent of the dataset's instances for which a prediction was accurate.The mathematical formula is shown in the equation 1.
Accur ac y = T N + T P T P + T N + F N + F P Precision: From all the predicted values, it is a fraction of the relevant prediction.The mathematical formula is shown in the equation 2.
P r eci si on = T P T P + F P (2)

Recall:
The ratio of instances that were accurately predicted to all instances.The mathematical formula is shown in the equation 3.
F-Measure: We can calculate the f-measure with a combination of two measurements (precision and recall).
The mathematical formula is shown in the equation 4.
F − M e asur e = 2 * P r eci si on * R ecal l P r eci si on + R ecal l (4)

| Performance of Models
Compared to the conventional GNN model, the transfer learning approach performs better.The

| CONCLUSION AND FUTURE WORK
We conclude all of our work in this section and provide suggestions for the future.

K
E Y W O R D S Android malware detection; Malware classifier; Deep learning; Transfer learning: Graph neural network 1 | INTRODUCTION With the release of the first Android smartphone in September 2008, the new open-source operating

Model development time:
The time needed to develop and train a new model decreases significantly because the last few data set-specific layers are required to be trained.Knowledge utilization : It is feasible to train a new target task using knowledge of the source model.Thus, there is no need to start from scratch while training the new target model.Over-fitting problems: When conventional deep and machine learning approaches are developed on only a small data set, over-fitting problems arise.The problem is solved through transfer learning by fine-tuning the model layers.
evaluated the transferability of adversarial examples produced on a structured and sparse dataset as well as the resistance of malware detection classifiers trained using adversarial methods to adversarial examples.The decision tree, random forest, SVM, CNN,[37] and RNN machine learning classifiers can all be tricked by adversarial examples generated by DNN, according to the authors.They also point out how adversarial training can increase DNN's robustness in terms of resisting adversarial attacks.Singh et al. [38] proposed a system based on machine learning to analyze Android applications.The authors exploited the APK files to gather manual features and extracted grayscale photos from a Drebin data set file.The image processing-based algorithms are used to extract image files.The system can classify Android applications, and the authors were effective in achieving an accuracy of 93%, although overfitting issues can arise if an algorithm needs to be trained on a huge amount of data.Pektacs[26] uses the API call graph to show all possible runtime execution paths used by malware.The API call graphs that have been converted into a low-dimensional numeric vector feature set can now be incorporated into deep neural networks.The F-measure, accuracy, recall, and precision metrics for the malware classification are each 98.86%, 98.65%, 98.47%, and 98.8%, respectively.Zhihua et al.[1] put out a novel method for classifying and identifying malware and its variations using CNNbased deep learning classifiers.The BAT algorithm was also introduced in the paper for the goal of dataset equilibrium.Image data augmentation is also used during the training process to improve the model's effectiveness and accuracy.The model classified 9,339 malware samples into 25 malware families with a 94.5% accuracy rate.Kumar et al. [39] proposed a malware classification and detection approach based on CNN that he used to classify the malware image dataset, achieving 98% accuracy for the 25 malware family-representative 9,339 samples.Kalash et al. [40] proposed a deep learning approach using CNN for classifying the malware dataset samples.There are two datasets, and the model was implemented (MalImg and Microsoft malware challenge).They attained 98.52% and 98.99% accuracy for both datasets, respectively.Singhet al. [41] gathered and analyzed a dataset of 37,374 samples from 22 malware families.Additionally, he proposed a deep CNN-based classification method and got 98.98% accuracy when classifying the malware dataset samples.Gibert et al. [42] proposed classifying malware images into specific families using a CNN-based model with three convolutional layers and a fully connected layer.Two publicly accessible datasets, Microsoft malware and MalImg, were used to test and train the model.The model's accuracy levels are 98.48% and 97.49%, respectively.

Figure 3
Figure 3 shows the architecture of TL-GNN.APK graphs can be efficiently visualized by using the static features of an APK file, such as the resource files, Dalvik files, Android Manifest.xml,and string XML files.Graphs can be obtained by transforming the files into binary vectors.The components required to create graph data sets are extracted from the data set APK archives.The APK data has been kept in a binary array vector matrix and has been interpreted as a binary stream.The 8-bit binary files generated by disassembling the APK files are then mapped to the grayscale range of the graph.The binary streams are converted into a vector array matrix and then used to construct a graph.In TL-GNN, the GNN model's first layers remain unchanged, and its last layers are modified.It is performed by altering the features in the final layers while keeping the initial layer alone in Figure 3.The final layer was

3
Proposed Work.tion relationship within the entire application is represented by the approximation graph, which contains all methods defined in the Dalvik bytecode code.
dataset [12] as part of the Malware Classification Challenge (BIG 2015) competition at the WWW2015/BIG 2015.The Kaggle platform makes the real dataset easily accessible.Microsoft provided a sizable malware dataset that was almost 0.5 gigabytes in size.The dataset includes more than 20,000 malware samples in .asm(disassembly code) and .byte(byte code) files.Bytecode files can be transformed into graphs using conversion methods.There are 10,868-byte file samples in 9 families in the collection.The dataset's description is in the Table

8 :
The x t h GNN classifier is trained by the weighted sample set; the predicted output category for the P classes of the x t h GNN classifier p k x (e ), where k = 1, 2,..., P; 11: Calculate the x t h classifier's training error, ε x according to (8); 12:Based on ε x , assign the classifier the weight α x using (11); 13: Matrix compared the actual class with the predicted class.It displays the number of samples in each quadrant.It aids in evaluating the model's predicted true positives, false positives, false negatives, and true negatives.This makes it easier to evaluate how effectively the model processed the classification.The prediction matrix for the approach we propose is displayed in Figure 4.It helps determine how well the model performed the classification.TP -True Positive: An effectively classified malware application.

Figure 6
Figure6shows the experimental results of RF and TL-

Figure 7
Figure 7 shows the experimental results of ResneXt and TL-GNN, which achieved high accuracy of 98.87% with precision of 99.55%, recall of 97.30%, and F-measure of 99.42% compared to the ResneXt model, which achieved accuracy of 98.32% with precision of 97.64%, recall of 97.93%, and F-measure of 97.69%.

Figure 8
Figure8shows the experimental results of RF and TL-GNN in terms of precision, accuracy, F-measure, and recall.The TL-GNN achieved a higher accuracy of 98.87% with the precision of 99.55%, recall of 97.30%, and Fmeasure of 99.42% than the OEL-AMD model, which

Figure 9
Figure9compares the outcomes of our approach with those of the four models: CNN, RF, ResneXt, and OEL-AMD.As can be seen, our approach had a quicker detection rate of 5.14 ms.Due to the other methods' use of time-consuming, highly complex approaches, their performance was a little lower.Our method reduces the requirement for huge amounts of computation power as well as the need for new training data.

F I G U R E 9
Comparison Between TL-GNN and other State-of-the-art Models.Mobile malware has been available since the arrival of smartphones.Malware applications continued to be successful in escaping security models as Android increased in popularity.We addressed using traditional GNN and transfer learning methods to categorize and detect Android malware.A two-stage system that transforms Android applications into binary graphs was proposed.These graphs serve as the input for the conventional GNN model.We addressed the issues of complexity, overfitting, and computation cost by applying the transfer learning method to the trained model by freezing the starting layers of the pre-trained model.The evaluation results show that the transfer learning strategy offers enhanced accuracy of 98.87%, precision of 99.55%, recall of 97.30%, f1 measure of 99.42% and a quicker detection rate of 5.14 ms with extremely few false positives when compared to the conventional GNN model.We also compared the evaluation results with those of other approaches.It was shown that transfer learning outperforms conventional methods while also lowering computation costs.Future research should provide us with thorough, finegrained feature sets for enhanced outcomes.We also tried to reduce the requirement for high RAM and GPUs, as well as the issue of overfitting in the event of smaller data sets, while attempting to overcome the constraints of the proposed framework employed in our study.The most important reason for this is that in our study, we considered both static and dynamic feature sets.Because static features lack attributes for runtime behav-ior, new malware strains dynamically change their behavior and form to avoid detection methods.The proposed approach is successful in detecting existing malware, but to maintain the detection approach, fresh sets of features as well as training layers must be chosen and transferred to the targeted model.The transfer learning approach has to be modified for new malware samples, in contrast to some of the earlier detectors described above, even though the training time will be reduced due to the lower computation cost.Novel malware behavior, dynamic permissions, resource obfuscation, and system call obfuscation are just a few of the factors that affect model updating.The problem of retraining the target model can be overcome by examining overall behavioral features instead of static features.Although our proposed approach offers good detection accuracy, we're going to get beyond those limitations in our next research to increase the detector's effectiveness.The transfer learning approach's issues with sustainability and performance deterioration will be the subject of our next phase.
, dynamic and static analysis are merged.The .apk file is used to access static features, such as some API calls.Its dynamic features are derived [47]t al.[46]presented DroidEvolver to identify Android malware that updates automatically and without user intervention.The model is updated using online learning techniques, eliminating the need for retraining and lowering the high computational cost.The authors assessed 34,722 malicious and 33,294 benign applications during a six-year period.According to the authors, the model outperformed the state-of-the-art Ma-maDroid model in terms of the average F1 measure.Fu and Cai[47]investigated Android malware detectors' concerns with deterioration.The authors analyze the performance of four state-of-the-art detectors and find that the performance of the available solutions degrades over time.Additionally, the researchers proposed a new approach built on a long-term analysis of application characterization with a focus on runtime behaviors.In order to analyze the deterioration problem, a comparison between the proposed strategy and four state-of- Comparison among the recent related work.
cial Google Play store.Intent, version, system services, manifest permissions, strings, and components are examples of static parameters.These parameters include metadata data that is kept in the application.In contrast to static parameters, dynamic parameters like logs, files, system calls, and network activities give information regarding an application's behavior and control flow.The source model is used as an input for classification and training purposes once the binary vectors have also been transformed into graphs.The static and dynamic features are discussed below: • Manifest.Permissions: The manifest file, which is contained in every Android application and used for providing details about resources, services, packages, strings, etc., The manifest.permissionsfile is one of the package files inside the manifest file.It is used to give applications authorization.When the application is installed, this file is used to verify rights.The ability to access SMS, location, and storage are cru-• API Call: Through application programming interfaces (APIs), runtime function calls are made.Applications are initiated when they require interaction with TA B L E 1 • Services: The components of Android services are in responsible for the background tasks carried out while an application is open.Users do not need to get involved with services.
TA B L E 3 Dataset for the Microsoft Malware Classification Challenge (BIG) description.
14: end for TA B L E 4 MALNET Dataset's statistical description for the nine major graph types.Description of Malicia Dataset.
Comparison among the recent related work.