Network intrusion detection system: A systematic study of machine learning and deep learning approaches

The rapid advances in the internet and communication fields have resulted in a huge increase in the network size and the corresponding data. As a result, many novel attacks are being generated and have posed challenges for network security to accurately detect intrusions. Furthermore, the presence of the intruders with the aim to launch various attacks within the network cannot be ignored. An intrusion detection system (IDS) is one such tool that prevents the network from possible intrusions by inspecting the network traffic, to ensure its confidentiality, integrity, and availability. Despite enormous efforts by the researchers, IDS still faces challenges in improving detection accuracy while reducing false alarm rates and in detecting novel intrusions. Recently, machine learning (ML) and deep learning (DL)‐based IDS systems are being deployed as potential solutions to detect intrusions across the network in an efficient manner. This article first clarifies the concept of IDS and then provides the taxonomy based on the notable ML and DL techniques adopted in designing network‐based IDS (NIDS) systems. A comprehensive review of the recent NIDS‐based articles is provided by discussing the strengths and limitations of the proposed solutions. Then, recent trends and advancements of ML and DL‐based NIDS are provided in terms of the proposed methodology, evaluation metrics, and dataset selection. Using the shortcomings of the proposed methods, we highlighted various research challenges and provided the future scope for the research in improving ML and DL‐based NIDS.

intrusion detection system (IDS) to ensure the security of the network and all its associated assets within a cyberspace. 1 Among these, network-based intrusion detection system (NIDS) is the attack detection mechanism that provides the desired security by constantly monitoring the network traffic for malicious and suspicious behavior. 2, 3 The idea of IDS was first proposed by Jim Anderson in 1980. 4 Since then, many IDS products were developed and matured to satisfy the needs of network security. 5 However, the immense evolution in the technologies over the last decade has resulted in a large expansion in the network size, and the number of applications handled by the network nodes. As a result, a huge amount of important data is being generated and shared across different network nodes. The security of these data and network nodes has become a challenging task due to the generation of a large number of new attacks either through the mutation of an old attack or a novel attack. Almost every node within a network is vulnerable to security threats. For instance, the data node may be very important for an organization. Any compromise to the node's information may cause a huge impact on that organization in terms of its market reputation and financial losses. Existing IDSs have shown inefficiency in detecting various attacks including zero-day attacks and reducing the false alarm rates (FAR). 6 This eventually results in a demand for an efficient, accurate, and cost-effective NIDS to provide strong security to the network.
To fulfill the requirements of an effective IDS, the researchers have explored the possibility of using machine learning (ML) and deep learning (DL) techniques. Both ML and DL come under the big umbrella of artificial intelligence (AI) and aim at learning useful information from the big data. 7 These techniques have gained enormous popularity in the field of network security, over the last decade due to the invention of very powerful graphics processor units (GPUs). 8 Both ML and DL are powerful tools in learning useful features from the network traffic and predicting the normal and abnormal activities based on the learned patterns. The ML-based IDS depends heavily on feature engineering to learn useful information from the network traffic. 9 While DL-based IDS do not rely on feature engineering and are good at automatically learning complex features from the raw data due to its deep structure. 10 Over the last decade, various ML-and DL-based solutions were proposed by the researchers to make NIDS efficient in detecting malicious attacks. However, the massive increase in the network traffic and the resulting security threats has posed many challenges for the NIDS systems to detect malicious intrusions efficiently. The research on using the DL methods for NIDS is currently in its early stage and there is still an enormous room to explore this technology within NIDS to efficiently detect intruders within the network. The purpose of this research paper is to provide a broad overview of the recent trends and advancements in ML-and DL-based solutions for NIDSs. The key idea is to furnish up-to-date information on recent ML-and DL-based NIDS to provide a baseline for the new researchers who want to start exploring this important domain. The main contributions of this article are 3-fold. (i) We conducted a systematic study to select recent journal articles focusing on various ML-and DL-based NIDS which are published during the last 3 years (2017-April 2020). (ii) We reviewed each article extensively and discussed its various features such as its proposed methodology, strength, weakness, evaluation metrics, and the used datasets. (iii) Based on these observations, we provided the recent trends of using AI methods for NIDS then highlighted various challenges in ML-/DL-based NIDs and we provided different future directions in this important domain.
There are many survey papers in the literature that provide some implementation details on the IDS. Our article is different from the other review articles [11][12][13][14][15][16] from three aspects: (i) We followed a systematic article selection process to obtain more focused articles on NIDS design considering AI tools. While the other studies reviewed the general IDS system without using the systematic approach. (ii) Our study reviewed the articles published between 2017 and April 2020. So it provides more updated information and the recent trends followed in the design of AI-based NIDS. (iii) In our study, an extensive review of the recent NIDS based on ML and DL approach is provided where they are critically analyzed according to their methods, techniques, datasets, and evaluation metrics. The focus is to provide researchers with more updated knowledge on AI-based NIDS in one place, where they can find the recent trends and potential research areas in the domain to start exploring it. A detailed comparison of this article with other review articles is provided in Table 1.
The rest of the paper is organized as follows: Section 2 describes the research methodology adopted in this study. Section 3 provides the basic IDS concept and classification methods. Section 4 elaborates the DL and ML methodologies adopted. The details about the evaluation metrics and the benchmark public datasets is illustrated in Section 5 and Section 6, respectively. Observations, recent trends in NIDS design, research challenges, and the future research scope are provided in Section 7. Finally, Section 8 concludes this review article. The abbreviations used in this article are summarized in Table 2.

METHODOLOGY
This study conducts a systematic literature review of the different ML-and DL-based NIDS and investigates the published journal articles between 2017 to first quarter of 2020. A systematic literature review is a methodology followed to identify, examine, and extract needful information from the literature related to certain research topics. 17 We performed this systematic review in two phases. Phase-1 identifies the information resource (search engine) and keywords to execute a query to obtain an initial list of articles. Phase-2 applies certain criteria on the initial list to select the most related and core articles and store them into final list which are reviewed in this article. The main purpose of this review article is to answer the following questions: (i) What are the recent trends in the design of AI-based NIDS? (ii) What are the recent ML and DL methodologies adopted for NIDS design? (iii) What are the merits and demerits of each adopted methodology? (iv) Which datasets are recently used for the AI-based NIDS testing purposes? (v) What are the most frequent performance metrics used for evaluation purposes? and (vi) What is the future scope of research in AI-powered NIDS?
In phase-1, firstly search engines and keywords are identified for article search. A Scopus document search 18 is chosen as a potential search engine due to its ability to search from almost all the well-known databases. We executed a search QUERY using an initial keyword "intrusion detection system" and adjusted the filter to show journal articles published between 2017 and 2020. The initial search QUERY resulted in the articles that proposed the IDS using different approaches like AI-based, watchdog-based, trust-based and game theoretic-based, etc. for different domain areas including wireless sensor network (WSN), internet of things (IoT), cloud computing, etc. We then redefined our keyword as intrusion detection system, network anomaly detection, and signature-based network intrusion detectionwith the combination of machine learning or deep learning to obtain more relevant articles. As a result of phase-1, relevant articles based on the keywords were selected and stored as an initial list. The detailed steps used in phase-1 to obtain an initial list are summarized in Figure 1.
In phase-2, we started with the initial list and defined certain criteria to obtain articles that are more focused for review. We selected those articles which were written in the English language and proposed a new AI-based idea. We did not consider the review and survey articles. Based on these criteria, we were able to identify articles for this review, stored them in the final list, and then used them for the analysis purposes. Each selected article is analyzed based on the proposed ML-or DL-based methodology and the advantages and disadvantages of each methodology. We also analyzed the most frequent used datasets and evaluation metrics used for testing and evaluation purposes. Finally, we identify the future scope of research and challenges in the design of an efficient AI-based NIDS. The detailed process used in phase-2 for the selection of articles in the final list for review articles is summarized in Figure 2.

IDS: CONCEPT AND CLASSIFICATION
This section first explains the concept of IDS and then provides the details about the classification of IDS based on its deployment and the detection methodology.
F I G U R E 3 Passive deployment of network-based intrusion detection system

Concept
An IDS is the combination of two words "intrusion" and "detection system." Intrusion refers to an unauthorized access to the information within a computer or network systems to compromise its integrity, confidentiality, or availability. 19,20 While detection system is a security mechanism for the detection of such illegal activity. So, IDS is a security tool that constantly monitors the host and network traffic to detect any suspicious behavior that violates the security policy and compromises its confidentiality, integrity, and availability. 11, 21 The IDS will generate alerts about detected malicious behavior to the host or network administrators. Figure 3 depicts a passive deployment of NIDS, where it is connected to a network switch configured with the port mirroring technology. The task is to mirror all the incoming and outgoing network traffic to NIDS for performing traffic monitoring to detect intrusions. NIDS can also be deployed in between the firewall and the network switch to allow all the traffic to pass through NIDS. 2

Classification of IDS
IDS can be classified with the perspective of its deployment or detection methods. A classification taxonomy is given in Figure 4.

Deployment method based IDS
From the deployment-based IDS perspective, IDS is further subclassified as host-based-IDS (HIDS) or NIDS. 19,22 HIDS is deployed on the single information host. Its task is to monitor all the activities on this single host and scans for its security policy violations and suspicious activities. The main drawback is its deployment on all the hosts that require intrusion protection, which results in extra-processing overhead for each node and ultimately degrades the performance of the IDS. 23,24 In contrast, NIDS is deployed on the network with the aim to protect all devices and the entire network from intrusions. The NIDS will constantly monitor the network traffic and scans for potential security breaches and violations. This article focuses on the different methods used in the NIDS.

Detection method based IDS
From the detection IDS perspective, the IDS is further subdivided into "Signature-based intrusion detection (SIDS)" and "Anomaly detection-based intrusion detection (AIDS)". SIDS, also known as the "misuse intrusion detection" or "knowledge-based intrusion detection," is based on the idea of defining a signature for attack patterns. These signatures are stored in the signature database and the data patterns are matched with these stored signatures for attack detection. 25, 26 The advantage includes the high detection efficiency for the known attacks due to the availability of signature for those attacks. On the other hand, this method lacks the ability to detect the novel and new attacks due to the absence of signature patterns. 23 Also, a huge signature database is maintained and compared with the data packets for possible intrusions, which makes it a resource-consuming approach. 27 AIDS, also called the "behavior-based IDS," is based on the idea of clearly defining a profile for normal activity. Any deviation from this normal profile will be considered as an anomaly or abnormal behavior. 28, 29 The major advantages of AIDS are its ability to detect unknown and new attacks 30 and the customized nature of the normal activity profile for different networks and applications. 31 However, the main drawback is the high FAR as it is difficult to find the boundary between the normal and abnormal profiles for intrusion detection. 32 The popularity of the IoT paradigm due to network technologies advancement has resulted in exponential growth in the use of IoT devices. 33,34 One of the vital technology used in the development of an IoT network is the WSN, which comprises of a collection of sensor nodes for information collection. 35 A huge amount of critical information is collected by these IoT sensor devices and is shared over the internet. 36 This big data along with the complex structure of WSN comprising of the resource-limited sensor nodes causes security challenges for the IoT network. 37,38 To this end, IDS is considered as one of the effective mechanism for the security of IoT and WSN. Many different IDS approaches are proposed in the literature, which are based on the efficient use of watchdogs, trust models, and game-theoretic concepts.
Watchdogs are the network nodes that are assigned the task of watching and monitoring the neighboring nodes' network traffic. Then a decision is made regarding the misbehaving nodes by using some set of rules. Many solutions are proposed for the anomaly and intrusion detection using watchdogs in the domain of WSN, 39 AdHoc networks, 40 and IoT. 41 Trust models are another tool used to improve the performance of an IDS. An IDS based on the trust model evaluates the trustworthiness of their nodes to identify the malicious nodes by constantly mentoring the network traffic for abnormal behaviors. Different implementations of the IDS using trust models are based on watchdog, 42 Bayesian trust model, 43 and game theory-based trust model. 44,45 In the context of IoT, a trust management scheme can be used in a distributed manner to reduce the computational overheads of resource-constrained sensor nodes. 46,47 Similarly, game theory is widely used for the efficient designing of IDS. It is an applied mathematical concept used to model the strategic interactions among the players by describing a game. Each game includes the set of players and each player has a set of strategies along with an action plan and payoff for each action within a game. The solution of the game is based on an equilibrium state which is based on the player's strategy to maximize the payoff. A game can be cooperative or noncooperative depending upon entities' interaction cooperatively or competitively. From the perspective of IDS for IoT and WSN, a game is modeled between the attackers and the defenders either by their interaction or by using the prediction strategy of an attacker. 45,[48][49][50] In this article, we have focused on reviewing the AI-based NIDS, which can be deployed for the security of an IoT network by monitoring the network traffic entered through the edge router. The most common AI-based algorithms used in the design of an efficient NIDS over the past three years are briefly explained in Section 4.

AI METHODS FOR NIDS
This section provides a general methodology of the AI-based NIDS along with the details of the most commonly used ML and DL algorithms used to design an efficient NIDS. Both ML and DL are broadly classified as supervised and unsupervised algorithms. 51 In supervised algorithms, the useful information is extracted from the labeled data. While unsupervised algorithms rely on the unlabeled data to extract useful features and information. 52

A general AI-based NIDS methodology
A NIDS developed using ML and DL methods usually involves following three major steps as depicted in Figure 5, that is, (i) Data preprocessing phase, (ii) Training phase, and (iii) Testing phase. For all the proposed solutions, the dataset is first preprocessed to transform it into the format suitable to be used by the algorithm. This stage typically involves encoding and normalization. Sometimes, the dataset requires cleaning in terms of removing entries with missing data and duplicate entries, which is also performed during this phase. The presprocessed data is then divided randomly into two portions, the training dataset, and the testing dataset. Typically, the training dataset comprises almost 80% of the original dataset size and the remaining 20% forms testing dataset. 53, 54 The ML or DL algorithm is then trained using the training dataset in the training phase. The time taken by the algorithm in learning depends upon the size of the dataset and the complexity of the proposed model. Normally, the training time for the DL models requires more training time due to its deep and complex structure. Once the model is trained, it is tested using the testing dataset and evaluated based on the predictions it made. In the case of NIDS models, the network traffic instance will be predicted to belong to either benign (normal) or attack class. In the following section, we provide an extensive overview of widely used ML and DL algorithms for NIDS systems. Further, Figure 6 highlights the taxonomy of recent ML-and DL-based techniques used for NIDS.

ML algorithms
ML is a subset of AI that includes all the methods and algorithms which enable the machines to learn automatically using mathematical models in order to extract useful information from the large datasets. 13, 55 The most common ML

Decision tree
DT is one of the basic supervised ML algorithms which is used for both classification and regression of the given dataset by applying the series of decisions (rules). The model has a conventional tree structure with nodes, branches, and leaf. 56 Each node represents an attribute or a feature. The branch represents a decision or a rule while each leaf represents a possible outcome or class label. 57 The DT algorithm automatically selects the best features for building a tree and then perform pruning operation to remove irrelevant branches from the tree to avoid the over-fitting. The most common DT models are CART, C4.5, and ID3. 58 Many advanced learning algorithms like Random Forest (RF) 59 and XGBoost 60 are made from multiple decision trees.

K-Nearest Neighbor
KNN is one of the simplest supervised ML algorithms which utilizes the idea of "feature similarity" to predict the class of a certain data sample. It identifies a sample based on its neighbors by calculating its distance from the neighbors. In the KNN algorithm, the parameter k affects the performance of the model. If the value of k is very smaller, the model may be susceptible to over-fitting. While, a very large selection of k value may result in misclassification of the sample instance. 61,62 Karatas et al 63 compared the performance of different ML algorithms using an up-to-date benchmark dataset CSE-CIC-IDS2018. They addressed the dataset imbalance problem by reducing the imbalance ratio using Synthetic Minority Oversampling Technique (SMOTE), 64 which resulted in detection rate improvement for minority class attacks.

Support vector machine
SVM is a supervised ML algorithm based on the idea of max-margin separation hyper-plane in n-dimensional feature space. It is used for the solution of both linear and nonlinear problems. For nonlinear problems, kernel functions are used.
The idea is to first map a low dimensional input vector into a high dimensional feature space using the kernel function.
Next, an optimal maximum marginal hyper-plane is obtained, which works as a decision boundary using the support vectors. 65,66 For NIDS, the SVM algorithm can be used to enhance its efficiency and accuracy by correctly predicting the normal and malicious classes. 67,68

K-mean clustering
The clustering is an idea of dividing data into meaningful clusters (or groups), by putting the highly similar data into the same cluster. K-Mean clustering is one of the popular centroid-based iterative ML algorithm that learns in an unsupervised manner. K represents the number of centroids (center of the cluster) within a dataset. For assigning certain data points to a cluster, normally distance is calculated. The main idea is to reduce the sum of the distances between the data points and their respective centroids within a cluster. [69][70][71] Yao et al 72 proposed a multilevel intrusion detection model framework named multilevel semi-supervised ML (MSML) to address IDS, The clustering concept is used along with the RF model. The proposed solution was composed of four modules as pure cluster extraction, pattern discovery, fine-grained classification, and model updating. The idea is if an attack is not labeled in one module then it is forwarded to the next one for detection. The proposed methodology was tested using the KDD Cup'99 dataset. Experimental results showed the model superiority for detecting the attacks even with low instances in the dataset.

Artificial neural network
ANN is also a supervised ML algorithm and is inspired by the working of the nervous system of the human brain. It is made up of the processing elements called the neurons (nodes) and the connection between them. These nodes are organized in an input layer, many hidden layers, and an output layer. The backpropagation algorithm is used as a learning technique for the ANN. The main advantage of using an ANN technique is its ability to perform nonlinear modeling by learning from larger datasets. 73 However, the main issue with training the ANN model is the high time consumption due to its complex nature, 74 which slows down the learning process and to reach a suboptimal solution. 75 To overcome the limitations of ANN, Huang et al 76 proposed a new ANN called an extreme learning machine (ELM). The ELM is a single hidden layer feed-forward neural network, which randomly uses the input weights and hidden layer bias without tuning and determines the output weights in an analytical way. 77 Based on the idea of ELM, Li et al 78 proposed a Fast Learning Network (FLN). FLN is based on connecting the multilayer feed-forward neural network and a single-layer feed-forward neural network in parallel. FLN showed reasonable performance and stability using a smaller number of hidden nodes and utilizing less time.
Ali et al 79 addressed the IDS problem by proposing a model based on FLN and particle swarm optimization 80 (PSO-FLN) and tested the model using KDD Cup'99 dataset. The model was tested by comparing the FLN with different optimization algorithms. Results showed that PSO-FLN outperforms the other FLN models with different optimization algorithms as Genetic Algorithm, Harmoney Search optimization, and Ameliorated Teaching Learning-based optimization. They also demonstrated that increasing the number of neurons in the hidden layer increases accuracy. The main drawback is the lower detection rate accuracy for lower attack classes.

Ensemble methods
The key idea behind ensemble methods is to get benefit from the different classifier by learning in an ensemble way.
Since each classifier has some strengths and weaknesses. Some may perform well for detecting a specific type of attacks and shows poor performance on other types of attack. The ensemble approach is to combine weak classifiers by training multiple classifiers and then form a stronger classifier by selecting using a voting algorithm. Shen et al 81 proposed an IDS using an ensemble method by selecting ELM as a base classifier. To optimize the proposed methodology, a BAT optimization algorithm is used during the ensemble pruning phase. The model was tested using KDD Cup'99, NSL-KDD, and Kyoto datasets. Experimental results showed that many ELMs combined in ensemble manner outperform individual ELM in performance.
Gao et al 82 proposed an adaptive ensemble model by using several base classifiers as DT, RF, KNN, Deep Neural Network (DNN), and choosing the best using adaptive voting algorithm. The proposed methodology was verified by performing experiments using the NSL-KDD dataset. Experimental results demonstrated the performance efficiency by comparing other models. The proposed methodology did not have satisfactory results for the weaker attack classes.

Deep learning algorithms
DL is the subset of the ML which includes many hidden layers to get the characteristics of the deep network. These techniques are more efficient than the ML due to their deep structure and ability to learn the important features from the dataset on its own and generate an output. This section presents the DL approaches adopted to propose DL-based NIDS solutions in the reviewed articles.

Recurrent neural networks
Recurrent Neural Networks (RNN) extends the capabilities of the traditional feed-forward neural network and is designed to model the sequence data. RNN is made of input, hidden, and output units, where the hidden units are considered to be the memory elements. To make a decision, each RNN unit relies on its current input and the output of the previous input. RNN is widely used in different fields like speech processing, human activity recognition, handwriting prediction, and semantic understanding, to name a few. [83][84][85][86] For an IDS, RNN can be used for the supervised classification and feature extraction. RNN normally can handle limited length sequences and will suffer from short-term memory if the sequence length is long. 87 Different RNN variants like Long short-term memory (LSTM) 88 and gated recurrent unit (GRU) 89,90 are proposed to solve these issues. RNN-based IDS was proposed by Yin et al 91 in the context of binary and multi class classification of the NSL-KDD dataset. The model was tested using a different number of hidden nodes and learning rates. Results showed that different learning rates and the number of hidden nodes affect the accuracy of the model. Best accuracy was obtained using 80 hidden nodes and a learning rate of 0.1 and 0.5 for binary and multi class scenarios. The proposed model performed well compared to ML algorithms and a reduced-sized RNN model proposed in Reference 92. The main shortcoming of this work is the increase in computational processing which results in high model training time and lower detection rate for the R2L and U2R classes. The article also lacks the performance comparison of the proposed model with different other DL methodologies.
In Reference 93, Xu et al proposed an IDS based on RNN using GRU as the main memory together with the multilayer perceptron and a softmax classifier. The proposed methodology was tested using KDD Cup'99 and NSL-KDD datasets. Experimental results showed good detection rates for comparing other methodologies. The major drawback of their model is lower detection rates for minority attack classes like U2R and R2L.
Naseer et al 94 performed a comparative analysis of IDS based on different DL and ML algorithms and implemented on GPU-based testbed. NSL-KDD is considered as the benchmark dataset and the experimental results showed that LSTM and Deep CNN achieved higher accuracy results comparing other models.

AutoEncoder
AutoEncoder (AE) is a popular DL technique that belongs to the family of unsupervised neural networks. 95 It works on the idea of matching the output as close to input as possible by learning the best features. It contains input and output layers of the same dimension, while the dimensions of the hidden layers are normally smaller than the input layer. AE is symmetric and works in Encoder-Decoder fashion. Different variants of AE are Stacked AE, Sparse AE, and Variational AE. 96 Shone et al 97 proposed an IDS based on deep AE and ML technique RF. To make the model efficient in terms of computational and time, only the encoder part of AE is utilized to make it work in a nonsymmetric fashion. Two nonsymmetric deep AEs, with three hidden layers each, are arranged in a stacked manner. RF was used for classification. Experiments were performed for multiclass classification scenarios using KDD Cup '99 and NSL-KDD datasets. The proposed method showed their efficiency compare to Deep Belief Network (DBN) used in Reference 98 in terms of detection accuracy and reduced training time. But the model showed inefficiency for detecting R2L and U2R attacks due to lack of data for training the model.
Yan et al 67 proposed an IDS using stacked sparse autoencoder (SSAE) and SVM. The SSAE was used as the feature extraction method and SVM as a classifier. Binary-class and multi-class classification problem is considered for conducting experiments. The results showed the proposed model superiority in performance comparing different feature selection, ML, and DL methods using the NSL-KDD dataset. Although, the model achieves reasonable detection rates for U2R and R2L attacks but it is still less comparing the other classes of the dataset.
A-Qatf et al 99 also proposed a similar idea of self-taught learning based on sparse AE and SVM. To validate their performance, they performed experiments on the proposed model considering the NSL-KDD dataset. The results showed improved overall performance comparing other DL and ML models. But the proposed methodology performance in R2L and U2R class is not discussed.
Papmartizivanous et al 100 proposed an autonomous misuse detection system by combining the advantages of self-taught learning 101 and MAPE-K frameworks. 102 They used sparse AE for the unsupervised learning algorithm to learn useful features while performing the Plan activity within the MAPE-K Framework. Experiments performed using the KDD Cup'99 and NSL-KDD datasets. The main drawback is the lack of detection accuracy for U2R and R2L attack classes.
Khan et al 103 proposed an efficient two-stage model based on deep stacked AE. The initial stage classified the dataset into the attack and normal classes with probability values. These probability scores are then used as an additional feature and are input to the final decision stage for normal and multiclass attack classification. The performance of the proposed model was tested using KDD Cup'99 and UNSWNB15 datasets. To reduce the problems due to class imbalance of the datasets, a different methodology was adopted for both datasets. For KDD Cup'99, the downsampling was performed to remove repeated records. While, to balance the distribution of records in UNSWNB15, upsampling of the dataset was performed using SMOTE. This preprocessing of the dataset dramatically improves the DR efficiency of attack class with lower training instances.
Malaiya et al 104 proposed different IDS models based on fully connected networks, Variational AE, and Sequence-to-Sequence (Seq2Seq) structures, respectively. These models were examined for different datasets NSL-KDD, KyotoHoneypot, UNSW-NB15, IDS2017, and MAWILab traces. 105 Results showed that the Seq2Seq model constructed using two RNNs performed the best comparing other models in terms of detection accuracy across all the datasets.
Yang et al 106 proposed a model for ID based on the supervised adversarial variational AE with regularization and DNN (SAVAER-DNN). The performance of the model was tested using benchmark data NSL-KDD and UNSW-NB15. Experimental results confirm the model's effectiveness in detecting low frequency and new attacks.
Andresini et al 107 incorporated the idea of AE to proposed a multistage model involving the ID convolution layer and two stacked fully connected layers. In the initial unsupervised stage, two AEs were trained separately using Normal and Attack flows to reconstruct the samples again. In the supervised stage, these new reconstructed samples are used to build a new augmented dataset that is used as input to a 1D-CNN. Then the output of this convolution layer is flattened and fed to fully connected layers, and lastly, a softmax layer classifies the dataset. Experiments were performed on the KDD Cup'99, UNSWNB15, and CICIDS2017 datasets and the proposed methodology achieves superior performance comparing different DL models. They have not shown how the minority classes perform using this methodology. The second drawback is that it does not provide any information on the characteristics of the attack.

Deep neural network
DNN is a basic DL structure that allows the model to learn in multiple layers. It is composed of an input layer, an output layer, and many hidden layers. DNN is used to model complex nonlinear functions. Increased number of hidden layers enhances the abstraction level of the model to increase its capability. 108 Jia et al 109 proposed a network IDS based on DNN with four hidden layers to classify the datasets KDD cup'99 and NSL-KDD. The output layer included one fully connected layer and softmax classifier for classification purposes. For the hidden layer, a rectified linear unit was used as the activation function. 110 Results showed the robustness of the proposed model as it achieved higher detection rates for almost all the attack classes except U2R due to presence of less number of records. According to the authors, increasing the number of nodes and layers leads to a complex structure that increases the computing time and consumes more resources. The solution to these issues is the optimization algorithm and automatic tuning. Wang et al 111 studied the DNN-based IDS with adversaries and evaluated using the NSL-KDD dataset. They comprehensively studied the roles of individual features in generating adversarial examples. The adversarial samples were produced by FGSM, 112 JSMA, 113 DeepFool, 114 and CW attacks. 115 Results showed that the most commonly used attributes are more vulnerable to DL-based IDS and require more attention to safeguard the network from attacks.

Deep belief network
DBN is a DL model constructed by stacking many Restricted Boltzmann Machines (RBM) in layers followed by a softmax classification layer. 118 An RBM is a two-layer (input and hidden layer) model with the data flow in both directions. In DBN, each node within a layer is connected to each of the other nodes in the previous and next layers, but within one layer, the nodes are not connected. DBN is pretrained using the greedy layer-wise learning approach in an unsupervised fashion, followed by a supervised fine-tuning methodology for learning useful features. 119 For IDS, DBN is used for feature extraction and classification tasks. Marir et al 120 proposed a distributed model based on the BDN and multilayer ensemble SVM for large-scale network ID based on Apache Spark. DBN was used for extracting features, which are then forwarded to the ensemble SVM, and then finally output was predicted using a voting mechanism. The efficiency of the proposed method was tested for KDD CUP'99, NSL-KDD, UNSW-NB15, and CICIDS2017 datasets. The proposed system has shown high performances in the detection of abnormal behavior in a distributed way.
To improve the detection accuracy of IDS, Wei et al 121 proposed a DL-based model DBN, which is optimized by combining the particle swarm, fish swarm, and genetic algorithms. The model was tested using the NSL-KDD dataset. Results showed a large improvement in the detection rate of U2R and R2L class. The main drawback of the proposed model is the increase in the training time of the model due to its complex structure.

Convolutional neural network
Convolutional neural network (CNN) is a DL structure more suitable for the data stored in arrays. It consists of an input layer, the stack of convolutional and pooling layers for feature extraction, and finally a fully connected layer and a softmax classifier in the classification layer. CNN is widely successful in the computer vision field. 122 For the IDS, they are used for the supervised feature extraction and classification purposes. Xiao et al 123 proposed an efficient IDS based on the CNN. The main idea is first to perform feature extraction using Principle Component Analysis and AE. Then transform the one-dimensional vector (feature set) into a two-dimensional matrix and input to the convolution Neural Network. Experiments were performed on the KDD Cup'99 dataset. Experiments show its effectiveness in terms of time taken by algorithms in the training and test phase. The main drawback is lower detection rates for the U2R and R2L classes comparing to other attack classes.
Zhang et al 124 proposed a complex multilayer IDS model based on CNN and gcForest. They also proposed a novel P-Zigzag algorithm for converting the raw data into two-dimensional greyscale images. They used an improved CNN model (GoogLeNetNP) in a coarse grain layer for initial detection. Then in the fine-grained layer, gcForest (caXGBoost) is used to further classifies the abnormal classes into N-1 subclasses. They used a dataset by combining UNSW-NB15 and CIC-IDS2017 datasets. The experimental results show that the proposed model significantly improves the accuracy and detection rate compared to the single algorithms while reducing the FAR.
Jiang et al 125 proposed an efficient IDS system by combining CNN and bidirectional long short-term memory (BiL-STM) in a deep hierarchy. The class imbalance problem is addressed by using the SMOTE to increase the minority samples, which helps the model to fully learn the features. The CNN was used for extracting spatial features while BiLTSM was used to temporal features. Experiments were performed using NSL-KDD and UNSWNB15 datasets. The proposed methodology achieves higher performance in terms of accuracy and detection rate. The detection rate of minority data classes improved slightly but still its is very low comparing other attack classes. Due to the complex structure, the training time is higher.
Yu et al 126 proposed an IDS model based on novel DL idea of Few-shot Learning (FSL). 127 The idea is to train using a small amount of balanced labeled data from the dataset. DNN and CNN are adopted as embedding functions in the model for extracting the essential feature and reducing the dimension. Experimental results performed on NSL-KDD and UNSW-NB15 datasets showed model efficiency in getting reasonable detection rates for minority attack classes. The proposed model only utilized less than 2% data for training to achieve such a remarkable performance for the considered dataset.
In this section, we identified various ML and DL techniques (as summarized in Table 3), which are developed by researchers recently to detect intruders in the network. Also, the strengths and weaknesses of each methodology is highlighted in Table 4. However, these techniques need to be evaluated based on certain metrics. In the next section, we

Study
Algorithm Methodology Use of an older dataset NSL-KDD for evaluation.

ML DL
Xu et al 93 Use of GRU as the main memory for RNN together with the multilayer perceptron and a softmax classifier to propose a NIDS solution.
The model is evaluated using old datasets KDD Cup'99 and NSL-KDD. Also, lower detection rates for R2L and U2R attack classes are recorded.
Al-Qatf et al 99 The proposal of a NIDS by an efficient idea of self-taught learning based on Sparse AE and SVM.
The model is tested using an older dataset NSL-KDD. Also, no results are provided for the performance of the model against minority attack classes (e.g. R2L and U2R).
Marir et al 120 Use of DBN and ensemble SVM to detect abnormal behavior in a distributed way. DBN is used for feature extraction, which is then forwarded to the ensemble SVM and finally, the voting mechanism is used for prediction.
The model is complex and the training time increases slightly for deeper layers. (Continues)

Weaknesses
Papamartizivanous et al 100 Propose an autonomous misuse detection system by combining the self-taught learning and MAPE-K frameworks. For learning useful features, a sparse AE is used.

Weaknesses
Karatas et al 63 The analysis of six ML-based NIDSs is presented that addresses dataset imbalance by reducing the imbalance ratio using SMOTE. This eventually improves the detection rate for the minority attack class. Also, an up to date dataset CSECIC-IDS2018 is used for evaluation.
Adaboost algorithm is shown to achieve higher detection accuracy, but it is done at the cost of higher execution time.
Jiang et al 125 An IDS is proposed using CNN and bi-directional long short-term memory (BiLSTM) in a deep hierarchy. The class imbalance problem is addressed by using SMOTE to increase the minority samples. Evaluation is performed using both older (NSL-KDD) and newer (UNSW-NB15) datasets.
The proposed model is complex and the detection rate of minority data classes improved slightly but still, it is very low comparing other attack classes.
Yang et al 106 A NIDS is proposed based on the supervised adversarial variational AE with regularization and DNN. Evaluation is performed using both older (NSL-KDD) and newer (UNSW-NB15) datasets. For the NSL-KDD dataset, the model achieves reasonable detection rates for U2R and R2L attacks but it is still lower comparing the other attack classes of the dataset. For the UNSW-NB15 dataset, the model performance for the detection of minority class attacks is not shown.
Yu et al 126 An efficient NIDS model is proposed based on the novel DL idea of Few-shot Learning (FSL). DL algorithms as DNN and CNN are adopted as embedding functions in the model for extracting the essential feature and reducing the dimension. The model is evaluated using NSL-KDD and UNSW-NB15 datasets.
For the NSL-KDD dataset, the model achieves reasonable detection rates for U2R and R2L attacks but it is still lower comparing the other attack classes of the dataset. Also, FSL requires labeled data for learning which makes its application limited in scenarios, where the model is trained frequently using the unlabeled network traffic to help it learn more patterns for effective detection.
Andresini et al 107 Use of AE to proposed a multi-stage model involving the intrusion detection convolution layer and two stacked fully connected layers. Also, the model's performance is tested using both new and older datasets. The model's effectiveness in detecting the minority attack class is not given. Also, the model is unable to provide details about the structure and characteristics of the attack.
will discuss the evaluation metrics which were used in the reviewed articles to evaluate the efficiency of the proposed solutions.

EVALUATION METRICS
This section explains the most commonly used evaluation metrics for measuring the performance of ML and DL methods for IDS. All the evaluation metrics are based on the different attributes used in the Confusion Matrix, which is a two-dimensional matrix providing information about the Actual and Predicted class 128  The diagonal of confusion matrix denotes the correct predictions while nondiagonal elements are the wrong predictions of a certain classifier. Table 5 depicts these attributes of confusion matrix. Further, the different evaluation metrics used in the recent studies are, • Precision: It is the ratio of correctly predicted Attacks to all the samples predicted as Attacks. (1) • Recall: It is a ratio of all samples correctly classified as Attacks to all the samples that are actually Attacks. It is also called a Detection Rate. (2) • False alarm rate: It is also called the false positive rate and is defined as the ratio of wrongly predicted Attack samples to all the samples that are Normal.
• True negative rate: It is defined as the ratio of the number of correctly classified Normal samples to all the samples that are Normal.
• Accuracy: It is the ratio of correctly classified instances to the total number of instances. It is also called as Detection Accuracy and is a useful performance measure only when a dataset is balanced.

Predicted class
Attack Normal

Attack
True Positive False Negative Actual Class

Normal
False Positive True Negative • F-Measure: It is defined as the harmonic mean of the Precision and Recall. In other words, it is a statistical technique for examining the accuracy of a system by considering both precision and recall of the system.
These evaluation metrics are calculated by testing the proposed methodologies using benchmark datasets. The next section discusses the popular public dataset used for testing NIDS.

BENCHMARK DATASETS
This section provides detail about the popular datasets used by the researcher for testing the performance of their proposed methodology. A detailed summary of the dataset and the attack classes within each is given in Table 6.
• KDD Cup'99: It is one of the most popular and widely used dataset for IDS. It contains approximately five and two million records for training and testing respectively. Each record contains 41 different features or attributes and is labeled as either normal or attack. The attacks are classified into four different types as Denial of Service (DoS), Probe, Remote to Local (R2L), and User to Root (U2R). 129

• Kyoto 2006+:
This dataset is created from the network traffic records, obtained by deploying honeypots, darknet sensors, email servers, web crawler, and other network security measures by Kyoto University. 134 133 It contains the normal flows and updated real-world attacks. The network traffic is analyzed by CICFlowMeter using the information based on timestamps, source, and destination IP addresses, protocols, and attacks. 136  • CSE-CIC-IDS2018: This dataset is jointly created by Communications Security Establishment (CSE) and CIC in 2018. 63 The user profiles containing the abstract representation of the different events is created. For the generation of the  Table 7 provides an overview and summary of various studies that utilize the datasets discussed in Section 6 and the evaluation metrics (discussed in Section 5) to evaluate the efficiency of their proposed techniques.

OBSERVATIONS, CHALLENGES, AND FUTURE TRENDS
This section first discusses recent trends in IDS based on proposed methodology, performance criteria, and the dataset adopted. It also highlights the possible research gap and challenges and finally presents the future trends for the researchers to come up with an efficient, robust, and accurate IDS.

Recent trends and observations
The efficiency of the AI-powered NIDS highly depends upon training using a suitable dataset. For ML models, the algorithm can be trained using a small dataset to achieve better results. But in case of the larger dataset, then ML is not suitable unless the dataset is labeled in nature. Since labeling is an expensive and time consuming process, DL methods are preferred for larger datasets. These methods will learn and extract useful patterns from raw datasets. To make NIDS efficient in detecting zero-day attacks, it needs to be trained regularly with the new data obtained as a result of monitoring the network traffic. The large dataset and deep nature of the DL algorithms will make the learning process more resource hungry in terms of computational resources and time consumption. The more the NIDS model is trained, the more efficiently it will detect intrusions. Table 4 highlights the strengths and weaknesses of the reviewed articles. It is observed that DL-based NIDS methodologies are preferred nowadays over the ML methodologies due to their efficiency in learning from large datasets in raw form. Since the DL algorithms require extensive computational resources, the advent of GPUs and cloud-based platforms have eased the way for the implementation of DL-based methods. We observed that in most of the proposed solutions, models are tested using older datasets such as KDD Cup'99 and NSL-KDD. We also observed that in some proposed solutions, the performance of the model showing extremely excellent results for older datasets is decreased for newer and recently proposed datasets, for instance. 103 Another major drawback exhibited by most of the methodologies is their inefficiency in detecting the attack having fewer samples for the training dataset. This class imbalance problem affects the detection rate and accuracy for these minority attack classes which needs further attention. Also, we observed that some of the methodologies are quite complex and it ultimately requires more model training time. We observe a trade-off between model complexity and the deep structure of DL methods. The deeper the algorithm is, the more complex the model will be and hence it will consume more time and computing resources. So the intelligent selection of useful features for the model training will ultimately improve this drawback.
Based on the reviewed article, it is observed that during the past three years, the researchers focused on DL tools for designing the IDS system as shown in Figure 7. It is noticed that 60% of the proposed methods are purely based on the DL approaches, 20% solutions use hybrid approach involving the combination of ML-and DL-based algorithms while only 20% proposed solutions are based on ML methods. As discussed earlier, DL models are complex in nature and require extensive computational resources. This is made possible due to the advent of the GPU and has contributed to a vast increase in the use of DL-based algorithms in the design of IDS.
Moreover, Figure 8 shows the number of times ML-or DL-based algorithms are adopted by the researchers in the context of designing an efficient IDS solution. It is observed that the four most frequent algorithms used are AE, DNN, CNN, and RNN, respectively, which are all DL in nature. Then the ML-based approaches like RF and SVM come in the list and are mostly used in the hybrid design to assist and improve DL-based algorithms. Also, ML-based algorithms like DT, KNN, and FLN are less frequently adopted algorithms during the reviewed period.
From the reviewed articles, it is observed that AE and its different variations are one of the most utilized algorithms adopted for proposing NIDS solutions. Mostly, AE is adopted specifically used for feature extractions and reduction, followed by the ML-based classifier for classification purposes. This feature reduction approach reduces the overall It is also observed that few proposed methodologies combined two or more DL algorithms in a model. Although these types of methodologies improved the detection accuracy but it is done at the cost of more complexity which demands high computing resources. Similarly, sometimes the data need to be transformed in the format to be used for some specific algorithm. For instance, to use CNN specifically for NIDS, the transformation of data from one-dimensional data vector (feature set) into a two-dimensional matrix is required. The analysis of the performance metrics used by the researchers for the evaluation of the methodology is shown in Figure 9. It is noted that the most widely used performance metrics are Detection Accuracy and Recall (Detection rate). It is obvious that for efficient network security, the IDS demands a higher Accuracy and Detection rate. So, to investigate the efficiency and effectiveness of the proposed methodology, these two performance metrics should be considered. For a typical IDS designed using ML / DL tools, the Accuracy, Recall, and F-measure should be the compulsory performance metric besides others to show its ability in detecting intrusions.
Benchmark datasets are an important ingredient used to test the performance of the proposed methodology. The analysis for the use of the public datasets is shown in Figure 10. It is illustrated that 60% times NSL-KDD and KDD Cup'99 were used for testing and validating purposes. Both of these are quite old datasets but are still very popular to use among researchers due to the availability of extensive results in the literature. Modern network architecture is quite different from the one 20 years ago. In this era of IoT, security, and privacy of the big data and sensor nodes are a prime concern due to the rapid generation of novel attacks. If a new IDS methodology proposed for the modern networks is tested using an old dataset, there are more chances that the proposed method will not perform well when deployed in the real-world environment. It is quite obvious that a model trained and verified using the latest dataset will perform comparatively better than the model trained and verified using an old dataset in the real world.

Research challenges
This subsection highlights the research challenges in the field of IDS.
1. Unavailability of a systematic dataset: The current study highlighted the unavailability of an up to date dataset that reflects the new attacks for modern networks. Most of the proposed methodologies were not able to detect zero-day attacks because these models were not trained with enough attack types and patterns. To come up with an efficient IDS model, it needs to be tested and verified using the dataset having older and newer attacks. By including the maximum number of attacks definition in a dataset will enable the ML/DL model to learn more patterns and eventually will provide protection against maximum intrusions of different types. But dataset construction is an expensive process that demands a lot of resources and high experts' knowledge. Hence, one of the research challenges for IDS is the systematic construction of an up-to-date dataset with enough instances of almost all the attack types. The dataset should be updated frequently to include the latest intrusion instances and should be made public to help the research community. 2. Lower detection accuracy due to imbalance dataset: It is also noted from the current study that most of the proposed IDS methodologies exhibit lower detection accuracy for certain attack types than the overall detection accuracy of the model. This problem is caused by the imbalance nature of the dataset. The detection accuracy for low frequent attacks class is lower than the attacks with more instances. There can be two solutions to this problem. The first is to come up with an up-to-date and balanced dataset. The second solution is to come up with efficient techniques that can increase the number of minority attack instances to balance the dataset. Recently, researchers used certain techniques like SMOTE, RandomOverSampler, and adaptive synthetic sampling approach (ADASYN Algorithm), etc. for reducing the dataset imbalance ratio for improved performance. But there is still room for improvement and demands more research in this direction. 3. Low performance in real-world environment: Another research challenge for IDS is their performance in the real-world environment. Since most of the proposed methodologies are tested and verified within a lab using the public datasets. None of the proposed methodologies is tested in a real-world environment. So, it is still not clear how they will perform in real-world scenarios. As stated, most of them still rely on testing using old datasets. So the biggest challenge for the proposed methodology is to be as efficient as demonstrated in the lab tests. The proposed method once tested in the lab should also be tested in a real-time environment to verify its effectiveness for modern networks. 4. Resources consumed by complex models: Most IDS methodologies proposed by the researcher are based on very complex models that require a lot of time in processing and computing resources (almost 80% DL-based methods or DL-ML based methods). This may result in extra overhead for the processing unit and ultimately affects the performance of IDS. The use of a multi-core high-performance GPU can speed up the computation process and reduce time, but it will cost a huge amount of money. Similarly, So to reduce the computational and processing overhead, an efficient feature selection algorithm is needed to intelligently selects the most important features for faster processing. Although many optimization algorithms are being explored by the researchers for feature selection, there is still scope of improvement and research can be carried out in this direction to come up with an efficient feature selection optimization algorithm. 5. Lightweight IDS for IoT: An IDS can also be used to provide security to the IoT network and its associated sensor nodes. In an IoT environment, sensor nodes collect a huge amount of critical data that is shared through the internet. Sensor nodes are resource-constrained with limited computational power, storage capacity, and battery life. IDS can either be deployed at the points where network traffic enters into the IoT network from the internet, or it can be deployed in a distributed manner over the sensor nodes. In the first scenario, the NIDS needs to be efficient in detecting malicious attacks and poses the same challenges as discussed earlier. In the second scenario, for the resource-limited sensor nodes, a lightweight IDS model is needed. So the design of a lightweight IDS model which is efficient in terms of computational power, training time, and with higher intrusion detection rate is one of the biggest challenges.

Future trends
This subsection highlights the future scope in ML-/ DL-based IDS research.
1. Efficient NIDS framework: NIDS is one of the important defense mechanism to a network against intrusions. Recent studies show its limitation in detecting zero-day attacks with a high false alarm. To this end, the performance of IDS can be improved by using an up-to-date, systematic, and balanced dataset. There are very few attempts by the researchers to propose an efficient and complete NIDS framework for a network (more specifically for modern networks like an IoT 138 ). Research can be carried in this direction to propose an efficient NIDS framework that can provide complete security against intrusions. The IDS framework should include a mechanism to frequently update the attack definitions in a dataset and keep on training the model with the updated definitions to make the model learn new features. This will eventually improve the IDS model in detecting zero-day attacks and decrease false alarms. The training phase of an ML-or DL-based IDS model normally takes a long time and can be performed offline. The key to attack detection and accuracy of a model lies in the constant process of dataset updating and training for AI-based IDS systems. 2. Solution to complex models: According to recent studies, DL-based IDSs have gained enormous popularity due to deep feature learning ability to produce excellent results in detecting malicious attacks. The models which are based on DL algorithms are quite complex and require high resources in terms of computational power, storage capacity, and time. These complex structures put extra challenges for IDS to be implemented in real-time environments. To address these problems, one solution is to use high-performance GPUs for the processing of big datasets quickly and efficiently. But these GPUs are normally expensive. So there is a tradeoff between performance and cost. To reduce cost, cloud-based GPU platforms or services can be explored for model training purposes. Another solution to this problem is to try reducing the complexity of DL algorithms by doing efficient and intelligent feature engineering. By selecting only the key features will result in almost the same detection accuracy as obtained using the complete set of features. This will eventually decrease the complexity of the model and will utilize fewer computing resources in a real-time environment. 3. Use of DL algorithms: Recent studies suggested the use of DL-based algorithms for an IDS design. Many DL algorithms (as discussed in this Section 4) are explored and used efficiently in proposing effective solutions. But there are still some DL algorithms that require more attention like deep reinforcement learning, Hidden Markov Models, etc. for proposing IDS solution for IoT network. The research to use DL for IDS is still in the early stages. Researchers can also explore the hybrid idea of using DL for feature extraction and ML for classification. This will reduce the complexity of the proposed model. 4. Efficient NIDS for cyber-physical systems: Recently, a massive interest is shown in Cyber-Physical Systems such as Supervisory Control and Data Acquisition (SCADA) networks and Unmanned Aerial Vehicles (UAV)-enabled networks. SCADA networks have various applications, such as smart grids 139 and manufacturing industries, etc. However, SCADA networks are becoming more and more complex as state-of-the-art information and communication technologies (ICT) are embedded within the network, thus providing an opportunity for the attackers to be part of the network. NIDS plays an important role in such networks, where these can detect intruders by inspecting the network traffic. Further, the use of ML and DL will increase the efficiency of NIDS, as it can provide an extra dimension to detect cyber attacks within the SCADA networks. However, the research in this domain is still in early-stage and more investigation is required to identify and design efficient DL-based NIDS for SCADA networks. Similarly, UAV-enabled networks also offer a wide range of applications, including traffic monitoring, asset inspection, and securing critical infrastructure, etc. 140 Due to the nature of communication involved over the wireless channel, these networks are accessible not only to legitimate users but also susceptible to the network intruders, which can not only monitor the communication but can also launch various attacks. Therefore, an efficient and intelligent NIDS is required which can detect the intruders within UAV-enabled networks. Furthermore, the use of AI within the NIDS for UAV-enabled networks can be an interesting research direction, which requires more exploration and investigation. 141,142

CONCLUSIONS
This paper provides an extensive review of the network intrusion detection mechanisms based on the ML and DL methods to provide the new researchers with the updated knowledge, recent trends, and progress of the field. A systematic approach is adopted for the selection of the relevant articles in the field of AI-based NIDS. Firstly, the concept of IDS and its different classification schemes is elaborated extensively based on the reviewed articles. Then the methodology of each article is discussed and the strengths and weaknesses of each are highlighted in terms of the intrusion detection capability and complexity of the model. Based on this study, the recent trend reveals the usage of DL-based methodologies to improve the performance and effectiveness of NIDS in terms of detection accuracy and reduction in FAR. About 80% of the proposed solutions were based on the DL approaches with AE and DNN are the most frequently used algorithms. Although DL schemes have much superior performance than the ML-based methods in terms of their ability to learn features by itself and stronger model fitting abilities. But these schemes are quite complex and require extensive computing resources in terms of processing power and storage capabilities. These challenges need to be addressed to fulfill real-time requirements for NIDS and hence improves NIDS performance.
The study also shows that 60% of the proposed methodologies were tested using KDD Cup'99 and NSL-KDD datasets mainly because of the availability of extensive results using these datasets. But these datasets are quite old to address modern network attacks, and hence limits the performance of the proposed methodologies in real-time environments. For AI-based NIDS methods, the model should be tested with the latest updated dataset like CSE-CIC-IDS2018 for better performance in terms of detection accuracy for intrusions. This article also highlights the research gaps in improving the model performance for low-frequency attacks in a real-world environment and to find efficient solutions to reduce complexity for the proposed models. Proposing an efficient NIDS framework using less complex DL algorithms and have an effective detection mechanism is a potential future scope of research in this area. For future research, we will use this knowledge to design a novel, lightweight, and efficient DL-based NIDS which will effectively detect the intruders within the network.

ACKNOWLEDGMENTS
This research is fully funded by Research, Innovation and Enterprise Centre (RIEC), Universiti Malaysia Sarawak under the grant number F08/PGRG/1908/2019.

DATA AVAILABILITY
Analyzed data for this survey article is available upon request from the corresponding author.