Blockchain and explainable AI for enhanced decision making in cyber threat detection

Artificial Intelligence (AI) based cyber threat detection tools are widely used to process and analyze a large amount of data for improved intrusion detection performance. However, these models are often considered as black box by the cybersecurity experts due to their inability to comprehend or interpret the reasoning behind the decisions. Moreover, AI‐based threat hunting is data‐driven and is usually modeled using the data provided by multiple cloud vendors. This is another critical challenge, as a malicious cloud can provide false information (i.e., insider attacks) and can degrade the threat‐hunting capability. In this paper, we present a blockchain‐enabled eXplainable AI (XAI) for enhancing the decision‐making capability of cyber threat detection in the context of Smart Healthcare Systems. Specifically, first, we use blockchain to validate and store data between multiple cloud vendors by implementing a Clique Proof‐of‐Authority (C‐PoA) consensus. Second, a novel deep learning‐based threat‐hunting model is built by combining Parallel Stacked Long Short Term Memory (PSLSTM) networks with a multi‐head attention mechanism for improved attack detection. The extensive experiment confirms its potential to be used as an enhanced decision support system by cybersecurity analysts.


INTRODUCTION
With the wide use of the Internet and the recent developments in information technology, the Internet of Things (IoT) has evolved remarkably over the last decade.IoT devices exchange data over the internet, enabling them to interact and communicate with each other without any human intervention. 13][4] The SHS has become an important application of the IoT and refers to the integration of smart technology into healthcare to monitor patient's health remotely.Furthermore, the SHS data is also used to promote evidence-based interventions for diseases, trauma, and rehabilitation.In such an environment, healthcare data is gathered from patients using various smart sensors and wearable devices for enhanced decision-making.For example, the electrocardiogram (ECG) signals obtained can be used for monitoring and classifying heart diseases such as Arrhythmia. 5he volume of data and complex infrastructure of SHS raises various security and privacy challenges.For example, the sensitive data is transmitted between smart healthcare monitoring devices, edge servers, that is, primary or urban hospitals, and cloud storage for processing and long-term storage using an open channel, Internet. 6As a result, the attacker launches various attacks and intrusions including, Distributed Denial of service (DDoS), fingerprinting, insider threats, spoofing, and replay attacks to exploit the network and its data.To tackle possible attacks and secure the entire SHS environment, Deep Learning (DL)-based cyber threat detection tools are mostly used to examine the intrusion patterns.The DL-based threat hunting helps in preventing repeated intrusions and further assists cybersecurity experts to become more proactive in terms of preventing and responding to aforementioned attacks. 7,8However, such models are more often prone to overfitting as well as they are often considered "black-boxes" because they lack interpretability. 9n the last few years, Blockchain has been widely utilized in numerous fields to promote data privacy and trust, allowing users to communicate and share information while preserving a level of trust, integrity, and greater transparency.It has the potential to enhance the security of the system by providing data integrity, authentication, and Threat Intelligence (TI). 10 Blockchain and cyber threat detection can work in collaboration to promote transparency and integrity and safeguard the SHS systems against cyber threats.Additionally, trust, accountability, and enhanced transparency can aid the AI-based model in decision-making by preventing the errors, distortion, and misrepresentation of information. 11Explainable Artificial Intelligence (XAI) provides concise justifications or explanations for an ML/DL-based threat-hunting decision. 12,13XAI has proven itself significant in decision-making since it improves the transparency of the AI-based models by disclosing internal workings and decision-making variables.Consequently, this transparency enables the users to understand the rationale behind the model outcomes.Thus, increasing the overall credibility, reliability, and acceptance of decisions made by the AI-based models. 14Moreover, the transparency of cyber threat detection provides more openness so that the users may understand the decisions of the threat hunting, including the reasons considered, data sources, and algorithms employed. 15In particular, the transparency of cyber threat detection encourages accountability and helps users verify the decisions of threat hunting, resulting in more informed judgments.Hence, In this research work, we investigate the following Research Questions (RQ): 1. (RQ1): How can we design and implement a framework to ensure data integrity and authenticity for cyber threat detection modeling?The AI-based threat hunting is data-driven and is usually modeled using the data provided by multiple cloud vendors.Consequently, a malicious cloud can launch an insider attack, where the attacker misuses their privilege intentionally and provides false or malicious data to degrade the cyber threat detection capability.2. (RQ2): How can we improve the performance of cyber threat detection during the testing phase using healthcare datasets?The data-driven strategies particularly, Recurrent Neural Networks (RNNs)-based threat hunting are most frequently used to learn the temporal and spatial representation of healthcare data.However, additional network growth may result in overfitting and convergence issues.Also, these models do not explicitly reflect fine-grained interactions between events in long-term dependencies of a sequence.

(RQ3): How can we provide the transparency of the cyber threat detection model to comprehend and interpret its decisions?
A DL-based threat hunting is considered a black box due to the lack of interpretability.It works by learning complex, non-linear data representation which makes it difficult to understand the rationale behind its decision.For security analysts and users, knowing the reasoning behind the cyber threat detection decision contributes to trust-building and increases threat hunting's credibility.
To the best of our knowledge, no prior research has been done to address the questions raised above concerning data integrity, and authenticity for cyber threat detection modeling and provide cyber threat detection transparency.To solve these questions, our contributions are as follows: 1. To address RQ1, we design a blockchain-enabled peer-to-peer cloud server to enable immutable data exchange between multiple cloud vendors by implementing a Clique Proof-of-Authority (C-PoA) consensus.To prevent insider attacks, attribute-based authentication is performed for each entity participating in the SHS ecosystem.The proposed approach rescues the SHS ecosystem from insider attacks and provides authenticated data for cyber threat detection modeling.

To address RQ2, we design a novel deep learning-based cyber threat detection model by combining Parallel Stacked
Long Short Term Memory (PSLSTM) networks with Multi-head Self-Attention (MhSA) mechanism for improved attack detection in the context of SHS.The PSLSTM improves the performance by shortening the training time and the MhSA mechanism is applied to obtain the weighted representations of each feature so that the proposed cyber threat detection can more accurately focus on the relevant features present in the dataset.The extensive experiments are conducted using two publicly available datasets namely, ToN-IoT 16 and IoT Healthcare Security Dataset. 17Additionally, a number of evaluation metrics, such as accuracy, precision, recall, F1 score, confusion matrix, and ROC curve are used to fully assess the effectiveness of the proposed cyber threat detection.3. To address RQ3, we innovatively utilize a model-agnostic XAI technique based on the SHapley Additive exPlanations (SHAP) mechanism for the local and global interpretation of data features and cyber threat detection output to improve the decision-making capability of cybersecurity experts.
The rest of the paper is organized as follows: In Section 2, we discuss the related work focused on blockchain and XAI.Section 3 introduces the proposed blockchain-enabled explainable AI framework.Section 4 introduces the experimental setup, evaluation metrics, dataset description, and result analysis.Section 5 discusses the theoretical and practical contributions.Section 6 concludes the paper.

RELATED WORK
The existing literature is summarized in Table 1.The authors of Reference 18 proposed an anomaly detection model for IoMT-based healthcare environments.They used a feature elimination technique known as Recursive Feature Elimination (RFE) to investigate the existing adversaries in the healthcare data flowing over the system.The optimal features are selected by employing the Logistic Regression (LR) and the Extreme Gradient Boosting Regression (XGBR) methods.A Multilayer Perception (MLP) technique is adopted to ensure the feature selection mechanism's accuracy.Further, the proposed model is evaluated to determine attack detection accuracy, precision, recall, and the f1-score.The system has shown remarkable performance by beating the existing intrusion detection schemes targeting the same healthcare area.Efficient investigation of cyber attacks is the main strength of the designed model, however, significant computation overhead is also observed while examining the model.The authors in Reference 19 proposed a solution to secure the Clustered Edge Intelligence (CEI) by using blockchain.They underline how important it is to preserve and track event history in order to determine the causes of failures.According to the authors, blockchain's immutability property can stop unwanted changes.In addition, the authors acknowledge upcoming obstacles and plans to integrate cutting-edge blockchain technology with edge infrastructure to create a more transparent and efficient CEI.Similarly in Reference 20, the authors introduce a decentralized security mechanism for the IoT in fog computing and mobile edge by combining blockchain technology with SDN.Where the blockchain is used to address failure issues in existing models, while the SDN analyzes system traffic to construct an attack identification model.The authors utilized the NSL-KDD dataset to train and evaluate their DL-based Model.Upon evaluation, the decentralized approach outperformed the centralized and distributed approaches used by the authors for comparison purposes.Consequently, the edge node's overall security is improved by the decentralized attack detection technique, which makes use of blockchain technology to identify and reduce fog attacks.In Reference 21, the authors present a peripheral Federated Learning (FL) based model that simultaneously improves data privacy and the ability to detect interventions in the healthcare data originating from IoMT applications.Researchers developed a Dew-Cloud-based cyber threat detection by employing LSTM and Hierarchical LSTM (HLSTM) to develop an efficient Intrusion Detection System (cyber threat detection).The designed scheme is trained on NSL-KDD and TON-IoT datasets.The proposed cyber threat detection for smart healthcare networks is examined and compared with the existing cyber threat detection models on a diversified performance matrix.Although the model is efficient in detecting frequently occurring cyber threats in healthcare data, the system also shows high false positives.Utilizing a GRU (Gated Recurrent Unit) model from DL, the authors in Reference 22 present an FL-based approach to design a network intrusion detection system.The CICIDS2017 dataset is employed by the authors for experiments.The suggested method retains data locally and allows many ISPs to collaborate on DL training.This method improves the accuracy of model detection while protecting network traffic privacy.Moving forward, the authors in Reference 23 choose the best parameters for attack Researchers in Reference 24 presented a technique for malware detection based on LSTM that can detect and identify new and unknown families of malware in addition to differentiating between malicious and benign samples.The authors evaluated their proposed threat detection technique using the CICAndMal2017 dataset.The case studies demonstrate that the model can even detect new malware families with an accuracy of more than 90%; however, this claim cannot be applied to other malware samples as these results are limited to the validation of existing malware families in this dataset.The authors in Reference 25 designed an intelligent edge-centric framework for efficiently analyzing healthcare networks.In addition, cyber threat detection is designed by employing the Swarm Neural Network (SNN) technique for detecting attacks in the edge-centric IoMT framework during data communication over the network while enhancing prediction accuracy.Traditional ML-based classification models, such as SVM, LR, DT, and KNN algorithms are used to accomplish the designed model.The proposed cyber threat detection is evaluated with a real-time security dataset called the ToN-IoT dataset, which includes practical IoT framework attacks.While evaluating the performance of the proposed framework, the system has achieved notable attack detection accuracy; however, considerable communication overhead is also witnessed.
The authors in Reference 26 designed an ensemble learning and fog-cloud model for threat investigation schemes in IoMT systems.To identify adversaries, NB, SVM, K-nearest Neighbour (k-NN), and XGBoost algorithms are used with the ToN-IoT dataset.The performance evaluation results represent that the designed model mostly attained an investigation rate of 99.98%, an accuracy of 96.35% with a low false alarm rate of 5.59%.The researchers in Reference 27 designed a Multi-View attention-based threat identification method to investigate common adversaries in smart healthcare networks.The designed threats identification model is empowered by LSTM, DNN, and CNN.The system is trained on Windows and Android-based malware datasets.The simulation results show that the designed model demonstrates an accuracy of 98% on the Windows-based dataset and an accuracy of 97% on the Android-based dataset.The performance evaluation results make the proposed model ideal for Windows and Android-based smart healthcare monitoring systems.
The authors in Reference 28 designed a threat investigation model to increase the security of a smart healthcare environment.An ML-based support system is used by the authors that collaborate a Random Forest (RF) and a genetic algorithm to design threat investigation models with a high identification rate and low false alarm rate.The NSL-KDD dataset is used to evaluate RF, Naive Bayes (NB), and logistic regression algorithms.The performance evaluation results demonstrate that the designed model shows an average precision of 5.65% and an 8.2% average F1 score.Authors in Reference 29 have proposed an XAI-based threat investigation mechanism for IoMT-based healthcare systems.The designed scheme is equipped with a Simple Recurrent Units (SRU) model for the identification of common adversaries.Furthermore, the Bidirectional nature of SRU provides additional strength to the proposed model.The framework is trained on the ToN-IoT dataset, and its performance is examined to evaluate threat detection accuracy, precision, recall, and f1-score.The performance of the proposed model is then compared with state-of-the-art classifiers, such as LSTM and Gated Recurrent Unit (GRU).While evaluating the performance of the system, attacks are effectively identified.However, significant communication delays are also noticed.
The authors in Reference 30 have designed another intrusion detection framework for IoT networks that are empowered by XAI.This framework is based on the LSTM classifier; however, a novel technique SPIP is further acquired to train the conventional LSTM classifier.The researchers have validated their model using three phenomenal datasets, that is, NSL-KDD, ToN-IoT, and UNSW-NB15 datasets.The model efficiently safeguards the IoT networks against Denial of Service (DoS) attacks, User-to-Root (U-to-R) attacks, Root-to-Local (R-to-L) attacks, and probe attacks.The performance of the designed scheme is evaluated to determine attack detection accuracy and processing speed.The system has shown promising results in terms of efficient attack detection with high processing speed.However, the system also shows high false positive rates.
Authors in Reference 31 have used XAI in another anomaly detection scheme, designed for IoT-based smart networks.The model is based upon several remarkable ML classifiers such as Decision Tree (DT), Random Forest (RF), and Support Vector Machine (SVM).The authors employed the CICcyber threat detection2017 dataset to train their model.The model is evaluated to determine its compatibilities regarding threat detection accuracy which is 96.25%.The authors further aim to enhance the applications of their proposed model by integrating it with Deep Lift, AIX360, and Shapely Additive Explanations (SHAP).Researchers in Reference 12 proposed an Explainable Deep Learning (XDL) based mechanism to expose cyber adversaries in the IoT-based smart industrial networks.The designed model is driven by CNN and Convolutional LSTM algorithms and is trained on a real-world Gas Pipelines (GSP) dataset.The designed framework is evaluated by employing the standard evaluation metrics and is further compared with CNN, LSTM, and Autoencoder LSTM.The authors claim that their proposed model outclasses the benchmarked schemes with high accuracy in cyber threat detection.However, the model is complex in nature and requires higher computational resources.
Researchers in Reference 32 have developed another attack-resilient model to combat frequently occurring anomalies in the Internet of Vehicles (IoV).The CNN classifier drives the model and is trained on the ToN-IoT dataset.An application of SHAP is also presented to explain the output of the threat detection scheme.The model is evaluated and compared with benchmarked intrusion detection models.The proposed scheme outperforms the competitor schemes with 99.15% accuracy, 99.10% precision, 99.15% recall, and 99.83% f1-score.Finally, the authors in Reference 33 have proposed a novel DL-based intrusion detection mechanism for IoT-based smart communications.The system employs the SHAP mechanism to interpret the decisions made by DL-based cyber threat detection.Moreover, the authors explore the interpretation differences between the one-vs-all and multiclass classifiers.The designed scheme is trained on the NSL-KDD dataset and shows promising results toward reliable performance regarding attack detection with high precision and recall value.

PROPOSED BLOCKCHAIN-ENABLED EXPLAINABLE AI FRAMEWORK
In the context of the Smart Healthcare System, we have discussed the general systematic architecture and the three key elements of the proposed framework in this section.Figure 1 shows the network model for the proposed blockchain-enabled eXplainable AI framework for enhanced decision-making in SHS.The following subsections provide more information on each of these elements.

Overall systematic architecture
1. User data layer: This layer consists of various IoT-based smart devices (e.g., MRI scanners, glucose monitoring sensors, and temperature sensors) that are primarily used to collect and transmit patient health data in real-time to edge servers-hospitals for processing and storage.2. Edge servers-hospitals: This layer includes some power-full edge servers located at primary or urban hospitals across the geo-distributed locations.The sensor data (transactions) are stored on the edge servers, where they are processed and either sent back to the user data layer or aggregated and transmitted to the cloud for additional processing and/or long-term storage.To monitor the data transfer from the smart device to the edge server and the edge server to the cloud server, the proposed cyber threat detection is deployed at the edge servers.As a result, it can also deal with the problems in SHS related to low latency, mobility support, location awareness, and geo-distribution.3. Blockchain-enabled cloud servers: This layer consists of various cloud data centers forming a peer-to-peer (P2P) network.In the proposed framework blockchain is integrated with the cloud P2P network.The data or transactions forwarded from edge servers are used here to first execute the consensus mechanism based on which block is created Network model of the proposed blockchain-enabled eXplainable AI for enhanced decision making in smart healthcare system.
and added to the network.As a result, the entire cloud transaction is stored on the distributed digital ledger making it impossible to change or modify the data.4. Enhanced decision support system: The stored unaltered data is used to train the proposed cyber threat detection in the cloud environment.Furthermore, to boost the decision-making of the proposed cyber threat detection, we have implemented the SHAP mechanism or the local and global interpretation of data features.The obtained SHAP plots are then used to determine which features are affecting the prediction model, their relative significance, and the justification of specific decisions.

Blockchain mechanism
This phase discusses the secure sharing of data between the entities by including entities' registration and their authentication process.To perform secure communication between the entities attribute-based authentication is performed.The three different attributes are checked during the communication namely, third-party secret key, entity identity, and entity secret key.These three attributes are generated during the registration process of each entity and the same attributes get verified during authentication.The notations used in this phase are listed in Table 2.The details of the registration and authentication process are mentioned below.

Entities registration
This phase presents, the registration process of User Device Layer (U DL ), Edge Server Hospital (E SH ), and Blockchain-enabled Cloud Server (B ECH ).The process of registration is performed by a trusted authority (T Z ).These are the various parameters set for the registration process for each working entity in the proposed framework such as cyclic group C GR , and prime order P O ∈ C GR , where generator GR ∈ C GR is evaluated by T Z .The registration process of each entity is shown in Table 3.

TA B L E 2
The notation used in blockchain mechanism.

U DL
User device layer

E SH
Edge server hospitals T Z computes three authentication parameters of E SH j .
T Z computes three authentication parameters Finally, T Z shares these parameters for further communication Finally, T Z shares these parameters for further communication Finally, T Z shares these parameters for further communication

Authentication
This phase presents the authentication scheme between U DL to E SH and E SH to B ECH .The detailed authentication scheme is illustrated in Tables 4 and 5.

Consensus mechanism for block verification and block creation
This phase presents the Clique Proof-of-Authority (C-PoA) consensus mechanism of block verification and block creation in the proposed framework.The verification and block creation are performed by the mining node B ECH .The proposed algorithm consists of two functions namely blocksignature() and generateblock().The block signature takes two parameters that is, the initial block index and the subsequent previous block index.This process is verified by all the miners is any block index is found invalid then it assigns a value false, else the current validated block is assigned with a new block index and returns true.The generateblock() function further assigns the previous block index and assigns it in Z. Next, it waits for the signature verification, current block index, and previous index verification within a certain time delay.If everything gets verified successfully, then block weight is assigned by 2 else 1.Finally, a new block gets created with all desired output like block index, previous block, time stamp, and signature of miners as shown in Algorithm 1.

Deep learning-based intrusion detection system
In this subsection, the proposed deep learning-based cyber threat detection based on SLSTM and multi-head attention mechanism is described.The designed pipeline primarily involves data pre-processing, a stacked LSTM model for keeping long-term dependency between intrusion data vectors, a dropout layer for preventing overfitting, a multi-head attention mechanism to assign higher weights to the important features, a feed-forward layer that uses a sigmoid activation function for intrusion detection.Each stage of the pipeline is detailed below.

Dataset pre-processing
In the pre-processing stage, we used one-hot encoding and min-max normalization techniques.The former converts categorical column values into numerical ones.The latter applies a linear transformation to the numerical data such that the greatest value is changed to 1, and every other value is changed to a decimal between 0 and 1.This is crucial because, TA B L E 4 Authentication Process between U DL and E SH .

Edge server hospital (U DL ) Blockchain enabled cloud (E SH )
The U DL selects the number randomly T1 ∈ Z P Further, important parameters are computed, The E SH receives MSG1 and compute The U DL successfully receives MSG2 Computes T2 ← V1 ⊕ T3.during training, DL models give each input/feature a weight to indicate the relevance of that input or feature, but features with varying scales might introduce bias and cause some weights to update faster than others.

Parallel stacked long short term memory
The LSTM network has the ability to keep the data in both long-term and short-term memory while forgetting irrelevant or insignificant details.Moreover, when compared with classical Recurrent Neural Networks (RNNs), the LSTM network demands less computational resources and does not suffer from gradient vanishing problems.The LSTM works based on a self-connected memory cell, and three gates, namely, the input gate, forgetting gate, and output gate. 34Initially, the forget gate (f t ) decides how much historical information should be forgotten using Equation (1), TA B L E 5 Authentication process between E SH and B ECH .

Edge server hospital (E SH ) Blockchain enabled cloud (B ECH )
The E SH selects the number randomly T1 ∈ Z P Further, important parameters are computed, The B ECH receives MSG1 and compute The E SH successfully receives MSG2 Computes T2 ← V1 ⊕ T3.where X t denotes the input, B is the bias, W is weight and H t−1 is the hidden state at time t.Next, LSTM decides about the amount of information that it will use as an input using input gate (i t ) and candidate gate (c t ).
Finally, the output gate (o t ) decides how much information is output.The output value corresponds to 0 and 1, where the former means forget and the latter means retain.Equations ( 4) and ( 5) describe the computational formula to calculate Algorithm 1. Block verification and block creation using C-PoA consensus mechanism where ⊙ denotes the element-wise product.Finally, several LSTM units are used in parallel and stacked together.The complete architecture of the proposed DL-based cyber threat detection is shown in Figure 2, which includes the input layer, parallel LSTM layer, the merging LSTM layer, multi-head attention layer, the fully connected layers and sigmoid layer for binary classification.

Dropout
A dropout layer is introduced in the SLSTM layer as a regularization method to prevent overfitting.This method exclusively drops connections from input to output (feed-forward connections) or non-recurrent connections, rather than eliminating connections between hidden states that occur at separate timestamps (recurrent connections).More specifically, the dropout operator D, which sets a random subset of its input to zero, only affects the output H l−1 t of the previous layer at the same timestamp and does not affect the output H l t−1 at the previous timestamp of this layer.This operator can F I G U R E 2 Process of the proposed framework for enhanced decision making in smart healthcare system.
be calculated using the equation below 35 : (7)

Multi-head self-attention mechanism
The Multi-head Self-Attention (MhSA) method effectively separates important data from the input traffic by utilizing the attention mechanism.The discriminative data from the better characteristics obtained by PSLSTM blocks are specifically explored using MhSA.Multi-head attention involves several "heads" with attentional functions that each analyze various portions of the material.Three inputs-query (Q), key (K), and value (V)-are needed for the multi-head attention computation.The output of self-attention can be computed as: 36 where the dimension of the key is indicated by d K .Using the dot product approach, the weights of each Q and each K are determined.For calculating the weights of the values, the Softmax function is utilized.The related V is equivalent to the output after the weighted sum of similarity is determined.The MhSA module linearly projects H times at the Q, K, and V with various learned linear projections to d k , d k , and d v dimensions, respectively, as opposed to applying the attention function with the d-dimension of the queries, keys, and values matrixes once.The attention function is then executed simultaneously on each of these projected copies of the queries, keys, and values to provide d v -dimensional output values.
The final values are produced by concatenating them and then projecting them once more. where where W Q I , W K I , W V I denotes trainable parameters and weight matrix of the query, keys and the values, respectively, and H is the number of attention (H = 8) used in this paper.

Fully connected layer
Next, the Fully Connected Layer (FCL) unit receives the value output from the MhSA module.The FCL unit flattens the feature map into a vector, multiplies the vector, and decreases the vector's dimension.The final classification outcome is then obtained by feeding the vector into the activation layer.The ultimate classification outcome for the activation layer is obtained by the activation function layer.The sigmoid is the activation function used in this paper to perform the binary-class attack detection task.

Model training
The proposed DL-based cyber threat detection repeats the same mathematical operations (described in Equations 1-10) throughout the training phase to identify the relevant attacks present in the datasets.Equation ( 11) can be used to do this by minimizing the loss function value between the actual output y and expected output ŷ for n observations (i.e., batch size).

Interpretability mechanism-from learned features to decision making operations
The proposed DL-based cyber threat detection has a "black-box" approach, in which the justification for the predictions is not made clear.As a result, the decision-makers, or cybersecurity professionals, are still mostly unaware of the underlying predictive mechanisms.The problem can be formulated as: Given input features ), the goal is to introduce a set of permutations in the feature set and then analyze the relative contribution of each feature value to the specific outcome.
To achieve this, the proposed framework employs the SHapley Additive exPlanations (SHAP) method for the interpretation of the decision model.SHAP employs a cooperative game theory technique to calculate the contribution of each feature to the prediction in order to present a clear decision model.According to the additive feature imputation approach used by SHAP, the output of the model is expressed as the linear addition of the input variables.The solid theoretical grounding of SHAP is useful in supervised situations.By giving a SHAP value to any factor that meets the parameters listed below, it defines a specific prediction using Shapley values.(1) local correctness-the explanation strategy must at least match the main model's findings; (2) Missingness-features that are omitted from the primary input must be ignored; (3) Consistency-regardless of how important other variables are, the relevance of a variable should not diminish if we change a model to make it more dependent on that variable.This explanation model function is defined as: 37 where ) is the explanation model, B( − → X ) is the black box model, N is the number of features,  i is the decomposition factor.To approximate SHAP values this paper used the Deep SHAP approach.

EXPERIMENTAL EVALUATION
In this section, we discuss the experimental setup, evaluation metrics, datasets used for the experiment, experimental analysis of blockchain technique, and experimental analysis of DL-based cyber threat detection including XAI interpretation and quantitative results.Each of them is explained below:

Experimental setup
The experiment is being carried out on a PowerEdge R940xa Rack Server with two Intel Xeon Gold 6240 processors clocked at 2.6 GHz, 256 GB of RAM, and 8 NVIDIA Ampere A100, 80GB Passive GPUs.The blockchain experiment is carried out over the hardhat development environment and Ethereum goerli test network.To create the smart contract solidity version 0.8.20 is used.To create the entity's address metamask (browser extension) wallet is used.The etherscan desktop version 9.8 is used for the storage of transactions.The proposed DL-based cyber threat detection is implemented using Python 3.7 programming language on the TensorFlow deep learning library version 2.5, and SHAP version 0.39.0 is used to interpret the decision of DL-based cyber threat detection.The DL-based model is trained with 2 layers having 32 and 18 neurons, the "ReLU" activation function is employed for fully connected layers, "Adam" optimizer with a learning rate of 0.001 over 20 epochs, 0.3 probability of dropout, 128 samples of each batch, 64 batches of each epoch and "binary cross-entropy" loss.

Evaluation metrics
We evaluate the performance of the proposed scheme by employing various evaluation metrics, that is, Accuracy (AC), Recall (RL), Precision (PR), and F1-score (F1).The following equations compute these metrics.Accuracy measures the overall correctness of the model by calculating the ratio of correctly predicted instances (both positive and negative) to the total number of instances.A higher accuracy indicates a better overall performance of the model.AC = T r P os + T r N eg T r P os + T r N eg + F al P os + F al N eg (13)   Recall assesses the model's ability to correctly identify positive instances among all actual positive instances.A higher recall value implies that the model is effective at capturing a larger proportion of actual positive instances.RL = T r P os T r P os + F al N eg (14)   Precision measures the accuracy of positive predictions, calculating the ratio of correctly predicted positive instances to the total predicted positive instances.A higher precision indicates that the model has fewer false positives, meaning that when it predicts a positive instance, it is more likely to be correct.

PR =
T r P os T r P os + F al P os (15)   F1-score is the harmonic mean of precision and recall, providing a balanced measure that considers both false positives and false negatives.The F1-score is particularly useful when there is an uneven class distribution, as it balances the trade-off between precision and recall.
where T r P os is the true positive rate and T r N eg represents the true negative rate, while F al P os and F al N eg denotes false positive rate and false negative rate respectively.

Dataset description
In this work, we employed two publicly available datasets, ToN-IoT 16 and IoT Healthcare Security 17 datasets.The ToN-IoT dataset includes 43 predictor features and 1 target feature.Similarly, the IoT Healthcare Security dataset includes 51 predictor features and 1 target feature.In both datasets, the target variable comprises benign and attack classes.Furthermore, we divide these datasets into the traditional 70-30 ratio.Finally, the preprocessing and normalization of these datasets are conducted based on Reference 38.The total distribution across the benign and attack class for training and testing sets are mentioned in Table 6.

Result analysis for blockchain technique
This section presents block mining and block creation analysis.The block mining and block creation are computed based on the C-PoA consensus mechanism.Figure 3, illustrates block mining time with the number of cloud servers (B ECH ) at a P2P blockchain-enabled cloud and the number of shared transactions.It can be seen that, for the transactions (Tx) of batch size 250, 500, and 750 Tx, and the for the set of 7B ECH 14B ECH , 21B ECH , 28B ECH , and 35B ECH the time is increasing linearly.However, for other batches of transactions, execution time depends on the number of peers in the healthcare network.Figure 4, illustrates the analysis time of block creation for different batches of Tx.It can be easily noticed that here execution time depends on the number of B ECH sharing the Tx over the network.
F I G U R E 3 Blockchain simulation result for block mining time.
F I G U R E 4 Blockchain simulation result for block creation time.

Result analysis for DL-based cyber threat detection
In this subsection, we discuss the interpretation and performance of the proposed DL-based cyber threat detection in enhanced decision-making.

XAI interpretation
The SHAP mechanism of eXplainable AI illustrates the decision-making process of a model to enhance its transparency by using numerous plots, such as Decision Plot (DP), Waterfall Plot (WP), Summary Plot (SP), and Force Plot (FP).In DP, the y-axis represents the features of the dataset while the x-axis is responsible for showing the expected value of the model.It demonstrates the key features involved in a model's output.The DP can be more useful than the FP when a high number of important features are involved.Unlike the FP, the vertical structure of DP allows showing the impact of any quantity of features.Figure 5A depicts the DP for the ToN-IoT dataset while the DP for the IoT Healthcare Security dataset is shown in Figure 5B.On the other hand, the WP is used to present explanations for a particular prediction, thus they require a single row of an explanation object as input.Figure 6A,B depict the WP for ToN-IoT and IoT Healthcare Security datasets respectively.The red bars in the rows show the positive contribution of the features in the model predictions, while the blue show the negative contributions.Moreover, we employ SP to give a visualization of the features in the ToN-IoT and IoT Healthcare Security datasets that are most significant for the proposed cyber threat detection to forecast the attacks.It visualizes the top features of the dataset that contribute to the decision of the designed model.Figure 7A depicts the SP of the ToN-IoT dataset, where we visualize the top 20 features of this dataset that contribute the most to the prediction of attack to the proposed cyber threat detection.Further, the SP of the IoT Healthcare Security dataset is presented in Figure 7B, where the top 10 features that contribute to the prediction are visualized.
Finally, FP is useful for visualizing feature descriptions of specific instances, where each value of the feature reflects a force that supports or opposes a prediction.It enables us to see how particular features influenced the model's forecast for a single observation.We employed the FP in this work for visualizing the features of the ToN-IoT and IoT Healthcare Security datasets that support or oppose the output of the proposed scheme.Figure 8A

ROC curve analysis
Receiver Operating Characteristic (ROC) analysis is a statistical method used to evaluate the performance of a binary classification model.It plots the true positive rate (sensitivity) against the false positive rate (1-specificity) for different threshold settings, illustrating the trade-off between correctly identifying positive instances and incorrectly classifying negative instances as positive.The area under the ROC curve (AUC) quantifies the model's ability to distinguish between the two classes, with a higher AUC indicating better overall performance.In the context of our research, we utilize ROC analysis to assess the discrimination capabilities of the proposed scheme.Figure 10A,B present the ROC curves for the ToN-IoT dataset and the IoT Healthcare Security dataset, respectively.Each point on the curve corresponds to a specific threshold for classifying instances, and the curve's shape provides insights into the model's ability to balance sensitivity and specificity.The higher AUC values reported (0.994 for ToN-IoT and 0.988 for IoT Healthcare Security) signify that our proposed scheme excels in distinguishing between positive and negative classes, showcasing its effectiveness in making accurate predictions across a range of classification thresholds.The ROC analysis adds a nuanced layer to our evaluation, offering a more detailed understanding of the model's discriminative power in differentiating between classes in diverse datasets.

Overall performance analysis
In Figure 11A, the performance metrics of the proposed scheme are presented in the context of the ToN-IoT dataset.The graph illustrates that the model attains a high AC of 97.27%, indicating the proportion of correctly classified instances.The PR, denoting the accuracy of positive predictions, is noteworthy at 94%, affirming high correctness in identifying positive instances.The model exhibits excellent RE at 98.50%, signifying its proficiency in capturing a substantial portion of actual positive instances.Additionally, the F1, a balanced measure of precision and recall, is commendable at 96.20%, emphasizing a harmonious trade-off between these two metrics.Transitioning to Figure 11B, which encapsulates the overall performance on the IoT Healthcare dataset, the model maintains a robust accuracy of 89.62%.Precision reaches an impressive 99.13%, ensuring a high accuracy of positive predictions.However, recall is relatively lower at 76.18%, suggesting that the model captures a slightly lower proportion of actual positive instances.Nevertheless, the F1 score remains strong at 86.15%, underscoring a favorable balance between precision and recall in the context of IoT healthcare data.These graphical representations collectively underscore the efficacy of the proposed scheme, revealing its capacity to deliver accurate and well-balanced predictions across diverse datasets.

Theoretical contributions
The proposed framework has several theoretical contributions.First, attribute-based authentication using blockchain ensures immutable, transparent, and trustworthy data exchange between the participating entities.This is important because the data is transmitted and exchanged among various smart devices, access points such as edge servers, and cloud servers in the SHS environment using open insecure channels, and therefore are prone to various attacks.Although, the work in Reference 25 designed an intelligent edge-centric framework for healthcare but lacks a suitable blockchain-based authentication scheme.On the other hand, the blockchain is taken into account in this work to guarantee persistence and audibility in the stored data, which later gives confidence in developing an efficient decision support system for attack detection.Moreover, the underlying mechanism is also robust against various security attacks including, insider attacks, replay attacks, and Man-in-the-Middle (MitM) attacks.For example, to execute an insider attack, attackers have to create an analogous message MSG3 ← (U DL SK ⊕ T2) with the authentic session key and the time delay U DL SK and same time the computation of T2 has to be performed.For an attacker, it is highly impossible to get the session key, and computation of T2 within the time delay.Thus, the process prevents insider attacks.Similarly, in replay attack, the U DL generates a message MSG3 ← (U DL SK ⊕ T2).The E SH verify this message by performing computation of T2.This analysis prevents the broadcasting of the message MSG3 from unauthorized U DL and prevents replay attack.Finally, for MitM attack, the message must be generated through the valid by U DL , that is, MSG3 ← (U DL SK ⊕ T2) including authorized session key.To generate the authorized session key three attributes are checked against the authentication namely third-party secret key, entity identity, and entity secret key.This process prevents the MitM attack and sends the message MSG3.
Second, we contribute to the field of cybersecurity by proposing a novel explainable deep learning-based intrusion detection system for attack detection in smart healthcare systems.The proposed end-to-end deep learning approach is based on parallel stacked long and short-term memory, and the multi-head self-attention mechanism can automatically extract weighted spatial-temporal features and reduce manual involvement.Additionally, the retrieved temporal and spatial features are accurate representations of normal and attack vectors, which can aid in the design of domain-specific intrusion detection tasks.Compared with network models, presented in References 12, 21, 27 and 30, the proposed deep learning-based cyber threat detection based on PSLSTM and MhSA mechanism can carry out parallel computation and therefore can decrease the training time to some extent.This gives the decision-maker more options in the context of real-world decision support, where making such decisions requires minimizing the computational burden in order to obtain compelling actionable results on time (i.e., implementing crucial security measures in healthcare applications).

Practical contributions
Our work has important contributions to practice.First, In this paper, we have proposed an attribute-based authentication scheme for each entity participating in the communication.These attributes are created while registration of the entity.The same attributes are verified when entities are communicating with each other.Once the verification is done successfully, the session key is shared between the entities for making further communications.The authenticated users are only allowed to make communication in the network.The framework includes blockchain-based secure communication, to ensure immutability, transparency, and integrity of the information.The blockchain layer includes distributed infrastructure where each authenticated entity can share the transactions and it gets recorded into the blockchain ledger.Using the widely accepted hardhat tool, metamask wallet, Goreli test network, and etherscan distributed ledger to perform legitimate communication in the blockchain network.The Clique Proof-of-Authority (C-PoA) consensus mechanism is used to create and verify the block.This is a voting-based consensus and performs block commitment after (n∕2 + 1) voting, where, 'n' denoted the mining node in the network.Our scheme is also secured against various security attacks namely, insider attacks, MitM attacks, and impersonation attacks.In the future, we would like to validate our scheme for the real blockchain network, to perform the analysis like block mining time, block creation time, communication overhead, and computation overhead.Second, the practical implementation of the proposed deep learning-based cyber threat detection using the unaltered data from a blockchain-enabled cloud server can rescue the data-driven approach from the malicious cloud (i.e., insider attack and data poisoning attack), and as a result, it will not subsequently lead to "incorrect attack detection on datasets."0][31][32][33] Our method addresses this issue by integrating blockchain with cloud servers and thereby assures the trustworthiness of data, as the transactions are verified in the distributed ledger using the C-PoA consensus mechanism and collected using the attribute-based authentication method in the SHS environment.
Third, the interpretation of the proposed cyber threat detection can enhance the decision-making capability of cybersecurity analysts.However, the existing work, 18,21,[25][26][27] lacks the interpretation behind the improved intrusion performance.
To this end, in the proposed framework, the interpretability of data features and cyber threat detection output is examined using the model-agnostic XAI technique namely, SHAP mechanism.SHAP enables cybersecurity professionals to examine several plots to better understand the effects of each feature in attack detection.For example, the features including, ts, dst_port, proto, dns_AA, service, and dst_ip are the most important 6 features out of 44 in the ToN-IoT dataset that have the strongest contribution in attack detection.Similarly, under the IoT Healthcare Security dataset, the features include, tcp.time_delta, tcp.flags.push,mqtt.msgtype,mqtt.qos,mqtt.hdrflags, and tcp.flags.ackhave the strongest contribution to attack detection.Therefore, the development of an effective cyber threat detection, along with its interpretation based on various SHAP plots, will help cybersecurity decision-makers better understand the model's outputs and identify which features have high contributions and show an impact on the attack detection model.As a result, it will assist them in designing effective cyber threat detection that will ultimately lower the level of risk and vulnerabilities in the SHS environment.

CONCLUSION
In this article, we integrated blockchain and explainable AI for enhanced decision-making in smart healthcare systems.First, we used a blockchain mechanism to ensure secure communication between the participating entities of the smart healthcare system.In particular, attribute-based authentication was carried out to enable immutable data exchange and guarantee the reliability of the data.Clique Proof-of-Authority consensus was put into place between various cloud vendors to mine blocks for verification and addition in the distributed digital ledger.Second, the tamper-proof transactional data stored in the ledger was used to train the proposed deep learning-based intrusion detection system.Specifically, the proposed method applies the parallel stacked long and short-term memory networks to separately extract features from different intrusion datasets, and the multi-head self-attention mechanism was applied to discover the discriminative information from the extracted features.Besides, the proposed model was trained using the sigmoid activation function to perform binary class attack detection using two publicly available datasets namely, ToN-IoT and IoT healthcare security dataset.Finally, an interpretation model called SHAP was also applied to analyze the feature importance and interpret the results of the proposed model designed for attack detection in smart healthcare systems.To better comprehend why the proposed data-driven cyber threat detection produces such a decision based on specific input data, we employed the SHAP technique that emphasized the best features influencing the prediction model by assigning SHAP values to each feature that contributes to the model's output.Future study will focus on adding a transparency component to the proposed deep learning cyber threat detection, which could significantly enhance the decision-making process for cybersecurity specialists in smart healthcare systems.
BC i ← {bck i U {bck},P i U{bck.parent}}27: Disseminate block BC i 28: end function the output of the cell.
depicts the FP for the ToN-IoT dataset, where the prediction probability of the proposed model is shown in bold as 0.99.The FP for the IoT Healthcare Security dataset is provided in Figure8B.

4. 5 . 2 F I G U R E 6 F I G U R E 7
Confusion matrix analysis A Confusion Matrix (CMax) is responsible for summarizing the number of correctly and incorrectly identified records.Each column in the CMax represents the total instances in the predicted class, while each row represents it in the given F I G U R E 5 Shap values using decision plot.(A) Under ToN-IoT dataset.(B) Under IoT healthcare security dataset.Shap values using waterfall plot.(A) Under ToN-IoT dataset.(B) Under IoT healthcare security dataset.Shap values using summary plot.The higher SHAP value of a feature corresponds to the higher prediction.(A) Under ToN-IoT dataset.(B) Under IoT healthcare security dataset.class.Figure 9A depicts the CMax of the proposed scheme under the ToN-IoT dataset while Figure 9B depicts the CMax under the IoT Healthcare Security dataset.These figures are evident that the proposed scheme identified both the normal and attack classes correctly.

F I G U R E 8
Shap values using force plot.(A) Under ToN-IoT dataset.(B) Under IoT healthcare security dataset.F I G U R E 9 Confusion matrix analysis.(A) Under ToN-IoT dataset.(B) Under IoT healthcare security dataset.

F I G U R E 10
The resulting ROC curve analysis on test datasets.(A) Under ToN-IoT dataset.(B) Under IoT healthcare security dataset.F I G U R E 11 Quantitative results of the proposed framework on various evaluation metrics.(A) Under ToN-IoT dataset.(B) Under IoT healthcare security dataset.

TA B L E 1 Literature overview. Ref Model Dataset Advantages Usage of blockchain Usage of XAI Limitation
Registration process of U DL , E SH , and B ECH .T Z compute secret key (S SK )T Z compute secret key (S SK ) T Z compute secret key (S SK ) SKBlockchain enabled cloud server session key TA B L E 3 Group of miners node bck ECH , bck ← Block, parentnode ← previous block, bck ECH ID ← Id of block miners BIndex ← Block number, W ← Weight of Block, bcklockPeriod ← Time of Block commitment, Vote ← (B ECH ∕2) + 1. 2: Output: Current Block Commitment 3: function B(l)ockSignature (BC i , Z) ∶ 1: Input: Statistics of total instances for training and testing.