Artificial intelligence of medical things for disease detection using ensemble deep learning and attention mechanism

In this paper, we present a novel paradigm for disease detection. We build an artificial intelligence based system where various biomedical data are retrieved from distributed and homogeneous sensors. We use different deep learning architectures (VGG16, RESNET, and DenseNet) with ensemble learning and attention mechanisms to study the interactions between different biomedical data to detect and diagnose diseases. We conduct extensive testing on biomedical data. The results show the benefits of using deep learning technologies in the field of artificial intelligence of medical things to diagnose diseases in the healthcare decision‐making process. For example, the disease detection rate using the proposed methodology achieves 92%, which is greatly improved compared to the higher‐level disease detection models.

Krishnaswamy Rangarajan & Purushothaman, 2020; Roy & Bhaduri, 2022).These deep learning models are capable of extracting visual features directly from massive amounts of data, not just for learning.The deep learning community is interested in analysing biomedical health data, especially disease detection (Khan et al., 2022;Wang et al., 2022;Wang, Jin, et al., 2021;Wang, Kang, et al., 2021) COVID-19 samples were used, for example, to create an intelligent model to determine infection rates (Wang, Jin, et al., 2021).The latter study used both supervised and unsupervised learning approaches, resulting in a 40% increase in detection speed.Pathogen frames were analysed using transfer learning, and COVID-19 examples were validated using common virus-based pneumonia (Wang, Kang, et al., 2021).The result highlights the importance of using intelligent methods to diagnose COVID-19.A 2D-CNN network and a new biomarker are proposed for automatic detection of major depressive disorder (Khan et al., 2022).The proposed biomarker is created by using different signal processing techniques that revolve around estimating the wavelet coherence between the default mode network of the brain.This solution achieves an accuracy of 98.1%, a sensitivity of 98.0%, and a specificity of 98.2% in a small dataset of 60 patients.Another study (Wang et al., 2022) presents a microfluidic imaging device with an embedded computer that is self-contained and portable.A portable microfluidic image acquisition module, light source module, embedded computer and control module, micropump module, touch control panel module, and power supply module are all included in the fully selfcontained bio-sample detection system.
In the new field of distributed deep learning, we also see examples that are being actively explored by exploring different types of deep learning models for disease detection based on biomedical data (Balachandar et al., 2020;Dwivedi et al., 2021;Ku et al., 2022;Roy et al., 2020;Xu et al., 2022).
Data variability in terms of training sample size and label distribution across institutions is a well-known fact that can significantly reduce the performance of distributed learning models for medical imaging.In this context, Balachandar et al. (2020)  The main goal of these technologies, especially distributed technologies, is to detect diseases and help biomedical personnel make fair and acceptable biomedical decisions.Disease detection is complicated by a variety of factors, the most important of which is data complexity.Diseases can take many different forms and be located in different regions, making them difficult to diagnose.To overcome these drawbacks, we are exploring a complete AIoMT framework based on the incorporation of deep learning, ensemble learning, and attention mechanisms.

| Contribution
To our knowledge, this is the first study to investigate a detailed combination of deep learning and attention mechanism in AIoMT for disease detection.Here is an overview of the main contributions: 2. We show how ensemble and attention learning can be combined to process complex biomedical health data.In addition, many improvements have been proposed, such as improved batch normalization, which ensures that deep learning is mature in processing biomedical healthcare data.
3. The usefulness of NUMERATE has been demonstrated by extensive experimental results.When training large biomedical healthcare data, results showed that NUMERATE almost outperformed other known disease detection algorithms in terms of quality of returned results and computational time.

| Outline
The remainder of the paper is organized as follows.Section 2 reviews related disease detection studies in more detail.Section 3 provides a thorough overview of the proposed methodology.Section 4 provides a performance review of the proposed framework.The important implications of this research study for biomedical health data are discussed in Section 5, as are future prospects for the research.Finally, Section 6 concludes the paper.ResNet18,ResNet50,ResNet101,and SqueezeNet).The models are pre-trained on a large number of photographs from different areas.The COVID-19 instances were learned from medical data using the transfer learning mechanism.Hirano et al. (2021) used a deep learning model to classify the different diseases.Three types of biomedical healthcare images were used to create the classification models: photographs, X-ray chest images, and retinopathy images.Then, three applications were studied: Skin cancer, diabetic referral, and pneumonia.The adversarial neural network was also used to implement transfer learning.The adversarial network can deal with both non-targeted and targeted attacks and can also detect fake medical images.This is due to the mechanism of transfer learning, which can be used to train the model using various biomedical healthcare sources.Zhang et al. (2022) propose the use of a generative adversarial network for brain diseases.A novel hybrid loss function is implemented to monitor the training process of brain data.This algorithm can be used as a data completion method for multimodal Alzheimer's disease diagnosis.Gupta et al. (2022) proposed an ensemble method for the detection and classification of brain tumours and their phases.A modified inception model and a pre-trained resnet model are used for tumour detection.After tumour diagnosis, the cancer stage is determined using a combination of the pre-trained deep learning models and a random forest tree covering glioma, meningioma, and pituitary cancer.Singh et al. (2021) worked on constructing a hybrid disease detection model based on both decomposition and deep learning.Using the k-means algorithm on medical data, a set of segments is created.These segments are then fed into a convolutional neural network to predict disease from biomedical images.Sedik et al. (2021) demonstrated the efficiency of using both the convolutional neural network and long-term short-term memory for COVID-19 related problems.The authors acquired biomedical data for the study from many sources, including tomographies and X-ray scans, which demonstrated the usefulness of multimodal data.
Krishnaswamy Rangarajan and Purushothaman (2020) identified four primary diseases caused by pests and pathogens.Under ideal conditions, these pathogens caused significant damage to the selected crops.Images of isolated leaf samples were captured using various smartphone cameras under laboratory conditions to create a dataset for the detection of these diseases.For training, VGG16 is used with some data augmentation techniques such as rotation and translation.Although this solution improved the state of the art, the detected leaves are in a darkened area, resulting in poor discrimination between Epilachna beetles and two spotted spider mites.Mansour et al. (2021) developed a deep learning-based disease detection model for hearing diseases and diabetes using artificial intelligence and the IoTs.The IoTs, such as wearables and sensors, enables real-time data collection, while deep learning algorithms use the data to detect diseases.The proposed disease detection technique uses a cascaded long-term memory model based on a crow-search optimization algorithm.The crow-search optimization technique is used to fine-tune the 'weights' and 'bias' parameters of the cascaded short-term memory model to achieve improved categorization of biological health data.Dhere and Sivaswamy (2022) proposed the use of hierarchical classification for COVID disease detection from biomedical healthcare data.They trained the model with residual learning at many scales and a novel loss function based on conicity.It began by distinguishing pneumonia cases from normal instances using a DenseNet-derived model and then used the multiple-attention residual learning architecture to distinguish COVID from non-COVID pneumonia cases.
As the above brief literature review shows, many research papers have addressed the use of deep learning to identify diseases using biomedical healthcare data.To address the lack of biomedical healthcare data, these models combined transfer learning and data augmentation with pretrained models.The adversarial neural networks were also used to secure the training process and handle sensitive biomedical healthcare data.This is being investigated especially for distributed platforms.These strategies still have a long way to go to become acceptable in the field of biomedical healthcare, especially in AIoMT situations.This research explores an intelligent framework based on deep learning, ensemble learning, and attention mechanisms to provide a mature solution for disease detection in AIoMT.

| Principle
We start with the main functions of NUMERATE (detectioN Using enseMble dEep leaRning And aTtention mEchanism).As shown in Figure 1, NUMERATE is a combination of several innovative solutions to solve the challenge of disease detection.Three different deep learning architectures with attention mechanism are used in the training process to identify the diseases from the biomedical healthcare data.Ensemble learning is then performed to refine the trained models and deliver the best results to the biomedical teams.The components of NUMERATE are explained in the following sections.

| RESNET
The accuracy of a neural network should theoretically improve as more layers are added.In reality, this turns out to be a misunderstanding.As the depth of the network is increased, the accuracy tends to saturate and then rapidly decrease.This is called the degradation problem.Overfitting is not the cause of the degradation problem.It is caused by the disappearing/exploding gradients in deep neural networks.Due to repeated multiplication during the backpropagation phase, gradients become infinitesimally small in the vanishing gradient problem, resulting in negligible parameter updates.Exploding gradients are a problem when gradients build up and lead to unusually high parameter updates during training.This problem was solved before the discovery of residual networks by using normalized initialization and intermediate normalization layers.The residual network (RESNET for short) is a CNN design with a residual block as the main building block.To mitigate the problem of degradation, a residual block uses jump connections, which can be defined as connections that skip one or more layers.If the coefficient of the regular connection converges to zero during the training phase, the residual shortcut ensures the integrity of the network.Alternative connections strengthen the network by allowing the user to select these shortcuts as needed.Our RESNET is a pre-trained model on the ImageNet data collection. 1 Using a pre-trained model allows for higher accuracy while saving time by using a minimal amount of data.By using skip connections, our RESNET is able to handle the degradation problem and achieve higher accuracy, and it enjoys the benefits of this pre-trained model.

| DenseNet
Rather than relying on very deep or broad architectures for representational performance, DenseNets exploit the potential of the network via feature reuse, resulting in condensed models that are easy to train and parametrically efficient.Concatenating features learned in different layers increases efficiency and diversity of input to subsequent layers.This difference between DenseNets and ResNets is critical.DenseNets are more straightforward and efficient than inception nets, which combine features from multiple layers.Each layer in our DenseNet undergoes non-linear The set of input images is first trained using channel attention and the various deep learning architectures (RESNET, DenseNet, and VGG16).Ensemble learning with a voting strategy is then used to identify whether the given image represents a disease or not modification, which can be the result of a combination of operations such as stack normalization, rectified linear units (ReLu), pooling, and convolution.We propose direct connections between each layer and all subsequent layers to further improve the information flow between them.Our DenseNet design divides the network into several dense blocks that are tightly connected to support downsampling.Layers between the blocks, such as a batch normalization layer, a 1 Â 1 convolutional layer, and a 2 Â 2 average pooling layer, are referred to as transition layers.

| VGG16
VGG16 is a CNN architecture developed by the Visual Geometry Group at Oxford for the ImageNet competition.Briefly, the design supports input images with a resolution of 224 Â 224 pixels.The filters are 3 Â 3 and padding is used to maintain the resolution of the intermediate results.There are a total of 13 convolution layers and three dense layers.The activation function on all layers is ReLu.Each of the penultimate two layers has 4096 hidden nodes, while the last layer contains 1000 output nodes, which is the number of ImageNet classes.We attempted to load the VGG with 16 pre-trained weights and construct an N-dimensional output layer corresponding to the N diseases to be identified in this study.Convolutional layer, pooling layer, fully linked layer and output layer are the four main components of CNN in this study.A pooling layer is used to reduce the size of the feature map after each block of a layer or convolutional layer.When redundant details are omitted, the important information is preserved.Placing features in the input photos reduces the model's sensitivity to distortion and displacement.The feature map set is obtained by pooling and successive convolutional layers.Each neuron in the fully linked layer performs an input-output operation.The softmax activation function is obtained by applying the output layer of each node's results.

| Attention mechanism
In computer vision, the attention mechanism is used to treat information differently in different regions of the input image.When images of diseases are input to a conventional convolutional neural network, the feature map generated by the sequence of convolutional operations of the network model often contains a considerable amount of channel information, which can lead to duplication of information across channels.Moreover, in the conventional convolutional neural network technique, each point of the feature map is considered equally.On the other hand, the importance of the features of different channels and locations is different, and the features of certain locations and channels are important for assessing the severity of a disease.Therefore, we propose a novel method to consider channels.It highlights the most important and representative elements of an image.The three dimensions of channels, height and width are represented in the feature map of a convolutional neural network for each layer.It compresses the spatial information contained in the feature map to create an attention mechanism specific to the channel area.The compression technique pools the global information for each channel of the feature map using a global average pooling layer.Disease patterns have a variety of shapes and relatively low severity variations, leading the channel attention module to capture additional nonlinear information to extract such differentiated features.Two convolutional layers of size 1 Â 1 are added to improve the channel attention process, and the input features are sampled twice and then resampled to ensure that the feature map obtained after channel attention has the same size as the feature map obtained before channel attention.

| Ensemble learning
Ensemble learning is a popular and effective strategy for increasing overall performance by combining multiple learning algorithms.We used a tuning strategy that increases the performance of disease detection.The results of each disease output are determined by majority voting.To obtain an optimal disease detection result, our strategy starts by comparing a disease detection result in each detection model with the actual label (disease or not).This results in a redefined label that affects the majority vote for each image and updates a new result.We consider the new result as corrected data from our ensemble learning model if it matches the actual result.On the other hand, if the new result differs from the actual result, it indicates that the model contains inaccurate data.

| Designed approach
Algorithm 1 shows the pseudocode of the NUMERATE algorithm.It takes as input the set of images I, the number of images n, the division ratio sr, and a Boolean variable data_augmentation that indicates whether or not the data augmentation techniques are needed.The process starts by dividing the set of input images I into two groups (I train for training and I test for testing).This step takes two parameters, the number of images n and the split ratio sr [0.1] (from line 7 to line 8).For small datasets, data expansion techniques such as rotation, flipping, translation, and saturation are applied.This way you can create more images (from line 9 to line 14).Then two main steps are performed: 1. Training: The attention mechanism is first (line 16).Then, the RESNET, DenseNet, and VGG16 models are created and trained on the training data I train using the attention mechanism (from line 17 to line 19).Finally, the tuning strategy is used to create the combined model, which is stored for testing and deployment (line 20).
2. Testing: Finally, the prediction phase is started using the trained combined model to detect the diseases of each image in the testing data (from line 22 to line 25).
The results of the algorithm are the trained combined model and the detection disease for the testing images, which will be returned in line 26.

| PERFORMANCE EVALUATION
To verify the proposed NUMERATE methodology, thorough testing was performed on known biological healthcare datasets developed for disease detection applications.The studies were conducted on a desktop computer equipped with an Intel i7 CPU and 16 gigabytes of main memory.
All algorithms were implemented using PyTorch framework available in Python.We retrieved data from two biomedical health databases: 2. Plant disease recognition data 2 : It contains disease data for plants that will be useful for smart agriculture applications.This dataset contains three plant-related labels: 'Healthy', 'Powdery', and 'Rust'.In total, there are 1, 382 images divided into three sets: Train, Test, and Validation.
The total size of the plant disease detection data is 1.35 GB.
The NUMERATE performance is calculated using the accuracy and the F1 formulas which are defined as follows: and, such as, and, where, 1. True positive (TP) is determined by counting the number of corrected positive observations.An observation is called correct and positive if it is a disease and the running model considers it as a disease.
2. True negative (TN) is determined by counting the number of corrected negative observations.An observation is called correct and negative if it is not disease and the running model considers it as non disease.
3. False positive (FP) is determined by counting the number of wrongly positive observations.An observation is called wrong and positive if it is a disease and the running model considers it as non disease.
4. False negative (FN) is determined by counting the number of wrongly negative observations.An observation is called wrong and negative if it is not a disease and the running model considers it as a disease.
For the Kvasir dataset, we varied the proportion of images used for training from 2000 to 8000 and for the Plant dataset from 2000 to 5130.The quality of the results is then calculated using the F1 and accuracy formulas.The results show that NUMERATE outperforms the baseline solutions in all cases.For example, when processing the entire set of images from the plant dataset, NUMERATE achieves 92% accuracy, while the three baseline solutions achieve less than 75% accuracy when trained on the same data.This phenomenal success can be attributed to the efficient components of NUMERATE, which include ensemble learning of the most promising architectures (VGG16, DenseNet, and RESNET) and the attention mechanism.

| NUMERATE for AIoMT settings
The purpose of the next experiment is to evaluate the scalability of NUMERATE compared to baseline methods in managing large amounts of data.Xception (Jain et al., 2021) and SqueezeNet (Ahuja et al., 2021)

| Examples of detected scenarios
This last part of the experiments is to show some real cases discovered by ALMOST.Figure 2 shows some of the diseases correctly detected by ALMOST.The first three images are considered esophagitis disease.
The purpose of these final tests is to demonstrate some real-life examples that NUMERATE has detected.Figure 2 shows some of the diseases that NUMERATE correctly detected.The first three images are indicative of healthy plants.The second three images are of powdery plants.Powdery is one of the most common types of plant diseases.It is a fungal infection that attacks the leaves and stems of plants, coating them with a white or grey powdery substance.In extreme cases, it can even spread to plant buds, flowers and fruits.The last three images are called rust plants.Rusts are a genus of fungi that attack the above-ground parts of plants.Rust most commonly occurs on leaves, but can also occur on stems, flowers, and fruits.These images show how difficult it is to detect disease, as it can take a variety of shapes and sizes.Compared to other algorithms, NUMERATE is more effective at detecting certain diseases.These promising results confirm the practicality of NUMERATE.

| DISCUSSIONS AND FUTURE DIRECTIONS
In this section, we outline the main advantages of using the proposed NUMERATE framework to analyse disease detection data.We also make some suggestions on how to make the NUMERATE framework even better.
1. Deep learning, ensemble learning, and attention mechanisms are an effective mix of intelligent technologies that create a high level of precision.Runtime efficiency and visualization are still critical issues when managing biomedical healthcare data and detecting diseases in real time.To improve the performance of NUMERATE, creating hybrid systems that combine augmented reality, evolutionary, and deep learning approaches could be a promising avenue (Djenouri, Belhadi, et al., 2021;Djenouri & Comuzzi, 2017;Khamparia & Singh, 2019;Liu et al., 2022).
2. Diseases were effectively detected by the recommended methods.It outperformed previous disease detection methods in terms of accuracy.
It would be interesting to investigate the results of NUMERATE for other smart healthcare applications, such as skin disease detection (Hossen et al., 2022), chronic glomerular disease detection (Zhou et al., 2021), and vascular aging assessment (Shin, 2022).
3. At NUMERATE, interpreting the result is in itself a difficult task.Indeed, it is based on black-box models that do not explicitly describe the process of inferring the result.In order to trust a particular result, biomedical scientists need to understand how it was obtained.This problem is being addressed by the XAI (eXplainable Artificial Intelligence) discipline, which offers a variety of ways to provide some level of explanation to deep learning AI solutions (Chen et al., 2022;Muddamsetty et al., 2022;Yang et al., 2022).NUMERATE is being updated to include XAI methods.This will allow for more accurate evaluation of results from NUMERATE.
4. Security and privacy are two important factors in AIoMT-based applications.Security and privacy can be solved by advanced blockchain technologies (Belhadi et al., 2021;Djenouri, Srivastava, et al., 2021) and identification of hidden sensitive patterns from biomedical healthcare data (Haque et al., 2021;Lin et al., 2020;Lin, Wu, et al., 2019;Lin, Zhang, et al., 2019).NUMERATE is updated by considering these two approaches in both training and deployment phases.
presented modifications to mitigate performance degradation caused by introducing variability in training sample sizes and label distributions across institutional training splits, and they test their effectiveness on simulated distributed tasks for detecting and classifying abnormal chest radiographs.An interesting study reported in Dwivedi et al. (2021) proposes a method for training a distributed Covid-19 detection model on biomedical images using edge-cloud collaboration.A distributed lightweight model-based training algorithm is developed by edge computing and cloud computing collaboration to improve training efficiency and ensure model accuracy.A resource allocation algorithm is also developed during training to jointly minimize the time cost and energy consumption.

1 .
NUMERATE (detectioN Using enseMble dEep leaRning And aTtention mEchanism) presents a new paradigm that uses deep learning and attention mechanisms to identify diseases.Different deep learning architectures (VGG16, RESNET, and DenseNet) are used to learn from biomedical health training data and different viral infections.
Nawaz et al. (2021) uses pattern mining in biomedical disease analytics.Each patient is represented by a transaction, and each COVID-based information associated with the patient is represented by an item, and the set of COVID patient data is converted into a set of transactions.The set of transactions is then subjected to a pattern mining technique to extract important patterns.This has been used to identify diseases based on correlated aspects of biomedical data.Wang, Jin, et al. (2021) explored deep learning-based architectures for segmentation and classification and automated the image evaluation process.This enabled a decent evaluation of the COVID-19 infection rate.Ahuja et al. (2021) collected COVID-19 of medical lung CT-scan data and used four deep learning architectures (

T
A B L E 2 NUMERATE versus advanced disease detection solutions with different number of error loss values (0.10, 0.08, 0.05, 0.02, and 0.01) versus disease detection solutions are used for comparison.These algorithms have proven effective in training large datasets.If you duplicate the datasets of Kvasir and Plan 1000, 10,000, and 100,000 times, respectively, you get large datasets.Table2shows the runtime calculation of NUMERATE, Xception and SqueezeNet while varying the error loss to be optimized from 0.10 to 0.01.Based on these results, we can conclude that NUMERATE outperforms the other two methods in terms of training time.This performance can be explained by the fact that NUMERATE employs both ensemble and attention mechanisms, converging quickly to ground truth.