Breast Cancer Detection Based on Simplified Deep Learning Technique With Histopathological Image Using BreaKHis Database

Presented here are the results of an investigation conducted to determine the effectiveness of deep learning (DL)‐based systems utilizing the power of transfer learning for detecting breast cancer in histopathological images. It is shown that DL models that are not specifically developed for breast cancer detection can be trained using transfer learning to effectively detect breast cancer in histopathological images. The outcome of the analysis enables the selection of the best DL architecture for detecting cancer with high accuracy. This should facilitate pathologists to achieve early diagnoses of breast cancer and administer appropriate treatment to the patient. The experimental work here used the BreaKHis database consisting of 7909 histopathological pictures from 82 clinical breast cancer patients. The strategy presented for DL training uses various image processing techniques for extracting various feature patterns. This is followed by applying transfer learning techniques in the deep convolutional networks like ResNet, ResNeXt, SENet, Dual Path Net, DenseNet, NASNet, and Wide ResNet. Comparison with recent literature shows that ResNext‐50, ResNext‐101, DPN131, DenseNet‐169 and NASNet‐A provide an accuracy of 99.8%, 99.5%, 99.675%, 99.725%, and 99.4%, respectively, and outperform previous studies.

TOMA ET AL. 10.1029/2023RS007761 2 of 18 Digital histopathological images can be collected from microscopic examination using a camera.This allows the development of automatic computer-aided diagnosis (CAD) of breast cancer detection using these images.CAD can play a significant role in assisting pathologists by improving the accuracy of the cancer diagnosis.Many researchers are exploring opportunities to improve the performance of CAD systems (Aswathy & Jagannath, 2017;Dromain et al., 2013;Spanhol et al., 2016;Veta et al., 2014).Due to a large amount of inter-intra class variations in histopathological images and the limitations of feature extraction methods, the conventional computerized diagnostic approaches are not able to produce very reliable classification results.
Several studies aiming to resolve the above-mentioned issues are reported in LeCun et al. (2015).These studies use an advanced computational model based on multiple layers of the neural network, that is, deep learning (DL).DL is inspired by the structure and function of the human brain's neural networks.While it is not an exact mimicry of the brain, it does draw some inspiration from the brain's information processing capabilities.It is a sub-field of machine learning that utilizes multiple layers of neural network architectures to perform several imaging tasks.DL models (Dimitriou et al., 2019;Litjens et al., 2017) have a high success rate in computer vision, object detection, classification in medical image processing, etc. due to their ability to learn complicated tasks and extract different features automatically from images.
DL models have the advantage of not requiring prior knowledge about the data.DL just requires the input data to be in a proper format and the network parameters relevant to the problem at hand be specified.Because DL-based cutting-edge technology provides excellent performance many researchers are focusing their attention of using DL on the area of breast cancer detection.Many studies have been published by various scholars where different algorithms have been proposed to detect and classify breast cancer.However, the performance reported using these algorithms is limited (Gurcan et al., 2009;Spanhol et al., 2016;Veta et al., 2014).Recently, research has been performed on the classification issue of identifying breast cancer using machine learning and DL techniques (Aswathy & Jagannath, 2017).This study shows there is still considerable room to improve the quality of the automatic detection of DL methods.
In this paper a strategy is proposed for accurately detecting breast cancer from histopathological images.This was achieved by: (a) exploiting the advantages offered by transfer learning convolutional neural network (CNN) models, that is, using knowledge learned from one task to adapt to a different but related task, and leveraging the generalization power of the pre-trained model and improve its performance on the new task by fine-tuning these pre-trained models on a specific data set; (b) performing the investigation using BreaKHis data set that consisted of 2,480 benign images and 5,429 malignant images in four magnification factors (40×, 100×, 200×, and 400×); and (c) preprocessing the images prior to application to the models.Using the proposed strategy, the DL training models are shown to outperform the existing breast cancer detection algorithms reported in the literature regardless of the image magnification factor.
The rest of the paper is organized as follows: Section 2 describes the related work.Section 3 on data and methods describes the several DL-based approaches for automatically identifying breast cancer using histopathological images.The experimental results are presented in Section 4. Sights from this work are discussed in Section 5, and this is followed by the conclusion and future work in Section 6.

Related Work
Over the last decade, a lot of research has been done to detect breast cancer from histological images.Much of this work has focused on utilizing CAD to identify the two basic forms of breast cancer (benign and malignant).Various machine learning techniques have been used to diagnose breast cancer.Several research papers outlined below on detection of breast cancer use DL methods and CNN for classification.
The categorization of carcinoma and non-carcinoma histopathological images of breast tissue by Hameed et al. (2020) used four distinct DL models: fully trained VGG-16, fully trained VGG-19, fine-tuned VGG-16, and fine-tuned VGG19.Among these, an ensemble of fine-tuned VGG-16 and VGG-19 models provided an average accuracy of 95.29% and F1 score of 95.29%.The group of fine-tuned models show an overall sensitivity of 97.73% and an F1 score of 95.29% for the carcinoma class.K. Gupta and Chawla (2020)  learning classifiers, namely support vector machine and logistic regression, are used for the classification.They achieved maximum accuracy of 92.5% for the ResNet-50 model.Their approach provided a better performance on 100× and 40× magnification as compared to the others.
Likewise, in a study by Lim et al. (2018), two CNN models, that is, VGG-16 and Inception-V3, utilize a transfer learning technique.They applied several data augmentation techniques like rotation, flip, shift, and zooming on the BreaKHis database.The two types of learning used by each model are custom layer and whole layer learning.With the Inception-V3 model, 98% accuracy was achieved, and with the VGG-16 model, 97% accuracy was achieved.It was also shown that the whole layer learning method gives better performance than the custom layer.Jiang et al. (2019) proposed a new model of a neural network (BHCNet), which contains a small SE-ResNet module requiring fewer parameters.They designed the Gauss error scheduler, that resolves the problem of fine-tuning the stochastic gradient descent (SGD) algorithm's learning rate parameter.The model was trained from scratch and provided satisfactory results.For binary classification, the model had an average accuracy score between 98.87% and 99.34%.For multi-classification, the performance was between 90.66% and 93.81%.
Haija and Adebanjo (2020) authors take advantage of the power of transfer learning by using a CNN model (ResNet-50) that was pre-trained on ImageNet.BreaKHis data set was used in this study where 75% of the data was used for training and 25% used for testing.Their model outperforms other training on the same data sets with an accuracy rate of 99%.
Multi-class breast cancer classification using the DL-based CNN model was conducted in Nawaz et al. (2018).In this experiment, the BreaKHis database was used, and the classification result was reported as patient-wise and image-wise classification.By fine-tuning the DenseNet CNN model, the accuracy of image classification was 95.4%, and for patient classification the accuracy was 96.48%.
A combined network called improved inception-residual CNN (IRRCNN) was suggested by Alom et al. (2019) for classifying breast cancer histopathological images.The IRRCNN is a combination of powerful deep CNN (DCNN) networks, namely the residual network (ResNet), the inception network (Inception-V4), and the recurrent convolutional neural network.Compared to any of these individual models, a combined model is shown to provide superior results.The model was tested on two separate public data sets, that is, the BreaKHis data set and the breast cancer classification challenge, 2015.The testing accuracy of the model for binary and multi-class classification reported is 99.05% and 98.59%, respectively.
In (Kassani et al., 2019), author proposed a DL-based algorithm where a DCNN descriptor and pooling operation are used.In their experiment, they processed the data using popular stain normalization techniques proposed by Macenko et al. (2009) and Reinhard et al. (2001).They trained five pre-trained DL networks, that is, Inception-V3, InceptionResNetV2, Xception, VGG-16, and VGG19.The best performance was achieved by Xception with an average classification accuracy of 92.50%.
Several fine-tuned pre-trained deep neural networks were used by Motlagh et al. (2018) on two public databases, namely BreaKHis and Tissue Micro Array (TMA) databases.Pre-processing techniques, such as normalization and color distortion, were employed.Different pre-trained models, like Inception (V1, V2, V3, and V4) and ResNet (V1 50, V1 101, and V1 152) were investigated.For binary and multi-class classification the ResNet V1 152 model was able to achieve an accuracy of 98.7% and 96.4%, respectively.Among the ResNet models, ResNet-152 provides the highest accuracy, which indicates that the deep layer network is important for classification.
Deep convolutional neural networks, that is, Inception V3 and Inception ResNet V2, trained with transfer learning techniques were used by the authors in J. Xie et al. (2019) as a method for examining the histological images of breast cancer.To do a clustering analysis on the images, they developed a new autoencoder network that converted the features acquired by Inception ResNet V2 to low dimensional space.It was found that clustering resulting from the autoencoder network outperforms the Inception ResNet V2 network.
According to the studies described above, it is obvious that DL models are very useful in the field of image-based detection applications.In certain cases, it has become an integral part of the routine for pathologists and doctors in clinical practices.Despite its significant success in medical imaging, automated breast cancer diagnosis based on histological images is still a long way from being a practical application.This is because a massive amount of label data is required, which is not yet available in this application domain.The main reason is because the annotation of a data set is time-consuming and costly.In the present paper, we compare several advanced DL methods utilizing the power of transfer learning for detecting breast cancer in histopathological images.

Data Sets
The work presented in this paper was conducted on a pathology data set, that is, the breast cancer histopathological image classification (BreaKHis) (Spanhol et al., 2016).This data set has been gathered from the result of a survey by P D Lab, Brazil over the period between January 2014 and December 2014.This data set consists of a total of 7909 histopathology images from 82 clinical breast cancer patients.The data set has 2480 benign images and 5429 malignant images of different magnification factors (40×, 100×, 200×, and 400×).Additionally, benign images have four different subsets: adenosis (A), fibroadenoma (F), phyllodes tumor, and tubular adenoma.The malignant images have four different subsets: ductal carcinoma, lobular carcinoma, mucinous carcinoma and papillary carcinoma.All images are in RGB mode with a resolution of 700 × 460 pixels.Table 1 shows the distribution of the images in the BreaKHis data set (Spanhol et al., 2016).

Data Pre-Processing
The standardization of the images in the data set using DL requires the data set to be noiseless, clean, and understandable.Hence it was necessary to pre-process the breast cancer histopathology images in the data set before using DL to improve the overall data quality and removing redundant attributes from the images.The various stages in the pre-processing used were: • Image rotation where the images were randomly rotated from −45 to +45°.Uniform distribution was used to generate random angles between the specified range with equal probability.The image rotation step produces a single rotated image for each original image.Photographs of the biological tissue are taken from a microscope.These histopathological images are not always perfectly aligned and may be skewed at an angle.The angle range between −45 and 45° was chosen as extremes of image misalignment.This choice in angle helped the model to become more robust in relation to the changes in the direction of objects.• Random cropping of some portion of the image to a specified size.This was done using the uniform distribution of PyTorch's "RandomResizedCrop" transform library.A single cropped image is generated for each single input image.The images were resized to 224 × 224 pixels for every model in our experiment except for InceptionV3, which requires the fixed size input of 299 × 299 pixels.This is to ascertain a meaningful object in the image that may be oriented in a different position.• Flipping the images vertically and horizontally to make the model more resilient.This step produces a single flipped image for each input image.
• Conversion of all images to a tensor data structure and application of parallel processing at the training and inferencing phase.• Resizing the images before using the data set in the model to align the size with the requirement of model.It should be noted that during the image resizing process, all images were converted to the required fixed size and then applied to the model for training.In the data transformation the total size of the data was unchanged to enhance the model's ability to learn meaningful features from the available data and improve its performance on unseen samples.The consequence of this was significantly reduced complexity and therefore less time required for training.The data set was transformed using the PyTorch library (Paszke et al., 2019), specifically version 2.0.1, which is compatible with Compute Unified Device Architecture (CUDA) 11.8.• Normalization and standardization were applied to scale down the input values in the data set to a nominal range between zero to one.This was important because the images may have a different range of features that may affect the results of the model (Yuan & Suh, 2018).The input images were normalized separately to scale down the features between 0 and 1.  Convolution layer is a building block of CNN that performs several linear and nonlinear operations.Convolution is a linear operation where the kernel (filter) is applied to the input tensor for feature extraction.The kernel is a set of weights in a matrix used to extract a particular feature from an image.A direct element-wise product is calculated between the kernel matrix elements and input tensor elements and then summed up for obtaining a feature map.Moreover, multiple kernels are applied repeatedly to extract different types of features from the input tensor.For example, if the input image is denoted by I and the kernel by K, then the convolution operation is given by where i denotes the indexes of rows of the result matrix, and j denotes the indexes of columns of the result matrix.
While implementing the convolution operation, the output dimensions are not the same as the input dimensions.The consequence of this is loss of some border data.To perfectly fit the input image entails adding or padding zeros around the edges of the image.Considering the padding and stride values, the dimensions of the output matrix can be calculated by where the padding is denoted by p, while the stride and the kernel are denoted by s and f, respectively.
A pooling layer is another important block of CNN.After the convolution operation, each image size is decreased, and the features map size is increased.The main idea of using the pooling operation is to reduce the dimension of the features map and down-sample the volume of the network.Furthermore, stride and padding also affect the network because they help the model to change or preserve the size of the image.
After successfully performing the convolution and pooling operations, the model will have learnt several important features from the input.The output is then flattened and fed to a fully connected layer for final classification.
After completing the forward phase, which is based on predicted output and actual output, the loss function is calculated.Using the gradients, the back propagation is applied for updating the weights.The generic formula used for updating the values of parameter is given by where X k+1 denotes the new parameter, X k denotes the old parameter, ϵ denotes the learning rate, and ∇ denotes the Gradient operator.It was found that over a series of iterations, this model gains the ability to reduce the error rate and gradually increase its performance.In such a way, the model can be trained to perform the classifications more robustly.
Several different techniques exist in the practice for dealing with imbalanced data set.The naivest class of techniques is sampling: changing the data presented to the model by under-sampling common classes, oversampling (duplicating) rare classes, or both.It should be noted that under-sampling shrinks the data size, therefore less time is necessary for learning.The disadvantage of this is that this is likely to lead to losing useful information.On the other hand, over-sampling is used to add minority examples to the data set to achieve a balance, in which the existing minority examples are replicated.
Experimental work conducted here compares the various pre-trained deep neural networks.It was found that these networks are capable of automatically learning selected features from histopathological images.From this investigation we were able to identify the model that provides the most satisfactory results in the detection of breast cancer.
In the current investigation, we used the ResNet-18, ResNet-34, ResNet-50, ResNet-101, and ResNet-152 models.The models were trained on the 1.2 million images on the data set by ImageNet.To enhance the accuracy of these models, the transfer learning technique in Hosna et al. (2022) was utilized.Also, the breast cancer classifier was adjusted to fit with these networks.The ResNet base model had 1,000 outputs in the last layer but based on the need for this work, the output was changed to two classes, benign and malignant.Adam was used as an optimizer ( Kingma & Ba, 2014) and the learning rate was set to 0.00001 because from numerous experiment trials we conducted it was found that this learning rate was the best.
In deep neural networks, it is observed that increasing the layers of the network can also increase the probability of the error rate.The ResNet model, proposed by He et al. (2016), resolves this problem by addressing a new residual block without degrading accuracy.Residual blocks have a skip connection which helps the model learn the identity function.There are many variants of the ResNet model.
In 2017, (Xie et al., 2017) introduced another new model named ResNeXt.The idea behind this model was to introduce the "splittransform-merge" strategy which uncovered another measurement, called "cardinality."Pre-trained models ResNeXt-50 (32 × 4d), ResNeXt-101 (32 × 4d), and ResNeXt-101 (64 × 4d) were investigated in this work.It was necessary to resize the input images to 224 × 224.Only the classifier layer was trained while keeping all other layers frozen using transfer learning.These fully connected layer models were fine-tuned using perceptrons of 2,048.It was found that the optimizer and learning rate were the same as for the ResNet architecture.
In this investigation, we have used the pre-trained network SENet-154, which is a squeeze and excitation network first introduced by Hu et al. (2018).This network mainly consists of three parts, that is, (a) squeeze module, (b) excitation module, and (c) scale module.The pre-trained model (Alex, 2019) was imported by first installing the cnn_finetune package using the pip manager in Python.The make_model package could then be imported from cnn_finetune.By using fine-tuning, this model was retrained for all layers with the input size of 224 × 224 × 3 and the batch size of 16.
Dense convolution network was designed by a research group (Huang et al., 2017).This model is like the ResNet model, except that ResNet uses summed (+) technique that adds the output of the previous layer (identity) with the next layer whereas DenseNet uses concatenation (.) instead of summation.In this work we investigated the different variants of pre-trained DenseNet networks such as DenseNet-121, DenseNet-161, DenseNet169, and DenseNet-201.In the work here, the classifier with the last linear layer of 1920 perceptions was modified.
Dualpath network architecture was proposed by Chen et al. (2017).It uses both ResNet and DenseNet blocks to create a simple but high-efficient system.The combination of these two strong modules results in a new macro-block, as residual blocks allow feature reuse and DenseNet blocks allows exploration of new features.DpNet-131 is a convolution neural network which we pre-trained on ImageNet and with 79.25 M parameters for 1,000 classes.By applying the fine-tune with two classes in the last linear layer, the number of parameters reduces to 76.57 M. The learning rate was found to be 0.00001.Adam optimizer was set to this model for retraining.
NASNet by Zoph and Le (2016) was inspired by the automated neural architecture search method.The idea behind this model is to use a recurrent neural network (RNN) to generate a network and train it by using reinforcement learning methods.In this model, a new ScheduledDropPath regularization method is utilized which allows the maximum expected performance to be obtained.Moreover, by taking the advantage of controller recurrent neural network and CNN, this model can find variable-length architecture space.In this study, a pretrained NASNet model was fine-tuned and then retrained with the last linear layer of 4,032 perceptrons.Adam optimizer was utilized in this study and 0.5 dropout was applied.Like the SENet model, this model was downloaded from the cnn_finetune package.The input size of the images for this model was set to 331 × 331 × 3.
Despite the cutting-edge performance of deep residual networks on thousands of layers, several layers must be duplicated to improve by a fraction of a percent, which slows down the model efficiency for training.To resolve these challenges, a novel architecture was designed by Zagoruyko and Komodakis (2016), where the depth of residual networks is decreased, and their width is increased.This model is made up of ResNet by adding more feature planes to the convolution layers to widen and increase the size of the filters in the convolution layers.The cutting-edge wide ResNet-50-2 was also investigated.This model provides an outstanding performance in medical imaging.In the present study, this model was fine-tuned with fully connected layers of 2,048 perceptrons.

Transfer Learning
Transfer learning is a technique in DL where previously acquired knowledge is used to solve related problems.
Deep CNN models that have been trained on ImageNet were found to be more capable of achieving satisfactory results.Nowadays, transfer learning is widely used and provides satisfactory results in computer vision tasks.It speeds the model training and enhances the model performance though on limited data.Moreover, transfer learning can solve complex real-world problems and helps to build more robust models which can perform a variety of tasks.The steps followed in our experiment are outlined in Figure 3.

Environmental Setup
As DL models needed to handle a huge amount of data it was necessary to use Graphics Processing Unit (GPU).
In our experiment, we used colaboratory (known as Colab) cloud service that provides free access to GPU and Tensor Processing Units (TPU) computing.Google colab provided the same runtime as the Jupyter Notebooks, which is mainly configured for DL tasks.Pytorch library, an opensource framework based on the Torch library (Paszke et al., 2019), was used to facilitate training and testing of the DL models.

Model Training
Before training, the selected models were prepared according to the above-mentioned requirements.All models used were trained for 100 epochs with batches of different sizes and 70% training data.The remaining 20% data was used for validation and 10% for testing.Data pre-processing techniques was used in every model to improve the systems performance.The number of total images was the same after image pre-processing.Adam's method, a variant of stochastic gradient descent, was chosen as an optimizer for the backpropagation of these models.The learning rate was generally set to 0.00001.Dropout method was used for regulating the results in this study.Using google colab, models like ResNeXt-50 (32 × 4d), DenseNet, and wide ResNet-50_2 took about 6 hr to be trained.
Due to the number of the parameters and input image size we faced some difficulty to retrain SENet-154, DPN-131, ResNeXt50 (32 × 4d), and ResNeXt-101 (64 × 4d) with the batch size of 32.Therefore, the batch size was decreased to 16 instead of 32 to avoid GPU memory errors while training.For the same reason the batch size was set to eight in the NASNet-A (large) model.Finally, during the experiments, the parameters were modified sometimes to avoid computational errors.

Model Testing
After completing the training phase, each model was saved for future inference.Inference is very significant in DL because it helps bridge the gap between the data the models were trained on and ambiguous information in the real world.Here, a trained model was used to predict the testing samples and a forward pass.It should be noted that image transformation was not applied in the inference phase.Only the image size was changed according to

Experimental Results
In this paper, we provide the findings from tests using several deep neural networks to detect breast cancer in histological images.All these pre-trained networks were retrained and tested on the publicly available BreaKHis data set which contains a total of 7,909 images divided into two classes of 2,480 benign images and 5,429 malignant images.The data set was randomly split into three parts: 70% for training, 20% for validation, and 10% for testing.We followed the same pipeline for every model to train and test.To facilitate the comparison of different models, we structured the results from these models given in Table 2 with their accuracy at different magnification magnitudes of the BreaKHis database.
In our experiment, a pre-trained ResNeXt-50 (32 × 4d) network achieved the highest average accuracy score among all examined models.This model was retrained with 70% training data for 100 epoch and for magnification magnitude of 400×.This model obtained the highest accuracy score of 100% in the inference phase which is impressive.The accuracy and loss curves of the ResNeXt-50 (32 × 4d) model for all magnification magnitudes are shown in Figures 5 and 6.Similarly, the accuracy and loss curves of the NAS-Net-N (large) model for all magnification magnitudes, which outperformed all previously reported investigations, are shown in Figures 7  and 8.
The second highest accuracy of 99.75% was obtained with Dense-Net161.ResNet-34, DenseNet-169, and SENet-201 models achieved an average accuracy of 99.725%.In fact, these results indicate a cutting-edge performance in breast cancer histopathological image classification.Another high average accuracy of 99.675% was achieved by DPN131 and Wide ResNet-50_2.The DPN-131 achieved a 100% accuracy score for magnification magnitude of 400×, and the wide ResNet-50_2 achieved accuracy of 99.8% for 100× and 200×.ResNeXt-101 (32 × 4d), while the test loss achieved by this model was 0.0009 in an epoch of 33 for magnification magnitude of 40×.In our comparative investigation, the NASNet-A (large) model was very accurate too, with an average accuracy of 99.375%, being 0.135% higher than the study reported by Shahidi et al. (2020) in two-class classification.The model gained a 99.7% accuracy score for magnification magnitude of 40×.
The overall performance of these experimental models was measured based on the confusion matrix also referred to as error matrix.This confusion matrix contains four elements, namely, true positive (TP), true negative (TN), false positive (FP), and false negative (FN).In our work, TP refers to images that were correctly predicted as cancerous, and TN refers to non-cancerous images that were correctly predicted.The FP refers to the fact that the model mistakenly classified the non-cancerous images as cancerous, whereas the FN indicates that the model mistakenly classified the cancerous images as non-cancerous.Based on these four basic metrics other metrics can be determined by which the model's performance can be evaluated more robustly.The other three metrices presented here are precision, recall, and F1-score.These three metrics and the confusion matrix of

Discussion
The goal of this research study was to establish the ability of DL models to extract discriminative features for breast cancer detection based on histopathological images.The study was focused on transfer learning, and the results demonstrate the effectiveness of the models examined.tively.However, statistically there is not a significance difference with respect to other convolutional networks for detecting breast cancer.The remaining models investigated provide good results compared to the results reported in literature.Figure 11 shows the performance statistics for the various models in terms of precision, recall and F1 score.
In this work we have introduced several strategies for training the models, such as data pre-processing, transformation, parameter tuning, etc.These strategies allow the models to deal with different challenges.In summary, the advanced DL model selected here has exhibited the capability to extract many multi-level features from histopathological images which can be applied in the automatic detection of breast cancer.
To ensure that our results are not solely a result of bias on the model or data, we took the following steps to validate our approach: We randomly shuffled our data and carefully split the data set into training and testing sets.This process guarantees that the training and testing data are unique and independent of each other, reducing the chances of data leakage or overfitting.We used cross-validation to evaluate the performance of our model.This means that we trained and tested the model on different folds of the data, which helps to reduce the variance of the results.By employing these steps, we have demonstrated the generalization ability of our transfer learning model and confirm that its performance is indeed a result of the application of transfer learning to our specific task.

Conclusion and Future Work
In medical applications, the interest in DL models is growing fast as it outperforms traditional learning models in terms of speed and reliability.In our investigation, a series of experiments were conducted utilizing the most recent and precise CNN models that were trained on the ImageNet data set.We have fine-tuned these models and trained them using the BreaKHis database.Because of the lack of training images and the complexity of medical images, the transfer learning technique has been applied in this experiment.It is found that transfer learning is a very effective solution to meet these challenges.Experimental results of the models investigated indicate exceptional performance compared to other traditional machine learning models for detecting breast cancer in histopathological images.In this work, it is shown that the models provide different results for different degrees of image resolution.The results show DL models find it challenging to deal with low resolution and noisy images.Our work was conducted on 2D histopathological images, but it is known that the amount of information obtained by 2D images has their limitations.On the other hand, the 3D approach provides enriched images and enables the visualization of differences in size and volume of the mammary glands.Although it is a challenging task, efforts are currently being made to reconstruct 3D histology images by stacking 2D images (Mertzanidou et al., 2017).Future work will be focused on experimenting with various DL models applied to 3D histopathological images.Additionally, new strategies will be devised and adopted to improve the resolution of the histopathological images to improve the performance of DL models.

3. 3 .
Deep Convolutional Neural Network Conventional neural network (CNN) is a type of DL architecture that has become very popular in various computer vision tasks such as image classification, face recognition, and action recognition.It is designed to extract the most significant features from any visual imagery.Basically, CNN has three main characteristics that give it an advantage in comparison to other learning algorithms.These characteristics are (a) sparse interactions, (b) parameter sharing, and (c) equivariant representations.In medical science, most researchers are using the DL-based CNN model because of its outstanding performance in image classification.Generally, CNN architecture is composed of three layers: (a) convolution layers, (b) pooling layers, and (c) fully connected layers.The convolution layers and pooling layers are essential components for feature extraction and the fully connected layers are essential for the final classifications.A standard CNN architecture consists mainly of a convolution layer and a pooling layer repeatedly, with one or more fully connected layers.The basic architecture of CNN is depicted in Figure 2.

Figure 1 .
Figure 1.Original images and the processed images.

Figure 2 .
Figure 2. The basic architecture of the convolutional neural network model.

Figure 3 .
Figure 3. Block diagram of the following methodology.
to the other models.The graphical plot of performance metrics distribution for all the proposed models is shown in Figure9.It can be observed in Figures5 and 7that our selected models performed very well on the test data set.With increasing number of epochs, the accuracy sharply increases and then saturates, which means that the model training is completed for the selected network.Moreover, from the training phase, an important conclusion is observed that the network did not have the characteristics of neither under-fitting nor over-fitting.The validation accuracy curves, and training accuracy curves are very similar.

Figure 10 .
Figure 10.Confusion matrix of NASNet-N (large) model for all magnitudes.

Table 1
Distribution of Images in BreaKHis Data Set In the present study, the mean values were [0.485, 0.456, 0.406] and the standard deviation values were[0.229,0.224,0.225].Normalization helps the network to keep the weights close to zero, which makes the network more stable for back propagation.Examples of the original and the processed images are shown in Figure1.

Table 2
and NASNet-N (large) for different magnification factors of 40× 100×, 200×, and 400× are shown in Figures 5-8.The findings of this experimental investigation suggest that ResNext-50 and DenseNet-169 convolutional networks can be used for breast cancer detection because these latest models provide very good Comparative Analysis to Detect Breast Cancer Using BreaKHis Database TOMA ET AL.
The accuracy scores obtained are comparable with other state-of-the-art studies of breast cancer histopathological image classification.The outcomes of the study are compared in Table2with previous studies reported in literature.It should be noted that even if the improvements by the proposed models may appear marginal, DL considers them significant.Because even a slight increase in accuracy has a significant impact on overall performance in the medical imaging industry.