A novel diagnostic map for computer-aided diagnosis of skin cancer

In teledermoscopy, images are transmitted through a communication channel to the medical facility for medical consultation. This yields to bandwidth congestion and consumption of large storage size which impairs the transmission of the high-resolution dermoscopy image. This study proposes a novel technique by generating a diagnostic feature map, named diagonal compressive sensing (CS) features map. This proposed map is generated using the signiﬁcant diagnostic features by aligning the feature vector in a diagonal map, which is then compressed using CS technique. Eventually, the recovered feature map at the receiver side is applied to the classiﬁcation system for decision-making. In addition, physicians at the receiver side who were trained to read the feature maps can verify the classiﬁcation decision, and then provide feedback to the patient. The results demonstrated that the proposed diagnostic map achieved less transmission time due to the small size of the feature map along with the compression process. Furthermore, the feature map has drastically improved the classiﬁcation performance metrics, including the accuracy, which increased, for example, from 88.6% to 98% at 80% compression ratio compared to the traditional method on the compressed whole image.


INTRODUCTION
Telemedicine systems have introduced a new era of medical care services for remote diagnosis and treatment especially in the rural areas that suffer from the shortage in dermatology/medical staff and long waiting queues [1][2][3]. One of the most common applications of telemedicine is teledermatology that supports long-distance consultations and diagnosis of skin conditions based on the telemedicine network infrastructure [4,5].
time-independent exchange of digital images and laboratory results between both parties, which is more flexible and timesaving. However, the SAF teledermoscopy systems require large transmission bandwidth and relatively long transmission time. Therefore, it urges the need of using compression methods for saving the bandwidth and speeding-up the transmission process to prevent connection interruption, transmission delays and deterioration in the quality of service as well as to reduce the network cost by minimising the size of the required storage systems. Recently, compressed sensing (CS) achieves both signal sampling and compression at the same time [6]. In this study, skin cancer is considered a case study of telemedicine applications. Skin lesions can be generally divided into benign and malignant lesions. Early skin cancer diagnosis increases the survival rate. For early diagnosis of suspicious lesions, several automated skin lesion classification systems that apply artificial intelligence and image processing techniques in skin lesion diagnosis have been developed [7][8][9][10][11][12]. Such classification systems require different feature sets, such as shape descriptors, morphological, colour features, texture features based on the grey-level co-occurrence matrix features, and high-level dermoscopic features. Doukas et al. [12] proposed a smartphone-based self-assessment system for lesion classification of 85%-90% accuracy to distinguish between benign and malignant lesions without using any compression technique before transmitting the feature vector (FV) to the receiving side.
In addition, the huge amount of dermoscopic images, which are transmitted for numerous numbers of patients at the same time on the same communication channel, motivated the proposed system. Thus, using a compression technique becomes essential. Furthermore, the proposed work aims mainly to achieve a new telemedicine dermoscopic images representation, which is diagnostic and can be used directly as a primary aid for the physician and cannot be identified by any hacker. Consequently, the present study proposes a fast SAF teledermoscopy system, which can be considered a computer-aided diagnostic system, for guiding the physician at the receiver end to the accurate diagnosis. This is achieved by transmitting a new proposed compressed feature map called diagonal CS features (D-CSF), which is generated from the traditional FV instead of sending the compressed whole raw images over the teledermoscopy channel using the CS technique. Accordingly, it is proposed initially to transmit the extracted FV as a diagnostic representation of the original dermoscopic image instead of sending the original dermoscopic image itself. Nevertheless, to guarantee the efficient transmission of this FV that has a very small size, it is proposed to transmit the FV in the form of a two-dimensional (2D) matrix. This 2D matrix is considered a new representation of the FV in the form of an image, which was proposed to avoid the channel noise effect on a small-sized FV, where the noise effect on the FV is greater than its effect on the larger size representation of the FV diagnostic image. Thus, the small-sized FV can be distorted greatly compared to the larger size diagnostic image. Accordingly, the proposed system achieved size reduction, bandwidth-saving, faster transmission and decreased required storage at the medical facilities' end. Thus, this proposed new representation reduces the overall system cost compared to the traditional system with a secured transmission and is only recognisable by pre-trained physicians.
The phases of the proposed system include the transformation of the traditional FV into the new proposed feature map (mapping process) to generate the D-CSF. Afterwards, the CS technique is applied for compression. The compressed D-CSF is then transmitted through the telemedicine network to the medical facility (i.e. receiver side) at which the D-CSF are recovered producing the recovered FV using convex optimisation reconstruction. Finally, the classification system will use the recovered features to identify the lesion type into benign nevus or malignant melanoma to provide distant consultation and therapy plan. The physician at the receiver side was also trained to observe the feature maps (D-CSF) and verify the classification results.
The remaining sections are structured as follows. Section 2 includes the proposed approach. Section 3 presents the results and the evaluated performance metrics of the proposed system besides a comparative study between the proposed system and the traditional teledermoscopy SAF system. Finally, Section 4 concludes the present study.

METHODOLOGY
In the proposed study, the used images at the user's end have followed the subsequent image preprocessing procedures, including hair removal using the well-known DullRazor software [13]. Afterwards, the lesion segmentation is applied using the Otsu threshold [14] as shown in Figure 1. For each single-level of wavelet four components, the average Otsu threshold is calculated to reduce the image noise followed by morphological postprocessing operations. In this study, the benign nevus and malignant melanoma classes are considered. Feature extraction techniques are employed to generate the traditional FV from the segmented image. The cumulative level-difference mean (CLDM) based on the grey-level difference method (GLDM) [14] is used to measure the texture coarseness over a given image. Then, the extracted FV is filtered to exclude the irrelevant features using the eigenvector centrality (ECFS) feature ranking and selection approach. This traditional selected FV is then mapped and aligned in a corresponding feature map, which is further reduced in size by compression using the CS technique. The compressed new mapped D-CSF is transmitted through the telemedicine network instead of transmitting the compressed original image for time-saving and more security issues. At the receiving medical facility end, the compressed mapped features are decompressed using the convex optimisation reconstruction technique before being applied to the classifier that will take the decision and determine the lesion's type.

Features extraction
In this work, an FV of the CLDM-based GLDM texture features is extracted and combined with the composite ABCD features that are mainly used for distinguishing the malignant melanoma from the benign nevus. Let the separation vector be S = (Δu, Δv) for any image I (u, v)at which the absolute greylevel difference between a pixel pair is given by The CLDM describes the lesion's texture by considering the number of counts D 0 , which represents the texture match at the given distance d and direction [15].
Additionally, in the ABCD features, each of the border B subfeatures are considered individually, which are the edge abruptness, fractal index, pigment transition, and the compact index, besides the A, C and D features, which support the classifier with deeper perception at the unique differences between the classes based on the border features. For asymmetry index (AI) calculation, the following equations are applied on the binary lesion mask: where A r is the total lesion area, ΔA 1 is the area difference between the upper and lower halves, andΔA 2 is the area difference between the right and left halves. In the present study, all the pre-mentioned ABCD features were calculated.

Feature ranking and selection
Feature ranking is applied to exclude any irrelevant features and reduce the feature space dimensionality, which reduces the classifier's complexity. The present study applied the ECFS feature ranking, where it does not need similarity values' thresholding and is computationally effective. The contribution of the present study is providing two stages of data reduction and compression procedure for efficient SAF teledermoscopy by using features selection and ranking to reduce the included features and select the significant features, transmitting the mapped features instead of the original image and using the CS technique for compression.

Proposed feature compression using CS
Generally, the CS technique comprises three main parts, namely, the measurement matrix, the sparse description of the signal, and the reconstruction algorithm. In this proposed feature map image, the features are aligned on the diagonal with zero-padding to achieve a uniform imaging structure with a reduced risk of cropping attacks. Moreover, such distributions of less dimensionality are more prone to channel noise effect. In the compression process at the transmitter, each column of the diagonal feature map represents the signal x of (N×1) dimensions, which is the data to be compressed. If the transmitted data is sparse with K non-zero entries, then it can be expressed as where Ψ represents the (N×N) basis matrix of x, while θ has N×1 dimension as it is composed of the consistent coefficients. The CS theory states that the transformation coefficient θ of a compressible signal x in an orthogonal basis Ψ is given by However, since the feature map image achieves the signal sparsity of x in the time-domain, thus, the basis matrix Ψ is replaced with the identity matrix of the same size [16].
To calculate the measurement matrix, the observation basis matrix φ is obtained using the discrete cosine transform (DCT) matrix after fulfilling the incoherence condition. Therefore, the measurement matrix A of dimension (M×N) is represented by the product of both the signal basis Ψ and the observation matrix DCT basis φ, such that M < < N. Eventually, the compressed signal can be expressed as where x ∈ R N and y ∈ R M .

Measurement matrix
The appropriate recovery of x from y depends on the welldesign of the measurement matrix (A) such that the dimensionality reduction does not damage the relevant information in the compressible signal x. Additionally, the incoherence condition states that the rows of φ cannot sparsely represent the columns of Ψ and vice versa. Hence, the measurement matrix (observation matrix) is an essential part of the compression process and its design should meet some criteria [17,18]. In this study, the DCT basis was used to obtain the measurement matrix with size M×N, such that M < < N for proper compression.

Sparse representation
Sparse representation means that the prevalent majority of the coefficients are zero and the non-zero elements in the signal are much less. For a sparse signal, the transform coefficient of it is θ, 0 < p < 2, 0 < k < < N and satisfies the following condition: where P is the norm type and k is the sparsity level of the signal. Then, the compressed mapped feature y cs is transmitted through the telecommunication channel to the receiver side, which should carry out the feature recovery process to obtain ∧ x (estimated mapped features).

Reconstruction algorithm
The signal reconstruction procedure takes the measurements in the compressed signal y, the random measurement matrix, and the basis Ψ to restructure the signal x of N×N-length or equivalently its sparse transformation coefficient vector θ. Therefore, in the proposed method, the DCT is applied to obtain the basis coefficients at the transmitter for obtaining the measurement matrix A and for further use at the receiver during the recovery process [19]. The simplest form to solve the problem and find ∧ x is to solve this optimisation problem named as o norm: x. However, it is considered an impractical solution because it is a non-convex optimisation, non-deterministic polynomial-time hardness (NPhard) problem. Consequently, in the current study, the convex optimisation is used to replace the combinatorial problem with a convex optimisation problem, which is solved using convex optimisation, such as Matlab-based modeling system for convex optimization (CVX). The convex form of the equalityconstrained problem is ∧ x = arg min ||x|| 1 s.t. y cs = .
x. At the receiver's end, the mapped features are recovered using the CVX approach according to the following steps. The superiority of the diagonal form representation is also validated in the results section. Finally, the recovered FV is then applied to the classifier to determine the corresponding lesion's class. Generally, the new compressed and recovered features are completely different than the original extracted features using the traditional method due to the mapping and compression processes.

Classification at receiver side using support vector machine (SVM)
In the current study, the classification process is performed at both the transmitter and the receiver ends to evaluate the accuracy of using the proposed feature map. The classification accuracy is measured when using the mapped diagonal features after compression and compared with the classification accuracy using the recovered diagonal features at the receiver end.

The proposed system
The proposed SAF teledermoscopy system for automated lesion classification and improved medical decisions, based on  the received-compressed-mapped diagonal feature matrix, is outlined in Figure 2.
In Figure 2, the traditional FV extraction and selection methods are applied. These features are then mapped to obtain the diagonal new D-CSF features for further compression prior to its transmission through the channel for saving the channel bandwidth, boosting the telemedicine service processing time and extending the storage capability of the resources at the receiver's side, which has been realised through applying the CS technique.
The transmission process is assumed to be through an internet channel of at least 4 Mb/s for transmission time calculations. The transmission time consumed during the sending process of the compressed feature map across the channel is calculated by dividing the compressed feature map size (bits) by the assumed channel speed (bits/s). At the receiver, the compressed D-CSF images-based features are first recovered using convex optimisation technique, CVX. Then, the diagonal vector of the D-CSF features is used in the classification process for the final decision-making. The specified decision of the classifier is fed back to the user's side, in addition to the recommended medical advice and therapy plan as assigned by the specialist. Figure 3 describes the details of the CS-based compression and reconstruction processes. Figure 3 illustrates the transformation of the extracted traditional FV into the feature map-form x in which the FV is aligned diagonally in a zero-elements matrix. Then, according to the required compression ratio (CR), which is the ratio between the size of the compressed image to the size of the uncompressed image in bytes, the dimensions of the measurement matrix φ are defined. Subsequently, the measurement matrix values are assigned using the DCT basis in the CS. The compressed signal y, which is obtained by multiplying the measurement matrix basis φ by the feature map x, is then transmitted through the channel to the receiver's side. At the receiver's end, the reconstruction algorithm applies the convex 1 norm optimisation problem based on the signal basis Ψ and the observation basis φ to solve for the estimated feature map ∧ x and eventually recover the traditional FV. In the present study, the classification process using the SVM is used to measure and evaluate the compression accuracy at different compression ratios and to validate the proposed mapped features compared to using the traditional FV as well as the original dermoscopic images.

EXPERIMENTAL RESULTS AND DISCUSSION
Skin cancer dermoscopy images are presented as a case study of our proposed concept using the new feature map-based CS technique for medical data transmission through telemedicine networks. A dataset of 60 dermoscopy images is used that is divided into 30 melanoma images and 30 nevus images from the publicly available International Skin Imaging Collaboration archive [20]. The results of the proposed modified SAF teledermoscopy system on both of the transmitter and the receiver sides were reported. The chosen dataset validates the system performance. All the results were evaluated on the system implemented using MATLAB R2017a on PC that has Intel-Core i5-2410 M 2.3 GHz processor with 8 GB of RAM running under the MSWin.7 operating system. All of the classification performance metrics in this study have been evaluated by using five-fold cross-validation.

Experiments on the transmitter side
The obtained results for the processes at the transmitter's side include extracting the traditional FV from the pre-processed image, then selecting the highest ranked features using ECFS algorithm. The ranked FV, named as the traditional FV extracted from each image, will be transformed into the corresponding feature map to obtain the D-CSF features after being compressed using the CS techniques.

Traditional FV extraction
In this study, 30 ranked features are selected including the diameter, fractal dimension, CLDM features in the four directions at inter-sample distance d = 1. In addition, the GLDM mean, entropy and contrast at two inter-sample distances (d = 1 and d = 3) in the four directions are also selected. Table 1 illustrates samples of the extracted features from the traditional FV averaged for each of the two classes under study, which shows the discrepancy in the feature values between both classes. The extracted FV is then converted into the diagonal form. The proposed feature map image corresponding to the image in (a1); (b1, b2): Images after 60% compression for (a1, a2), respectively; (c1, c2): Images after 20% compression for (a1, a2), respectively Figure 4 shows samples of the proposed feature map images represented from the corresponding dataset lesion images. Figure 4 illustrates the difference between the two classes in the feature maps images. For validating the importance of this step, a comparison is conducted between the classification performances of the two cases, namely, (i) compressing the traditional extracted FV directly using the CS technique, and (ii) compressing the proposed feature map producing the new D-CSF features (as displayed in Figure 4). The recovered FV (at the receiver side) for both cases is classified using the SVM. Figure 5 illustrates the compressed images at the transmitter side in case of transmitting the compressed version of the original image versus transmitting the compressed version of the proposed feature map corresponding to the same original image for different compression ratios. Figure 6 shows the difference in the classification accuracy percentages between using the D-CSF features and when using    Figure 6 establishes that the classification accuracy has increased using the proposed feature map compared to the case of using the traditional FV. These results established the necessity of our contribution by converting (mapping) the FV to the proposed feature map prior to the compression using the CS and transmission. Hence, the proposed feature map image carries the needed information to perform the intended classification task while optimising the bandwidth and reducing the transmission time which would be used to transfer the original image.

Experiments on receiver side
The received D-CSF vector is reconstructed using CVX to obtain the recovered skin lesion features. Then, the diagonal of the recovered mapped features is used as the modified FV for further classification process to determine the lesion's class. The proposed SAF teledermoscopy diagnostic system was first validated in terms of the classification performance metrics using the recovered D-CSF features reconstructed at the receiver side.
The main goal of the classifier is to decide whether the lesion represented by the recovered D-CSF features can be correctly used to classify the benign nevus and the malignant melanoma. The overall accuracy of the SVM is estimated. Table 2 shows the classification accuracy of the recovered FV using several kernel types of SVMs. Table 2 shows that the classification of the recovered transmitted D-CSF features achieved 93.2% accuracy at 20% compression using medium Gaussian, coarse Gaussian and linear SVMs. In addition, the accuracy reached 90.9% also for 20% compression using quadratic and cubic SVMs. In the remaining classification results, the linear SVM is applied due to its simplicity and acceptable performance. Figure 7 shows the performance metrics of the classification process using the linear SVM by measuring the accuracy, specificity, sensitivity, and F-measure at the receiver side at 80%, 40%, 60%, and 20% CRs using the proposed system. Figure 8 illustrates the receiver-operating characteristics (ROC) curves for the proposed system using linear SVM at both 80% and 20% compression ratios. Figure 8 validates the results illustrated by Figure 7, where at 80% CR, the ROC is near-optimal having an area under the curve equal to 1 unlike the case of the 20% CR, in which the performance deteriorated to 0.97.

Comparative study and discussion
In this section, the proposed system's performance is numerically validated and outperforms the traditional image transmission system performance. The traditional system transmits the original images instead of transmitting the mapped features (D-CSF features) of the images as proposed. In the traditional system, the image is compressed before being transmitted to the receiver, and then the original image recovered for any further processing. However, our proposed system proved its superiority compared to the traditional image transmission system in terms of classification accuracy, transmission time, compression and recovery time. Figure 9 shows the results of the CS when applied to the original images from both classes of the used dataset after recovery at 60% and 20% CRs. Figure 9 reveals that increasing the compression leads to decreasing the system classification capability due to the deterioration of the image quality. The present study focuses on using the same compression method for two different types of systems to show the improvement in classification accuracy and time metrics. Figures 10 and 11 compare the proposed SAF teledermoscopy image classification system and the traditional system in terms of classification performance metrics for 80% and 20% CRs, respectively.
From Figures 10 and 11, it can be seen that the proposed teledermoscopy image classification system outperforms the traditional system in terms of classification metrics at different RCs. Figure 12 compares the proposed system and the traditional   system in terms of the required data transmission time. Figure 12 shows the variation in the required transmission time for the data transfer in the telemedicine network in case of the traditional system in which the data transfer refers to the lesion's image against the proposed system in which the lesion's data compressed D-CSF features is the only transmitted data. The transmission time is calculated by dividing the data transmission size by the assumed internet channel speed of 4 Mb/s. Figure 12 represents the data transmission times for both systems, which is calculated by dividing the size of the transmitted images in case of the traditional system or the D-CSF in the case of the proposed system by the assumed average transmission channel connection speed of 4 Mbps. Obviously, the proposed system has further less transmission times than the case of the traditional system, which are estimated to be in milliseconds as shown in Table 3.
The reduction in the image size reflects the calculated values in Table 3 after dividing the aforementioned sizes by the assumed internet channel speed of 4 Mb/s. In addition, it was also observed that the transmission size of the proposed feature map (in bytes) at any CR is approximately 0.057% of the transmitted image's size at the same CR. As an example, at 80% compression, an image to be transmitted using the traditional system occupied a size of 417,792 bytes, however, its corresponding transmission in the proposed system represented by the compressed feature map at the same compression ratio of 80% comprised only a size of 240 bytes. Certainly, this reduction in the transmission size extends the utilisation of the storage devices required at the receiver's side. Figure 13 shows the distinction in the consumed compression time prior to data transfer in the telemedicine network in case of the traditional system in which the compression is applied on the lesion image versus the proposed system in which the lesion data feature map is the only compressed. In addition, the time reduction is also achieved with the recovery time consumed at the receiver prior to the classification process as shown in Figure 14, which is also essential in the overall processing time reduction. Figure 13 proves that the compression time along with the transmission time has reduced drastically according to the change in the form of the processed data. Figures 12 through  14 confirm the dominance of the proposed system in terms of overall process time including the data transmission, data compression and recovery times, which ensure a fast reliable service of telemedicine, generally, as applied on the SAF teledermoscopy system. The proposed system saves the system resources by efficient utilisation of the channel bandwidth and the required capacity of the storage systems, which allows also for extended utilisation, thus, reducing the overall system cost. The experimental results proved that using the D-CSF features is efficient in the classification process even on the compressed images. In the current study, the classification process is performed at both sides (transmitter and receiver) to ensure the robustness of the classification process using the new features. From the preceding results, the newly proposed diagnostic map system proved its superiority as it requires less transmission time, less required bandwidth, and high classification performance. Also, by taking the opinion of dermatology experts in the hospital, who appreciate the newly generated diagnostic map, it is recommended to evaluate the proposed system in a real medical environment.

CONCLUSION
Telemedicine systems provide remote healthcare services and teleconsultations for distant patients over telecommunications channel. The SAF systems allow asynchronous communication where the patients can send their medical laboratory results to the medical experts in addition to exchanging of digital images, such as dermoscopy images along with the medical symptoms. The proposed study provided a fast, reliable, and accurate SAF teledermoscopy system with reduced overall processing time, besides better channel bandwidth utilisation and reduced storage requirement at the medical facility side along with higher classification accuracy. The main novelty of the proposed SAF teledermoscopy system is the transmission of the compressed new D-CSF features map instead of transmitting the compressed dermoscopy images. The results proved that using the D-CSF features is superior to using and transmitting the traditional FV without being aligned in the proposed features map. The compression performance is validated by measuring the classification accuracy at both the transmitter and the receiver sides after recovery. Thus, in the proposed system, these advancements are realised by introducing a new technique in data transmission based on the transmission of the compressed extracted features which is compressed using the CS technique instead of the transmission of the whole high-resolution image. Then, the received FV is recovered using CVX, which reconstructs the estimated features. Finally, the restored FV is applied to the classification system to determine the skin lesion class so that the medical consultation decision along with the therapy plan can be fed back to the patient immediately. The system performance was validated using a skin lesion dataset. Also, a comparative study was conducted between the proposed system and the traditional system in which the whole lesion image is transmitted. From that comparison, the classification accuracy, sensitivity and specificity for the proposed system achieved 98%, 95% and 100%, respectively, compared to 88.6%, 86.3% and 90.9% for the traditional system at 80% CR using linear SVM. Also, at 20% CR, the performance metrics of accuracy, sensitivity and specificity for the proposed system classification achieved 93%, 91% and 95%, respectively, compared to 88.6%, 81.8% and 95.3% for the traditional system. Besides, in terms of data transmission size, the compressed feature map size at any CR is approximately equal to 0.057% of the compressed transmitted image at the same CR, which eventually reduces the overall processing time along with the required storage size. In our intended future study, we aim to extend the diagnostic feature map concept to different imaging modalities such as confocal laser scanning microscopy, which allows for a 3D reconstruction of the scanned skin lesion, magnetic resonance imaging, computed tomography, and other techniques. These techniques inherently produce large size images, and thus our proposed concept is expected to efficiently reduce their transmission size and improve the classification performance metrics. Moreover, the newly generated dermoscopic features map guarantees the robustness of the transmitted map against any unknown hacker who is unfamiliar with this new representation, which added another characteristic to the proposed system, namely, being robust against any hackers. Also, the compression step added more unclear representation to the transmitted map which complicates its recovery by any hacker who learned the map. Subsequently, it is recommended as a future study to investigate different feature distributions and compare them to the diagonal distribution in terms of their immunity to the different security attacks and their classification performance.