Funding sources This project was supported by the Technology Strategy Board under the grant number TP/6/ICT/6/S/K1524H.
Clinical and Laboratory Investigations
Incorporating clinical metadata with digital image features for automated identification of cutaneous melanoma
Article first published online: 31 OCT 2013
© 2013 British Association of Dermatologists
British Journal of Dermatology
Volume 169, Issue 5, pages 1034–1040, November 2013
How to Cite
Liu, Z., Sun, J., Smith, M., Smith, L. and Warr, R. (2013), Incorporating clinical metadata with digital image features for automated identification of cutaneous melanoma. British Journal of Dermatology, 169: 1034–1040. doi: 10.1111/bjd.12550
Conflicts of interest None declared.
- Issue published online: 31 OCT 2013
- Article first published online: 31 OCT 2013
- Accepted manuscript online: 31 JUL 2013 10:52PM EST
- Manuscript Accepted: 24 JUN 2013
- Technology Strategy Board. Grant Number: TP/6/ICT/6/S/K1524H
Computer-assisted diagnosis (CAD) of malignant melanoma (MM) has been advocated to help clinicians to achieve a more objective and reliable assessment. However, conventional CAD systems examine only the features extracted from digital photographs of lesions. Failure to incorporate patients' personal information constrains the applicability in clinical settings.
To develop a new CAD system to improve the performance of automatic diagnosis of melanoma, which, for the first time, incorporates digital features of lesions with important patient metadata into a learning process.
Thirty-two features were extracted from digital photographs to characterize skin lesions. Patients' personal information, such as age, gender and, lesion site, and their combinations, was quantified as metadata. The integration of digital features and metadata was realized through an extended Laplacian eigenmap, a dimensionality-reduction method grouping lesions with similar digital features and metadata into the same classes.
The diagnosis reached 82·1% sensitivity and 86·1% specificity when only multidimensional digital features were used, but improved to 95·2% sensitivity and 91·0% specificity after metadata were incorporated appropriately. The proposed system achieves a level of sensitivity comparable with experienced dermatologists aided by conventional dermoscopes. This demonstrates the potential of our method for assisting clinicians in diagnosing melanoma, and the benefit it could provide to patients and hospitals by greatly reducing unnecessary excisions of benign naevi.
This paper proposes an enhanced CAD system incorporating clinical metadata into the learning process for automatic classification of melanoma. Results demonstrate that the additional metadata and the mechanism to incorporate them are useful for improving CAD of melanoma.
Malignant melanoma (MM) accounts for 75% of the mortality caused by skin cancers, and has been increasing rapidly among the white population during the last few decades. According to the U.K. National Health Service (NHS), registered cases of melanoma have increased approximately fivefold from around 3·2 per 100 000 inhabitants in 1975 to 17·2 per 100 000 in 2010 (Fig. 1·3 in Cancer Research U.K.), and this U.K. statistic to some extent reflects the incidence trend of MM in other Western countries.
Although MM is the most fatal form of skin cancer, it is known that the 5-year survival rate of patients with melanoma can be > 95% if the lesion is detected and removed before metastasis. Therefore, early detection and surgical removal of thin lesions is vital for successful treatment of melanoma. Currently, human visual examination still plays a dominant role in the clinical diagnosis of MM. However, human assessment is rarely repeatable and varies greatly due to the experience and expertise of individual physicians. Monitoring the progress of melanoma is equally subjective. It has been reported that diagnostic accuracy can vary from 56% for an untrained general practitioner to a little over 80% for an experienced plastic surgeon. Therefore, many dermatologists have advocated advanced computing and imaging technologies to develop some form of computer-assisted diagnosis (CAD) for an early and objective detection of melanoma.
A standard CAD system proposed for melanoma normally consists of the following components: data acquisition, data preprocessing (artefact removal, image enhancement, lesion segmentation etc.), feature extraction and a classification strategy. In particular, feature extraction and classification are the two most investigated and important steps that greatly influence the performance of a computer-based diagnostic system. This leads to two active research directions in this field: that of feature-orientated CAD systems and classifier-orientated CAD systems.
Feature-orientated CAD systems focus on extracting discriminative lesion features such as asymmetry, border irregularity, colour variegation and large dimension (ABCD). These are mainly two-dimensional (2D) geometric and colour features, which are frequently used in clinics to describe differences between MM and benign naevi.[5-7] Recently, with the development of imaging techniques, feature extraction has been extended to three-dimensional (3D) space, to provide extra information for a more comprehensive description of cutaneous lesions.[8, 9] On the other hand, classifier-orientated CAD systems are normally developed to combine multiple digital features efficiently, while eliminating redundant and unreliable information, based on machine learning and data mining algorithms.
Although all previous CAD systems have attempted to use advanced mathematical algorithms to extract and combine informative digital features for distinguishing MM from benign naevi, they surprisingly have not considered the incorporation of any relevant patient information – such as age, gender and disease history – into the diagnostic systems. However, experienced dermatologists always take all of this personal information into account when making a primary assessment. For example, it is generally believed that the risk of progression to melanoma increases substantially if there is a personal or family history of skin cancer. Furthermore, lentigo maligna melanomas occur usually in older people. The absence of patient-related information (referred to as ‘metadata’ hereafter) makes existing CAD systems less trusted than experienced dermatologists.
This paper proposes a new CAD system, which, for the first time, integrates both traditional digital features and clinically important metadata into a completely automatic classification process. The experimental results demonstrate the diagnostic significance of the metadata for achieving a highly trustworthy approach to aid melanoma recognition. Therefore, this work brings a new direction in this field and provides significant improvement to existing feature-orientated and classifier-orientated CAD systems.
Materials and methods
Data collection and experimental database
The 305 cutaneous lesions used in this study were captured using a Skin Analyser device from 265 patients during routine clinical examination at the Pigmented Lesion Clinic, Frenchay Hospital, North Bristol NHS Trust, U.K., between January 2009 and July 2012.
The Skin Analyser device, developed at the Machine Vision Laboratory, University of the West of England, Bristol, applies a six-light-source photometric stereo technique for acquisition of both 2D reflectance images and 3D geometric information (surface orientation and depth) of the skin lesion. This handheld device is designed with an effective field of view of 32·5 × 26 mm and a depth of field of 8 mm, which enables it to cover most skin lesions seen in clinics. The images output from the device offer a pixel resolution of 1280 × 1024.
All of the 305 skin lesions included in this study had been excised, and diagnosis was established by histopathology, identifying 79 MMs and 226 atypical melanocytic naevi, as shown in Table 1. Table 2 summarizes the stages of the 79 MMs based on the widely accepted American Joint Committee on Cancer guidelines. It is worth noting that early melanomas (MM in situ and MM with Breslow thickness < 1 mm) account for 76% of the entire MM data. This illustrates that the present study concentrates primarily on the differentiation between early melanomas and excised atypical melanocytic naevi, which normally pose great difficulty in clinical diagnosis.
|Benign melanocytic naevi||226|
|MM Stage||Number||Breslow thickness, mm (mean ± SD)|
|Tis (MM in situ)||23||–|
|T1 (< 1 mm)||37||0·54 ± 0·22|
|T2 (1–2 mm)||15||1·37 ± 0·27|
|T3 (2–4 mm)||4||2·69 ± 0·41|
|T4 (> 4 mm)||0||–|
Cutaneous lesions in this study were examined by five board-certificated dermatologists, all with more than 10 years' clinical experience in dermatology.
During clinical assessment, cutaneous lesions were firstly directly evaluated by dermatologists on the basis of macroscopic visual inspection. Patients were then sent for dermoscopy examination, under which specific morphological characteristics of malignancy (e.g. blue–white veil) can be detected. Associating the macroscopic evaluation and dermoscopic criteria,[12-15] cutaneous lesions prior to excisional biopsy were classified into three groups: malignant, suspicious and benign.
All of the cutaneous lesions in this study were excised and diagnosed by histology as well. Considering the practical management of clinical settings, and variation in the approach to interpretation among expert pathologists, our experiment was limited to analysis from one pathologist, whose evaluation is taken as a diagnostic ground truth.
The physician's diagnosis before biopsy was deemed to be correct if the category of primary assessment agreed with that of the pathological report, for example when a lesion was first identified as a benign Spitz naevus and pathology later showed it to be a dysplastic naevus, also benign.
Digital features and clinical metadata
Before digital feature extraction, a few preprocessing steps were applied to remove skin hairs and imaging noise,[16, 17] and to isolate cutaneous lesions from surrounding normal skin automatically. Digital features were then extracted for the entire set of skin lesion data, using the existing computer-based analytical algorithms.[19-21] As listed in Table 3, they include four morphological features described by the ABCD rules, texture information (skin line patterns) and 3D features (3D curvature). All the digital features were normalized using a z-score transformation to guarantee that 99% of elements for each feature were in the range of 0–1. This circumvents the problem of features with large ranges dominating the calculation, and makes all the digital features contribute comparably in the classification step.
|Digital feature||Description||Number of images|
|Asymmetry||Asymmetric degree along a pair of reflective symmetry axes, calculated from a global point signature-based descriptor. Features are extracted from MI × 2, EI × 2||4|
|Border irregularity||Four border irregularity features based on a centroid distance curve, including the relative ratio of the area under the curve, difference between the maximum and minimum distance, SD of the distance curve, and the maximum magnitude corresponding to the nonzero frequency element in the curve after discrete Fourier transformation||4|
|Colour variegation||Absolute and relative colour descriptors calculated from the second moments to characterize the colour variegation in a greyscale image. Features are extracted from MI × 2, EI × 2, R × 2, B × 2, G × 2||10|
|Dimensions of lesion||Diameter and area of lesion||2|
|Skin line patterns||Local line directions and local line variations across isolated skin lesions in a greyscale lesion image. Features are extracted from R × 2, G × 2, B × 2, bumpmap × 2||8|
|Three-dimensional curvature||Three-dimensional differential forms derived from the second fundamental matrix to characterize the topography of cutaneous lesions. This method generates four features, including mean and SD values of the principal curvatures, and the first two eigenvalues of the fundamental matrix||4|
Three types of clinical metadata were considered in this study, namely the patient's age, gender and lesion site. According to the clinical nature of skin lesions, metadata can be described as either continuous or discrete information. As demonstrated in Table 4, age (continuous metadata) is quantized into six intervals (< 31, 31–40, 41–50, 51–60, 61–70 and > 70 years). This quantization is defined based on the incidence statistics provided by the NHS: few cases of melanoma are diagnosed in patients < 31 years old, but the incidence rate increases steadily with age and reaches a peak at the age of 60–70 years (Fig. 1·2 in Cancer Research U.K.). On the other hand, gender and lesion site (discrete metadata) are classified into two categories (male, female) and four categories (head and neck, front body, back body, extremities), respectively.
|Metadata||Malignant melanomas, n||Melanocytic benign naevi, n|
|Head and neck||9||32|
|Front of the body||7||30|
|Back of the body||16||69|
Apart from the separated metadata stated above, combined metadata are also considered in this study to investigate the joint effect of the metadata in melanoma progression. The different combinations of metadata are listed in Table 5.
|Metadata||Number of metadata intervals (continuous) or categories (discrete)|
|M3: lesion site||4|
|M4: age + gender||6 × 2|
|M5: age + lesion site||6 × 4|
|M6: gender + lesion site||2 × 4|
Integrating digital features and metadata
As stated earlier, we intended to incorporate clinically important metadata and digital features of cutaneous lesions into a completely automatic CAD system. This is based on the hypothesis that MM and benign lesions can be differentiated more accurately and reliably when fusing clinical metadata into the learning process.
In the present study, this new CAD system is achieved by using an extended Laplacian eigenmap, which is a dimensionality-reduction algorithm capable of grouping skin lesion data with similar characteristics into closer clusters. The configuration of the proposed CAD system is illustrated in Figure 1. A skin lesion image is connected by a weighted edge with its neighbouring lesion images having similar digital features. The similarity of the digital features is quantified through the heat kernel (a similarity measurement), and this value is assigned as the edge weight between two connected lesion data. As the heat kernel is a monotonically decreased function, images with high similarity result in large edge weights, while images with less similarity give rise to smaller edge values. Two disconnected images are assigned zero weight between them.
Metadata variables are represented by the support nodes in Figure 1. The number of support nodes in the graph depends on the number of intervals or categories of different metadata. Each lesion image then connects an extra weighted edge to these support nodes to specify its metadata class. For discrete metadata, the edge weight between lesion data and support nodes can be expressed in a binary way: 1 (connected) or 0 (disconnected). For continuous metadata, the edge weight is decided based on the difference between the metadata of a specific lesion image and the metadata variable representing a support node.
Finally, image-based weights and metadata-based weights are combined to construct an extended Laplacian matrix. The first few eigenvectors with nonzero eigenvalues of the matrix project digital features and metadata into a lower dimensional space, while preserving the local connections of the graph, as indicated in Figure 1. This configuration associates digital features with clinically important metadata, and attempts to place lesions with similar digital features and the same metadata class close together in the low-dimensional embeddings. As a result, lesions with analogous information (both digital features and metadata) have a large probability of being automatically classified into the same cluster, as MM or benign cutaneous lesions.
In the experiment, the first eight eigenvectors of the extended Laplacian matrix with nonzero eigenvalues were chosen as the diagnostic descriptors for classification. A support vector machine was utilized as the classifier, and a tenfold cross validation was employed as the training–testing strategy. The whole program was executed 50 times and the average statistics recorded as final results.
Receiver operating characteristic (ROC) analysis was applied to investigate the sensitivity and specificity of the classification performance for differentiating MMs from benign naevi. The area under the ROC curve was also calculated, with a confidence interval of 95%.
To estimate further the usefulness of the metadata, an unpaired t-test was executed between the distribution output by classification with the digital feature alone, and that calculated by the classification incorporating different metadata. The P-value was used to estimate the distribution difference, with P < 0·001 considered statistically significant.
Diagnostic accuracy with and without clinical metadata
As shown in Table 6, the diagnostic accuracy significantly increases when the appropriate metadata [e.g. patient's age (M1), lesion site (M3), patient's age + lesion site (M5)] are imported as additional information into the learning process, as opposed to the classification using digital features only (sensitivity 82·1%, specificity 86·1%). Considering that the most important objective of melanoma diagnosis is to maximize the identification capability of malignant lesions, incorporating the patient's age (M1) as metadata was deemed the best feature combination, giving 95·2% sensitivity, 91·0% specificity and 92·1% overall diagnostic accuracy.
|Metadata||SE (%)||SP (%)||Acc. (%)||AUC||P-value|
Figure 2 shows how the patient's age can be useful in the classification process. For visualization purposes, only the first two eigenvectors with nonzero eigenvalues in the Laplacian matrix were used to construct a 2D embedding of cutaneous lesion features. An improved separation between MM and benign naevi can be observed in Figure 2b, compared with that in Figure 2a. This improvement is especially obvious near the hyperplane, where the scattered misclassified cutaneous lesions in Figure 2a become compact and close to the hyperplane, after age was incorporated as metadata. There are 43 elements misclassified if only digital features are used, while this decreases to 32 when digital features are combined with the patient's age as metadata.
Assessment of clinical diagnostic accuracy
Table 7 presents the comparison between clinical evaluation on the basis of macroscopic and dermoscopic criteria by experienced dermatologists, and histological examination after excision biopsy. If those lesions reported as ‘suspicious’ are considered to be a misdiagnosis of melanoma, the dermatologists differentiating MM from benign naevi have 70% (55/79) sensitivity and 86·7% [(54 + 142)/226] specificity. On the other hand, if those lesions reported as ‘suspicious’ are regarded as having a correct diagnosis of melanoma, the sensitivity rises to 90% [(55 + 16)/79] and the specificity drops to 62·8% (142/226).
|Malignant melanomas (n = 79)||Benign melanocytic naevi (n = 226)|
Differentiation of malignant melanoma from other melanocytic naevi
The above experiment demonstrated that the incorporation of age (M1) could generally improve the sensitivity and specificity of classification. Therefore, we further investigated which subtypes of skin lesions led to this improvement.
The results in Table 8 demonstrate that halo naevi, blue naevi and T2 MMs (1–2 mm thick) have high diagnostic accuracy (> 90%), both with and without metadata. The classification rates are greatly enhanced, by 4–25%, and reach 83·0–100% for early melanoma (MM in situ, MM < 1 mm), T3 MMs (2–4 mm), Spitz naevi, congenital naevi, acquired naevi and dysplastic naevi, when the patient's age is applied as metadata in the classification.
|Without metadata (%)||With age (M1) as metadata (%)||P-value|
|Tis (MM in situ)||73·9||87·0||< 0·001|
|T1 (< 1 mm)||83·8||97·3||< 0·001|
|T2 (1–2 mm)||93·3||100||< 0·001|
|T3 (2–4 mm)||75||100||0·08|
|Melanocytic naevi||86·1||91·0||< 0·001|
|Acquired naevi||88·1||91·7||< 0·001|
|Dysplastic naevi||75·5||83·0||< 0·001|
|Spitz naevi||87·0||100||< 0·001|
|Congenital naevi||87·5||100||< 0·001|
Our experimental results indicate that CAD of melanoma can be enhanced by incorporating appropriate patient metadata along with conventional digital lesion features. After incorporating the patient's age (M1) as metadata, a marked improvement in terms of the correct diagnostic rate was observed (Table 8), especially for early melanomas (MM in situ, MM < 1 mm thick), and congenital naevi and Spitz naevi in the benign melanocytic category. This illustrates that although early melanomas may have similar digital features to some types of atypical melanocytic naevi, the nature of the lesions can be entirely different and distinguishable. This property can be utilized for increasing automatic melanoma diagnostic accuracy by applying the CAD system proposed here.
It should be noted that while some metadata improve the automatic recognition rate of melanoma, other forms of metadata do not show such diagnostic effectiveness. The usefulness of different forms of metadata is inherently associated with the complex pathological causes of skin lesions. As a result, melanoma can be recognized with high confidence if highly correlated metadata (e.g. the patient's age) are considered in the classification. However, less well-correlated metadata (e.g. the patient's gender) will not contribute, or may even reduce diagnostic accuracy, if they are incorporated alone with digital features. Even so, the sensitivity derived from the combination of gender and lesion site (M6) is clearly increased by 2–6% compared with the results obtained by incorporating these two types of metadata individually. Therefore, combining two forms of metadata may have a joint effect on the indication of the presence of cutaneous lesions. This is consistent with previous findings claiming that the distribution of MM on parts of the body varies by gender. For example, it was reported that > 40% of melanomas in men arise on the trunk of the body, while the most common site of melanoma in women is the legs (Fig. 1·8 in Cancer Research U.K.). Hence, a two-layered support node model combining two forms of metadata generates a joint strength for the CAD of melanoma.
In clinical diagnosis, dermatologists tend to overestimate the likely malignancy of skin lesions in order to maximize their chances of identifying MMs and to avoid omitting cases of MMs. This gives rise to the involved risk of misdiagnosis. In this study many benign naevi were misclassified as MMs or suspicious lesions (Table 7). If the presence of suspicious lesions is supposed to indicate a correct diagnosis of melanoma, clinical assessment and computer-based analysis have achieved a comparable range of sensitivity for excised cutaneous lesions (89·7% vs. 81·9–95·2%, Table 6), whereas examination by dermatologists presents much lower specificity than CAD (62·8% vs. 85·9–91·0%, Table 6).
Furthermore, it is worth noting that the absence of a few metadata, due to the nature of clinical data collection, results only in disconnections between particular data points and their corresponding support nodes, and does not change the global structure of the proposed CAD system. Therefore, this new CAD system can deal with incomplete metadata, making it especially meaningful for clinical applications.
In conclusion, this paper proposes an innovative CAD system to enhance the diagnostic accuracy of melanoma by incorporating clinical metadata with digital image features. Following the present study, we plan to extend the two-layered support node model to a multilayered support node model (Fig. 1), which will provide a joint effect for multiple metadata (more than two) for automatic diagnosis of melanoma. In addition, the benefits of introducing clinical metadata into a CAD system of melanoma will be further tested on a larger number of experimental data captured by different data acquisition settings.
- 2Cancer Research U.K. Skin cancer incidence statistics. Available at http://info.cancerresearchuk.org/cancerstats/types/skin/incidence (last accessed 2 September 2013).
- 7Analysis of pigmented skin lesion border irregularity using the harmonic wavelet transform. In: 13th International Machine Vision and Image Processing Conference. Dublin: Institute of Electrical and Electronic Engineers, 2009; 18–23., , et al.
- 93D visualisation of skin lesions in photogrammetry. J Invest Dermatol 2007; 127 (Suppl. 2):S19., , et al.
- 12ABCD rule of dermoscopy: a new practical method for early recognition of melanoma. Eur J Dermatol 1994; 4:521–7., , .
- 17Bilateral filtering for gray and color images. In: Proceedings of the 1998 IEEE International Conference on Computer Vision. Bombay: Institute of Electrical and Electronic Engineers, 1998; 839–46., .
- 20Computer-aided diagnosis of melanoma – a photometric stereo based approach. PhD thesis, University of the West of England, Bristol, 2010..
- 25Innovative lesion modelling for computer-assisted diagnosis of melanoma. PhD thesis, University of the West of England, Bristol, 2012..