Background and Objectives
A probability-based, robust diagnostic algorithm is an essential requirement for successful clinical use of optical spectroscopy for cancer diagnosis. This study reports the use of the theory of relevance vector machine (RVM), a recent Bayesian machine-learning framework of statistical pattern recognition, for development of a fully probabilistic algorithm for autofluorescence diagnosis of early stage cancer of human oral cavity. It also presents a comparative evaluation of the diagnostic efficacy of the RVM algorithm with that based on support vector machine (SVM) that has recently received considerable attention for this purpose.
Study Design/Materials and Methods
The diagnostic algorithms were developed using in vivo autofluorescence spectral data acquired from human oral cavity with a N2 laser-based portable fluorimeter. The spectral data of both patients as well as normal volunteers, enrolled at Out Patient department of the Govt. Cancer Hospital, Indore for screening of oral cavity, were used for this purpose. The patients selected had no prior confirmed malignancy and were diagnosed of squamous cell carcinoma (SCC), Grade-I on the basis of histopathology of biopsy taken from abnormal site subsequent to acquisition of spectra. Autofluorescence spectra were recorded from a total of 171 tissue sites from 16 patients and 154 healthy squamous tissue sites from 13 normal volunteers. Of 171 tissues sites from patients, 83 were SCC and the rest were contralateral uninvolved squamous tissue. Each site was treated separately and classified via the diagnostic algorithm developed. Instead of the spectral data from uninvolved sites of patients, the data from normal volunteers were used as the normal database for the development of diagnostic algorithms.
The diagnostic algorithms based on RVM were found to provide classification performance comparable to the state-of-the-art SVMs, while at the same time explicitly predicting the probability of class membership. The sensitivity and specificity towards cancer were up to 88% and 95% for the training set data based on leave- one-out cross validation and up to 91% and 96% for the validation set data. When implemented on the spectral data of the uninvolved oral cavity sites from the patients, it yielded a specificity of up to 91%.
The Bayesian framework of RVM formulation makes it possible to predict the posterior probability of class membership in discriminating early SCC from the normal squamous tissue sites of the oral cavity in contrast to dichotomous classification provided by the non-Bayesian SVM. Such classification is very helpful in handling asymmetric misclassification costs like assigning different weights for having a false negative result for identifying cancer compared to false positive. The results further demonstrate that for comparable diagnostic performances, the RVM-based algorithms use significantly fewer kernel functions and do not need to estimate any hoc parameters associated with the learning or the optimization technique to be used. This implies a considerable saving in memory and computation in a practical implementation. Lasers Surg. Med. 36:323–333, 2005. © 2005 Wiley-Liss, Inc.