Ultrasound methods to distinguish between malignant and benign adnexal masses in the hands of examiners with different levels of experience
Article first published online: 8 SEP 2009
Copyright © 2009 ISUOG. Published by John Wiley & Sons, Ltd.
Ultrasound in Obstetrics & Gynecology
Volume 34, Issue 4, pages 454–461, October 2009
How to Cite
Van Holsbeke, C., Daemen, A., Yazbek, J., Holland, T. K., Bourne, T., Mesens, T., Lannoo, L., De Moor, B., De Jonge, E., Testa, A. C., Valentin, L., Jurkovic, D. and Timmerman, D. (2009), Ultrasound methods to distinguish between malignant and benign adnexal masses in the hands of examiners with different levels of experience. Ultrasound Obstet Gynecol, 34: 454–461. doi: 10.1002/uog.6443
- Issue published online: 29 SEP 2009
- Article first published online: 8 SEP 2009
- Manuscript Accepted: 19 FEB 2009
- Swedish Medical Research Council. Grant Number: K2006-73X-11605-11-3
- ovarian neoplasms;
- pattern recognition;
- risk of malignancy;
- statistical models;
- subjective impression;
To determine the effect of an ultrasound training course on the performance of pattern recognition when used by less experienced examiners and to compare the performance of pattern recognition, a logistic regression model and a scoring system to estimate the risk of malignancy between examiners with different levels of experience.
Using ultrasound images of selected adnexal masses, two trainees classified the masses as benign or malignant by using pattern recognition both before and after they had attended a theoretical gynecological ultrasound course. They also classified the masses by using a logistic regression model and a scoring system, but only after they had attended the course. The performance of these three methods when they were used by the trainees was then compared with that when they were used by experts.
One hundred and sixty-five adnexal masses were included, of which 42% were malignant (21% invasive tumors and 21% borderline tumors). The area under the receiver–operating characteristics curve of pattern recognition when used by the trainees was similar before and after they had attended the course. Training decreased sensitivity (84% vs. 70% for Trainee 1, P = 0.004; 70% vs. 61% for Trainee 2, P = 0.058) and increased specificity (77% vs. 92% for Trainee 1, P = 0.001; 89% vs. 95% for Trainee 2, P = 0.058). The performance of pattern recognition was poorer in the hands of the trainees than in the hands of the experts. The sensitivities of the logistic regression model were 70% and 54% for the trainees vs. 83% for an expert (P = 0.020 and < 0.001, respectively) and the specificities were 84% and 94% vs. 89% (P = 0.25 and 0.59, respectively). The sensitivities of the scoring system were 59% and 54% for the trai-nees vs. 75% for the expert (P = 0.002 and < 0.001, respectively), and the specificities were 90% and 93% vs. 85% (P = 0.103 and 0.008, respectively).
Theoretical ultrasound teaching did not seem to improve the performance of pattern recognition in the hands of trainees. A logistic regression model and a scoring system to classify adnexal masses as benign or malignant perform less well when they were used by inexperienced examiners than when used by an expert. Before using a model or a scoring system, experience and/or proper training are likely to be of paramount importance if diagnostic performance is to be optimized. Copyright © 2009 ISUOG. Published by John Wiley & Sons, Ltd.