- Top of page
- Supporting Information
An adnexal tumor may be found in women presenting with gynecological complaints or it may be an incidental finding. The finding of an adnexal mass usually raises anxiety because of the possibility of malignancy. Imaging methods, particularly ultrasound imaging, are almost always used to determine the nature of a mass1. In the hands of an experienced ultrasound examiner subjective evaluation of ultrasound findings, i.e. pattern recognition, is an excellent method for discrimination between benign and malignant masses2, 3, but pattern recognition may also be used to make a specific diagnosis, for example dermoid cyst, endometrioma or hydrosalpinx4. In the original publication in which the term pattern recognition was first mentioned, it was used to describe subjective evaluation of gray-scale ultrasound findings4. However, the term has now been extended to include subjective evaluation of Doppler ultrasound findings3, even though it has been shown that subjective evaluation of Doppler ultrasound findings adds very little to subjective evaluation of gray-scale ultrasound findings4.
A correct indication of the nature of an adnexal mass is important for choosing appropriate treatment. Some masses are probably best treated expectantly, if they do not cause any symptoms (e.g. functional cysts) and are not associated with reproductive dysfunction (e.g. hydrosalpinx and uterine myoma), some are possibly best treated expectantly if they do not cause symptoms (e.g. small dermoid cysts5), others might be treated by cyst puncture (e.g. peritoneal cysts6), and others by surgery (e.g. borderline tumors and invasive malignancies).
The aim of this study was to determine the sensitivity and specificity of subjective evaluation of gray-scale and Doppler ultrasound findings (here called pattern recognition) when used by experienced ultrasound examiners with regard to making a specific diagnosis in adnexal masses, and to determine which histological diagnoses are most likely to be confused with each other at ultrasound examination.
- Top of page
- Supporting Information
We used the information in the database of the International Ovarian Tumor Analysis (IOTA) study Phase 1. Phase 1 of the IOTA study is a multicenter study including nine European ultrasound centers. It was approved by the local ethics committees and has been described in detail in another publication7. All participants gave written informed consent to participate. The design of the study is outlined only briefly below.
Women with at least one adnexal mass were recruited into the study from 1 June 1999 to 30 June 2002. They were examined with gray-scale and color Doppler ultrasonography by experienced ultrasound examiners who used high-end ultrasound systems equipped with high-frequency vaginal probes, a standardized examination technique, and standardized terms and definitions8. About 80% of the examinations were carried out by examiners defined as Level III examiners using the terminology of the European Federation of Ultrasound in Medicine and Biology, i.e. the examiners worked in tertiary referral centers, had an academic record, and a high level of experience and expertise9. Moreover, all examiners had a special interest in ultrasound diagnosis of adnexal masses, and some had performed up to 16 000 gynecological ultrasound examinations by the start of the study.
All women underwent a transvaginal scan; transabdominal sonography was added when large masses could not be fully visualized via the transvaginal route. On the basis of subjective evaluation of gray-scale and color Doppler findings (pattern recognition4, 10) the ultrasound examiner classified each mass as: certainly benign, probably benign, difficult to classify as benign or malignant (complete uncertainty), probably malignant or certainly malignant. Even when the examiner found the mass difficult to classify, he/she was obliged to state whether the mass was more likely to be benign or malignant. On the basis of published descriptions of typical gray-scale and color Doppler ultrasound findings of various specific diagnoses4, 10 the examiner also suggested a specific histological diagnosis (e.g. endometrioma, dermoid cyst or hydrosalpinx) whenever this was possible. The ultrasound examiner had no knowledge of the patient's serum CA 125 values when suggesting a diagnosis but was aware of the patient history.
The reference standard was the histology of the surgically removed adnexal tumors, the tumors being classified and malignant tumors staged according to the criteria recommended by the International Federation of Gynecology and Obstetrics11. Both the ultrasound diagnoses and the histological diagnoses were grouped retrospectively into 22 categories (Table 1). For subjective assessment of the mass on the basis of ultrasound findings, in addition to these 22 categories, a 23rd category was added, i.e. ‘don't know’. Only patients operated on ≤ 120 days after the ultrasound examination were included and, in the case of bilateral or multiple masses in the same patient, only the most complex mass—or, if all masses had similar ultrasound morphology, the largest one or the one most easily accessible by ultrasound examination—was used in our analysis.
Table 1. Histopathological diagnoses (n = 1066)
| Endometrioma||199 (18.7)|
| Serous cyst/cystadenoma||149 (14.0)|
| Teratoma||116 (10.9)|
| Mucinous cyst/cystadenoma||86 (8.1)|
| Adenofibroma||39 (3.7)|
| Two or more diagnoses*||35 (3.3)|
| Functional cyst||24 (2.2)|
| Hydrosalpinx, hematosalpinx||21 (2.0)|
| Paraovarian/parasalpingeal cyst||21 (2.0)|
| Inflammatory process†||20 (1.9)|
| Fibroma/fibrothecoma||19 (1.8)|
| Simple cyst||18 (1.7)|
| Other‡||13 (1.2)|
| Leiomyoma||12 (1.1)|
| Rare benign tumor (excluding struma ovarii)§||9 (0.8)|
| Torsion of lesion||9 (0.8)|
| Peritoneal pseudocyst||5 (0.5)|
| Struma ovarii||5 (0.5)|
| Primary invasive tumor||144 (13.5)|
| Stage 1||42|
| Stage 2, 3 or 4||102|
| Borderline tumor||55 (5.1)|
|Metastatic tumor||42 (3.9)|
|Rare malignant tumor¶||25 (2.3)|
In this analysis, a woman was considered to be postmenopausal if she reported no menstruation for at least 1 year after the age of 40 years, provided that the amenorrhea was not explained by pregnancy, medication or disease. Women who were 50 years or older and had undergone a hysterectomy, so that the time of menopause could not be determined, were also defined as postmenopausal.
Statistical analyses were carried out using the Statview 4.5™ statistical program (Abacus Concepts, Inc., Berkeley, CA, USA). The diagnostic performance of pattern recognition was expressed as the accuracy, sensitivity, specificity, and positive and negative likelihood ratios. The 95% CIs of sensitivity, specificity, and likelihood ratios were calculated using the Evidence-Based Medicine (EBM) Calculator Version 1.2 (http://www.cebm.utoronto.ca/palm/ebmcalc/).
- Top of page
- Supporting Information
To the best of our knowledge this is the first multicenter study evaluating the ability of ultrasound examiners using subjective evaluation of gray-scale and Doppler ultrasound findings (pattern recognition) to make a specific diagnosis of an adnexal mass. It is a strength that our study is a multicenter study involving many examiners and many tumors, because this makes our results likely to be more generalizable than small single-center studies. On the other hand, despite our study being large, the number of some specific diagnoses (e.g. peritoneal pseudocyst, rare benign tumor, struma ovarii, fibroma/fibrothecoma, leiomyoma and simple cyst) is too small for a precise estimate of sensitivity and specificity with regard to these diagnoses to be possible (see CIs for sensitivity, specificity, and positive and negative likelihood ratios).
Our results agree well with those from single centers in that the sensitivity and specificity of pattern recognition (i.e. subjective evaluation of gray-scale ultrasound findings with or without subjective evaluation of Doppler ultrasound findings) with regard to benign teratoma/dermoid cyst, endometrioma and hydrosalpinx were high4, 12–25.
Some studies have evaluated the sensitivity and specificity of pattern recognition with regard to paraovarian cysts4, 26 and mucinous and serous cystadenomas15, 27 and all included rather few cases (three and 17 cases of paraovarian cyst, 18 and 38 cases of serous cystadenoma, and 21 cases of mucinous cystadenoma). The sensitivity with regard to these three diagnoses was substantially higher in the studies cited than in ours, whereas the specificity was similar. The low sensitivity of pattern recognition with regard to serous and mucinous cystadenomas and adenofibromas in our study is explained either by an incorrect specific diagnosis having been assigned to the tumor or by no specific diagnosis (‘don't know’) having been suggested by the ultrasound examiner. As illustrated in Figure 1, the ultrasound morphology of serous and mucinous cystadenomas may overlap extensively. The ultrasound morphology of serous and mucinous cystadenomas also overlaps to some extent with that of adenofibromas (Figure 2). Indeed, serous cyst/cystadenoma was the most frequent incorrect ultrasound diagnosis in our series. While endometriomas and dermoid cysts were confused—albeit very rarely—with a variety of other conditions (with no particular pathology being over-represented among the misdiagnoses), serous cysts, adenofibromas, simple cysts, hydrosalpinx, functional cysts and paraovarian/parasalpingeal cysts were often confused with each other. This illustrates that many of the latter pathologies do not have a pathognomonic appearance at ultrasound examination.
We know of only one study reporting on the sensitivity and specificity of pattern recognition with regard to peritoneal cysts and fibromas/fibrothecomas. In that study there were only three cases of peritoneal cyst and nine cases of fibroma/fibrothecoma4. The reported sensitivity and specificity with regard to peritoneal cyst in the study cited were 100% and 99% vs. 80% and 99% in our study, and the sensitivity and specificity with regard to fibroma/fibrothecoma were 56% and 100% vs. 42% and 99% in our study. We have found no study reporting on the sensitivity and specificity of pattern recognition with regard to ovarian adenofibroma.
Typical ultrasound findings in different types of ovarian malignancies and of metastases in the ovaries of tumors of different primary origin have been described28, 29 but, to the best of our knowledge, there are no published reports on the sensitivity and specificity of pattern recognition with regard to a diagnosis of primary invasive ovarian cancer, rare malignant ovarian tumors or metastatic tumors in the ovary. However, one study reported the sensitivity and specificity of pattern recognition with regard to ovarian borderline tumors30. That study comprised 35 borderline tumors, 99 benign tumors and 32 invasive ovarian malignancies. The reported sensitivity with regard to borderline tumor was 69% and the specificity 94%. Our sensitivity with regard to borderline tumor was much lower (29%) and our specificity was only marginally higher (98%), with 9% of our borderline tumors not having been assigned a specific diagnosis, 38% having been classified as a benign cyst (vs. 29% in the study cited) and 24% as invasive malignancies (vs. 3% in the study cited). Differences in tumor population are likely to contribute to the discrepant results, even though the proportion of serous and mucinous borderline tumors was similar in the two studies.
It is important to be aware that the apparent sensitivity and specificity of pattern recognition with regard to specific diagnoses are affected not only by the skill of the ultrasound examiner and the quality of the ultrasound equipment used. The mix of tumor type is clearly very important. Our study population comprises only adnexal tumors that were removed surgically (a prerequisite for obtaining the true diagnosis). This means that masses with atypical or complex ultrasound morphology are likely to be more common in our study population than in a total population of adnexal masses, because an unequivocal ultrasound diagnosis of, for example, uterine leiomyoma, peritoneal cyst, paraovarian cyst, hydrosalpinx, simple cyst or functional cyst is less likely to result in the mass being removed surgically than an equivocal diagnosis. For example, in our series most functional cysts had been misdiagnosed as non-functional cysts, e.g. as endometriomas or cystadenomas, whereas in reality most functional cysts are recognized as such and are not removed surgically. Moreover, two of the centers contributing cases to this study were tertiary referral centers affiliated to cancer centers. For this reason, too, tumors with equivocal, unusual or complicated ultrasound morphology are likely to be over-represented in our study. All of the above means that both the sensitivity and specificity of pattern recognition with regard to specific diagnoses may have been underestimated in our study. The much higher sensitivity with regard to serous and mucinous cystadenoma reported in other studies (sensitivity with regard to serous cystadenoma 70–78% vs. 54% in our study; sensitivity 95% with regard to mucinous cystadenoma vs. 36% in our study) may be explained by a rather ‘artificial’ mix of tumors in the other studies15, 27. The apparent sensitivity and specificity are also affected by the willingness of the ultrasound examiner to suggest a specific diagnosis if he/she is not completely certain about the diagnosis. As clearly illustrated by our results, this willingness is likely to differ between examiners (a single examiner who contributed 30% of the cases to our study was responsible for 63% of the ‘don't know’ diagnoses). Some of the ultrasound examiners might have guessed a diagnosis and entered their guess into the study protocol, whereas they might not have been confident enough to suggest the same diagnosis in a clinical report. Moreover, there is almost certainly some interobserver disagreement between pathologists when they assign a diagnosis to an adnexal mass. All the above complicates not only the estimation of sensitivity and specificity but also the determination of which diagnoses are likely to be confused with each other at a scan.
From a clinical point of view, it is most important to be able to distinguish benign from malignant adnexal tumors31, and there are numerous studies showing that pattern recognition can do this2, 3, 32–35. It is also of value to be able to reliably discriminate between primary ovarian malignancies, borderline tumors and metastatic tumors in the ovaries, because the management of these diagnoses differs. Our study suggests that this was not possible with the knowledge available at the time of our data collection. However, studies describing the typical ultrasound morphology of rare adnexal tumors have been published recently36–38, and it is reasonable to believe that the skill in discriminating between different types of ovarian pathology will increase, if more work describing the typical ultrasound morphology of specific types of adnexal tumor is published. It is definitely clinically useful to be able to make a specific diagnosis of dermoid cyst, hydrosalpinx, endometrioma, peritoneal cyst and hemorrhagic corpus luteum cyst, because it seems reasonable to treat these conditions expectantly if they do not cause any symptoms and are not associated with subfertility. Pattern recognition is likely to be able to reliably distinguish not only dermoid cysts, endometriomas and hydrosalpinges (as shown in this study) but also peritoneal cysts and hemorrhagic corpus luteum cysts from other adnexal masses, but there are too few peritoneal cysts and corpus luteum cysts in our study for us to be able to state this with any certainty. The most likely explanation for the low number of these pathologies is that they were treated expectantly if the diagnosis was certain. If screening for ovarian cancer is to become a reality39, expert ultrasound imaging with pattern recognition will probably be used as a secondary test to evaluate the need for surgery, and our results support the contention that pattern recognition is likely to be able to provide an exact diagnosis in many screen-positive cases.
To sum up, using subjective evaluation of gray-scale and Doppler ultrasound findings it is possible to make an almost conclusive diagnosis of dermoid cyst, endometrioma and hydrosalpinx (and possibly peritoneal pseudocysts), but currently it does not seem to be possible to make any other histological diagnosis conclusively. Even though it is possible to recognize many other adnexal pathologies on the basis of their ultrasound characteristics they cannot be confidently confirmed or excluded. Serous cysts, cystadenofibromas, simple cysts, hydrosalpinx, functional cysts and paraovarian/parasalpingeal cysts are diagnoses that are often confused with each other. Our results are generalizable only to tumor populations, ultrasound examiners (level of experience) and ultrasound equipment similar to those in this study.
SUPPORTING INFORMATION ON THE INTERNET
The following supporting information may be found in the online version of this article:
Table S1 Sensitivity, specificity and positive and negative likelihood ratios for each ultrasound center with regard to those specific diagnoses for which at least three centers contributed at least 10 cases each