SEARCH

SEARCH BY CITATION

Keywords:

  • CA 125 antigen;
  • ovarian neoplasms;
  • ultrasonography

Abstract

  1. Top of page
  2. Abstract
  3. Introduction
  4. Methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. Appendix
  9. REFERENCES
  10. Supporting Information

Objectives

To determine whether CA 125 measurement is superior to ultrasound imaging performed by an experienced examiner for discriminating between benign and malignant adnexal lesions, and to determine whether adding CA 125 to ultrasound examination improves diagnostic performance.

Methods

This is a prospective multicenter study (International Ovarian Tumor Analysis (IOTA) study) conducted in nine European ultrasound centers in university hospitals. Of 1149 patients with an adnexal mass examined in the IOTA study, 83 were excluded. Of the remaining 1066 patients, 809 had CA 125 results available and were included. The patients underwent preoperative serum CA 125 measurements and transvaginal ultrasound examination by an experienced ultrasound examiner blinded to CA 125 values. The examiner classified each mass as certainly or probably benign, difficult to classify, or probably or certainly malignant. The outcome measure was the sensitivity and specificity with regard to malignancy of CA 125, ultrasound imaging and their combined use, the ‘gold standard’ being the histological diagnosis of the adnexal mass removed surgically within 120 days after the ultrasound examination.

Results

There were 242 (30%) malignancies. For 534 tumors judged to be certainly benign or certainly malignant by the ultrasound examiner the sensitivity and specificity of ultrasound examination and CA 125 (≥35 U/mL indicating malignancy) were 97% vs. 86% (95% CI of difference, 4.7–17.2) and 99% vs. 79% (95% CI of difference, 15.7–24.2); for 209 tumors judged probably benign or probably malignant, sensitivity and specificity were 81% vs. 57% (95% CI of difference, 12.3–36.0) and 91% vs. 74% (95% CI of difference, 8.5–25.7); for 66 tumors that were difficult to classify, sensitivity and specificity were 57% vs. 39% (95% CI of difference, −9.7 to 41.1) and 74% vs. 67% (95% CI of difference, −14.6 to 27.7). Diagnostic performance deteriorated when CA 125 was used as a second-stage test after ultrasound examination.

Conclusions

Specialist ultrasound examination is superior to CA 125 for preoperative discrimination between benign and malignant adnexal masses, irrespective of the diagnostic confidence of the ultrasound examiner; adding CA 125 to ultrasound does not improve diagnostic performance. Our results indicate that greater investment in education and training in gynecological ultrasound imaging would be of value. Copyright © 2009 ISUOG. Published by John Wiley & Sons, Ltd.


Introduction

  1. Top of page
  2. Abstract
  3. Introduction
  4. Methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. Appendix
  9. REFERENCES
  10. Supporting Information

CA 125 is a glycoprotein, defined by the antibody OC 12, that may be raised in patients with ovarian malignancy. Values ≥30 U/mL or ≥35 U/mL are often taken to indicate malignancy, but some suggest a higher cut-off (for example ≥65 U/mL) to indicate malignancy in premenopausal women1–6. The risk of malignancy in an adnexal mass can also be estimated on the basis of the results of a transvaginal ultrasound examination. Subjective evaluation of ultrasound findings (pattern recognition) by an experienced operator is highly accurate for the prediction of malignancy7–10, and a correct specific diagnosis can be made in many benign tumors, e.g. endometriomas or dermoid cysts7, 10, 11. Although ultrasound imaging is an excellent method for classifying adnexal masses, serum CA 125 is often measured as a second-stage test to estimate the likelihood of malignancy in adnexal lesions detected by ultrasound examination. CA 125 may be used alone or incorporated in the risk of malignancy index (RMI)12. The RMI is calculated as the product of the serum CA 125 level (U/mL), the ultrasound scan result (expressed as a score of 0, 1 or 3) and the menopausal status (1 if premenopausal and 3 if postmenopausal); an RMI > 200 is often used to indicate malignancy12. We have questioned the value of using CA 125 for estimating the risk of malignancy in adnexal tumors when the results of ultrasound examinations performed by experienced examiners are available13, 14. However, when characterizing an adnexal mass, ultrasound examiners may have a variable degree of confidence in their assessment9. It is possible that adding information on CA 125 could be superior to pattern recognition—or improve diagnostic performance if added to pattern recognition—at least when the ultrasound examiner is uncertain.

Our aim was to determine whether CA 125 measurement is superior to ultrasound imaging performed by an experienced examiner for the preoperative discrimination between benign and malignant adnexal lesions, in particular for masses thought difficult to characterize as benign or malignant on the basis of ultrasound findings, and to determine whether adding information on CA 125 to ultrasound findings as a second-stage test improves diagnostic performance. Because CA 125 is an important variable in the RMI we also wanted to compare the diagnostic performance of pattern recognition with that of RMI.

Methods

  1. Top of page
  2. Abstract
  3. Introduction
  4. Methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. Appendix
  9. REFERENCES
  10. Supporting Information

We used the prospectively collected information in the International Ovarian Tumor Analysis (IOTA) database. The IOTA study is a prospective multicenter study comprising nine European ultrasound centers in university hospitals. It was approved by the local ethics committees and has been described in detail in a previous publication15, which records the ultrasound centers involved, the number of patients, and the number of benign and malignant tumors that each center contributed to the study. The procedures followed were in accordance with the Helsinki Declaration of 1975, as revised in 1983. Informed consent was obtained from each participant in the study. The design of the IOTA study is briefly outlined below.

Consecutive patients referred to the participating ultrasound centers because of at least one adnexal mass were considered for inclusion in the study provided that the clinical history and ultrasound findings did not suggest that the mass was a functional cyst. The patients considered for inclusion underwent gray-scale and color Doppler ultrasonography by an experienced ultrasound examiner using high-quality ultrasound equipment, a standardized examination technique, and standardized terms and definitions16. A transvaginal scan was performed in all cases. Transabdominal sonography was added to examine large masses that could not be seen in their entirety using a transvaginal probe. On the basis of subjective evaluation of gray-scale and color Doppler findings (pattern recognition), the ultrasound examiner classified each mass as being certainly benign, probably benign, difficult to classify as benign or malignant (complete uncertainty), probably malignant or certainly malignant. Even when the examiner found the mass difficult to classify, he/she was obliged to state whether the mass was more likely to be benign or malignant. Whenever possible the examiner also suggested a specific histological diagnosis (e.g. endometrioma, dermoid cyst or hydrosalpinx). The ultrasound examiner had no knowledge of the patient's serum CA 125 value when suggesting a diagnosis. Only patients with a histological diagnosis of the mass obtained by surgery within 120 days after the ultrasound examination were included.

In this analysis, a woman was considered to be postmenopausal if she reported absence of menstruation for at least 1 year after the age of 40 years provided that the amenorrhea was not explained by pregnancy, medication or disease. Women aged ≥50 years who had undergone a hysterectomy, for whom the time of menopause could not be determined, were also defined as postmenopausal. Women with menstrual periods during the year before the examination and women younger than 50 years who had undergone a hysterectomy without bilateral oophorectomy were defined as premenopausal.

The participating centers were encouraged to measure the level of serum CA 125 in peripheral blood from all patients, but the availability of CA 125 results was not a requirement for inclusion in the IOTA study. Second-generation immunoradiometric assay kits for CA 125 (CA 125 II)17 from five companies were used (Centocor, Malvern, PA, USA; Cis-Bio, Gif-sur-Yvette, France; Abbott Axsym system, REF 3B41-22, Abbott Laboratories Diagnostic Division, Abbott Park, IL, USA; Immuno-l-analyser, Bayer Diagnostics, Tarrytown, NY, USA; or Vidas, bioMérieux, Marcy l'Etoile, France). All kits used the OC 125 antibody. CA 125 results are expressed in U/mL.

The reference standard was the histology of the surgically removed adnexal mass. Ultrasound imaging and CA 125 results were not concealed to the pathologists making the histopathological diagnosis. Tumors were classified according to the criteria recommended by the International Federation of Gynecology and Obstetrics18.

Statistical analysis

All statistical analyses were performed using SAS version 9.1 (SAS Institute, Cary, NC, USA). In the statistical analyses, borderline tumors were classified as malignant. The diagnostic performance in terms of accuracy, sensitivity, specificity, and positive and negative likelihood ratios with regard to malignancy of the following five diagnostic methods was determined: (1) subjective evaluation by the ultrasound examiner, i.e. pattern recognition; (2) serum CA 125; (3) a policy whereby both pattern recognition and CA 125 must suggest a benign diagnosis for a benign diagnosis to be made (Figure 1), a strategy that would increase sensitivity at the expense of reduced specificity; (4) a policy whereby both pattern recognition and CA 125 must suggest a malignant diagnosis for a malignant diagnosis to made (Figure 2), a strategy that would increase specificity at the expense of reduced sensitivity; and (5) RMI using the algorithm of Jacobs et al.12. The CIs for differences in sensitivity and specificity between pattern recognition and analysis of CA 125, and between pattern recognition and RMI, were calculated using a score interval method, i.e. method 10 in Newcombe19. The CIs for likelihood ratios were calculated using the Cox–Hinkley–Miettinen–Nurminen method20.

thumbnail image

Figure 1. Decision tree illustrating the use of serum CA 125 as a second-stage test in cases where subjective assessment of ultrasound findings (pattern recognition) by an ultrasound examiner predicts a benign tumor. This strategy will increase sensitivity at the expense of reduced specificity.

Download figure to PowerPoint

thumbnail image

Figure 2. Decision tree illustrating the use of serum CA 125 as a second-stage test in cases where subjective assessment of ultrasound findings (pattern recognition) by an ultrasound examiner predicts a malignant tumor. This strategy will increase specificity at the expense of reduced sensitivity.

Download figure to PowerPoint

Results

  1. Top of page
  2. Abstract
  3. Introduction
  4. Methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. Appendix
  9. REFERENCES
  10. Supporting Information

Recruitment of patients to the IOTA study started in June 1999 and ended in June 2001. Of the 1149 patients with an adnexal mass examined in the IOTA study, 83 were excluded15. Of the remaining 1066 patients, 809 (76%) had available CA 125 results and were included in the present analysis. These 809 patients have also been used in two other publications regarding the usefulness of CA 125 when assessing adnexal tumors13, 14. Table 1 shows demographic background data, histological diagnoses and results of subjective estimation of risk of malignancy by the ultrasound examiner both for the 809 patients included and the 257 patients excluded from the present analysis because of missing CA 125 results. The patients excluded were younger, and more of them had benign tumors, in particular endometriomas, but the diagnostic performance of the ultrasound examiner was similar in the patients included and those excluded (sensitivity 88% in patients included vs. 83% in the patients excluded; specificity 95% vs. 96%).

Table 1. Demographic background data, histological diagnoses, estimation of risk of malignancy by experienced ultrasound examiners who used pattern recognition, and rate of correct diagnoses with regard to malignancy when using pattern recognition in women included in and excluded from this analysis
VariableExcluded women (n = 257)Included women
All (n = 809)Premenopausal (n = 445)Postmenopausal (n = 364)
  • Women included in this study had CA 125 serum levels available. Values are mean ± SD or n (%).

  • Numbers in parentheses are percentages of all benign masses.

  • Numbers in parentheses are percentages of all malignant masses.

  • Reproduced from J Natl Cancer Inst 2007; 99: 1706–1714. Published with permission from Oxford University Press.

Mean age (years)41.9 ± 14.548.8 ± 15.637.4 ± 9.162.8 ± 9.4
Postmenopausal68 (26.5)364 (45.0)  
Histological diagnosis
 All benign tumors233 (90.7)567 (70.1)359 (80.7)208 (57.1)
  Endometrioma84 (32.7)128 (15.8)123 (27.6)5 (1.4)
  Dermoid/teratoma44 (17.1)83 (10.3)68 (15.3)15 (4.1)
  Simple cyst15 (5.8)84 (10.4)39 (8.8)45 (12.4)
  Functional cyst13 (5.1)15 (1.9)12 (2.7)3 (0.8)
  Hydrosalpinx9 (3.5)15 (1.9)12 (2.7)3 (0.8)
  Peritoneal pseudocyst4 (1.6)4 (0.5)3 (0.7)1 (0.3)
  Abscess6 (2.3)19 (2.3)13 (2.9)6 (1.6)
  Fibroma8 (3.1)29 (3.6)9 (2.0)20 (5.5)
  Cystadenoma34 (13.2)102 (12.6)38 (8.5)64 (17.6)
  Mucinous cystadenoma14 (5.4)80 (9.9)40 (9.0)40 (11.0)
  Rare benign tumor2 (0.8)8 (1.0)2 (0.4)6 (1.6)
 All malignant tumors24 (9.3)242 (29.9)86 (19.3)156 (42.9)
  Primary invasive17 (6.6)127 (15.7)32 (7.2)95 (26.1)
   Stage I9 (3.5)33 (4.1)10 (2.2)23 (6.3)
   Stage II2 (0.8)10 (1.2)2 (0.4)8 (2.2)
   Stage III4 (1.6)69 (8.5)17 (3.8)52 (14.3)
   Stage IV2 (0.8)15 (1.9)3 (0.7)12 (3.3)
  Borderline3 (1.2)52 (6.4)27 (6.1)25 (6.9)
  Metastatic4 (1.6)38 (4.7)13 (2.9)25 (6.9)
  Rare primary invasive0 (0)25 (3.1)14 (3.1)11 (3.0)
Risk estimation by ultrasound examiner
 Certainly benign168 (65.4)384 (47.5)276 (62.0)108 (29.7)
 Probably benign41 (16.0)140 (17.3)75 (16.9)65 (17.9)
 Unclassifiable24 (9.3)66 (8.2)26 (5.8)40 (11.0)
 Probably malignant14 (5.4)69 (8.5)26 (5.8)43 (11.8)
 Certainly malignant10 (3.9)150 (18.5)42 (9.4)108 (29.7)
Correctly classified with regard to malignancy244 (94.9)752 (93.0)421 (94.6)331 (90.9)
 by ultrasound examiner 
 Benign masses224 (96.1)538 (94.9)351 (97.8)187 (89.9)
 Malignant masses20 (83.3)214 (88.4)70 (81.4)144 (92.3)

The histopathological diagnoses according to the diagnostic confidence of the ultrasound examiner when using pattern recognition are shown in Table 2. The ultrasound examiner classified 534 (66%) tumors as certainly benign or certainly malignant, 209 (26%) tumors as probably benign or probably malignant and 66 (8%) tumors as impossible to classify as benign or malignant (‘uncertain’). Endometriomas were more common among tumors that the ultrasound examiner was completely confident were benign or malignant, whereas borderline tumors, rare benign tumors, cystadenomas and fibromas were more common in the group of tumors that the ultrasound examiner was less confident or completely uncertain about.

Table 2. Histological diagnoses with regard to the diagnostic confidence of the ultrasound examiner when he/she suggested a diagnosis of benignity or malignancy
Histological diagnosisDiagnostic confidence of ultrasound examiner (n (%))Total (n (%)) (n = 809)
Certainly benign or certainly malignant (n = 534)Probably benign or probably malignant (n = 209)Uncertain (n = 66)
  1. PID, pelvic inflammatory disease.

All benign tumors384 (71.9)140 (67.0)43 (65.2)567
 Endometrioma105 (19.7)19 (9.1)4 (6.1)128
 Dermoid/teratoma68 (12.7)9 (4.3)6 (9.1)83
 Simple or functional cyst70 (13.1)23 (11.0)6 (9.1)99
 Extraovarian mass13 (2.4)6 (2.9)0 (0)19
 Abscess/PID11 (2.1)6 (2.9)2 (3.0)19
 Fibroma11 (2.1)13 (6.2)5 (7.6)29
 Cystadenoma104 (19.5)59 (28.2)19 (28.8)182
 Rare benign tumor2 (0.4)5 (2.4)1 (1.5)8
All malignant tumors150 (28.1)69 (33.0)23 (34.8)242
 Primary invasive94 (17.6)28 (13.4)5 (7.6)127
  Stage I19 (3.6)12 (5.7)2 (3.0)33
   Serous83011
   Mucinous2305
   Endometrioid5308
   Other4329
  Stage II–IV75 (14.0)16 (7.7)3 (4.5)94
   Serous4710259
   Mucinous3104
   Endometrioid121013
   Other134118
 Borderline17 (3.2)23 (11.0)12 (18.2)52
  Serous99624
  Mucinous611522
  Other2316
 Rare primary invasive14 (2.6)9 (4.3)2 (3.0)25
 Metastatic25 (4.7)9 (4.3)4 (6.1)38

Table 3 shows the diagnostic performance of pattern recognition and serum CA 125 depending on the confidence of the ultrasound examiner when CA 125 values ≥35 U/mL were used to indicate malignancy. Pattern recognition was superior to CA 125 irrespective of the confidence with which the ultrasound examiner suggested whether a tumor was benign or malignant. None of the tested cut-off values for CA 125 (30 U/mL, 35 U/mL, 65 U/mL, 100 U/mL, 200 U/mL, 400 U/mL, 1000 U/mL) was superior to pattern recognition in any of the three confidence groups (certainly benign or certainly malignant, probably benign or probably malignant, completely uncertain), and this was true of both premenopausal and postmenopausal patients (Tables S1 and S2 online). When the ultrasound examiner was uncertain whether the tumor was benign or malignant both pattern recognition and CA 125 were poor diagnostic tests.

Table 3. Diagnostic performance of pattern recognition and serum CA 125 depending on the confidence of the ultrasound examiner
Confidence of ultrasound examinerPattern recognitionCA 125Difference in sensitivity (95% CI)Difference in specificity (95% CI)
Accuracy (% (n))Sensitivity (% (n))Specificity (% (n))LR+ (95% CI)LR− (95% CI)Accuracy (% (n))Sensitivity (% (n))Specificity (% (n))LR+ (95% CI)LR− (95% CI)
  1. CA 125 values ≥35 U/mL indicated malignancy. LR+, positive likelihood ratio; LR−, negative likelihood ratio.

Certainly benign or certainly98979974.20.0348186794.080.1810.719.8
 malignant (n = 534)(524/534)(145/150)(379/384)(32.1 to > 100)(0.02–0.08)(432/534)(129/150)(303/384)(3.3–5.0)(0.12–0.26)(4.7–17.2)(15.7–24.2)
Probably benign or probably8881918.70.2086857742.140.5924.617.1
 malignant (n = 209)(183/209)(56/69)(127/140)(5.2–14.9)(0.13–0.33)(142/209)(39/69)(103/140)(1.5–3.0)(0.43–0.77)(12.3–36.0)(8.5–25.7)
Completely uncertain6857742.20.5845839671.200.9017.47.0
 (difficult tumor) (n = 66)(45/66)(13/23)(32/43)(1.2–4.1)(0.34–0.91)(38/66)(9/23)(29/43)(0.6–2.3)(0.58–1.30)(−9.7 to 41.1)(−14.6 to 27.7)

Table 4 shows the diagnostic performance of pattern recognition and RMI depending on the confidence of the ultrasound examiner when RMI > 200 was used to indicate malignancy. Pattern recognition was superior to RMI irrespective of the diagnostic confidence of the ultrasound examiner, but when the ultrasound examiner was uncertain about the character of the mass both pattern recognition and RMI were poor diagnostic methods. Similar results were obtained when RMI > 100 was used to indicate malignancy.

Table 4. Diagnostic performance of pattern recognition and risk of malignancy index (RMI) depending on the confidence of the ultrasound examiner
Confidence of ultrasound examinerPattern recognitionRMIDifference in sensitivity (95% CI)Difference in specificity (95% CI)
Accuracy (% (n))Sensitivity (% (n))Specificity (% (n))LR+ (95% CI)LR− (95% CI)Accuracy (% (n))Sensitivity (% (n))Specificity (% (n))LR+ (95% CI)LR− (95% CI) 
  1. RMI > 200 indicated malignancy. LR+, positive likelihood ratio; LR−, negative likelihood ratio.

Certainly benign or certainly98979974.20.03492839517.80.17513.33.4
 malignant (n = 534)(524/534)(145/150)(379/384)(32.1 to > 100)(0.02–0.08)(491/534)(125/150)(366/384)(11.4–28.1)(0.12–0.25)(7.6–19.9)(1.2–6.0)
Probably benign or probably8881918.740.2087652874.060.54929.03.6
 malignant (n = 209)(183/209)(56/69)(127/140)(5.2–14.9)(0.13–0.33)(158/209)(36/69)(122/140)(2.5–6.6)(0.42–0.69)(15.4–41.2)(−3.4 to 10.7)
Completely uncertain6857742.210.5845826741.020.99330.40.0
 (difficult tumor) (n = 66)(45/66)(13/23)(32/43)(1.2–4.1)(0.34–0.91)(38/66)(6/23)(32/43)(0.4–2.3)(0.7–1.3)(2.4–52.6)(−20.9 to 20.9)

The effects of using CA 125 as a second-stage test after the ultrasound examiner had suggested a diagnosis are shown in Tables 5 and 6. The outcome of a strategy in which both pattern recognition and CA 125 must suggest a benign diagnosis for a benign diagnosis to be made is shown in Table 5, and the outcome of a strategy in which both pattern recognition and CA 125 must suggest a malignancy for a malignant diagnosis to be made is shown in Table 6. The strategy requiring pattern recognition and CA 125 to be concordant resulted in more tumors being misclassified and in diagnostic performance deteriorating. This was true of both premenopausal and postmenopausal patients and irrespective of the diagnostic confidence of the ultrasound examiner (Tables S3–S8 online show these results in detail).

Table 5. Diagnostic performance when using CA 125 as a second-stage test after ultrasound examination (requiring the results of both pattern recognition and CA 125 to indicate a benign diagnosis for a benign diagnosis to be made)
PopulationStrategy (cut-off*)Accuracy (%)Sensitivity (%)Specificity (%)LR+LR−
  • LR+, positive likelihood ratio; LR−, negative likelihood ratio; Subj, subjective evaluation of ultrasound findings; Subj −/CA 125, requiring the results of both pattern recognition and CA 125 to indicate a benign diagnosis for a benign diagnosis to be made.

  • *

    CA 125 cut-off in U/mL to indicate malignancy.

All (n = 809)Subj93.088.494.917.310.12
 Subj −/CA 125 (30)75.392.667.92.880.11
 Subj −/CA 125 (35)78.691.773.03.400.11
 Subj −/CA 125 (65)86.890.985.06.060.11
 Subj −/CA 125 (100)89.789.789.88.770.12
 Subj −/CA 125 (200)91.789.392.812.350.12
 Subj −/CA 125 (400)92.688.494.415.680.12
 Subj −/CA 125 (1000)92.888.494.716.720.12
Premenopause (n = 445)Subj94.681.497.836.500.19
 Subj −/CA 125 (30)69.989.565.22.570.16
 Subj −/CA 125 (35)73.787.270.52.950.18
 Subj −/CA 125 (65)85.686.085.55.940.16
 Subj −/CA 125 (100)90.383.791.910.360.18
 Subj −/CA 125 (200)93.082.695.518.510.18
 Subj −/CA 125 (400)94.681.497.836.500.19
 Subj −/CA 125 (1000)94.681.497.836.500.19
Postmenopause (n = 364)Subj90.992.389.99.140.09
 Subj −/CA 125 (30)81.994.272.63.440.08
 Subj −/CA 125 (35)84.694.277.44.170.07
 Subj −/CA 125 (65)88.293.684.15.900.08
 Subj −/CA 125 (100)89.092.986.16.670.08
 Subj −/CA 125 (200)90.192.988.07.730.08
 Subj −/CA 125 (400)90.192.388.58.000.09
 Subj −/CA 125 (1000)90.792.389.48.720.09
Table 6. Diagnostic performance when using CA 125 as a second-stage test after ultrasound examination (requiring the results of both pattern recognition and CA 125 to indicate malignancy for a malignant diagnosis to be made)
PopulationStrategy (cut-off*)Accuracy (%)Sensitivity (%)Specificity (%)LR+LR−
  • Inf, infinity; LR+, positive likelihood ratio; LR−, negative likelihood ratio; Subj, subjective evaluation of ultrasound findings; Subj +/CA 125, requiring the results of both pattern recognition and CA 125 to indicate malignancy for a malignant diagnosis to be made.

  • *

    CA 125 cut-off in U/mL to indicate malignancy.

AllSubj93.088.494.917.310.12
(n = 809)Subj +/CA 125 (30)90.171.598.136.850.29
 Subj +/CA 125 (35)90.069.898.649.50.31
 Subj +/CA 125 (65)87.961.699.169.970.39
 Subj +/CA 125 (100)86.355.899.378.580.45
 Subj +/CA 125 (200)83.244.699.6127.50.56
 Subj +/CA 125 (400)79.732.699.8181.30.67
 Subj +/CA 125 (1000)75.217.499.896.40.83
PremenopauseSubj94.681.497.836.500.19
(n = 445)Subj +/CA 125 (30)91.258.199.269.210.42
 Subj +/CA 125 (35)91.257.099.41020.43
 Subj +/CA 125 (65)89.245.399.7162.00.55
 Subj +/CA 125 (100)88.139.599.7141.20.61
 Subj +/CA 125 (200)86.127.9100.0Inf0.72
 Subj +/CA 125 (400)84.017.4100.0Inf0.83
 Subj +/CA 125 (1000)82.710.5100.0Inf0.90
PostmenopauseSubj90.992.389.99.140.09
(n = 364)Subj +/CA 125 (30)88.778.896.220.480.22
 Subj +/CA 125 (35)88.576.997.126.70.24
 Subj +/CA 125 (65)86.370.598.136.720.30
 Subj +/CA 125 (100)84.164.798.644.960.36
 Subj +/CA 125 (200)79.753.899.056.090.47
 Subj +/CA 125 (400)74.541.099.585.480.59
 Subj +/CA 125 (1000)65.921.299.544.060.79

Discussion

  1. Top of page
  2. Abstract
  3. Introduction
  4. Methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. Appendix
  9. REFERENCES
  10. Supporting Information

In this study we have shown that an ultrasound examination performed and interpreted by an experienced operator is superior to the analysis of serum CA 125 irrespective of the diagnostic confidence of the ultrasound examiner when he/she suggests that a lesion is benign or malignant, and that this is true in both premenopausal and postmenopausal patients. We were disappointed to find that measurements of serum CA 125 were not helpful in the characterization of tumors considered difficult to classify by ultrasound imaging. This was the case however the CA 125 values were used and whatever cut-off value was tested. Pattern recognition was also superior to RMI. However, all three methods performed poorly in the group of ‘difficult tumors’. The particular mix of tumors in this group may explain this. For example, borderline tumors were clearly over-represented among the difficult tumors, and borderline tumors are often misclassified both by pattern recognition21 and by CA 12514. Because CA 125 is an important variable in the RMI, RMI is also likely to misclassify borderline tumors.

It is important to emphasize that our results are representative of ultrasound examinations carried out and interpreted by very experienced examiners using high-quality ultrasound equipment. Almost 80% of the examinations in the study had been carried out by Level III examiners defined using the terminology of the European Federation of Ultrasound in Medicine and Biology (EFSUMB), i.e. the examiners worked in tertiary referral centers, had an academic record, and a high level of experience and expertise22. Some of them had performed up to 16 000 gynecological ultrasound examinations by the start of the study. Moreover, they had a special interest in the use of ultrasound imaging to characterize adnexal tumors. It cannot be excluded that CA 125 would improve the diagnostic performance if it were added to an ultrasound examination carried out and interpreted by a less experienced examiner, or that CA 125 or RMI would be superior to such an ultrasound examination. On the other hand, even examiners with very limited experience of only 200–300 gynecological ultrasound examinations performed under supervision were able to correctly classify most adnexal tumors as benign or malignant when presented with representative ultrasound images8.

In previous publications using the same patients as in this study we have shown that adding information on CA 125 to clinical and ultrasound information does not improve the diagnostic performance of mathematical models constructed to calculate the risk of malignancy in adnexal masses13, and that ultrasound examination performed by an experienced examiner is superior to the analysis of serum CA 125 for distinguishing benign from malignant adnexal masses in each of 15 specific histological subgroups of adnexal tumors14. However, we want to emphasize that we have examined only the diagnostic performance of a single measurement of CA 125. It is possible that serial measurements of CA 125 would have better diagnostic performance than a single measurement. Some might argue that borderline tumors should be included in the group of benign tumors when comparing the ability of pattern recognition and CA 125 to discriminate between benign and malignant tumors, because some oncologists regard borderline tumors as disease with a good prognosis. However, even when including the borderline tumors in the benign group, pattern recognition was superior to CA 125. Using a CA 125 cut-off value of 30 U/mL the sensitivity of CA 125 and pattern recognition was 96% (182/190) vs. 85% (161/190) and the specificity 90% (558/619) vs. 70% (433/619), and pattern recognition remained superior to CA 125 even when using higher CA 125 cut-off values to indicate malignancy.

We acknowledge the limitations of our study. Table 1 reveals a bias whereby serum CA 125 is more likely to have been measured in women with lesions that were suspected of being malignant by the ultrasound examiner. We do not believe that this invalidates our conclusions, because in all likelihood serum CA 125 would have performed more poorly in the patients excluded than in the patients included, given the large proportion of endometriomas in the group excluded. An experienced ultrasound examiner almost always classifies endometriomas correctly, whereas CA 125 tends to misclassify endometriomas as malignancies14. We cannot rule out, of course, that CA 125 in addition to ultrasound pattern recognition in patients without endometriomas may be helpful. Another limitation of our study is that five CA 125 kits were used to assess the level of serum CA 125. However, this reflects clinical reality, and there is some evidence that the variation in CA 125 resulting from use of different kits is not large23, 24.

The preoperative assessment of adnexal tumors remains a challenge. Advances in surgery have provided new treatment options for women with ovarian tumors, but these new methods are useful only if the preoperative diagnosis is correct. Rupture of a Stage 1 ovarian cancer during an operation may worsen the prognosis25, and incorrect preoperative classification of a tumor as benign may increase the risk of this happening. Currently, ultrasound examination by an experienced operator using pattern recognition seems to be the best method for discriminating between benign and malignant adnexal tumors before surgery9, 10, 14. The ability to discriminate between benign and malignant masses using pattern recognition increases with the experience of the ultrasound examiner8. We believe that time and money could be saved both for patients and health services if there was consensus that patients with adnexal masses should undergo Level II or Level III ultrasound imaging before deciding on management, that is before referring the patient to a gynecological oncology center, and that greater investment in education and training in gynecological ultrasound examination would be of value. This is also supported by the results of a randomized controlled trial showing that improved quality of ultrasonography had a measurable effect on the management of patients with suspected ovarian cancer in a tertiary gynecology cancer center, and resulted in a significant decrease in the number of major staging procedures and a shorter inpatient hospital stay26. The EFSUMB has published guidelines on how much training and education in gynecological ultrasound imaging is needed to obtain competence at different levels22, but the amount of training and experience needed to become good at pattern recognition is likely to vary between individuals.

Unfortunately, even when performed by an experienced examiner, pattern recognition is not a good diagnostic method for ‘difficult tumors’, i.e. when the examiner is uncertain about whether the mass is benign or malignant, nor do logistic regression models to calculate the risk of malignancy seem to help in these21. Such difficult masses comprise 7–10% of tumors currently considered appropriate to remove surgically21. From a clinical viewpoint some might be happy to include these masses in the ‘probably malignant’ group. However, it is possible that other diagnostic methods added to conventional gray-scale and Doppler ultrasound imaging as second-stage tests would be helpful in assessing these difficult tumors; examples are evaluation of the vascular tree of tumors using three-dimensional power Doppler ultrasound examination27, or semiquantification of tumor perfusion using ultrasound contrast. Qualitative evaluation of contrast-enhanced ultrasound examination does not seem to improve diagnostic performance in tumors with papillary projections28, which constitute a subgroup of difficult ovarian tumors21.

Supporting Information on the Internet

The following supporting information may be found in the online version of this article:

Table S1 Diagnostic performance of pattern recognition and serum CA 125 depending on the confidence of the ultrasound examiner in premenopausal patients

Table S2 Diagnostic performance of pattern recognition and serum CA 125 depending on the confidence of the ultrasound examiner in postmenopausal patients

Table S3 Diagnostic performance when CA 125 is used as a second-stage test after ultrasound, i.e. requiring the results of both pattern recognition and CA 125 to indicate a benign diagnosis for a benign diagnosis to be made, in tumors considered to be certainly benign or certainly malignant by the ultrasound examiner

Table S4 Diagnostic performance when CA 125 is used as a second-stage test after ultrasound, i.e. requiring the results of both pattern recognition and CA 125 to indicate a benign diagnosis for a benign diagnosis to be made, in tumors considered to be probably benign or probably malignant by the ultrasound examiner

Table S5 Diagnostic performance when CA 125 is used as a second-stage test after ultrasound, i.e. requiring the results of both pattern recognition and CA 125 to indicate a benign diagnosis for a benign diagnosis to be made, in tumors where the ultrasound examiner was completely uncertain whether the tumor was benign or malignant

Table S6 Diagnostic performance when CA 125 is used as a second-stage test after ultrasound, i.e. requiring the results of both pattern recognition and CA 125 to indicate malignancy for a malignant diagnosis to be made, in tumors considered certainly benign or certainly malignant by the ultrasound examiner

Table S7 Diagnostic performance when CA 125 is used as a second-stage test after ultrasound, i.e. requiring the results of both pattern recognition and CA 125 to indicate malignancy for a malignant diagnosis to be made, in tumors considered probably benign or probably malignant by the ultrasound examiner

Table S8 Diagnostic performance when CA 125 is used as a second-stage test after ultrasound, i.e. requiring the results of both pattern recognition and CA 125 to indicate malignancy for a malignant diagnosis to be made, in tumors where the ultrasound examiner was completely uncertain whether the tumor was benign or malignant

Acknowledgements

  1. Top of page
  2. Abstract
  3. Introduction
  4. Methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. Appendix
  9. REFERENCES
  10. Supporting Information

This work was supported by the research council of the Katholieke Universiteit Leuven, Belgium (GOA-AMBioRICS, CoE EF/05/006 Optimization in Engineering OPTEC); the Belgian Federal Science Policy Office IUAP P6/04 (DYSCO, ‘Dynamical systems, control and optimization’, 2007-2011); the EU: BIOPATTERN (FP6-2002-IST 508803); ETUMOUR (FP6-2002-LIFESCIHEALTH 503094); Healthagents (IST–2004–27214); the Swedish Medical Research Council (grants numbers K2001-72X-11605-06A, K2002-72X-11605-07B, K2004-73X-11605-09A and K2006-73X-11605-11-3); funds administered by Malmö University Hospital; Allmänna Sjukhusets i Malmö Stiftelse för bekämpande av cancer (the Malmö General Hospital Foundation for fighting against cancer); and ALF-medel (a Swedish governmental grant).

Appendix

  1. Top of page
  2. Abstract
  3. Introduction
  4. Methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. Appendix
  9. REFERENCES
  10. Supporting Information

IOTA Steering Committee

Dirk Timmerman, Lil Valentin, Tom Bourne, Antonia C. Testa, Sabine Van Huffel, Ignace Vergote

IOTA principal investigators (alphabetical order)

Jean-Pierre Bernard, Maurepas, France

Enrico Ferrazzi, Milan, Italy

Davor Jurkovic, London, UK

Andrea Lissoni, Monza, Italy

Ulrike Metzger, Paris, France

Dario Paladini, Naples, Italy

Antonia Testa, Rome, Italy

Dirk Timmerman, Leuven, Belgium

Lil Valentin, Malmö, Sweden

Other IOTA contributors

Fabrice Lécuru, Paris, France

Francesco Leone, Milan, Italy

Ben Van Calster, Leuven, Belgium

Caroline Van Holsbeke, Leuven, Belgium

Sabine Van Huffel, Leuven, Belgium

Dominique Van Schoubroeck, Leuven, Belgium

Gerardo Zanetta (deceased), Monza, Italy

REFERENCES

  1. Top of page
  2. Abstract
  3. Introduction
  4. Methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. Appendix
  9. REFERENCES
  10. Supporting Information
  • 1
    Jacobs IJ, Skates S, Davies AP, Woolas RP, Jeyerajah A, Weidemann P, Sibley K, Oram DH. Risk of diagnosis of ovarian cancer after raised serum CA 125 concentration: a prospective cohort study. BMJ 1996; 313: 13551358.
  • 2
    Paramasivam S, Tripcony L, Crandon A, Quinn M, Hammond I, Marsden D, Proietto A, Davy M, Carter J, Nicklin J, Perrin L, Obermair A. Prognostic importance of preoperative CA-125 in International Federation of Gynaecology and Obstetrics stage I epithelial ovarian cancer: an Australian multicenter study. J Clin Oncol 2005; 23: 59385942.
  • 3
    Bast RC Jr, Klug TL, St John E, Jenison E, Niloff JM, Lazarus H, Berkowitz RS, Leavitt T, Griffiths CT, Parker L, Zurawski VR Jr, Knapp RC. A radioimmunoassay using a monoclonal antibody to monitor the course of epithelial ovarian cancer. N Engl J Med 1983; 309: 883887.
  • 4
    Bon GG, Kenemans P, Verstraeten R, van Kamp GJ, Hilgers J. Serum tumor marker immunoassays in gynecologic oncology: establishment of reference values. Am J Obstet Gynecol 1996; 174: 107114.
  • 5
    Gadducci A, Baicchi U, Marrai R, Ferdeghini M, Bianchi R, Facchini V. Preoperative evaluation of D-dimer and CA 125 levels in differentiating benign from malignant ovarian masses. Gynecol Oncol 1996; 60: 197202.
  • 6
    Predanic M, Vlahos N, Pennisi JA, Moukhtar M, Aleem FA. Color and pulsed Doppler sonography, gray-scale imaging, and serum CA 125 in the assessment of adnexal disease. Obstet Gynecol 1996; 88: 283288.
  • 7
    Valentin L. Use of morphology to characterize and manage common adnexal masses. Best Pract Res Clin Obstet Gynaecol 2004; 18: 7189.
  • 8
    Timmerman D, Schwarzler P, Collins WP, Claerhout F, Coenen M, Amant F, Vergote I, Bourne TH. Subjective assessment of adnexal masses with the use of ultrasonography: an analysis of interobserver variability and experience. Ultrasound Obstet Gynecol 1999; 13: 1116.
  • 9
    Valentin L. Prospective cross-validation of Doppler ultrasound examination and gray-scale ultrasound imaging for discrimination of benign and malignant pelvic masses. Ultrasound Obstet Gynecol 1999; 14: 273283.
  • 10
    Valentin L, Hagen B, Tingulstad S, Eik-Nes S. Comparison of ‘pattern recognition’ and logistic regression models for discrimination between benign and malignant pelvic masses: a prospective cross validation. Ultrasound Obstet Gynecol 2001; 18: 357365.
  • 11
    Valentin L. Pattern recognition of pelvic masses by gray-scale ultrasound imaging: the contribution of Doppler ultrasound. Ultrasound Obstet Gynecol 1999; 14: 338347.
  • 12
    Jacobs I, Oram D, Fairbanks J, Turner J, Frost C, Grudzinskas JG. A risk of malignancy index incorporating CA 125, ultrasound and menopausal status for the accurate preoperative diagnosis of ovarian cancer. Br J Obstet Gynaecol 1990; 97: 922929.
  • 13
    Timmerman D, Van Calster B, Jurkovic D, Valentin L, Testa A, Bernard J, Van Holsbeke C, Van Huffel S, Vergote I, Bourne T. Inclusion of CA-125 does not improve mathematical models developed to distinguish between benign and malignant adnexal tumors. J Clin Oncol 2007; 25: 41944200.
  • 14
    Van Calster B, Timmerman D, Bourne T, Testa A, Van Holsbeke C, Domali E, Jurkovic D, Neven P, Van Huffel S, Valentin L. Discrimination between benign and malignant adnexal masses by specialist ultrasound examination versus serum CA-125. J Natl Cancer Inst 2007; 99: 17061714.
  • 15
    Timmerman D, Testa AC, Bourne T, Ferrazzi E, Ameye L, Konstantinovic ML, Van Calster B, Collins WP, Vergote I, Van Huffel S, Valentin L. Logistic regression model to distinguish between the benign and malignant adnexal mass before surgery: a multicenter study by the International Ovarian Tumor Analysis Group. J Clin Oncol 2005; 23: 87948801.
  • 16
    Timmerman D, Valentin L, Bourne TH, Collins WP, Verrelst H, Vergote I. Terms, definitions and measurements to describe the sonographic features of adnexal tumors: a consensus opinion from the International Ovarian Tumor Analysis (IOTA) Group. Ultrasound Obstet Gynecol 2000; 16: 500505.
  • 17
    Kenemans P, van Kamp GJ, Oehr P, Verstraeten RA. Heterologous double-determinant immunoradiometric assay CA 125 II: reliable second-generation immunoassay for determining CA 125 in serum. Clin Chem 1993; 39: 25092513.
  • 18
    Heintz AP, Odicino F, Maisonneuve P, Beller U, Benedet JL, Creasman WT, Ngan HY, Pecorelli S. Carcinoma of the ovary. Int J Gynaecol Obstet 2003; 83: 135166.
  • 19
    Newcombe RG. Improved confidence intervals for the difference between binomial proportions based on paired data. Stat Med 1998; 17: 26352650.
  • 20
    Miettinen OS, Nurminen M. Comparative analysis of two rates. Stat Med 1985; 4: 213226.
  • 21
    Valentin L, Ameye L, Jurkovic D, Metzger U, Lecuru F, Van Huffel S, Timmerman D. Which extrauterine pelvic masses are difficult to correctly classify as benign or malignant on the basis of ultrasound findings and is there a way of making a correct diagnosis? Ultrasound Obstet Gynecol 2006; 27: 438444.
  • 22
    European Federation of Societies in Ultrasound in Medicine and Biology (EFSUMB). Minimum training recommendations for the practice of medical ultrasound. Ultrashall Med 2005; 26: 79105.
  • 23
    Bonfrer J, Baan A, Jansen E, Lentfer D, Kenemans P. Technical evaluation of three second generation CA 125 assays. Eur J Clin Chem Clin Biochem 1994; 32: 201207.
  • 24
    Davelaar E, van Kamp G, Verstraeten R, Kenemans P. Comparison of seven immunoassays for the quantification of CA 125 antigen in serum. Clin Chem 1998; 44: 14171422.
  • 25
    Vergote I, De Brabanter J, Fyles A, Bertelsen K, Einhorn N, Sevelda P, Gore ME, Kaern J, Verrelst H, Sjövall K, Timmerman D, Vandewalle J, Van Gramberen M, Tropé CG. Prognostic importance of degree of differentiation and cyst rupture in stage I invasive epithelial ovarian carcinoma. Lancet 2001; 357: 176182.
  • 26
    Yazbek J, Raju SK, Ben-Nagi J, Holland TK, Hillaby K, Jurkovic D. Effect of quality of gynaecological ultrasonography on management of patients with suspected ovarian cancer: a randomised controlled trial. Lancet Oncol 2008; 9: 124131.
  • 27
    Sladkevicius P, Jokubkiene L, Valentin L. Contribution of the morphological assessment of the vessel tree by three-dimensional ultrasound to a correct diagnosis of malignancy in adnexal masses. Ultrasound Obstet Gynecol 2007; 30: 874882.
  • 28
    Testa AC, Timmerman D, Exacoustos C, Fruscella E, Van Holsbeke C, Bokor D, Arduini D, Scambia G, Ferrandina G. The role of CnTI-SonoVue in the diagnosis of ovarian masses with papillary projections: a preliminary study. Ultrasound Obstet Gynecol 2007; 29: 512516.

Supporting Information

  1. Top of page
  2. Abstract
  3. Introduction
  4. Methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. Appendix
  9. REFERENCES
  10. Supporting Information
FilenameFormatSizeDescription
TableS1.doc86KTable S1: Diagnostic performance of pattern recognition and serum CA 125 depending on the confidence of the ultrasound examiner in premenopausal patients.
TableS2.doc87KTable S2: Diagnostic performance of pattern recognition and serum CA 125 depending on the confidence of the ultrasound examiner in postmenopausal patients.
TableS3.doc46KTable S3: Diagnostic performance when CA 125 is used as a second-stage test after ultrasound, i.e. requiring the results of both pattern recognition and CA 125 to indicate a benign diagnosis for a benign diagnosis to be made, in tumors considered to be certainly benign or certainly malignant by the ultrasound examiner.
TableS4.doc46KTable S4: Diagnostic performance when CA 125 is used as a second-stage test after ultrasound, i.e. requiring the results of both pattern recognition and CA 125 to indicate a benign diagnosis for a benign diagnosis to be made, in tumors considered to be probably benign or probably malignant by the ultrasound examiner.
TableS5.doc46KTable S5: Diagnostic performance when CA 125 is used as a second-stage test after ultrasound, i.e. requiring the results of both pattern recognition and CA 125 to indicate a benign diagnosis for a benign diagnosis to be made, in tumors where the ultrasound examiner was completely uncertain whether the tumor was benign or malignant.
TableS6.doc45KTable S6: Diagnostic performance when CA 125 is used as a second-stage test after ultrasound, i.e. requiring the results of both pattern recognition and CA 125 to indicate malignancy for a malignant diagnosis to be made, in tumors considered certainly benign or certainly malignant by the ultrasound examiner.
TableS7.doc44KTable S7: Diagnostic performance when CA 125 is used as a second-stage test after ultrasound, i.e. requiring the results of both pattern recognition and CA 125 to indicate malignancy for a malignant diagnosis to be made, in tumors considered probably benign or probably malignant by the ultrasound examiner.
TableS8.doc47KTable S8: Diagnostic performance when CA 125 is used as a second-stage test after ultrasound, i.e. requiring the results of both pattern recognition and CA 125 to indicate malignancy for a malignant diagnosis to be made, in tumors where the ultrasound examiner was completely uncertain whether the tumor was benign or malignant.

Please note: Wiley Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.