Consensus Statement on preoperative diagnosis of ovarian tumors

The European Society of Gynaecological Oncology (ESGO), the International Society of Ultrasound in Obstetrics and Gynecology (ISUOG), the International Ovarian Tumour Analysis (IOTA) group and the European Society for Gynaecological Endoscopy (ESGE) jointly developed clinically relevant and evidence-based statements on the preoperative diagnosis of ovarian tumors, including imaging techniques, biomarkers and prediction models. assessment of different neoplasias in the adnexa, assessment of different neoplasias in the adnexa masses, assessment of different neoplasias in the adnexa model, benign ovarian masses, benign ovarian tumours, beta-human chorionic gonadotropin, biomarker, borderline tumours, carbohydrate antigen 19.9, carbohydrate antigen 125, carcinoembryonic antigen, cell-free deoxyribonucleic acid, circulating cancer cells, circulating cell-free deoxyribonucleic acid, circulating free deoxyribonucleic acid, circulating tumour cells, circulating tumour deoxyribonucleic acid, clinical routine, computed tomography, consensus statement, daily practice, diagnosis, diagnostic performance, diagnostic models, diffusion-weighted imaging, diffusion-weighted magnetic resonance imaging, dynamic contrast-enhanced magnetic resonance imaging, expert ultrasound examiners, ﬁrst line test, functional sequences, gynecology imaging reporting and data system, human epididymis protein, imaging, imaging methods, immunohistochemical diagnosis, inhibin, international ovarian tumor analysis, international ovarian tumor analysis methods, international ovarian tumor analysis rules, intraoperative ultrasound, investigations, logistic regression 1 test, logistic regression 2 test, magnetic resonance imaging, malignant ovarian masses, malignant ovarian tumours, marker, maximum standardized uptake value, molecular biology, molecular marker, morphological scoring system, multivariate analysis, ovarian cancer, ovarian masses, ovarian tumours, ovary, positron emission tomography, positron emission tomography-computed tomography, pre-operative characterization, pre-operative diagnosis, prognostic factor, prognostic value, protein biomarker, risk factors, risk of malignancy score, risk of malignancy index, risk of ovarian malignancy algorithm, scoring system, screening test, secondary metastatic tumours, second line test, simple rules, simple rules risk, simple rules risk model, single protein biomarker, standardized uptake value, suspected malignancy, suspected metastatic tumour, test performances, threshold risk, transabdominal ultrasound, transvaginal ultrasound, tumour markers, ultrasonography,


INTRODUCTION
The accurate characterization of newly diagnosed adnexal lesions is of paramount importance to define appropriate treatment pathways. Patients with masses that are suspicious for malignancy should be referred to a gynecological oncology center, in order to receive specialist care, as per the definitions of the European Society of Gynaecological Oncology (ESGO) 1 and national and international recommendations and guidelines. For a non-gynecological primary tumor, patients need to be referred to an appropriate specialist, while patients with benign lesions may be followed up and treated conservatively or may be suitable for less radical surgical treatment, depending on the clinical context [2][3][4][5][6][7] . Treatment decision-making processes should be based on a combination of the patient's overall clinical picture, symptoms, preferences, previous medical and surgical history, tumor markers and clinical and radiological findings. A single diagnostic modality alone should not determine the patient's journey.
The ESGO, the International Society of Ultrasound in Obstetrics and Gynecology (ISUOG), the International Ovarian Tumour Analysis (IOTA) group and the European Society for Gynaecological Endoscopy (ESGE) have, jointly, developed clinically relevant and evidence-based statements on the preoperative diagnosis of ovarian tumors and assessment of disease spread, including imaging techniques, biomarkers and predictive models. Neither screening and follow-up modalities, nor economic analysis of the imaging techniques, biomarkers and prediction models addressed herein, are included within the remit of this Consensus Statement.

149
approaches for the preoperative diagnosis of ovarian tumors and assessment of disease spread, based on the available literature and evidence. Any clinician applying or consulting these statements is expected to use independent medical judgment in the context of individual clinical circumstances to determine all patients' care and treatment. These statements are presented without any warranty regarding their content, use or application and the authors disclaim any responsibility for their application or use in any way.

METHODS
This Consensus Statement on the preoperative diagnosis of ovarian tumors and assessment of disease spread was developed using an eight-step process, chaired by Professors Christina Fotopoulou and Dirk Timmerman ( Figure 1). Aiming to assemble a multidisciplinary international group, ESGO/ISUOG/IOTA/ESGE nominated 19 practising clinicians and researchers who have demonstrated leadership and expertise in the preoperative diagnosis of ovarian tumors and clinical management of ovarian cancer patients through research, administrative responsibilities, and/or committee membership (including eight members of ESGO, five members of ISUOG, four members of IOTA and two members of ESGE). These experts included seven gynecologists with special interest in ultrasonography, two radiologists and 10 gynecological oncologists. They did not represent the societies from which they were selected, and were asked to base their decisions on their own experience and expertise. Also included in the group was a patient representative, who is Chair of the Clinical Trial Project of the  European Network of Gynaecological Cancer Advocacy Groups, ENGAGe. An initial conference call, including the whole group, was held to facilitate introductions, as well as to review the purpose and scope of this Consensus Statement.
To ensure that the statements were evidence-based, the current literature was reviewed and critically appraised. Thus, a systematic literature review of relevant studies published between 1 May 2015 and 1 May 2020 was carried out using the MEDLINE database (Appendix 1). The literature search was limited to publications in the English language. Priority was given to high-quality systematic reviews, meta-analyses and validating cohort studies, although studies with lower levels of evidence were also evaluated. The search strategy excluded editorials, letters and case reports. The reference list of each identified article was reviewed for other potentially relevant articles. Final results of the literature search were distributed to the whole group, including electronic full-text versions of each article. F. Planchamp provided the methodology and medical writing support for the entire process, and did not participate in voting for statements.
The chairs were responsible for drafting preliminary statements based on the review of the relevant literature. These were then sent to the multidisciplinary international group prior to a second conference call. During this conference call, the whole group discussed each preliminary statement and a first round of binary voting (agree/disagree) was carried out for each potential statement. All 20 participants took part in each vote, but they were permitted to abstain from voting if they felt they had insufficient expertise to agree/disagree with the statement or if they had a conflict of interest that could be considered to influence their vote. Statements were removed when a consensus among group members was not obtained. The voters had the opportunity to provide comments/suggestions with their votes. The chairs then discussed the results of this first round of voting and revised the statements if necessary. The voting results and the revised version of the statements were again sent to the whole group and another round of binary voting was organized, according to the same rules, to allow the whole group to evaluate the revised version of the statements. The statements were finalized based on the results of this second round of voting. The group achieved consensus on 18 statements. In this Consensus Statement, we present a summary of the supporting evidence, the finalised series of statements, and their levels of evidence and grades.

Consensus Statement
assessment studies addressing all types of adnexal tumor, as this is a better reflection of clinical reality.

Ultrasonography
A transvaginal ultrasound examination is often regarded in clinical practice as the standard first-line imaging investigation for the assessment of adnexal pathology [8][9][10][11] . The diagnostic accuracy of ultrasonography in differentiating between benign and malignant adnexal masses has been shown to relate to the expertise of the operator [12][13][14] . The European Federation of Societies for Ultrasound in Medicine and Biology has published minimum training requirements for gynecological ultrasound practice in Europe, including standards for theoretical knowledge and practical skills 15 . These identify three levels of training and expertise. Thus, Level III (expert) can be attributed to a practitioner who is likely to spend the majority of their time undertaking gynecological ultrasound and/or teaching, research and development in the field. A Level-II practitioner should have undertaken at least 2000 gynecological ultrasound examinations. The training required to attain this level of practice would usually be gained during a period of expert ultrasound training, which may be within, or after completion of, a specialist training program. To maintain competence at Level II, practitioners should perform at least 500 examinations each year. A Level-I practitioner should have performed a minimum of 300 examinations under the supervision of a Level-II practitioner or an experienced Level-I practitioner with at least 2 years' regular practical experience. To maintain Level-I status, the practitioner should perform at least 300 examinations each year. A prospective randomized controlled trial to assess the effect of the quality of gynecological ultrasonography on the management of patients with suspected ovarian cancer has demonstrated that women with a Level-III (expert) ultrasound examination undergo significantly fewer unnecessary major procedures and have a shorter inpatient hospital stay compared with those having a Level-II (routine) examination by a sonographer 14 .
Subjective assessment by expert ultrasound examiners has excellent performance to distinguish between benign and malignant ovarian tumors [9][10][11][12][13][14] . In many cases, expert examiners should be able to narrow the diagnosis down further, to a specific histological subtype. The typical pathognomonic ultrasound features of some key histological types have been published in the series, 'Imaging in gynecological disease', in Ultrasound in Obstetrics and Gynecology (https://obgyn.onlinelibrary .wiley.com/doi/toc/10.1002/(ISSN)1469-0705.IMAGING INGYNECOLOGICALDISEASE). The most common and typical findings for each pathology are summarized in Table 1.

Risk of malignancy index (RMI) and risk of ovarian malignancy algorithm (ROMA)
Several attempts have been made to develop more objective ultrasound-based approaches for discriminating between benign and malignant adnexal tumors. These include the risk of malignancy index (RMI), a scoring system based on menopausal status, a transvaginal ultrasound score and serum cancer antigen 125 (CA 125) level 16 . Many studies have demonstrated the diagnostic performance of the RMI in classifying adnexal masses 11,[17][18][19][20][21][22][23][24][25][26][27][28][29] . Three variants of the RMI (RMI-II, RMI-III, RMI-IV) have been developed, but these offer no significant additional diagnostic advantage compared with the original version (RMI-I) 11,22,27,28 . Moore et al. 30 developed an algorithm, the risk of ovarian malignancy algorithm (ROMA), based on both CA 125 and human epididymis protein 4 (HE4). Westwood et al. 18 pooled data comparing the ROMA with the RMI-I to guide referral decisions for women with suspected ovarian cancer and found similar performance if women with borderline tumors and non-epithelial cancers were excluded from the analyses. More recently, another meta-analysis showed a higher specificity of the RMI-I than the ROMA in premenopausal women but a similar performance for detecting ovarian cancer in postmenopausal women presenting with an adnexal mass 17 . Limitations of the RMI are the absence of an estimated risk of malignancy, and its considerable dependence on serum CA 125, the latter resulting in a relatively low sensitivity for early-stage invasive and borderline disease, especially in premenopausal women 31,32 (see Tumor markers).

IOTA methods
To homogenize and standardize the quality, description and evaluation of ultrasonography across different centers, and thereby increase diagnostic accuracy, the IOTA group first published a consensus paper on terms and definitions to describe adnexal lesions in 2000 33 . Using this standardized methodology, the IOTA group has developed different prediction models based on logistic regression analysis [34][35][36] . In a large-scale external validation study, Van Holsbeke et al. 37 showed that the IOTA logistic regression models 1 (LR1, with 12 variables) and 2 (LR2, with six variables) outperformed 12 other models, including the RMI. The LR2 model was easier to use than the LR1 model. Demonstrating the standardization and reproducibility of the IOTA models, Sayasneh et al. 38 showed that even less-experienced sonographers are able to differentiate accurately between benign and malignant ovarian masses using the IOTA LR1 model. The IOTA group also developed 'Simple Rules' that may be applied to a mass based on the presence or absence of five benign and five malignant ultrasound features. These rules can be applied to about 80% of adnexal masses, with the rest being classed as inconclusive. They have now been broadly accepted and are widely used in clinical practice [38][39][40][41][42][43][44][45][46] . More recently, a logistic regression model based on the ultrasound features of the original Simple Rules was developed, i.e. the Simple Rules risk model. This model is able to provide an individual estimated risk of malignancy for any type of lesion 35 . A summary of the main models and scoring systems for the  Continued over. Continued over.  preoperative diagnosis of ovarian tumors is presented in Table 2. As many ovarian masses can be recognized relatively easily, the IOTA group also proposed four 'Simple Descriptors' of the features typical of common benign lesions and two suggestive of malignancy, which can give an 'instant diagnosis' and reflect the pattern recognition that is a key part of ultrasonography. These are applicable to about 43% of adnexal masses 47 . A three-step strategy, consisting of the sequential use of Simple Descriptors, Simple Rules and subjective assessment by an expert, had high accuracy for discriminating between benign and malignant adnexal lesions 47 . A systematic review and meta-analysis reported better performance of the IOTA Simple Rules and the IOTA LR2 model compared with all other scoring systems, including the RMI 48 . Besides confirming these findings, another meta-analysis highlighted that a two-step approach, with the IOTA Simple Rules as the first step and subjective assessment by an expert for inconclusive tumors as the second step, matched the test performance of expert ultrasound examiners 11 . The IOTA Simple Rules have been integrated into several national clinical guidelines for the evaluation and management of adnexal masses 49,50 and they were considered the main diagnostic strategy 51 as part of a first international consensus report for the assessment of adnexal masses.

Consensus Statement
A randomized controlled trial assessing surgical intervention rates and the oncologic safety of decision-making processes using on an RMI-based protocol developed by the British Royal College of Obstetricians and Gynaecologists (RCOG) vs triage using the IOTA Simple Rules 52 showed that the IOTA protocol resulted in lower surgical intervention rates compared with the RMI-based RCOG protocol. The IOTA Simple Rules did not result in more cases in which a diagnosis of cancer was delayed. It was found that the addition of biomarkers such as serum CA 125 and HE4 when using the IOTA Simple Rules, with or without subjective assessment by an expert sonographer, offered no additional diagnostic advantage for the characterization of ovarian masses, but was more costly than a three-step strategy based on the sequential use of the IOTA Simple Descriptors, Simple Rules and expert evaluation 53,54 .
The IOTA group have also developed the Assessment of Different NEoplasias in the adneXa (ADNEX) model. This multiclass prediction model is the first risk model to differentiate between benign and malignant tumors, whilst also offering subclassification of any malignancy into borderline tumors, Stage-I and Stage-II-IV primary cancers and secondary metastatic tumors. The IOTA ADNEX model was developed and validated using parameters collected by experienced ultrasound examiners 36 . Several

GI-RADS
The Gynecologic Imaging Reporting and Data System (GI-RADS) was first introduced by Amor et al. 64 in 2009 and was validated prospectively by the same team in a multicenter study 2 years later 65 . This reporting system quantifies the risk of malignancy into five categories: GI-RADS 1, definitively benign (estimated probability of malignancy (EPM) = 0%); GI-RADS 2, very probably benign (EPM < 1%); GI-RADS 3, probably benign (EPM = 1-4%); GI-RADS 4, probably malignant (EPM = 5-20%); and GI-RADS 5, very probably malignant (EPM > 20%). More recently, several studies have demonstrated the value of the GI-RADS system for the assessment of malignant adnexal masses in women who are candidates for surgical intervention. Furthermore, the addition of GI-RADS to CA 125 improves the identification of adnexal masses at high risk of malignancy compared with using CA 125 alone [66][67][68][69][70][71] .

O-RADS
The Ovarian-Adnexal Reporting and Data System (O-RADS) lexicon for ultrasound was published in 2018, providing a standardized glossary that includes all appropriate descriptors and definitions of the characteristic ultrasound appearance of normal ovaries and various adnexal lesions 72,73 . The O-RADS ultrasound working group developed an adnexal-mass triage system based either on the O-RADS descriptors or on the risk of malignancy assigned to the mass using the IOTA ADNEX model to classify ovarian tumors into different risk categories 74 . However, to date, neither the triage system nor the O-RADS descriptors have been externally validated. Basha et al. 75 determined the malignancy rates, validity and reliability of the O-RADS approach when applied to a database of 647 adnexal masses collected before the development of the O-RADS system. In this retrospective study, the O-RADS system had significantly higher sensitivity than did the GI-RADS system and the IOTA Simple Rules, with a non-significant slightly lower specificity compared with both GI-RADS and IOTA Simple Rules, and with similar reliability.

Statements on ultrasonography (Statements 1-6)
1. Subjective assessment by expert (Level-III) ultrasound examiners has the best performance to distinguish between benign and malignant ovarian tumors.

Tumor markers
According to a systematic quantitative review assessing the accuracy of CA 125 level in the diagnosis of benign, borderline and malignant ovarian tumors, CA 125 is the best available single-protein biomarker identified to date 76 . Although it lacks sensitivity and specificity for early stages of the disease and has a relatively low specificity overall, it can help direct treatment options in patients with suspicious ovarian masses. Pooled analyses have highlighted that a high body mass index and ethnicity might influence CA 125 levels, representing an additional diagnostic challenge 77 . Other factors that influence CA 125 levels are the age of the patient, pregnancy, inflammatory processes and the presence of fibroids or endometriosis [77][78][79][80] .

Statements on tumor markers (Statements 7-12)
7. CA 125 is the best single-protein biomarker for the preoperative characterization of ovarian tumors. However, it is not useful as a screening test for ovarian cancer.
-Level of evidence: 2b -Grade of statement: B -Consensus: yes, 60% (n = 12); no, 10% (n = 2); abstain, 30% (n = 6) 10. CA 125 is helpful as a biomarker in cases of suspected malignancy and it helps to distinguish between subtypes of malignant tumors, such as borderline and early-and advanced-stage primary ovarian cancers and secondary metastatic tumors.

Magnetic resonance imaging
Several reports have found that magnetic resonance imaging (MRI), alone or in combination with computed tomography (CT), predicts accurately the presence of peritoneal carcinomatosis in patients undergoing preoperative evaluation for cytoreductive surgery, particularly when the assessment is carried out by an experienced radiologist [114][115][116][117] . Recently, a prospective study reported higher specificity of the IOTA LR2 model compared with subjective interpretation of MRI findings by an experienced radiologist, as well as similar sensitivities for both imaging modalities for discriminating between benign and malignant tumors 118 . The addition of diffusion-weighted techniques to conventional imaging modalities has been shown in multiple pooled studies to increase diagnostic accuracy in discriminating between benign tumors and ovarian cancer, especially in the Caucasian population, with data even suggesting a value in predicting resectability [119][120][121][122][123] . However, the true extent of such a benefit needs to be validated further in multicenter, large-scale prospective randomized studies, which are currently being designed or underway 121 . The addition of quantitative dynamic contrast-enhanced MRI to diffusion-weighted imaging and anatomical MRI sequences and the development of a 5-point scoring system (O-RADS MRI score) is another modern diagnostic development with promising potential for the differentiation between benign and malignant adnexal masses in cases in which ultrasound is unable to arrive at a clear diagnosis (i.e. indeterminate masses). When this technique is enhanced with volume quantification, it can help to discriminate between Type-I and Type-II epithelial ovarian cancers [124][125][126][127][128][129][130] . However, there are only limited data available on the impact of these modern MRI techniques on clinical decision-making and further studies are needed, with larger sample populations 131 .

Computed tomography
Dedicated multidetector CT protocols with standardized peritoneal carcinomatosis index forms are the most common diagnostic tool used in routine clinical practice to assess the extent of tumor dissemination and the presence of peritoneal carcinomatosis [132][133][134][135][136] . A radiological peritoneal carcinomatosis index applied at preoperative CT within an expert setting has been shown to have low performance scores as a triage test to identify patients who are likely to have complete cytoreduction to no macroscopic residual disease 137 . On retrospective analysis, preoperative CT imaging showed high specificity but rather low sensitivity in detecting tumor involvement at key sites in ovarian cancer surgery 136 . Multiple studies that have attempted to cross-validate the accuracy of CT scans in predicting unresectable disease and incomplete cytoreduction have shown a substantial drop in accuracy rates when attempts have been made to validate them in other cohorts [138][139][140][141][142][143][144][145] . Thus, CT should not be used as the sole tool to predict the resectability of peritoneal carcinomatosis and exclude patients from surgery; rather, the full clinical context should be taken into account. Its widespread availability makes CT useful as a first-line diagnostic tool to identify patients who should not be selected for cytoreductive surgery, such as those with large/multifocal intraparenchymatous distant metastases, acute thromboembolic events or secondary metastatic tumors that limit the prognosis. The role of radiomics as an additional quantitative mathematical segmentation of conventional preoperative CT images has shown some promising results in preliminary studies; however, larger studies are necessary for validation before this technique is implemented in clinical practice 146 .

Positron emission tomography-computed tomography
Positron emission tomography-computed tomography (PET-CT) may be useful in differentiating malignant from borderline or benign ovarian tumors, with the limitation that its diagnostic performance can be impacted negatively by certain tumor histological subtypes, due to the lower fluorodeoxyglucose uptake in clear-cell and mucinous invasive subtypes [147][148][149][150][151][152] . PET-CT can also play a role as an additional technique in the diagnosis of lymph-node metastases, especially outside the abdominal cavity, or in characterizing unclear lesions in key areas that would alter clinical management, for example chest lesions [153][154][155] . However, PET-CT does not seem to be a relevant additional diagnostic modality for the true extent of peritoneal spread of ovarian cancer, specifically bowel and mesenteric serosa, and therefore fails to predict resectability in those key sites, especially in the presence of low-volume disease 156 . Furthermore, PET-CT has been shown to have a low diagnostic value in differentiating borderline from benign tumors and should therefore not be used in clinical decision-making processes in that context, especially when considering fertility-sparing procedures 147,148,152 .

Statements on MRI, CT and PET-CT (Statements 13-17)
13. MRI with the inclusion of the functional sequences, dynamic contrast-enhanced and diffusion-weighted MRI, is not a first-line tool but may be used as a second-line tool after ultrasonography to further differentiate between benign, malignant and borderline masses.

Consensus Statement
Circulating cell-free DNA and circulating tumor cells Circulating cell-free DNA and circulating tumor cells as non-invasive cancer biomarkers and in non-invasive biopsy (sometimes called 'liquid biopsy') have been investigated in multiple studies [157][158][159][160][161][162][163][164][165][166][167][168][169][170] . DNA methylation patterns in cell-free DNA show potential to detect a proportion of ovarian cancers up to 2 years in advance of diagnosis. They may potentially guide personalized treatment, even though validation studies are lacking. The prospective use of novel collection vials, which stabilize blood cells and reduce background DNA contamination in serum/plasma samples, will facilitate the clinical implementation of liquid biopsy analyses 160 .
A prospective evaluation of the potential of cell-free DNA for the diagnosis of primary ovarian cancer using chromosomal instability as a read-out suggested that this might be a promising method to increase the specificity of the presurgical prediction of malignancy in patients with adnexal masses 168 . However, even though these circulating biomarkers play a key role in understanding metastasis and tumorigenesis and provide comprehensive insight into tumor evolution and dynamics during treatment and disease progression, they still have not been established as part of routine clinical practice [157][158][159] .
One meta-analysis suggested that quantitative analysis of cell-free DNA has unsatisfactory sensitivity but acceptable specificity for the diagnosis of ovarian cancer 170 . In a more recent meta-analysis, cell-free DNA appeared to be slightly better than CA 125 and similar to HE4 with respect to its diagnostic ability to discriminate individuals with from those without ovarian cancer 163 . Nevertheless, the diagnostic value of cell-free DNA in ovarian cancer patients remains unclear and the data should be interpreted with caution. Further large-scale prospective studies are strongly recommended to validate the potential applicability of using circulating cell-free DNA, alone or in combination with conventional markers, as a diagnostic biomarker for ovarian cancer, and to explore potential factors that may influence the accuracy of ovarian cancer diagnosis 170 .

Circulating cell-free DNA and circulating tumor cells
should not yet be used in routine clinical practice to differentiate between benign and malignant ovarian masses.

OVERVIEW OF CONSENSUS
The experts also reached a consensus on a flowchart describing steps recommended to distinguish between benign and malignant tumors ( Figure 2) and to direct  patients towards appropriate treatment pathways. Ultrasonography is recommended as a first step to stratify patients with symptoms suggestive of an adnexal mass, and in those with an incidental finding of an adnexal mass on imaging. If the scan rules out normal ovaries and physiological changes (i.e. rules out O-RADS 1), the IOTA ADNEX model could be applied as a next step in order to determine the risk of malignancy. Any ultrasonographic examination in the case of a suspected ovarian mass should be performed by an expert sonographer. The resulting classification of the lesion into one of the O-RADS categories (2)(3)(4)(5) can further guide the management and selection of patients for referral to a dedicated gynecological oncology center. A consensus was also reached on further steps necessary to differentiate between subgroups of malignancy and extent of disease within gynecological oncology centers ( Figure 3). Ultrasound assessment by an expert or application of the IOTA ADNEX model in combination with the tumor marker profile (CA 125 and CEA, complemented with other markers in specific cases) can often indicate the specific subtype of malignancy. If available, diagnosis of the primary lesion can be confirmed with diffusion-and perfusion-weighted MRI, especially in cases in which fertility-sparing surgery is considered. A CT scan of chest, abdomen and pelvis is mandatory before planned surgery for presumed malignancy, in order to exclude secondary cancers, thromboembolic events, and multifocal intraparenchymal distant metastases that would preclude operability. The final management and treatment journey of the patient should be determined within an expert multidisciplinary setting, taking into account both the diagnostic findings and the overall patient profile, including symptoms, patient preferences and prior surgical, medical and reproductive history, with the ultimate aim of defining an individualized approach for every patient.

ACKNOWLEDGMENTS
The authors thank ESGO, ISUOG, IOTA and ESGE for their support. We wish also to express sincere gratitude to Maciej Malecki (University Hospital Leuven, Leuven, Belgium) for providing technical support during the conference call.

Funding
All costs relating to the development process were covered from ESGO, ISUOG, IOTA and ESGE funds. There was no external funding of the development process or manuscript production.  Systematic review (with homogeneity) of Level-1 diagnostic studies; or clinical decision rule with Level-1b studies from different clinical centers 1b Validating cohort study with good reference standards; or clinical decision rule tested within one clinical center 1c Absolute SpPins and SnNouts* 2a Systematic review (with homogeneity) of Level > 2 diagnostic studies 2b Exploratory cohort study with good reference standards; or clinical decision rule after derivation, or validated only on split-sample or databases 3a Systematic review (with homogeneity) of studies Level ≥ 3b 3b Non-consecutive study; or without consistently applied reference standards 4 Case-control study, poor or non-independent reference standard 5 Expert opinion without explicit critical appraisal, or based on physiology, bench research or 'first principles'

Quality of Code evidence Definition
A High Further research is very unlikely to change our confidence in the estimate of effect.
• Several high-quality studies with consistent results • In special cases: one large, high-quality multicenter trial B Moderate Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate.
• One high-quality study • Several studies with some limitations C Low Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate.
• One or more studies with severe limitations D Very low Any estimate of effect is very uncertain.
• Expert opinion • No direct research evidence • One or more studies with very severe limitations Note: A minus sign '-' may be added to denote evidence that fails to provide a conclusive answer because it is either (a) a single result with a wide confidence interval; or (b) a systematic review with considerable heterogeneity. Such evidence is inconclusive, and therefore can only generate Grade D recommendations. *'Absolute SpPin' is a diagnostic finding whose specificity is so high that a positive result rules in the diagnosis; 'Absolute SnNout' is a diagnostic finding whose sensitivity is so high that a negative result rules out the diagnosis.