Diagnostic performance of the EU TI‐RADS and ACR TI‐RADS scoring systems in predicting thyroid malignancy

Abstract Introduction Several ultrasound scoring systems have been developed to stratify the risk of malignancy of thyroid nodules, including ACR (American College of Radiology) and EU (European) TI‐RADS. This study aimed to assess the diagnostic performance of these two classifications using histology as a reference standard. Methods It was a single‐centre, retrospective study including 156 patients who underwent thyroidectomy. Ultrasound data of 198 nodules (99 malignant nodules and 99 benign nodules) were analysed. Both classifications were applied for all nodules. Results Ultrasound criteria associated with malignancy were solid composition (OR=7.81; p < 10−3), hypoechoic character (OR=16.42; p < 10−3), irregular contours (OR=7.47; p < 10−3), taller‐than‐wide shape (OR=3.58; p = 0.02), microcalcifications (OR=3.02; p = .006) and the presence of cervical adenopathy (OR=3.89; p = .006). The prevalence of malignancy was 15.5%, 69% and 76.9% for EU TI‐RADS categories 3, 4 and 5, respectively. It was 33.3%, 57% and 91.1% for ACR TI‐RADS categories 3, 4 and 5, respectively. For category 5, EU TI‐RADS and ACR TI‐RADS had sensitivities of 60% and 41%, specificities of 82% and 96%, respectively. For categories 4 and 5 combined, the diagnostic performance of these two classification systems became comparable with a sensitivity of 89% and 86% for EU‐TIRADS and ACR‐TIRADS, respectively. The area under the ROC curve was 0.81 for the EU TI‐RADS classification and 0.82 for the ACR TI‐RADS classification. Conclusions EU TI‐RADS and ACR TI‐RADS scoring systems seem to be comparable in predicting malignancy in thyroid nodules.


| INTRODUC TI ON
A thyroid nodule is defined as a lesion within the thyroid gland that is radiologically distinct from the surrounding thyroid parenchyma. 1 Since the widespread use of cervical ultrasound, thyroid nodules have become a common entity with a prevalence of 60%. 2,3 However, thyroid cancer remains a relatively rare entity accounting for less than 10% of thyroid nodules. 4 Thyroid ultrasound and fine needle aspiration cytology represent the standard of care for evaluating thyroid nodules. The thyroid ultrasound is widely used and recognized as the first tool to characterize thyroid nodules. Several ultrasound scoring systems have been developed to estimate the risk of malignancy and to identify nodules deserving fine needle aspiration cytology.
Among the most recent and recognized Thyroid Imaging Reporting and Data System (TI-RADS) classifications, the American College of Radiology (ACR) advanced an approach with an appropriate lexicon to be used for the ultrasound report, revised and published in 2017. 5 Similarly, the European Thyroid Association (ETA) developed a risk stratification system for thyroid nodules with a practical image guide, published in September 2017. 6 These ultrasound-based risk stratification systems aim to select nodules that warrant cytological diagnosis and to reduce unnecessary and not risk-free surgery.
However, there is no consensus on which TI-RADS scoring system is the best one. Many studies demonstrated that both guidelines provide effective stratification of malignancy risk for the diagnosis of thyroid nodules using cytology as a reference. [7][8][9] However, few studies employed histology as the reference standard to evaluate these ultrasound scoring systems performances. 10,11 This study aimed to evaluate and compare the accuracy and reliability of EU TI-RADS 2017 and ACR TI-RADS 2017 scoring systems in predicting thyroid malignancy using histology as a reference standard.

| ME THODS
This was a retrospective study including 99 benign and 99 malignant nodules in 156 patients who had undergone surgery for thyroid nodules in the Oto-rhino-laryngology department of La Rabta university hospital, Tunis, Tunisia between 2016 and 2020. Patients were included in this study if they had a complete medical file with a physical examination, a thyroid-stimulating hormone (TSH) measurement, a detailed cervical ultrasound report with the morphological criteria necessary to classify nodules according to EU TI-RADS and ACR TI-RADS scoring systems, the surgical report, and the final histology result. Patients aged less than 18 years were not included.
Age, gender, indication for surgery, TSH level, ultrasound data and histopathology report were collected.
Ultrasound examinations were performed in the department of radiology of La Rabta hospital and nodules were initially classified according to the ACR TI-RADS scoring system. We applied then the EU TI-RADS system. This latest version of EU TI-RADS consists of five categories, each one is scored in correspondence to features from ultrasound examination: EU TI-RADS 1 refers to the absence of thyroid nodule and EU TI-RADS 5 involves nodules presenting at least one of the following high-risk signs of malignancy: nonoval shape, irregular margins, microcalcifications and marked hypoechogenicity. The other three categories correspond to an increased risk of malignancy. 6 The ACR TI-RADS system is based on the assessment of different US features of thyroid nodules: composition, echogenicity, shape, margin and echogenic foci. Each of these features is associated with a score ranging from 0 to 3 points. The sum of the assigned points defines the risk of malignancy according to 5 grades, with each grade corresponding to benign, not suspicious, mildly suspicious, moderately suspicious and highly suspicious for malignancy. 5

| Statistical analysis
Statistical analysis was performed using the SPSS software package version 22. Continuous data were expressed as mean ± standard deviation or as ranges. Categorical data were expressed using frequencies and percentages. The student's t-test was used to compare continuous variables and the chi-square test or the Fisher exact test to compare categorical variables. Ultrasound criteria predictive of malignancy were determined by calculating odd ratios (OR).
To assess the diagnostic performance of both EU TI-RADS and ACR-TIRADS, receiver operating characteristic (ROC) curves were constructed and the area under the curve (AUC) was calculated.
The corresponding sensitivity, specificity, positive predictive value and negative predictive value were calculated. Agreement between the two systems was measured by Cohen's kappa coefficient. p val-ues<0.05 were considered to indicate statistical significance.

| RE SULTS
A total of 198 nodules in 156 patients (127 women and 29 men) were included in this study. All of the patients underwent surgery in the ORL department of our hospital. The youngest patient was 18 years old and the oldest was 80, with a mean age of 47.9 ± 13.9 years.
Preoperative TSH was normal in most patients and suppressed in 13 patients. The number of nodules per patient was one nodule in 121 patients, two nodules in 27 patients and three nodules or more in 8 patients. The indications of thyroid surgery were compression symptoms (7%), ultrasound high-risk lesion (64%) or fine needle aspiration cytology class 5 or 6 according to the Bethesda classification (29%). Initial surgery was a total thyroidectomy in 76.3% of cases and a thyroid lobectomy in 23.7% of cases. Out of the 156 patients, 89 patients had at least one malignant nodule, and 67 patients had one or more exclusively benign nodules. Adenomatous hyperplasia was noted in all benign nodules associated with lesions of thyroiditis, which comprised 30% of cases. Among the malignant nodules, there were 86 papillary carcinomas, five follicular carcinomas, four medullary carcinomas, two oncocytic carcinomas and two undifferentiated carcinomas. Malignant nodules were significantly smaller than benign nodules (23.7 ± 17.2 mm vs. 29.8 ± 13.1 mm; p = .005). Compared with benign nodules, the malignant nodules were significantly more likely to be solid or mostly solid, to have a hypoechogenicity or marked hypoechogenicity, to be lobulated or to have irregular margins, to be taller-than-wide shaped, to have microcalcifications and to be associated with suspect lymph nodes (Table 1). However, on multivariate logistic regression analysis, none of these ultrasound signs was independently associated with the risk of malignancy.
Diagnostic performance of ultrasound signs in distinguishing benign from malignant nodules is detailed in Table 2. While solid composition had the best sensitivity, marked hypoechogenicity and taller-than-wide shape were the most specific ultrasound signs in estimating thyroid malignancy. Table 3 represents the comparison of malignancy risk according to each category in both TI-RADS classifications using histology as a reference standard. The two systems proposed an estimated risk of malignancy in each category. Most of them were not well matched within the range of the theoric malignancy risk. The malignancy rates tended to increase along with the higher risk categories. Table 4 represents the sensitivity, specificity, positive predictive value and the negative predictive value of the EU TI-RADS and the ACR TI-RADS classifications. In the assessment of concordance between EU TI-RADS and ACR TI-RADS (Table 5), 40.8% of EU TI-RADS category 3 nodules, 90.5% of category 4 and 57.9% of category 5 nodules corresponded to ACR TI-RADS categories 3, 4 and 5, respectively. The mean concordance rate of all categories was 59%. The mean Cohen's kappa coefficient was 0.46 indicating a moderate agreement. The best kappa value was found between both categories 4 and 5 of the classification systems (Cohen's kappa = 0.87).

| DISCUSS ION
Our results affirmed the good performance of both EU TI-RADS and ACR TI-RADS 2017 ultrasound classifications in detecting malignant thyroid nodules.
In adjunction to clinical assessment, ultrasound examination of the thyroid is a useful tool to characterize nodules and to indicate the FNA cytology, hence making the right decision for a follow-up or surgical treatment. 12 Reliable and reproducible ultrasound criteria have been defined in recent years to distinguish suspicious nodules from benign ones. 13 The ultrasound features associated with a high risk of malignancy are of variable importance and none of them is sensitive enough to guide clinical decisions when considered separately. However, their association is more specific. [9][10][11] Our study showed that the exclusively solid composition, hypoechogenicity, taller-than wide shape, irregular margins and microcalcifications were significantly associated with malignancy. These findings were in line with what was previously described in many studies. 8,[14][15][16][17] The diagnostic performance of each ultrasound criteria was TA B L E 1 Ultrasound features in malignant and benign nodules (univariate analysis). Similarly for the ACR TI-RADS, although the risks of malignancy found were significantly higher than reported values, they are close to the results of several other series. 7,9,21 Since this is a retrospective analysis, patients in this study were all selected from a surgery department, which may lead to being classified at a higher risk for malignancy.

Malignant
In this study, we evaluated the diagnostic performance of these two international TI-RADS for detecting thyroid cancer. When we considered category 5 as the cut-off value, the EU TI-RADS classification showed a significantly higher sensitivity and the ACR TI-RADS classification had a higher specificity. However, no significant difference was noted with either classification when both categories 4 and 5 were compared to the other categories.
Our study showed that the category-based diagnostic performances of both classifications were closely comparable. But, the sensitivity and specificity of each system varied depending on which category was considered as the positive result (Category 5 or categories 4 and 5). These results were consistent with those of two recent meta-analyses. 22,23 If only category 5 is considered, the EU TI-RADS system would have a better diagnostic performance than the ACR TI-RADS system but without a significant difference. This could be explained by the fact that the presence of one ultrasound sign highly predictive of malignancy (nonoval shape, irregular contours, marked hypoechogenicity, microcalcifications) is sufficient to classify the nodule in category 5, whereas for the ACR TI-RADS system these ultrasound signs do not all have the same value. If the two categories 4 and 5 are considered together, the diagnostic performance of both ultrasound classification systems overlaps with a sensitivity exceeding 85%, but with a lower specificity.
Receiver operating characteristic (ROC) curves were used for the assessment of the performance of the TI-RADS classifications.
The area under the ROC curve was 0.81 for the EU TI-RADS clas- Prospective studies with ultrasound performed by the same qualified radiologist may overcome these limitations.

| CON CLUS ION
Both ACR TI-RADS and EU TI-RADS scoring systems provided effective stratification of malignancy risk for the diagnosis of thyroid nodules.

AUTH O R CO NTR I B UTI O N S
Hiba-Allah Chatti: Conceptualization (supporting); data curation

FU N D I N G I N FO R M ATI O N
This research did not receive any specific grant from funding agencies in the public, commercial or not-for-profit sectors.

CO N FLI C T O F I NTE R E S T S TATE M E NT
The authors declare that they have no competing interests.

DATA AVA I L A B I L I T Y S TAT E M E N T
The data used and analysed during the current study are available from the corresponding author on reasonable request.

E TH I C S S TATEM ENT
The study was carried out in accordance with the ethical standards of the institutional and the national research committee and with the 1964 Helsinki declaration.