The machine learning‐based model for lateral lymph node metastasis of thyroid medullary carcinoma improved the prediction ability of occult metastasis

Abstract Background For medullary thyroid carcinoma (MTC) with no positive findings in the lateral neck before surgery, whether prophylactic lateral neck dissection (LND) is needed remains controversial. A better way to predict occult metastasis in the lateral neck is needed. Methods From January 2010 to January 2022, patients who were diagnosed with MTC and underwent primary surgery at our hospital were retrospectively reviewed. We collected the patients' baseline characteristics, surgical procedure, and rescored the ultrasound images of the primary lesions using American College of Radiology (ACR) Thyroid Imaging, Reporting and Data System (TI‐RADS). Regularized logistic regression, 5‐fold cross‐validation and decision curve analysis was applied for lateral lymph node metastasis (LLNM) model's development and validation. Then, we tested the predictive ability of the LLNM model for occult LLNM in cN0−1a patients. Results A total of 218 patients were enrolled. Five baseline characteristics and two TI‐RADS features were identified as high‐risk factors for LLNM: gender, baseline calcitonin (Ctn), tumor size, multifocality, and central lymph node (CLN) status, as well as TI‐RADS margin and level. A LLNM model was developed and showed a good discrimination with 5‐fold cross‐validation mean area under curve (AUC) = 0.92 ± 0.03 in the test dataset. Among cN0−1a patients, our LLNM model achieved an AUC of 0.91 (95% CI, 0.88–0.94) for predicting occult LLNM, which was significantly higher than the AUCs of baseline Ctn (0.83) and CLN status (0.64). Conclusions We developed a LLNM prediction model for MTC using machine learning based on clinical baseline characteristics and TI‐RADS. Our model can predict occult LLNM for cN0−1a patients more accurately, then benefit the decision of prophylactic LND.


| INTRODUCTION
Medullary thyroid carcinoma (MTC) is a relatively rare neuroendocrine tumor (NET) that originates from the parafollicular C cells.It has an incidence of less than 1 per 100,000 population and accounts for only 1%-2% of all thyroid malignancies. 1,2The standard treatment for MTC is total thyroidectomy and central neck dissection (ND).][4] For MTC patients without evidence of LLNM, whether prophylactic lateral neck dissection (LND) should be performed is still controversial.
Some research suggest that central lymph node (CLN) metastasis 5 and high preoperative basal calcitonin (Ctn) 6 are high-risk factors for LLNM, and recommend to decide whether to perform further prophylactic lateral ND based on this.However, there is no widespread consensus in clinical practice, nor is it recommended by most guidelines.Due to the higher malignancy of MTC compared to differentiated thyroid carcinoma (DTC), patients with MTC without preoperative evidence of LLND could face the dilemma of undertreatment or overtreatment with LND.Therefore, how to accurately predict the presence of occult LLNM in cN 0−1a patients before surgery is an urgent problem that needs to be solved.
In the past decade, machine learning has been widely applied in various fields of medical research, and the advent of GPT-4 has even ignited the global desire for artificial intelligence.4][15][16] Our study is based on a single center and incorporates TI-RADS scores along with clinical baseline data to achieve a better prediction outcome.Moreover, we evaluated the performance of our model in detecting occult LLNM by testing it among cN 0−1a patients, and compared it with basal Ctn and CLN status to demonstrate its clinical applicability.

| Patient enrollment and data collection
From January 2010 to January 2022, 324 patients who underwent initial surgery and pathologically confirmed medullary thyroid cancer in our hospital were reviewed.Our exclusion criteria were (1) previous clinical records and ultrasound images were unavailable or of poor quality, (2) pre-and postsurgery basal Ctn tests were unavailable, (3) distant metastasis was detected before surgery, and (4) biochemically uncured patients were lost to follow-up and could not determine the recurrence of lateral neck.A total of 218 cases were included in our research.The flow chart of patient enrollment is shown in Figure 1.
This study has been registered and approved by the Ethics Committee of the Cancer Hospital, Chinese Academy of Medical Sciences.(Approval number: NCC2021C-431).Considering that it is a retrospective study, signing informed consent was judged exempted.
We reviewed the medical records and extracted the patient's clinical baseline characteristics, namely: age, gender, family history (based on RET germline gene screening, judging whether the patient is sporadic or inherited MTC), preoperative basal Ctn, tumor size (tumor length, if multiple lesions, the larger one is regarded as the main lesion), multifocality, CLN status.The criteria of radiographically positive lymph nodes was determined based on the presence of enlarged size, cystic changes, rounded shape, eccentric cortical, widening, decreased echogenicity, loss of fatty hilus, presence of calcifications, and increased intranodal vascularity. 17,18Based on the American College of Radiology (ACR) Thyroid Imaging, Reporting and Data System (TI-RADS), the ultrasound features were recorded with data for nodule composition, echogenicity, shape, size, margins, and echogenic foci.The features for each patient were independently evaluated and recorded in a blinded manner by two radiologists (Qianqian Guo and Lijuan Niu), with 5 and 30 years of experience in thyroid nodules diagnosis.When a disagreement occurred between the two reviewers during valuation, a joint review was performed, and consensus data were used for further statistical analysis.

| Surgery and outcome assessment
According to the preoperative ultrasound and neck CT images results, 126 patients were diagnosed as cN 0 , 15 patients as cN 1a , and 77 patients as cN 1b in the group.Except for one with inherited MTC who had prophylactic total thyroidectomy, all patients underwent total thyroidectomy and central compartment dissection.Additionally, 38 cases in the cN 0 group and 9 cases in the cN 1a group underwent prophylactic LND.The specific surgical procedures and treatment outcomes are shown in Table 1.
According to the postoperative pathological results, 23, 9, and 71 patients in the cN 0 , cN 1a , and cN 1b groups were found to have metastasis in the lateral neck, respectively, totaling 103 cases.The biochemical cure rates of the three groups were respectively 79.4%, 46.7%, and 18.2%.For patients who underwent LND, those who were pathologically confirmed LNM were defined as positive; otherwise, they were defined as negative.For patients who did not undergo LND, if the postoperative Ctn was lower than normal (biochemically cure), it was defined as negative.If biochemical cure was not achieved, the patients were followed up by outpatient review records and telephone calls.A total of five patients were found to have lateral lymphnode recurrence in follow-up.

| Feature selection
MTCs with and without LLNM were compared based on patient clinical baseline characteristics and TI-RADS score.The features used in this study include numeric features and categorical features.The numeric features were age, tumor size, and basal Ctn.The categorical features were gender, multifocality, family history (inherited or sporadic MTC), CLN status, TI-RADS items (composition, echogenicity, shape, margin, echogenic foci), and TI-RADS level.The Chi-squared test and Student's t-test were used to determine whether there was any statistical difference in categorical features and numeric features.The features with p-value <0.05 (gender, multifocality, central compartment LN, TI-RADS edge and TI-RADS level, tumor size, and basal Ctn)  b Data in parentheses are percentages.
c Preoperative ultrasound considering central lymph node metastasis or not.
d The number on the right represents the score of each item in ACR TI-RADS.
were selected for further study.The statistical analysis was performed by using Python software (version 3.8).

| Model development
In this study, we developed a regularized logistic regression model for LLNM prediction.The numeric features were standard scaled, and the categorical features were one-hot encoded via 'OneHotEncoder' in the preprocessing.Then, the hyperparameters for LLNM prediction model are optimized by cross-validated exhaustive grid search over the search space.The strategy to select the best hyperparameters is to select the most accurate prediction in the training set.For more details of the model development, please refer to Note S1.We utilized 5-fold cross-validation to split our study population into training and testing datasets to reduce bias and avoid overfitting in this study.The screened 218 MTC patients were randomly divided into five equal parts, each of which was called a fold.The LLNM prediction model was trained on 4 folds of the patients at a time, and then tested on the remaining fold patients.This process was repeated five times so that each fold of the data has a chance to become the independent test dataset.

| Model evaluation
The performance of the LLNM prediction model was evaluated by quantitative indexes including the area under the receiver operating characteristic (ROC) curve (AUC), accuracy (ACC), sensitivity/recall, specificity, positive predictive value/precision, negative predictive value, Matthew's correlation coefficient, and F1 score (F1), which are described in Note S2.The decision curve analysis was used to test the clinical usefulness of the regularized logistic regression model in LLNM prediction.The details can refer to Note S3.
To test the clinical utility of the LLNM model, we evaluated its ability to predict occult LLNMs in cN 0−1a patients, which is 141 cases in total.We also examined the predictive ability of Basal Ctn and CLN status, and compared that with the LLNM model.The evaluation method is the same as described above.  | RESULTS

| Patient characteristics and selection of significant features
In this study, 107 cases were diagnosed with LLNM and 111 cases without.There were five clinical baseline characteristics and two TI-RADS items associated with LLNM in patients, details are shown in Table 2.The risk factors for LLNM in baseline characteristics included: male, high calcitoninemia, large tumor, multifocal lesions, and CLN metastasis.Tumor margin and overall level of TI-RADS significant difference were seen between the groups.

| Prediction performance on cross-validation cohort
Because of lacking an external validation set, we utilized 5-fold cross-validation to split our study population into training and testing datasets.The averaged quantitative indexes of LLNM prediction status for each clinical feature type on testing datasets was summarized in Table 3. Figure 3 shows the nomogram of the Fold 4 among the five modelings, which roughly displays the weights of each clinical feature.The total score of the seven features corresponds to the risk of LLNM.From the experimental results (Table 4), the AUCs of the LLNM prediction model on the five crossvalidation testing sets were 0.92, 0.88, 0.92, 0.96, and 0.94, respectively, and averaged to be 0.92 ± 0.03 with one standard deviation.Figure 2A illustrated the ROC curves of the LLNM prediction results on the crossvalidation cohort.The corresponding quantitative indexes of the LLNM prediction model were summarized in Table S1 Figure 2B illustrated the averaged decision curve of the LLNM prediction model on 5-fold cross-validation testing dataset.The decision curves showed that if the threshold probability ranged from 12% to 99%, using the LLNM prediction model can gain more benefit than treat all or none patients.

| Clinical utility
Our model demonstrated good predictive ability for occult LLNM in 141 patients with cN 0−1a , achieving an AUC of 0.91 (95% CI, 0.83-0.98),which was superior to basal Ctn's 0.83 (95% CI, 0.76-0.91)and CLN status's 0.64 (95% CI, 0.52-0.75).(seedetail in Table 5) Figure 4 shows the ROC curve.The Delong test showed a significant difference among the 3 models (Table 6).Our model also had a higher ACC of 0.9, compared to basal Ctn and CLN status's 0.79 and 0.83, respectively.The specific parameter index is shown in Table S2.This indicates that our model can better determine whether to perform prophylactic LND for patients with negative founding lateral neck before surgery.

| DISCUSSION
Compared to differentiated thyroid carcinoma, MTC is more aggressive and has a poorer prognosis, especially when lymph node metastasis occurs. 19Thus, the treatment strategy for MTC should focus on assessing the cervical lymph nodes.The standard surgery is total thyroidectomy plus CLN dissection.Controversy exists regarding prophylactic lateral neck dissection for patients without preoperative evidence of lateral neck involvement.We developed a machine learning model for LLNM prediction using clinical and ultrasound data of 218 patients.The model demonstrated a robust performance with a 5-fold AUC of 0.92 ± 0.03.Furthermore, the model exhibited superior predictive ACC for occult LLNM compared to basal Ctn and CLN status (AUC 0.91 vs. 0.83 vs. 0.64).By providing a more precise risk assessment, we can avoid both overtreatment and undertreatment, tailoring the treatment strategy to each individual's unique needs.
5][16] Zhou et al. 15 developed a nomogram based on the five clinical characteristics of 35 patients from single center, and their prediction model had discrimination with a C-index of 0.825.We contend that the sample size of 35 cases, relying solely on clinical features for refining predictive characteristics, raises reliability concerns.Similar to our article, Luo et al. 16 developed a prediction model for cervical lymph node metastasis (CLNM) by applying clinical and ultrasonographic data of 74 patients with MTC.Their model had a high discrimination with an AUC of 0.919.However, their model could not tell the location of CLNM in the central or lateral neck.Therefore, their model had limited help for solving the current dilemma of whether to perform prophylactic LND for MTC.Some studies have suggested that prophylactic LND for MTC patients with no evidence of LLNM does not improve prognosis.The study from MD Anderson 20 analyzed the data of 66 sporadic MTC patients and found no significant difference in biochemical cure rate, local recurrence rate, and other outcomes between the elective lateral neck dissection group and the observation group.However, in this study, the prophylactic LND group had significantly higher Ctn levels and relatively higher T-stage (although not statistically significant) than the non-LND group, which proved that prophylactic LND improved the prognosis of patients with more severe conditions.In another study from MSKCC, 21 it was observed that in patients with Ctn >200 pg/mL, the recurrence rate at 10 years was lower in the prophylactic LND group than in the non-LND group, 21% versus 30%, although there was no statistical difference since the limited cases.From these two studies, prophylactic LND can bring benefits, but the benefits are not significant.Therefore, if a better method is used to predict the lateral neck metastasis, the value of prophylactic LND will be more evident.
We chose to use the ACR TI-RADS mainly because it is a more comprehensive and objective system for describing the ultrasound features of the primary lesion, and it has a wide application base worldwide.Therefore, we believe that it can reflect the primary lesion in a comprehensive way, and also facilitate the dissemination of our model in other medical centers.The ACR TI-RADS scale is a system for scoring thyroid nodules based on their ultrasound features, proposed by the American College of Radiology in 2017. 22By assigning points to five categories of ultrasound images and TI-RADS level, the scale was used to determine the risk of malignancy.Through this study, we found that the categories and total level of TI-RADS were also significantly associated with LLNM.When considering the application of the model in different hospitals, the advantage of TI-RADS lies in its widespread clinical adoption, obviating the need for additional training for radiologists.In contrast, Luo et al.'s 16 predictive model relies on eight thyroid ultrasound image features, which may pose challenges for dissemination and adoption across diverse medical institutions.
The current exploration of risk factors for LLNM in MTC is mainly based on Machens' study.He found that basal Ctn beyond thresholds of 20, 50, 200, and 500 pg/ mL, indicating LNM in ipsilateral central and lateral neck, contralateral central neck, contralateral lateral neck, and upper mediastinum, respectively. 6As the number of central lymphnode metastases increases, the probability and extent of lateral cervical lymphnode metastases would also increase. 5We tested basal Ctn, CLN status and our model in cN 0−1a patients.The result showed that our model had a better ability to detect occult LLNM than the other two, with p < 0.05.Therefore, our model can be used in clinical practice to decide whether to perform prophylactic lateral neck dissection.
Our choice of machine learning algorithm balances interpretability, stability, and efficiency.Logistic regression is suitable for medical applications and provides interpretability in clinical settings.Given the rarity of MTC cases, regularization is used to prevent overfitting and enhance generalization.Additionally, a grid search method is employed to comprehensively explore hyperparameter space and identify the optimal model.Among other commonly used modeling techniques, decision trees tend to overfit and are not suitable for high-dimensional data, while random forest models are complex and challenging to interpret.5-fold cross-validation leverages the limited MTC data by ensuring that each case is used for both training and testing, thus enhancing the model's robustness.
However, our article also has some shortcomings.On the one hand, this study only included the basal Ctn.Other tumor markers such as CEA and procalcitonin, which was regarded as potential tumor markers, 23 were not included in the study.But in our experience, these tumor markers are also affected by many other external factors and are not superior to Ctn for the diagnosis and prognosis evaluation.In addition, since pentapeptide gastrin is not approved for use in mainland China, it is not possible to measure the stimulated Ctn.On the other hand, the biggest regret of this study is that it did not include an external validation dataset.Because for such a rare disease, it may be necessary to contact multiple hospitals to complete enough data, which is also our wish for future research.
In conclusion, by using a machine learning algorithm, we developed a predictive model for LLNM of MTC based on clinical characteristics and ultrasound images.Our model has good discrimination and can significantly improve the prediction ability of occult LLNMs in practical applications.This will further provide decision support for prophylactic LND in patients with clinically negative findings in the lateral neck.

T A B L E 1
The specific surgical procedures and treatment outcomes.

T A B L E 4
Quantitative indexes of LLNM prediction model on 5-fold crossvalidation testing dataset.

F I G U R E 2
Receiver operating characteristic (ROC) curve and decision analysis curve of lateral lymph node metastasis (LLNM) prediction model on 5-fold cross-validation testing datasets.(A) ROC curves of the LLNM prediction results on testing datasets, the area under curve of the LLNM prediction model on average is 0.92 ± 0.03 with one standard deviation.(B) Decision analysis curve of the LLNM prediction model, which shows that if the threshold probability ranged from 12% to 99%, using the LLNM prediction model can gain more benefit than treat all or none patients.

F I G U R E 3
The nomogram of the Fold 4 among the five modelings.Points were assigned for gender, multifocality, central lymph node status, Thyroid Imaging, Reporting and Data System (TI-RADS) margin, TI-RADS level, tumor size, and basal calcitonin by drawing a line upward from the corresponding values to the "Points" line.The sum of these seven points, plotted on the "overall points" line, corresponds to predictions of LLNM risk, with the threshold = 0.5.F I G U R E 4Receiver operating characteristic curves for occult lateral lymph node metastasis (LLNM) prediction by our LLNM model, basal calcitonin and central lymph node status in cN 0−1a patients.

2 Baseline characteristics and TI-RADS scores of the 218 patients in medullary thyroid carcinoma with and without lateral cervical lymph node metastases.
Abbreviations: TD, thyroidectomy; CND, central neck dissection; LND, lateral neck dissection.T A B L EAbbreviations: ACR, American College of Radiology; CLN, central lymphnode; Ctn, calcitonin; LLNM, lateral lymphnode metastasis; MTC, medullary thyroid cancer; TI-RADS, thyroid imaging, reporting and data system.a Data are presented as medians with ranges in parentheses.
T A B L E 5AbbreviationsT A B L E 6Abbreviations: CLN, central lymph node; Ctn, calcitonin; LLNM, lateral lymph node.