Development and validation of a nomogram based on stromal score to predict progression‐free survival of patients with papillary thyroid carcinoma

Abstract Background Growing evidence has proved that stromal cells, as the critical component of tumor microenvironment (TME), are closely associated with tumor's progression. However, the model based on stromal score to predict progression‐free survival (PFS) in papillary thyroid carcinoma (PTC) has not been developed. The study aimed at exploring the relation between stromal score and prognosis, then establishing a nomogram to predict PFS of patients with PTC. Method We obtained the stromal score and clinicopathological characteristics of PTC patients from The Cancer Genome Atlas (TCGA) database. Cox regression analysis assisted in selecting prognosis‐related factors. A stromal score‐based nomogram was built and verified in the training and validation cohorts, respectively. The calibration curve, concordance index (C‐index), decision curve analysis (DCA) as well as receiver operating characteristic (ROC) curve assisted in measuring the performance exhibited by the nomogram. Results We divided 381 PTC patients into the training cohort (n = 269) and the validation cohort (n = 112) randomly. Compared with patients who had a low stromal score, patients with a high stromal score appeared with significantly better PFS [Hazard ratio (HR) and 95% confidence interval (CI): 0.294, 0.130–0.664]. The C‐index of the PFS nomogram was 0.764 (0.662–0.866) in the training cohort and 0.717 (0.603–0.831) in the validation cohort. The calibration curves for PFS prediction in the nomogram were remarkably consistent with the actual observation. DCA indicated superior performance of the nomogram to predict PFS than the American Joint Committee on Cancer (AJCC) Tumor Node Metastasis (TNM) staging system. The ROC curves showed the favorable sensitivity and specificity of the novel nomogram. Conclusion High stromal score was significantly associated with improved PFS in patients with PTC. The nomogram based on the stromal score and clinicopathological patterns yielded a reliable performance to predict the prognosis of PTC.


| INTRODUCTION
Thyroid carcinoma acts as a common disease around the world and its incidence continues to rise in the past tens of years. 1 Papillary thyroid carcinoma (PTC) is a representative subtype of the thyroid carcinoma. Although PTC patients manifest favorable prognosis, some patients present with aggressive progress as recurrence and metastasis, which consequently result in poor prognosis. American Joint Committee on Cancer (AJCC) Tumor Node Metastasis (TNM) staging system refers to a standard approach to predict the prognosis for PTC patients. 2 However, it primarily focuses on death of PTC which has limitations to accurately evaluate the risk of progression in the early stage. 3 Therefore, it is of significance to explore a novel model to predict the rate of relapse for patients with PTC.
In recent years, increasing evidence has confirmed that tumor microenvironment (TME) is closely associated with prognosis of various cancers, including PTC. 4,5 Surrounded with tumor cells, TME consists of infiltrating immune cell, stromal cell, as well as other kinds of normal epithelial cells. Abundant of studies have proved that stromal cells, as the most critical component of TME, greatly affect PTC's progression. [6][7][8] Stromal score, which could be calculated from gene expression data, was applied for estimating stromal cells' infiltration in tumor tissue. 9 Increasing studies have attempted to establish predictive model based on stromal score to evaluate the prognosis of tumors, such as breast cancer, 10 gastric cancer, 11 and clear cell renal carcinoma cancer. 12 However, there is no report that focus on the association between stromal score and prognosis of PTC, and stromal score-based model has not been developed to evaluate the prognosis for PTC patients.
In this study, we attempted to explore the correlation of stromal score with progression-free survival (PFS) of PTC, and integrated stromal score with clinicopathological characteristics to build a prognostic nomogram for predicting the survival of PTC patients.

| Patients enrollment
The data in this study were downloaded from The Cancer Genome Atlas (TCGA) database. Clinicopathological data of PTC patients were retrieved as follows: diagnostic age, gender, race, radiation therapy status, histological subtypes, AJCC TNM stage (which is defined following the AJCC 7 th edition), and PFS. Detailed information is available on the following website: http://www.cbiop ortal.org/. Estimation of STromal and Immune cells in MAlignant Tumor tissues using Expression data (ESTIMATE) algorithm was utilized for inferring the cellularity of tumor and various infiltrating normal cells in TME. The single-sample gene set enrichment analysis (ssGSEA) assisted in calculating the stromal score, thereby predicting the levels of infiltrating stromal cells. 9 The stromal score of each PTC patient in our study was downloaded from the following website: http://bioin forma tics.mdand erson.org/estim ate/.

| Data processing
All records from the two datasets were matched by patients' ID number. In total, 498 cases were available for screening. Some cases were excluded due to absence of information.

| Development and validation of the nomogram
Poor prognosis in PTC patients is mainly caused by recurrence and metastasis, so the study was performed taking PFS as the endpoint. Using createDataPartition() function in R package, we divided people into training and validation cohorts in a 7:3 ratio (seed: 20201124). Univariate and multivariate regression models assisted in confirming the independent predictors for the PFS of PTC. We estimated the adjusted hazard ratio (HR) as well as the 95% confidence interval (CI). A nomogram was formulated using the training cohort based on the results of cox regression analyses. External validation was carried out by virtue of the validation cohort. Assessment on the performance exhibited by the nomogram was conducted through measuring | 5491 TANG eT Al.
the concordance index (C-index) as well as the calibration (compare the survival probability predicted by the nomogram with the observed value by Kaplan-Meier analysis).
Also, DCA assisted in confirming the threshold probability range regarding nomogram together with the AJCC TNM stage. The specificity and sensitivity of the nomogram were  assessed via the receiver operating characteristics (ROC) curve.

| Statistical analysis
Fisher's exact test or Chi-square test served for the analysis of all categorized data, and Kruskal-Wallis H test served for the analysis of continuous variables. The optimal cutoff point was obtained by virtue of X-tile 3.6.1 software (Yale University School of Medicine). 13 The Kaplan-Meier method and the log-rank test assisted in constructing and comparing the survival curves, respectively. R 3.6.3 software (http:// www.r-proje ct.org) helped to conduct all statistical analyses. The performed statistical tests were two-sided, with p values less than 0.05 were considered exhibiting a statistical significance.

| The cutoff points of age and stromal score for PFS prediction
A total of 381 PTC patients with available data in TCGA-THCA dataset were analyzed. The age of 57 was selected as the best cutoff point to predict PFS in PTC patients, according to the results of X-tile plots in Figure 2A. Figure 2B showed the PFS curves specific to younger group (diagnostic age <57) and older group (diagnostic age ≥57). Patients who were younger than 57 had significantly longer PFS. Similarly, the stromal score of −677.0 was identified as the best cutoff value referring to the results of X-tile plots in Figure 2C. Figure 2D showed the survival curves of PFS for group with a low stromal score (≤ −677.0) and group with a high stromal score (> −677.0).

| Association of stromal score with clinicopathological characteristics and PFS in PTC patients
To explore how the stromal score related to other clinicopathological features, we divided those in the training cohort (n = 269) into two groups, namely group with a high stromal score (> −677.0) and group with a low stromal score (≤ −677.0). There were statistical differences of stromal scores among patients with different T stages (Kruskal-Wallis H test, p = 0.048)( Figure 3A). As shown in Figure 3B, patients with lymph node metastasis (N1) yielded notably lower stromal score than those without lymph node metastasis (N0)  Figure 3C displayed the association between stromal score and PFS. Patients whose stromal score was lower manifested statistically decreased PFS relative to patients whose stromal score was high (logrank test, p < 0.001).

| The results of univariate and multivariate cox regression analyses
The univariate cox regression analysis results were shown in Table 2. There were significant differences of PFS between patients with low and high stromal scores (p = 0.002). It showed patients whose stromal score was higher exhibited a longer PFS (HR The results of multivariate cox proportional hazard regression analyses were listed in Table 3. Patients whose stromal score was higher had significantly improved PFS (HR and 95%CI: 0.294, 0.130-0.664, p = 0.003). Patients older than 57 years old were statistically presented with poorer PFS (HR and 95% CI: 5.898, 1.694-20.534, p = 0.005). When compared with patients in T1 status, patients in T3 status appeared with significantly poorer PFS (HR and 95%CI: 6.296, 1.217-32.555, p = 0.028). When compared with patients in M0 status, patients in M1 classification presented with shorter PFS (HR and 95%CI: 12.743, 1.901-85.437, p = 0.009). These results indicated that stromal score, age, T status, and M status were independent factors of PFS for PTC patients. Regarding the rest of the clinical characteristics, significant associations were not recognized.

| Construction and validation of the novel prognostic nomogram
Based on cox regression analyses, a nomogram was constructed for predicting PFS of PTC patients. Age, stromal score, T status, and M status were parameters included in the nomogram (Figure 4). In the training group, the C-index of the nomogram for PFS prediction was 0.764 (95% CI, 0.662-0.866). Then the model was verified in the validation cohort, and the C-index showed 0.717 (95% CI, 0.603-0.831). In Figure 5, as displayed in the nomogram calibration plots, the two cohorts exhibited similar predicted 1-, 2-, and 3-year PFS to actual observations. The DCA results indicated that the performance exhibited by PFS nomogram was obviously better relative to the AJCC TNM stage ( Figure 6). As shown in Figure 7, high area under ROC curve (AUC) showed the favorable sensitivity and specificity of the nomogram both in the training cohort (0.807, 0.770, and 0.799 for 1-, 2-, and 3-year PFS, respectively), and the validation cohort (0.736, 0.695, and 0.700 for 1-, 2-, and 3-year PFS, respectively). Above results indicated that the nomogram yielded reliable performance, and it showed superior predictive value relative to the traditional AJCC TNM staging system.

| DISCUSSION
Despite with relatively good prognosis, PTC patients still have a risk of advanced disease. Recognizing the high-risk patients in the early stage is critical for practitioners to select more aggressive treatment. The AJCC staging system is considered as the standard approach to predict the prognosis of PTC patients and abundant studies have indicated its applicability in clinical practice. [14][15][16] However, it has limitations to identify patients with progression in the early stage, especially for the low risk majority. 3,17 The present study revealed the tight correlation between stromal score and PFS of PTC patients. Stromal score was also found to correlate with tumor status and lymph node metastasis of PTC. Based on the stromal score and other prognosis-related patterns, we  further constructed a nomogram to estimate PFS of PTC patients and it yielded a superior performance than the AJCC staging system. Increasing research has confirmed the indispensable role of tumor microenvironment (TME) in tumor's growth and progression. 18 As for one of the most important components of TME, stromal cells are suggested to draw critical impact on progression of various tumors. 19,20 However, the role of stromal cells in PTC has not be fully explored. ESTIMATE algorithm provides an easy way to predict immune cell and stromal cell infiltration in TME. In our study, we found the patients with higher stromal scores appeared with longer PFS, which indicate the potential role of stromal cells to prevent PTC's progression. One previous study showed that, through secreting extracellular superoxide dismutase, stromal cells can have an inhibitory effect on thyroid cancer cell migration. 8 7 They stated that the development of stroma was associated with the progression of carcinogenesis, such as lymph node metastasis, signifying that stroma responds to the microenvironmental needs of tumor cells. In addition, Liu et held that some chemotactic factor derived from stromal cells, such as SDF-1, remarkably affect the invasion and metastasis processes of tumor cells of PTC. 6 Overall, until now, the role of stromal cells in PTC remains unclear. Our preliminary observation could provide a perspective to explore this issue, and further research is needed in the future.
In our study, clinicopathological characteristics were found to be correlated with PFS in PTC. We defined 57 year old as the cutoff value of age, which is basically consistent with previous reports. And patients older than 57 year old showed statistically poorer PFS in our findings. Many studies have recognized 55 year old as the best single time point for prognosis model. [21][22][23] And one multi-center research demonstrated that using 55 year old as the cutoff value to predict the survival of PTC can help to avoid nearly 12% over treatment. 24 We also found patients with distant metastasis (M1 status) presented with poorer survival, which was consistent with previous findings. 25 Concerning with T status, patients in T3 showed significantly poorer PFS when comparing with patients in T1 status. The notable decreased PFS of patients in T3 status probably result from the extent of extrathyroidal extension (ETE). 16 It indicated that patients with microscopic ETE were more likely to have lymph node metastases, which took a significantly higher risk of recurrence than patients without ETE. 15,26 In terms of lymph node metastasis, we found N status was not an independent factor for PFS in PTC. Some previous studies held the similar viewpoints that nodal metastasis was merely correlated with increased recurrence risk but slightly affected patients' survival. 15 Conversely, other researches, such as American Thyroid Association Management (ATA) Guidelines, which insisted the prognostic significance exhibited by the nodal metastasis, could be classified given the number, size, as well as extranodal invasion of the metastatic lymph nodes. 3,27 Inadequate detailed information about lymph node in TCGA database might be the limitation to show the notable association between the N status and PFS of PTC patients in our study.
In recent years, nomograms have been widely applied to estimate clinical prognosis as they integrate multiple prognostic parameters into an intuitive figure, and easier for patients to understand. Mounting studies have established nomograms considering the immune and stromal scores, aiming for taking the TME-related cells as important factors to evaluate patients' prognosis. 12,28 Nomograms were built to improve the prediction of prognosis in PTC patients have emerged, 17,25,29,30 few of which, however, took stromal scores into account. As we all know, the study for the first time constructed a prognostic model which combined stromal scores and the clinicopathological characteristics comprehensively. The new information included in our nomogram can provide novel insights in PTC's prognosis and further assist physicians to make more effective clinical decisions.
Our study still had three major limitations. At first, we obtained the clinicopathological information for the dataset in the study mainly from the TCGA database. Most patients came from North America. Therefore, it is necessary to be cautious about applying the results of the study to patients in other places. Second, some critical prognostic factors, such as surgical treatment, multifocality, radioactive iodine, BRAF mutation, and TERT mutation, were unavailable in the TCGA database. Third, this study included the relatively small sample (n = 381). More data need to be analyzed to improve the accuracy of model performance assessments, and an external validation of the prognostic model is necessary in the further study.

| CONCLUSIONS
In our study, we found PTC patients with high stromal scores were closely related to the improved PFS. We established a prognostic nomogram combining stromal score with clinicopathological parameters related to the prognosis for predicting PTC patients' PFS. The novel nomogram showed reliable