Construction and validation of a prognostic nomogram for anal squamous cell carcinoma

Abstract Background Anal squamous cell carcinoma (ASCC) is the main subtype of anal cancer and has great heterogeneity in prognosis. We aimed to construct a nomogram for predicting their 1‐, 3‐, and 5‐year overall survival (OS) rates. Methods Patients with ASCC, enrolled between January 1, 2010 and December 31, 2017, were identified from the SEER database. They were divided into a training group and a validation group in a ratio of 7:3. Univariate and multivariate Cox analyses were used to identify the prognostic factors for OS. Then a prognostic nomogram was established and validated by Harrell consistency index (C‐index), area under the curve (AUC) of the receiver operating characteristic (ROC) curves, calibration plots, and decision curve analysis (DCA). Results We identified 761 patients in training group and 326 patients in validation group. Four prognostic factors including age, sex, AJCC stage, and radiotherapy were identified and integrated to construct a prognostic nomogram. The C‐index and AUC values proved the model's effectiveness and calibration plots manifested its excellent discrimination. Furthermore, in comparison to the AJCC stage, the C‐index, AUC, and DCA proved the nomogram to be of good predictive value. Finally, we constructed a risk stratification model for dividing patients into low‐risk, medium‐risk, and high‐risk groups, and there were obvious differences in OS. Conclusions A prognostic nomogram was firstly established for predicting the survival probability of ASCC patients and helping clinicians improve their risk management.


| Data source and patient selection
The SEER Program contains information on cancer statistics of approximately 28% of the US population. Patient' data were downloaded from SEER*Stat version 8. Patients enrolled in our study fulfilled the following inclusion criteria: (a) patients with ASCC were enrolled from January 1, 2010 to December 31, 2017, (b) ASCC was their only primary malignancy, (c) patients were diagnosed with histological methods, and (d) patients' follow-up data including survival time and vital status were complete. Patients were excluded under the following conditions: (a) being under 18 years of age. (b) incomplete demographic or clinicopathological information, including age, sex, race, tumor grade, tumor size, AJCC stage, T stage, N stage, or M stage, and regardless of the order of the treatment, uncertain treatment information (whether they underwent surgery, radiotherapy, or chemotherapy), and (c) diagnosis by death certificate or autopsy. In addition, the AJCC stage was redefined according to the 2017 version. The detailed screening procedure is shown in Figure 1.
The following factors were selected and sorted: age at diagnosis, gender, marital status, race, tumor size, tumor differentiation, AJCC stage, N stage, M stage, therapy, survival months, and survival status. For tumor size, the optimal cutoff point was calculated using the X-tile software ( Figure S1). OS was the main endpoint. OS refers to the time from diagnosis to death or the last follow-up.

| Construction and validation of the nomogram
In this study, 1087 patients with ASCC were included. Through the "caret" package, we indiscriminately divided patients into two cohorts in the ratio of 7:3. The training cohort (n = 761) was applied to develop and internally validate nomograms, while the validation group (n = 326) was used for external validation.
In the training cohort, univariable and multivariable Cox proportional hazard regression models were used to identify significant predictors of OS, then, the nomogram was constructed to estimate ASCC patients' 1-, 3-, and 5year survival rate according to the proportional conversion of each regression coefficient into a 0-100 point scale through the "rms" package. Furthermore, we identified the reliability of the model through internal and external validation. C-index, the area under the curve (AUC) of the receiver operating characteristic (ROC) curves, and calibration curves were employed to evaluate the discrimination of the model. The C-index is between 0 and 1.0, of which 1.0 represents the best prediction and 0.5 stands for a completely random predictor. The AUC value ranges from 0.5 to 1, and there is a positive correlation between the value and predictive ability. Calibration maps were employed to test the difference between the predicted and observed survival rates. Decision curve analysis (DCA) was conducted to identify the clinical application of the model. Based on the total score of every patient according to the prognostic nomogram, patients were classified into low-, intermediate-, and high-risk groups by X-tile software.

| Statistical analyses
Summary statistics were used to depict the basic characteristics of the included population. The Chi-square test was used to compare categorical differences between different groups. Univariate and multivariate Cox regression analyses were used to analyze the prognostic factors associated with OS. Kaplan-Meier curves and log-rank tests were used to assess survival differences. Moreover, a prognostic nomogram model was developed to predict the survival probabilities of patients with ASCC at 1-, 3-, and 5-years. The values of C-index and AUC were used to evaluate the model's discrimination. Calibration maps were applied to assess the difference between the observed and predicted probabilities of survival, and the DCA was used to assess its clinical utility. Analysis was performed using X-tile software, SPSS (version 24.0; SPSS, Inc.), GraphPad Prism (version 8.0; GraphPad), and packages (rms, survival, survival ROC, hmisc, rmda, etc.) in R software version 3.6.6 (http://www.rproj ect. org). Statistical testing was bilateral and significance was set at p < 0.05.

| Patient characteristics
We identified 1087 ASCC cases from the SEER database, including 761 patients in the training group and 326 patients in the validation group ( Figure 1). The patients' demographic and clinical features are exhibited in Table 1. No significant differences existed in demographic and clinicopathological distribution between the training and validation groups. For all the patients with ASCC, the mean and median ages were 60.9 and 60 years, respectively. Nearly half of the patients were between 60 and 65 years of age. A total of 685 (63%) patients were female and 654 (60.2%) were single.

| Independent predictors in the training group
In the training group, the prognostic considerations of OS were identified using univariate and multivariate Cox proportional hazard regression analyses ( I as a reference), and radiotherapy (radiotherapy: HR 0.559, 95% CI 0.365-0.857; no radiotherapy as a reference) were all correlated with OS (p < 0.05). Survival analysis showed that compared to cancer patients aged ≤50 years, patients aged 50-65 years had a worse prognosis, followed by those aged >65 years, and the differences were statistically significant (Figure 2A); The prognosis of male patients with ASCC was worse than those of female patients ( Figure 2B). In addition, the more advanced the patients' AJCC stage was, the worse the prognosis was ( Figure 2C). Patients who received radiotherapy had a better prognosis ( Figure 2D).

| Construction and validation of the nomogram
On basis of the prognostic factors of OS derived from the Cox proportional hazard regression analyses in the training group, a nomogram was developed for predicting ASCC patients' survival probability in 1-, 3-, and 5-years ( Figure 3). As shown in the nomogram, AJCC stage contributed the greatest significant influence on patients' survival outcome, followed by age, radiotherapy, and sex. To obtain access to a patient's survival probability by nomogram, we need to identify the exact score of the four factors in ASCC patients (Table 3), totalize the scores and find the corresponding 1-, 3-, and 5-year survival probabilities based on the total score.
Furthermore, the C-index, AUC, and calibration curves were employed to test our prognostic model. Using random sampling of internal and external cohorts to verify the prediction performance of the model, the C-index of the nomogram was 0.684 (95% CI, 0.643-0.725) and 0.730 (95% CI, 0.677-0.783) in the training and validation groups, respectively, exhibiting a good discrimination capability of the model to predict the survival probability of ASCC patients. The 1-, 3-, and 5-year AUC values were 0.706, 0.699, and 0.687 in the training cohort ( Figure 4A) and 0.743, 0.716, and 0.704 in the validation cohort, respectively ( Figure 4B). In addition, the calibration plots displayed favorable consistency between the observed 1-, 3-, and 5-year survival rates and the predicted rate, regardless of the training group ( Figure 5A

| Comparison of the nomogram and AJCC staging system
To further verify the superiority of our model, we compared our prognostic nomogram with the AJCC staging system. We found that the C-index of the AJCC stage for OS was 0.610 (95% CI: 0.569-0.651) and 0.659 (95% CI: 0.598-0.720) in the training and validation groups, respectively, lower than that in the nomogram. We further calculated the AUC values of 1-, 3-, and 5-year to evaluate the model's discrimination. The results showed that the nomogram was superior to the AJCC stage in the training and validation groups (Figure 4). The DCA also showed that the net return of our prognostic model exceeded that in the AJCC stage for a wide range of threshold rates ( Figure 6). These results suggest that our nomogram can better predict the survival probability of ASCC patients.

| Risk stratification via the nomogram
Furthermore, in order to improve the management of patients with ASCC, a risk stratification model was developed to divide patients into three groups: low-risk (total scores <73), medium-risk (73 ≤total scores <121), and high-risk (total scores ≥121) groups, using the X-tile software to determine the optimal cutoff values based on the total scores of every patient ( Figure S2). Kaplan-Meier analysis was conducted in the training, validation and all groups, and an obvious difference existed in the OS (Figure 7), indicating the practical utility of our prognostic nomogram in risk stratification for patients with ASCC.

| DISCUSSION
In this study, 1087 ASCC patients were enrolled from the SEER database and four prognostic considerations were identified from 13 variables to develop a nomogram to better predict patient survival probability. The C-indexes, F I G U R E 3 Nomogram for predicting 1-, 3-, and 5-year survival rate of ASCC patients

Variables Nomogram
Age ROC curves, and calibration curves confirmed the reliability and discrimination of the model and the DCA proved its clinical utility. Furthermore, a risk stratification model was developed to classify patients into low-risk, mediumrisk, and high-risk groups, which may help clinicians to improve patient management. It is the first large retrospective study to establish a nomogram to predict the prognosis of ASCC patients and these significant factors can be accessible to better predict the survival outcome of each patient and aid clinicians in making appropriate therapy decisions. In our study, age was identified as a significant independent risk factor associated with OS in patients with ASCC. The incidence and mortality of ASCC is increasing, with an approximate 5% increase in SCCA-based mortality in patients aged 60-69 years. 2 Compared to younger patients, elderly patients are less likely to receive standardof-care chemoradiotherapy. In addition, considering the adverse effects, when chemoradiotherapy was delivered, the elderly preferred to receive concurrent single-agent rather than multiagent chemotherapy, which resulted in a worse survival outcome. 10,20,21 In addition, compared to female patients, male patients had a worse survival outcome, which is consistent with the results of previous studies. 11,14,22,23 Some researchers have argued that women are more active in their health care, receive more cancer screening, and are more likely to be diagnosed at earlier stages of the disease. There are wellknown risk factors for HPV and HIV infection, prompting the development of ASCC. 5,24,25 Shiels et al. 26 reported that HIV infection was the main reason for the increased incidence of male patients with ASCC. Men who were engaged in anal intercourse had an increased risk of ASCC. [27][28][29][30][31][32][33] In addition, single men had a higher incidence of anal cancer than married men. 34,35 In the entire cohort, there were 402 male patients, among whom 272 were single and 130 were married; for the 685 female ASCC patients, 382 and 303 were single and married, respectively. Over 60% of male patients are single, which may contribute to a higher risk for ASCC. For male patients with ASCC, a The literatures have had inconsistent results on the presence of marital and racial differences on survival of ASCC patients. Wu et al. 36 analyzed 2111 localized anal carcinoma patients who received definitive chemoradiation between 2004 and 2012 and found the married and the white positively impacted survival. Marriage may be a proxy for social support, which is essential among cancer survivors because of the great psychosocial burden. 37 In addition, it is controversial whether black race was independently associated with worse survival. 38,39 In our study, univariate analysis showed that the married and the white could reduce patients' death risk, but further multivariate analysis showed that they were not independent prognostic factors. These results' differences could be owing to an interplay of structural, cultural, and social issues associated healthcare.
According to the prognostic nomogram model, AJCC stage had the largest influence on the survival prediction of ASCC patients. As shown in the comparative analysis of the prognostic model and the AJCC staging system, it can be seen that the AJCC stage plays a certain role in the survival prediction of ASCC patients. It can also be said that our predictive model is a further improvement of the AJCC staging system based on patients' demographic information and clinicopathological features.
In terms of the treatment, the current standard-ofcare for localized ASCC is concurrent chemoradiotherapy (CRT). In the 1970s, Norman Nigro promoted organ conservation therapy and found that the clinical complete response rate to CRT was 86%. 40,41 Subsequently, CRT was identified as the standard-of-care for localized ASCC through two key trials: the Anal Cancer Trial (ACT I) of the UK Coordinating Council for Cancer Research (UKCCCR) and the European Organization for Research and Treatment of Cancer (EORTC). 6,22,42 However, the optimal management of patients with stage I ASCC is controversial. Some studies have indicated that the combination with chemotherapy does not correlate with inferior OS. 19,[43][44][45] In addition, for patients with well-differentiated T1, N0, M0, or smaller T2 perianal tumors, local surgery with margins of 1 cm may be an effective treatment strategy. 46 However, CRT cannot be ignored in patients who have received local resection. ASCC often metastasizes to the lung, liver, and extrapelvic lymph nodes although patients are rarely diagnosed with stage IV disease. Chemotherapy is advocated, 5 but radiation for metastatic sites can also be regarded as part of combination therapy. 46 In addition, immunotherapy may also provide survival benefits to patients who fail chemotherapy. 47,48 In our study, most patients were diagnosed as stage I-III, which provides the opportunity for radiotherapy and chemotherapy for them and offer a relatively better survival outcome. Moreover, the majority of patients undergoing surgery also received radiochemotherapy. Univariate analysis showed that surgery and radiotherapy can reduce the patients' death risk, and chemotherapy also seemed F I G U R E 7 Kaplan-Meier curves of the low-, medium-, and high-risk groups in the training cohort (A), the validation group (B), and all patient's group (C) to improve patients' survival outcomes, but the difference was not statistically significant. Further multivariate analysis showed that radiotherapy was an independent prognostic factor. Chemotherapy is an important treatment option for ASCC, but its actual efficacy may be underestimated in our study. In SEER database, chemotherapy is coded as "yes" or "no/unknown," but the specific protocol and cycle cannot be obtained. Therefore, our results should be interpreted cautiously.
Nowadays, the trend is to treat rectal squamous cell carcinoma (RSCC) by analogy to ASCC. 49 Because of the same histotype and close localization, we wonder if there are differences in prognostic factors between ASCC and RSCC. Diao et al. 50 enrolled 806 RSCC patients and constructed a nomogram to predict 3-and 5-years OS. However, compared with our study, different prognostic factors in demographic characteristics, AJCC stage and treatment options, were identified, which may be caused by the differences in pathogenesis, 51 staging, 52,53 and treatment options 49,54 between ASCC and RSCC. Due to the rarity of RSCC, much is unknown about this cancer, such as the role of HPV and HIV in RSCC, 55,56 the molecular profile or the most effective treatments. And more studies are needed to address these questions.
There are some limitations in our study. First, this is a retrospective study, and there is a lack of data collection, including virus infection status, related molecular factors, and detailed treatment information, which may improve the accuracy of the model. Second, there may be selection bias due to the exclusion of patients with missing data. Finally, because of the rarity of the ASCC patients, we cannot externally validate the nomogram from patients in our hospital, and the prognostic model may be more applicable to patients in the United States.

| CONCLUSION
In conclusion, a nomogram for patients with ASCC was constructed on the basis of four significant prognostic factors determined by univariate and multivariate Cox analysis. It enables clinicians to better anticipate the 1-, 3-, and 5-year survival probabilities, and plays a role in risk stratification and treatment decision making for patients with ASCC.