Construction and validation of a nomogram to predict overall survival in patients with inflammatory breast cancer

Abstract In the present study, we examined the factors affecting survival of women with inflammatory breast cancer (IBC) and constructed and validated a nomogram to predict overall survival (OS) in these patients. The cohort was selected from the Surveillance, Epidemiology, and End Results (SEER) program between 1 January 2004 and 31 December 2013. Univariate and multivariate Cox proportional hazards regression models were constructed. A nomogram was developed based on significant prognostic indicators of OS. The discriminatory and predictive capacities of the nomogram were assessed using Harrell's concordance index (C‐index) and calibration plots. A total of 1651 eligible patients were identified, with a median survival time of 31 months (range 0‐131 months), and the 3‐ and 5‐year OS rates were 52.8% and 39.5%, respectively. Multivariate analysis revealed that race (P < .001), marital status (P = .011), N stage (P = .002), M stage (P < .001), hormone receptor (P < .001), human epidermal growth factor receptor‐2 (HER2) (P = .001), surgery (P < .001), chemotherapy (P < .001), and radiotherapy (P = .010) were independent prognostic indicators of IBC. These nine variables were incorporated to construct a nomogram. The C‐indexes of the nomogram were 0.738 (95% confidence interval [CI]: 0.717, 0.759) and 0.741 (95% CI: 0.717, 0.765) for the internal and external validations, respectively. The nomogram had a better discriminatory capacity for predicting OS than did the SEER summary stage (P < .001) or the American Joint Committee on Cancer tumor‐node metastasis staging systems (8th edition; P < .001). The calibration plot revealed satisfactory agreement between the findings and predicted outcomes in both the internal and external validations. The nomogram‐based 3‐ and 5‐year OS predictions for patients with IBC exhibited superior accuracy over the existing models.


| INTRODUCTION
Inflammatory breast cancer (IBC) is a rare and aggressive clinicopathological entity of breast cancer (BC), accounting for 1%-6% of all BCs. 1 IBC is classified as T4d in the tumornode metastasis (TNM) BC staging classification, which is clinically featured by a diffused duration on the skin with erysipeloid edges, generally without an underlying mass. 2 Because IBC is rare, data on IBC are mainly acquired from small, single-center, retrospective research studies or extrapolated from randomized prospective studies or the clinical experience of non-IBC patients. 3 TNM staging is a common tool for predicting the outcomes of cancer patients by evaluating the tumor size and location (T), local lymph node involvement (N), and distant metastasis (M). 4 However, TNM classification alone is insufficient to encompass cancer biology or predict the outcomes of all BC cases, especially IBC. 5 Furthermore, other clinicopathological factors, such as race, grade, adjuvant treatments, and molecular characteristics, can influence the prognoses of IBC patients. 6,7 The nomogram, a simple user-friendly method of statistical prediction, compares favorably to the traditional TNM staging system in multiple cancers. [8][9][10][11][12] However, no study has constructed a nomogram for IBC until now. In this study, we used data from the Surveillance, Epidemiology, and End Results (SEER) database to identify patient and tumor characteristics that affect the survival outcomes of women with IBC and subsequently construct a nomogram.

| Ethics statement
The National Cancer Institute's SEER program, initiated in 1973 and annually updated, uses population-based data to develop comprehensive sources, 13 covering approximately 30% of the US population in several geographic regions. 14 For this study, we signed the SEER research data agreement to access SEER information, using reference number 16462-Nov2016. Data were obtained following the approved guidelines. The Office for Human Research Protection considered this research to be on nonhuman subjects because the subjects were patients who had been researched by the United States Department of Health and Human Services and were publicly accessible and de-identified. Thus, no institutional review board approval was required.

| Study population
Patient data were acquired from the SEER database (Submission, November 2016). The SEER*State v8.3.5 tool, released on 6 March 2018, was employed to select and identify eligible patients. The study duration ranged from 1 January 2004 to 31 December 2013. The inclusion criteria for data screening were as follows: (a) age at diagnosis ≥ 20 years; (b) women with primary IBC; and (c) IBC diagnosis was consistent with the International Classification of Disease for Oncology, third edition (coded as 8530/3).The exclusion criteria were as follows: (a) patients under 20 years old; (b) patients had more than one primary malignancy; (c) incomplete or unavailable survival data; (d) patients had only a clinical diagnosis; (e) inaccessible critical clinicopathological data, including marital status, race, 8th American Joint Committee on Cancer (AJCC) tumor stage, and surgical information;(f) patient died within 3 months after surgery; and (g) patients without prognostic data. Eligible patients were enrolled as the SEER primary cohort. Patients from five randomly selected registries (Alaska Natives, Atlanta, California, Detroit, and Greater Georgia) were assigned to the validation cohort, while the remaining patients were assigned to the training cohort.

| Covariates and endpoint
The following 12 clinicopathological variables were analyzed: age (<40, 40-49, 50-59, 60-75, or > 75 years), race (white, black, or other), marital status (married or unmarried), grade (grade I/II, grade III/IV, or unknown), N stage (N0, N1, N2, or N3), M stage (M0 or M1), tumor extension (<50%, >50%, or unknown), hormone receptor (HoR; negative, positive, or unknown), HER-2 (negative, positive, or unknown), surgery (no surgery, partial mastectomy, simple mastectomy, or radical mastectomy), chemotherapy (no/unknown or yes), and radiotherapy(no/unknown or yes).Widowed, separated, divorced, or single (having a domestic partner or never married) patients were classified as unmarried. Age was further converted into categorical variables according to the recognized cutoff values. All eligible cases were regrouped according to the 8th AJCC TNM staging system. Tumor extension was defined as the percentage of tumor area in the affected unilateral breast. The classification of tumor HoR was as follows: HoRpositive (at least one positive outcome for estrogen receptor [ER] or progesterone receptor [PR]) or HoR-negative (both negative outcomes for ER and PR). ER/PR-positive disease was defined as positive staining in 1% or more of the cells. 15 The primary endpoint in this study was overall survival (OS). OS was defined as the duration from diagnosis to the most recent follow-up date or date of death. The predetermined cutoff date was 31 December 2014, because the SEER 2016 submission database contains death information until 2014.

| Nomogram construction
Categorical variables were presented as frequencies and proportions and were compared using a chi-squared test. Univariate prognostic analysis was conducted using the Kaplan-Meier method and log-rank test. Significant prognostic factors identified from the univariate analysis were further analyzed in a multivariate Cox proportional hazards model along with the corresponding 95% confidence interval (CI) for each potential risk factor. Afterward, a nomogram model was constructed based on the training cohort to predict 3-and 5-year OS by including all independent prognostic factors using the rms package in R software version 3.51.

| Nomogram validation
The nomogram was validated based mainly on the internal (training cohort) and external (validation cohort) discrimination and calibration measurements. The concordance index (Cindex) was used to evaluate the discriminative capacity of the nomogram, which mainly measured the differences between predicted and actual outcomes. 16 A higher C-index suggested a superior discriminative capacity for survival outcomes. The Rcorrp.cens package in Hmisc in R was used to compare the nomogram with the SEER summary stage or TNM 8th staging classification, followed by C-index evaluation. Calibration plots were used to construct marginal estimates vs the model, indicating the calibration between nomogram-predicted and actual survival. A calibration plot along the 45-degree line implicated a perfect model, with great consistency between the predicted and actual outcomes. SPSS software, version 19.0 (SPSS Inc, Chicago, IL, USA) and R software, version 3.51 (www.r-proje ct.org) were used for statistical analysis, and P < .05 was considered statistically significant.

| Patient screening process
In total, 1651 eligible women diagnosed with IBC from January 2004 to December 2013 were included in the study. Figure 1 shows the specific screening process. The median follow-up was 31 months, ranging from 0 to 131 months. The median age at diagnosis was 56 years (range, 22-98 years). Among all patients, 983 and 668 subjects were assigned to the training and validation cohorts, respectively. Table 1 lists the demographic and clinicopathological features, and no statistically significant differences were found between the two groups.

| Nomogram construction
The 3-and 5-year OS rates were 52.8% and 39.5%, respectively. Figure 2 shows the OS curves for localized, regional, and distant diseases. Figure 3 shows the OS survival curves for AJCC stages IIIA, IIIB, and IV disease. Table 2 lists the independent factors that significantly influenced OS in the multivariate analysis. Nine factors remained as independent factors after adjusting for other risk factors, including race (P < .001), marital status (P = .011), N stage (P = .002), M stage (P < .001), HoR (P < .001), human epidermal growth    factor receptor-2 (HER2) (P = .001), surgery (P < .001), chemotherapy (P < .001), and radiotherapy (P = .010). These nine independent factors in the training cohort were incorporated into a nomogram-based prediction of 3-and 5-year OS rates ( Figure 4). The nomogram showed that HER2 status and M stage contributed the most to the prognosis, followed by chemotherapy, HoR, surgery, N stage, radiotherapy, and marital status. The survival probability of each patient could be easily calculated by adding the scores for every variable.

| Nomogram validation
The nomograms were internally and externally validated. The Cindexes for OS prediction in the nomogram were 0.738 (95% CI: 0.717, 0.759) and 0.741 (95% CI: 0.717, 0.765) for the training (internal validation) and validation (external validation) cohorts, respectively. Moreover, the discriminative capacity of the nomogram was compared with that of the SEER summary stage and TNM 8th staging classification, which revealed that the nomogram was significantly superior to the SEER and TNM 8th edition staging classification in both the training and validation sets (P < .001; Table 3). Finally, the internal and external calibration plots of the nomogram showed good agreement between the nomogram-based predictions and actual outcomes ( Figure 5).

| DISCUSSION
In total, 1651 IBC patients from the SEER database were analyzed. The constructed nomogram successfully predicted T A B L E 2 (Continued) the 3-and 5-year OS of IBC patients, demonstrating favorable discrimination and calibrations, which were internally and externally validated. Additionally, the nomogram demonstrated better prediction capacity than that of the SEER summary stage or TNM 8th edition staging classification. Currently, IBC has no established risk factors. However, many epidemiological studies have clarified the characteristics of IBC. 3 Of these, the most important suspected risk factors associated with IBC include race, body mass index, and age. 7 Wu et al found that BC subtype is clinically useful for predicting survival in IBC. Patients with the HoR-/ HER2-subtype had significantly poorer OS than did the other three subtypes. 17 Positive node involvement is also an adverse prognostic factor. In addition, ER/PR positivity and therapeutic approaches, including surgical resection and radiotherapy in node-positive patients, have been reported to enhance outcomes. 18 In our study, we found nine independent prognostic factors of OS, including race, marital status, N stage, M stage, HoR, HER2, surgery, chemotherapy, and radiotherapy.
IBC has historically been treated with surgery and/or radiotherapy; however, the 5-year OS is under 5%. 18 Before 1950, the median survival for patients treated by mastectomy was 19 months, and none of these patients survived to 5 years. 6 Administering definitive radiotherapy without surgery showed a 5-year survival rate without recurrence of 17% and an OS rate of 28%. Combining surgery and radiotherapy improves OS. 19 Moreover, the introduction of systemic chemotherapy showed an additional survival benefit. 20 Thus, trimodal therapy, including chemotherapy, surgery, and radiotherapy, has gradually become the standard of care for IBC. This therapy was established at the First International Conference on IBC in December 2008 to manage IBC. 21 Our study also found that surgery, chemotherapy, and radiotherapy significantly prolonged patient survival, confirming the effectiveness of trimodal therapy. Nomograms are a user-friendly statistical method that can estimate survival or a specific outcome through simple graphical presentation. 22 Moreover, nomograms can better predict outcomes than conventional AJCC TNM staging can for some malignant tumors and are recognized as an alternative or novel standard. 23,24 Additionally, nomograms can facilitate decision-making under complicated clinical conditions without needing standard guidelines. 25,26 This study has several strengths. The clinicopathological data on IBC patients collected from the SEER dataset were detailed, thus helping to ensure the accuracy of our constructed nomogram. Moreover, our nomogram demonstrates superior discriminative power for predicting OS over the SEER or TNM eighth edition staging classification. Calibration was used to confirm the validity and presentation of the nomogram. Easily accessible clinicopathological factors were used, which were convenient for clinical application of the nomogram.
Our study had some limitations. First, the nomogram was established retrospectively using the SEER database, which may lead to potential selection bias. Second, some prognosis-related clinicopathological factors were inaccessible in the SEER database, including vascular invasion and the specific radiotherapy and chemotherapy contents, which will be a main focus in future studies. Finally, as a user-friendly method for decision-making, some prognostic variables were not included in the nomogram; thus, the nomogram may not always yield accurate prognoses in clinical practice.

| CONCLUSION
In summary, our study was the first to construct a well-validated nomogram for women with IBC. This nomogram may help clinicians identify patients at a high risk of overall mortality within 3-5 years. However, the unknown prognostic factors must be further exploited to optimize the nomogram, and more external validation is required.

DATA AVAILABILITY STATEMENT
The data that support the findings of this study are available from the corresponding author upon reasonable request.