Nonmetastatic breast cancer patients subsequently developing second primary malignancy: A population‐based study

Abstract Background With life span extending, breast cancer (BC) survivors may face the possibility of developing second primary cancer (SPC) and considerably shorten survivorship. However, little is known about multiple primary cancer (MPC) patients with nonmetastatic breast cancer as a first primary malignancy (BCFPM). Methods Here, we retrospectively analyzed data on cancer survivors with BCFPM diagnosed between 2010 and 2015 from the Surveillance, Epidemiology, and End Results (SEER) database. The prognostic factors for breast cancer‐specific survival (BCSS) were ascertained by the stepwise regression analysis and a competing risk model, and were integrated to the establishment of prognostic nomogram, of which the accuracy was measured by the calibration curve and the concordance index (C‐index). Results In total, 8616 patients were identified with 4.6% of 3‐year breast cancer‐ specific death (BCSD) and 8.6% of 5‐year BCSD. The most common SPC among BCFPM patients were female BC and lung cancer. Besides, the median latency time between BC and SPC was 22 months. At a ratio of 7:3, all patients were randomly categorized into a training cohort (n = 6032) and a validation cohort (n = 2584). By a proportional subdistribution hazards regression analysis, the following factors were considered to own independent prognostic abilities of BCSS: subtypes, grade, T classification, N classification, radiation, and sites of SPC. The nomogram could accurately predict 3‐year and 5‐year breast cancer‐associated survival of BCFPM patients with high internal and external validated C‐index, 0.715 (95% CI, 0.691–0.739), and 0.683 (95% CI, 0.642–0.724), respectively. Conclusions BC survivors remained a high risk of developing SPC and considerably shortened survival time. In this study, a favorable nomogram was constructed to as a prediction model for 3‐year and 5‐year BCSS of BCFPM patients, largely intending to prolong the life of these patients by assisting clinicians to make individualized follow‐up plans.


| INTRODUCTION
Breast cancer (BC) is widely recognized as one of the most prevalent malignancies with highest morbidity for women worldwide. The encouraging fact comes that the prognosis of BC patients have greatly improved over the past decades, largely due to improvement of early detection and systematic treatment. [1][2][3] Statistics from the American Cancer Society during 2009 through 2015 showed 5-year survival rate for early-stage BC survivors is over 90%, while approximately 30% for metastatic BC. 4 Therefore, majority of BC survivors possessed a prolonged life expectancy, especially those without metastasis at the initial diagnosis, which made it possible to develop multiple primary cancers (MPC). In the United States, up to 7.9% of cancer survivors were diagnosed with more than one primary cancer, emphasizing the role of these patients with MPC cannot be ignored. 5 Nonetheless, there are few researches on MPC, especially for MPC involving BC. In addition, it was evidence from large-scale data that the clinical prognosis of BC survivors with MPC is not optimistic. 6,7 At present, TNM staging system, the most extensively used approach to evaluate survival outcomes, is not suitable for these patients who are related to special biological characteristics and complicated prognosis judgement. Thus, a dedicated clinical prognostic model for patients with MPC involving BC is urgently needed. Nomogram is obviously a better choice for this demand. It is a concrete graphical prediction tool that visually exhibits multiple independent prognostic variables to predict the clinical outcome we set. 8,9 To our knowledge, no published research has performed a nomogram for specific prediction on prognosis of BC survivors with MPC.
In this study, we focused on BC patients subsequently developing second primary cancer (SPC), because the vast majority of MPCs was dual primary cancer (DPC). We collected the patients' available data in the Surveillance, Epidemiology, and End Results (SEER) database to make a deep understanding for this special group, including clinical information, the prevalent occurrence sites of the SPC and time duration between BC and SPC. In order to guide clinical strategies for patients, prognostic factors based on competing risk model were utilized to construct a systematic nomogram for accurate prediction in shortterm survival rates of MPC survivors with breast cancer as a first primary malignancy (BCFPM).

| Data source and study cohorts
A population-based research was retrospectively operated with data from the SEER database, which incorporates national information on tumor samples from 18 largescale cancer registries and is open to public for cancer studies. From 2010 to 2015, a total of 8616 MPC sufferers with BCFPM were enrolled in this study, definitely including patients with two or more primary malignancies. Besides, we mainly focus on patients with nonmetastatic BC at the time of diagnosis, with consideration of low long-term survival and complicated prognostic factors of metastatic (stage IV) BC. The particular clinical exclusion criteria were as follows: (1) male patients; (2) patients whose first primary malignancies were confirmed by autopsy or a death certificate; (3) patients with stage IV; (4) all required Information remain unknown. The elaborate patient screening process is exhibited in Figure 1.

| Definition of SPC
A SPC is defined as independent cancer arising at least 2 months after the first primary malignancy, excluding recurrence, and metastasis. Clinically, the diagnosis of a SPC mainly depended on differential pathology findings from the primary and secondary lesions. In addition, the 2007 Multiple Primary and Histology Coding Rules from SEER were developed to assist in the classification of multiple primary malignancies, which were determined by pathology/cytology reports, sites of primary cancer, the interval between diagnoses, histology, and tumor behavior. 10 It is of worth to note two key variables of identifying MPC in SEER, the "sequence number" of the MPC and "total number of in situ/malignant tumors for patient."

| Statistical analysis
Among patients with early diagnosed nonmetastatic BC, we should not overlook the significant impact of causes of death other than BC on this population. 11 Thus, we selected breast cancer-specific survival (BCSS) as our primary endpoint, whose definition turned to be the date from the diagnosis of the first primary malignancy (BC) until death due to BC. Moreover, competitive risk events were derived from non-breast cancer-specific causes of death. Simple random sampling was performed by the R software version 4.0.3, which contributed to the classification of a validation cohort and a training cohort with a ratio of 3 to 7. To achieve a more accurate estimated BCSS than Cox proportional hazard model, a proportional subdistribution hazards model, which could eventually calculate the survival rates of a particular outcome by considering competitive risk events, was utilized to ascertain prognostic factors associated with BCSS in the training cohort. 12 Eight variables screened by stepwise method were considered in univariate analysis, and statistically significant covariates (p < 0.05 in the univariable analysis) were calculated on multivariate analysis. Then, the factors with statistical significance in multivariate analysis (p < 0.05) were incorporated into the final prognostic nomogram for predicting 3-year and 5-year BCSS using the R package "rms." Both internal (derived from the training cohort) and external (derived from the validation cohort) validations were responsible for assessing the predictive precision of the models. The concordance index (C-index) would reflect the predictive ability of the models and demonstrate its higher accuracy as the value lowered within a range of 0.5 to 1.0. 13 The calibration curves were generated based on the comparison between the observed breast cancer-related survival rates and the expected BCSS probabilities predicted by the nomogram. Statistical analyses and modeling were performed using R version 4.0.3 and its appropriate packages. Two sided p < 0.05 was considered significant. Exclude male breast cancer (n=197) Exclude the patients whose first primary malignancies were confirmed by autopsy or a death certificate (n=781) Exclude the patients with stage IV and all required Information remain unknown (n=2126) 3 | RESULTS

| Baseline clinical characteristics
By primary screening, a total of 11720 patients with MPC involving BC diagnosed between 2010 and 2015 were abstracted from the SEER database. Of them, only 8616 MPC survivors with a prior nonmetastatic BC had information without missing so as to further analyze clinical and pathological characteristics ( Figure 1). The cumulative incidence of the 3-year, and 5-year breast cancerspecific death for BCFPM patients were 4.6% and 8.6%, respectively, as displayed in Figure 2. According to a ratio of 7:3, those cases were categorized into two groups at random, a training cohort (n = 6032) and a validation cohort (n = 2584). The median follow-up time of training cohort was 59 months. Their baseline demographic and clinicopathological characteristics are illustrated in Table 1. Besides, there were three groups in the entire study population classified by their age at the FPC (BC) diagnosis: young (aged ≤39 years), intermediate (aged 40-64 years), and elderly (aged ≥65 years). We set the cutoff points referred to previous published studies. 14,15 As shown in the table, the majority of patients in our analysis tended to be elderly, white, and married. In terms of molecular subtypes, 6606 (76.7%) patients presented with luminal A, followed by triple negative breast cancer (TNBC) patients with proportion of 11.3%. The other subtypes of luminal B and human epidermal growth factor receptor 2 (HER2) enriched accounted for 8.7%, 3.4%, respectively. Besides, a similar proportion of patients with either left breast tumor or right breast tumor were included in entire cohorts, whereas most of them had a tumor in the outer upper quadrant, at 34.1%. In terms of histologic type, ductal carcinoma is the largest group in our study. The patients diagnosed with T1, N0, or stage I, all over 50% of patients, had a higher frequency in both training and validation cohorts. Additionally, all patients underwent surgery, 60.6% of them suffer with lumpectomy, and 39.4% with mastectomy. Among them, patients receiving chemotherapy or radiation made up for 35.1% and 54.2%, respectively. In brief, above baseline variables were evenly distributed in both two cohorts.

| The occurrence sites of SPC and the time interval between two primary cancers
Among all the cases, we mainly focused on MPC with BCFPM, so patients were included if involved with two or more primary malignancies. In total, 80 sites of SPC were observed between the 8616 BCFPM patients. As we can see from the Table 2, female breast turned out to be the most common sites of SPC, which reached up to 30%, followed by lung and bronchus (13.4%), corpus uteri (5.8%), thyroid (5.6%), respectively. The median interval time of the dual primary cancers mentioned above was 17, 25, 27, and 13 months. Besides, in the whole study cohorts, the median latency time of SPC diagnosis after BC was 22 months, and up to 50.1% of these observations was shown to be over 2 years. Breast cancer survivors had the shortest median interval of 13 months for undergoing another primary malignancy in the thyroid or the kidney, while the longest was 30 months for pancreatic cancer.

| Independent prognostic factors for breast cancer-specific survival
A univariate analysis based on competing risk model was performed on data of 6032 patients in the training cohort, with the purpose of exploring prognostic factors for BCSS of BCFPM patients. It was showed that marital status, subtypes, tumor grade, T classification, N classification, F I G U R E 2 Estimates of overall cumulative incidence of breast cancerspecific death, taking the other causes of death as a competing event

| Development and validation of the prognosis nomogram
On the basis of all the factors significantly associated with BCSS in the competing risk regression analysis, the estimated 3-year, and 5-year BCSS for individuals were predicted by a nomogram (Figure 3). It can see from the Figure 3 that T classification was the strongest contributor to prognosis, followed by N classification, tumor grade, sites of SPC, subtypes, and radiation. Different states of each clinical variable corresponded to different scores on the point scale. By summing the scores of each item, we could easily acquire the estimated probability of the 3-year and 5-year BCSS for BCFPM patients. The C-index for the nomogram in the internal validation was 0.715 (95% CI, 0.691-0.739), and that of external validation was and 0.683 (95% CI, 0.642-0.724). As presented in Figure 4, the corresponding calibration curves of 3-year, and 5-year BCSS was drawn in each cohort, indicating a strong agreement between the nomogram-predicted probabilities and realistic observation.

| DISCUSSION
According to Global cancer statistics, BC has regarded as a main public health threat worldwide, with about 2.3 million new cases in 2020. 16  of early screening and the application of comprehensive treatment, the survival of BC patients has significantly improved, which naturally faces the possibility that these patients may develop the MPC. In recent years, patients with MPC have attracted researchers' attention for its highly 18% incidence in cancer survivors based on SEER data. 17 Previous studies indicated that MPC patients involving BC were related to worse outcomes than those with BC only. 7,18 Therefore, it is necessary for us to conduct in-depth study on survival prognosis of this population. However, as far as we have concerned, there were no published literature to identity prognostic factors and construct a nomogram of nonmetastatic BC patients subsequently developing SPC. In our study, a retrospective analysis was operated to analyze 8616 MPC survivors with BCFPM diagnosed between 2010 and 2015 from the SEER database, aiming to evaluate prognosis of these patients and provide valuable recommendations for clinical treatment.
Notably, no clear standardized follow-up plan is developed for MPC, especially for dual primary cancer survivors with prior BC. In general, the sites where the primary tumor tend to metastasize are of greater concern to clinicians during the follow-up period, leading to inevitable neglect for other sites of developing secondary primary cancer. However, BC patients with SPC greatly shortened its survival time. Thus, with the purpose of designing effective follow-up strategies, summarizing the common sites of SPC for patients with a history of BC is warranted. In this study, we clarified a total of 80 types of tumors as SPC among the 8616 BCFPM patients, and the most frequent sites of SPC are female breast, lung, corpus uteri, and thyroid, respectively. As similar to above results, a large-scale data analysis in Korea showed that thyroid cancer finally proved to be the most common MPC for patients with prior BC, accounting for 48.5%, followed by gynecologic cancer, upper gastrointestinal cancer, colorectal cancer, and lung cancer. 6 Those findings might remind clinicians that surveillance for female breast, lung, thyroid, and other common sites would also be preferable after the treatment of initial BC. In addition, we further explored the time interval between two primary cancers and revealed that median interval was 22 months, which partly address a matter of how often to follow-up after BC diagnosis. On the other hand, it is interesting to note that a few studies suggested treatment of primary cancers may increase the risk of developing SPC. 19,20 As observed in several studies, Hodgkin's lymphoma survivors were more likely to subsequently suffer BC, explained by the result of radiation therapy to the chest. 21,22 According to competing risk regression model, univariate and multivariate analyses were utilized to found that subtypes, tumor grade, T classification, N classification, radiation, and location of SPC appeared to be independent prognostic factors for BC survivors with SPC. Specifically, we demonstrated the relationship between dissatisfied survival outcomes and patients with MPC who were single at the time of BC diagnosis using SEER data. Besides, the result of our study showed that patients with TNBC in this group, compared to other molecular subtypes of BC, tended to be have a shorter survival time and similar finding could be observed in the general BC patients. 23 This may be affected by a high degree of malignant behavior of patients with TNBC in biological characteristics and its resistance to most conventional treatments except chemotherapy. 24,25 In addition, we also found the unfavorable prognostic role of an advanced TN stage among the BCFPM patients, which was consistent with conventional wisdom that poor clinical features of BC were associated with poor clinical outcomes. Another finding was that patients with BCFPM receiving radiation had a higher propensity for longer survival time than those without treatment. That suggested that patients with BCFPM might benefit from the intervention of comprehensive and effective regimens for cancer survivors.  However, to date, less is known that the relationship between the prognosis of patients with BCFPM and SPC. Several studies have declared that the definite diagnosis of SPC after BC was regarded as an independent risk factor and led to significantly shorten the life span of patients. 26,27 In our study, we not only confirmed that subsequent primary cancer indeed worsened survivorship in BC survivors using SEER data, but also further revealed that the survival of patients with BCFPM varied depending on occurrence sites of SPC. Among several groups of different locations of SPC, the adverse impact on survival of patients with second primary lung cancer was the most obvious, while influence on the patients who developed dual primary BC was relatively minimal. Despite of this, there were a few controversial findings from other studies. A retrospective analysis in Korea reported that a relatively good prognosis occurred in patients with thyroid cancer as SPC, but the prognosis of prior BC had nothing to do with it. 6 Conversely, another research found a poorer prognosis of patients who developed thyroid cancer after BC, compared to those with BC alone. 7 In this regard, it may be caused by differences in data sources and statistical analysis methods. Interestingly, in the case of second primary lung cancer, previous studies indicated that a BC history may result in better prognosis for non-small cell lung cancer (NSCLC) patients, which we think may be due to the effective and varied treatment of BC. 18 In a word, more further researches and statistical evidence are required to explore the influence of subsequent primary cancer on survival rate of patients with BCFPM. Our research has the following advantages. First of all, large-scale information from the SEER database is one of the highlights of our study, including 8616 BCFPM patients among 11720 patients with MPC involving BC. Moreover, the SEER database has high data accuracy, and the default histological criteria for defining MPC make the impact of misclassification errors in the study minimal. 10,18 Furthermore, to our knowledge, it is the first attempt to use nomogram to construct a clinical prediction model for survival time of BCFPM patients by calculating the 3-year or 5-year BCSS as the high C-index showing good consistency and ideal predictive value. Instead of Cox proportional hazard model, the utilization of the competing risk model strengthens the credibility and rationality of our research. Therefore, we had an unbiased estimate of BCSS to reflect a more realistic survivorship of nonmetastatic BC patients in the presence of competing risk. Finally, no matter the site distribution of SPC and the interval time between two primary cancers, or the nomogram we established for evaluation of patient survival prognosis, all these initial findings were inclined to facilitate clinicians to deepen the understanding of MPC patients with a prior BC for better clinical decision-making.
Certainly, some shortcomings do exist in the present study. The first disadvantage is the limitation of the collected data in the SEER database. In addition to the lack of detailed chemotherapy regimens, there is no complete information on family history, lifestyle characteristics, Body Mass Index (BMI), tumor markers, and other potential risk factors for BCFPM patients. Second, we have to admit the selection bias of the retrospective study despite the valuable ability of proportional subdistribution hazards model to adjust confounding factors. Third, as researches on prognosis of MPC patients involving BC is still not abundant, the published findings only serve as recommendations for clinicians instead of consensuses. Therefore, there is imperious demands to develop further prospective studies to validate our findings.

| CONCLUSION
In conclusion, we indicated that the most common site of SPC after BC was female breast, followed by lung, corpus uteri, thyroid, and so on. Additionally, the median interval time between two primary cancers in our study was 22 months, and second primary BC was the shortest of 13 months. Furthermore, our study revealed the prognosis of nonmetastatic BC patients subsequently developing SPC determined by subtypes, tumor grade, T classification, N classification, radiation, and location of SPC. A clinically useful competing risk nomogram was constructed to predict the 3-year and 5-year BCSS for BCFPM patients. Our findings could be helpful for clinicians to make individualized follow-up plans for these patients, which is expected to prolong the life of patients by early detection and treatment. However, more effort on studies about screening strategies for SPC patients after BC are needed.