Comparison of model fit and discriminatory ability of M category as defined by the 7th and 8th editions of the tumor‐node‐metastasis classification of colorectal cancer and the 9th edition of the Japanese classification

ABSTRACT Background In transitioning from the 7th edition of the tumor‐node‐metastasis classification (TNM‐7) to the 8th edition (TNM‐8), colorectal cancer with peritoneal metastasis was newly categorized as M1c. In the 9th edition of the Japanese Classification of colorectal, appendiceal, and anal carcinoma (JPC‐9), M1c is further subdivided into M1c1 (without other organ involvement) and M1c2 (with other organ involvement). This study aimed to compare the model fit and discriminatory ability of the M category of these three classification systems, as no study to date has made this comparison. Methods The study population consisted of stage IV colorectal cancer patients who were referred to the National Cancer Center Hospital from 2000 to 2017. The Akaike information criterion (AIC), Harrell's concordance index (C‐index), and time‐dependent receiver operating characteristic (ROC) curves were used to compare the three classification systems. Subgroup analyses, stratified by initial treatment year, were also performed. Results According to TNM‐8, 670 (55%) patients had M1a, 273 (22%) had M1b, and 279 (23%) had M1c (87 M1c1 and 192 M1c2 using JPC‐9) tumors. Among the three classification systems, JPC‐9 had the lowest AIC value (JPC‐9: 10546.3; TNM‐7: 10555.9; TNM‐8: 10585.5), highest C‐index (JPC‐9: 0.608; TNM‐7: 0.598; TNM‐8: 0.599), and superior time‐dependent ROC curves throughout the observation period. Subgroup analyses were consistent with these results. Conclusions While the revised M category definition did not improve model fit and discriminatory ability from TNM‐7 to TNM‐8, further subdivision of M1c in JPC‐9 improved these parameters. These results support further revisions to M1 subcategories in future editions of the TNM classification system.


| INTRODUCTION
Since the number of metastatic sites involved is an important prognostic factor for colorectal cancer, 1 the 7th edition of the tumor-node-metastasis classification (TNM-7) of malignant tumors (published in 2009) 2 subdivides M1 into M1a and M1b: metastasis confined to one organ (liver, lung, ovary, or non-regional lymph node(s)) is classified as M1a, and metastasis to more than one organ or the peritoneum as M1b. In the 8th edition of TNM (TNM-8; published in 2017), 3 colorectal cancer with peritoneal metastasis was categorized as M1c regardless of other organ involvement, which is experienced by approximately one fourth of patients presenting with M1 disease, 4,5 because its prognosis is worse than that for visceral metastases to one or more solid organs. 6,7 It is noteworthy that two studies 6,7 referred to in the American Joint Committee on Cancer (AJCC) cancer staging manual (8th edition) 8 did not include as subjects all stage IV colorectal cancer patients, as those studies excluded patients with peritoneal metastasis who underwent curative resection.
Japan has its own classification system for colorectal cancer. The 9th edition of the Japanese Classification of colorectal, appendiceal, and anal carcinoma (JPC-9) was published in 2018 by the Japanese Society for Cancer of the Colon and Rectum. 9 In JPC-9, M1c is further subdivided into M1c1 (metastasis to the peritoneum without other organ involvement) and M1c2 (metastasis to the peritoneum with other organ involvement). 10 To date, no study to our knowledge has compared TNM-7, TNM-8, and JPC-9 in detail. Thus, comparing these three classification systems may be informative. To this end, the present study aimed to compare the model fit and discriminatory ability of these three classification systems. In addition, an important consideration when evaluating long-term outcomes is the use of a classification system which holds up when applied to both the past and present, 2,3 particularly in view of advances in diagnostic and treatment modalities which have improved the overall survival (OS) of stage IV colorectal cancer patients. This aspect was examined by performing subgroup analyses, in which patients were divided into two groups by initial treatment year.

| Study population
Subjects were stage IV colorectal cancer patients who were referred to the Department of Colorectal Surgery or the Gastrointestinal Medical Oncology Division of the National Cancer Center Hospital from January 2000 to December 2017. Patients with appendiceal cancer or anal cancer, those with a histologic diagnosis other than adenocarcinoma (e.g., neuroendocrine carcinoma), and those with other concomitant advanced disease were excluded. The initial treatment strategy, such as curative resection including metastasectomy, palliative resection, and perioperative and palliative chemotherapy, was routinely decided during multidisciplinary team meetings attended by colorectal surgeons, medical oncologists, hepatobiliary surgeons, thoracic surgeons, and radiologists, taking into consideration disease severity, comorbidities, and patient condition.
This retrospective study was approved by the Institutional Review Board (IRB) of the National Cancer Center Hospital (IRB code: 2015-320).

| Data collection
The following parameters were retrospectively assessed using medical records: treatment year, gender, age, Eastern Cooperative Oncology Group (ECOG) performance status (PS) at initial treatment, primary tumor site (right-sided: cecum, ascending colon, hepatic flexure, and transverse colon; left-sided: splenic flexure, descending colon, sigmoid colon, rectosigmoid junction, and rectum 11 ), histological differentiation, type of systemic chemotherapy regimen (cytotoxic agent therapy without molecular targeted agents, such as fluoropyrimidine monotherapy, fluoropyrimidine plus oxaliplatin, and fluoropyrimidine plus irinotecan), use of at least one molecular targeted agent throughout the treatment course (i.e., bevacizumab, cetuximab, or panitumumab), and type of surgery (i.e., curative resection achieving R0 such as primary tumor resection and metastasectomy, including dissection of peritoneal metastasis; palliative resection such as primary tumor resection without metastasectomy; and unresected cases, including surgical procedures such as diverting stoma construction, bypass surgery, or probe laparotomy).

| Statistical analysis
Pearson's chi-square test for categorical variables and the Wilcoxon rank-sum test for continuous variables were performed to compare various patient background factors between the two subgroups (2000-2007 and 2008-2017). OS was defined as the interval between the date of stage IV colorectal cancer diagnosis and the date of death from all causes. Patients alive at the end of follow-up (March 31, 2020) were censored. Kaplan-Meier plots were used to estimate OS. Differences in survival were assessed with the log-rank test. The Akaike information criterion (AIC) is an information-based criterion that assesses model fit and can be used to compare various models with the same data set. AIC was calculated as follows: AIC = −2 log maximum likelihood +2 X (number of parameters in the model). The model having the smallest value is the preferred model. 12 AIC was applied to the Cox proportional hazards regression model to correct for potential bias in comparing prognostic systems with different numbers of parameters. Time-dependent receiver operating characteristic (ROC) curves and estimated area under the curve (AUC) were used to compare prognostic abilities of the three classification systems. Time-dependent ROC analysis is an extension of the ROC curve analysis and evaluates the power of discrimination of continuous indices for prognoses of time-dependent disease. 13 A predictive variable with a higher AUC indicates better discriminatory ability or prognostic accuracy. In addition, the discriminatory performance of the three classification systems was evaluated using Harrell's concordance index (C-index). 14 Harrell's C-index is an extension of the AUC analysis to censored survival data. 14 A larger C-index value indicates a better ability to predict outcomes.
Data are presented as numbers of patients, proportions (%), median and interquartile range (IQR), or median and 95% confidence interval (CI), as indicated. p < 0.05 was considered statistically significant. All statistical analyses were conducted using the JMP14 software program (SAS Institute Japan Ltd.) and R ver.3.5.3 (R Foundation for Statistical Computing). The R package "stats," "timeROC," and "survival" were used for AIC analyses, time-dependent ROC analyses, and C-index analyses respectively.

| Characteristics of the study cohort
The consort diagram for this study is shown in Figure S1. Between January 2000 and December 2017, 1245 patients with stage IV colorectal, appendiceal, and anal carcinoma were referred to our hospital. Excluding 11 patients with appendiceal cancer, six with anal cancer, three with neuroendocrine cell carcinoma, and three receiving chemotherapy for other concomitant advanced cancer, the final study population consisted of 1222 patients. The median follow-up period for survivors was 38.4 months.

| Subgroup analyses of longterm outcomes
Subgroup analyses of OS stratified by initial treatment year revealed that OS of patients in each M subcategory for all

| AIC values of the three classification systems
AIC values of each classification system are shown in Table 2. Analyses with the entire cohort revealed that the  Table 2). In both subgroups, the AIC value was lower for JPC-9 compared to TNM-7 and TNM-8.

| Time-dependent ROC analyses of OS
Time-dependent ROC curves were generated to compare sequential trends in discriminatory ability of the three classification systems for OS ( Figure 5). The time-dependent ROC curve for JPC-9 was consistently superior to curves for TNM-7 and TNM-8 for all observation periods. Furthermore, subgroup analyses of time-dependent ROC curves by treatment year revealed that the time-dependent ROC curve for JPC-9 was again superior to curves for TNM-7 and TNM-8 for all observation periods in both subgroups.  Table 2 shows Harrell's C-index of the three classification systems. Analyses with the entire cohort revealed that Harrell's C-index was higher for JPC-9 compared to TNM-7 and TNM-8 (Table 2). Furthermore, subgroup analyses by treatment year revealed that, in both subgroups, Harrell's Cindex was again higher for JPC-9 compared to TNM-7 and TNM-8 (Table 2).

| DISCUSSION
When considering a classification system to stage cancer patients, care must be taken in interpreting the results. Whereas two previous studies have shown that the prognosis of patients with peritoneal metastases is worse than that of patients with visceral metastases to one or more solid organs, 6,7 the patient populations of these studies were not uniformly stage IV colorectal cancer patients, but rather patients with unresectable metastatic colorectal cancer. Similarly, while a Japanese multi-institutional retrospective study reported that the OS of patients with M1c1 tumors was significantly longer than that of patients with M1c2 or M1b tumors, 96% of patients subjected to analysis in that study had undergone resection of the primary tumor, 15 implying that unresected cases comprised only 4% of the patient population and that not all stage IV colorectal cancer patients were included. In the present study, patients undergoing resection comprised 57% of the 2008-2017 subgroup. This is consistent with the annual rate of primary tumor resection in stage IV colorectal cancer patients of 57.4% in 2010, as reported using data from the National Cancer Institute's Surveillance, Epidemiology, and End Results CRC registry in the United States. 16 Given that our data accurately represented real world stage IV colorectal cancer patient populations, we considered it feasible to compare the three classification systems for stage IV colorectal cancer in general. As a result, we found that JPC-9, which subdivides peritoneal metastasis (M1c) based on the absence or presence of other organ involvement (M1c1 and M1c2, respectively), is superior to TNM-7 and TNM-8 for predicting OS in stage IV colorectal cancer patients. The number of M1 subcategories varies across the three classification systems, with TNM-7 having two subcategories (M1a and M1b), TNM-8 having three subcategories (M1a, M1b, and M1c), and JPC-9 having four subcategories (M1a, M1b, M1c1, and M1c2). The number of subcategories in the model is included when calculating AIC values, and in general, the greater the number of subcategories, the higher the AIC value. Nonetheless, despite having the highest number of subcategories, JPC-9 had the lowest AIC of the three classification systems. These results suggest that dividing M1c into M1c1 and M1c2 contributes to improvements in model fit.
We also found that patients with M1c1 tumors had better OS than those with M1b and M1c2 tumors, regardless of treatment year. According to previous studies, when R0 resection of peritoneal metastasis was achieved, 5-year OS rates ranged from 28.7%-34.1% among patients with M1c tumors. 17,18 Another study reported no significant difference in survival outcomes for patients with M1c tumors and patients with liver metastases who could achieve curative resection (5-year OS rates: 32.1% and 33.3%, respectively), 19  has a better model fit and discriminatory ability compared to TNM-7 and TNM-8. Since the classification system used must in principle be reliable regardless of the type of treatment or when the treatment was performed, 2,3 we also conducted subgroup analyses by dividing patients into two groups based on the period before and after introduction of targeted therapy in Japan. In this analysis, JPC-9 was reliable both before and after the introduction of targeted therapy. Thus, JPC-9 is valid also for patients with stage IV colorectal cancer who did not receive targeted therapy. On the other hand, TNM-8 did not show a good model fit and its discriminatory ability improved only marginally relative to TNM-7, and only in the 2008-2017 subgroup. This suggests that the update to the M category in TNM-8 relative to TNM-7 failed to consistently improve model fit and discriminatory ability over time.
This study has some limitations. First, because the study was retrospective in design, bias may exist. Second, although consecutive patients were enrolled, the study period was from 2000 to 2017. During this long period, treatment strategies including intensive chemotherapeutic regimens have changed significantly, as well as perioperative awareness of peritoneal metastasis. Thus, our study may not be fully reflective of current medical practice using newly developed treatment and diagnostic modalities. Third, a strategy unique to Japan (i.e., R0 resection of peritoneal metastasis from colorectal cancer 17,18,20 ) was performed in some patients with limited peritoneal metastases, whereas in Western countries, cytoreductive surgery (peritoneal stripping surgery) with hyperthermic intraperitoneal chemotherapy (HIPEC) is more commonly performed. Since this strategy could affect the prognosis of patients with M1c tumors, our findings might not be generalizable to Western patient populations. That said, discussions on the extent of peritoneal resection for peritoneal metastases, such as 'The extent of peritonectomy should vary according to the primary site. For colorectal peritoneal metastases, less extensive resection may be sufficient.' Have begun in Western countries, 21,22 which might support a strategy unique to Japan. Fourth, since our data is based on treatments in Japan, there could be bias in evaluating the validity of the JPC-9 using our data. However, there really is no major difference in treatment strategy between Japan and Western countries, as described above. And, basically, both the 7th and 8th editions of the TNM classification 2,3 mention that 'a system of classification is needed that is applicable to all sites regardless of treatment'. Thus, regardless of the fact that our data were from Japan, it is reasonable to use the data to compare the three classification systems for stage IV colorectal cancer. Nonetheless, our findings warrant further consideration of M1c subcategorization and validation in a larger stage IV colorectal cancer patient population.

| CONCLUSION
Updates to the M category from TNM-7 to TNM-8 failed to improve model fit and discriminatory ability. On the other hand, JPC-9, which further divides M1c based on the presence or absence of other organ involvement, was superior to TNM-7 and TNM-8 for predicting OS in stage IV colorectal cancer patients. Our findings highlight the importance of updating staging classification systems regularly, particularly because new and important evidence accumulates even within the span of a few years. We anticipate that our results will serve as a reference for M1 category revision during the next update to the TNM classification system for malignant tumors.

ETHICAL APPROVAL STATEMENT
This retrospective study was approved by the Institutional Review Board (IRB) of the National Cancer Center Hospital (IRB code: 2015-320).