Validation of a nomogram for predicting disease-specific survival after an R0 resection for gastric carcinoma

Authors


Abstract

BACKGROUND

A statistical model for predicting disease-specific survival in patients with gastric carcinoma, based on a single U.S. institution experience, was tested for validity in a sample of patients treated at different institutions.

METHODS

The authors analysed 459 patients from the Dutch Gastric Cancer trial that compared limited (D1) with extended (D2) lymph node dissection. The discrimination ability of the nomogram with respect to 5 and 9-year disease-specific survival probabilities was superior to that of the American Joint Committee on Cancer (AJCC) staging system.

RESULTS

There was considerable heterogeneity of risk within many of the AJCC stages. Calibration plots suggested that predicted probabilities from the nomogram corresponded closely to actual disease-specific survival. The gastric carcinoma nomogram performed well when applied to patients treated in a large number of institutions.

CONCLUSIONS

The nomogram provided predictions that discriminated better than the AJCC staging system, regardless of the extent of lymph node dissection. Patient counseling and adjuvant therapy decision-making should benefit from use of the nomogram. Cancer 2005. © 2005 American Cancer Society.

Although the incidence of gastric carcinoma is declining in Western Europe,1 the disease remains the second most common cause of cancer death worldwide.2 Surgery is the only curative treatment. The influence of extent of gastric and lymph node resection is still being debated.3–5 Adjuvant chemoradiotherapy has been proposed as well and tested in an attempt to improve local control and survival rates. The U.S. Intergroup study by the Southwest Oncology Group showed a significant overall survival benefit after postoperative chemoradiotherapy (i.e., a median overall survival period of 36 vs. 27 months in the surgery-alone group), which led to standardization of this regimen in the United States.6 The trial was criticized, however, for the suboptimal surgery employed and the level of unresected lymph node disease. Surgical undertreatment, as observed in that trial, clearly undermined survival.7

Although the treatment delivered determines a patient's prognosis to a large extent, other factors, such as patient age and gender, the stage of disease at presentation, and tumor location and morphology, play a substantial role. Current staging modalities, which solely focus on the depth of tumor invasion and the presence of lymph node disease, do not take these factors into account. Nomograms have been developed to address this problem. They are predictive tools for the individual patient based on known prognostic variables including the extent of surgical treatment. Nomograms help with patient counseling, follow-up scheduling, and clinical trial determination and have been developed for use in soft tissue sarcoma,8 prostate carcinoma,9–12 renal cell carcinoma,13 pancreatic carcinoma,14 and breast carcinoma.15 The statistical model developed for gastric carcinoma (Fig. 1) was able to predict an individual patient's probability for 5 and 9-year disease-specific survival (DSS) after an R0 resection for gastric carcinoma in a single-institution U.S. patient population involving 1039 patients treated from 1985 to 2002.16

Figure 1.

Nomogram for disease-specific survival. A/P: antrum/pyloric; B/M: body/middle one-third; GEJ: gastroesophageal junction; P/U: proximal/upper one-third; Int: intestinal; mix: mixed; Dif: diffuse; MM: mucosa; MP: propria muscularis; S1: suspected serosal invasion; S2: definite serosal invasion; S3: adjacent organ involvement; SM: submucosa; SS: subserosa. Reprinted with permission.

The purpose of the current study was to assess the validity of this prediction tool when applied to patients with different stages of disease at presentation who received different (surgical) treatments at different institutions. We also compared the discriminating value of the nomogram to the American Joint Committee on Cancer (AJCC) staging system.

MATERIALS AND METHODS

Patients were enrolled in the Dutch Gastric Cancer trial. This trial was undertaken between August 1989 and July 1993 and randomized patients with gastric carcinoma from 80 Dutch hospitals to receive either a limited (D1) or an extended (D2) lymph node dissection as recommended by the Japanese Research Society for the Study of Gastric Cancer (JRSGC).17, 18 The results of this trial have been published.19–21 For the current analysis, patients were eligible if they had received an R0 resection, i.e., a resection with negative margins without any evidence of disease (n = 633). In agreement with our previous report, the following prognostic variables were assembled for use in validating the nomogram: age, gender, primary site (distal one-third, middle one-third, proximal one-third, and gastroesophageal junction), Lauren histotype (diffuse, intestinal, mixed), number of positive lymph nodes resected, number of negative lymph nodes resected, and depth of invasion as defined by the standard nomenclature.22 Patients with suspected versus definite serosal invasion are distinguished in the nomogram. However, pathologic analysis from the Dutch trial did not distinguish between these depths. For purposes of nomogram validation, we calculated the nomogram prediction assuming a point one-half way between these two points on the nomogram. Patients with ≥ 1 missing value were excluded (Lauren histotype, n = 126; size, n = 19; primary site, n = 41), leaving 459 patients who had values for all nomogram predictor variables, AJCC stage information, and complete follow-up. For each of these patients, the nomogram 5 and 9-year DSS probabilities were computed and compared with the AJCC staging system on the basis of discrimination ability, as measured by the concordance index. DSS was estimated using the Kaplan–Meier method.

Nomogram validation comprised two activities. First, discrimination was quantified with the concordance index.23 Similar to the area under the receiver operating characteristic curve, but appropriate for censored data, the concordance index provides the probability that, in a randomly selected pair of patients in which one patient dies before the other, the patient who died first had the worse predicted outcome from the nomogram.

Second, calibration was assessed. This was done by grouping patients with respect to their nomogram-predicted probabilities and then comparing the mean of the group with the observed Kaplan–Meier estimate of DSS. All analyses were performed using S-plus 2000 professional software (Statistical Sciences, Seattle, WA) with the Design and Hmisc libraries added.24

RESULTS

Table 1 depicts the patient and tumor characteristics of the 459 eligible patients with all the information available for the nomogram calculation. With a median follow-up of 10 years, 194 of the 459 patients had died of disease. DSS by AJCC stage grouping is shown in Figure 2, suggesting a reasonable number of patients alive at both 5 and 9 years for nomogram validation. The concordance index for the nomogram was 0.77. Calibration of the nomogram (Fig. 3) appeared to be accurate for both the 5 and 9-year predictions.

Table 1. Patient and Tumor Characteristics of All Patients with Available Information on Nomogram Predictor Variables
CharacteristicsNo. of patients (%)
  1. A/P: antrum/pyloric; B/M: body/middle one-third; GEJ: gastroesophageal junction.

Gender 
 Male270 (59)
 Female189 (41)
Primary site 
 A/P199 (43)
 B/M191 (42)
 GEJ69 (15)
Lauren 
 Mixed17 (4)
 Intestinal337 (73)
 Diffuse105 (23)
Stage 
 IA102 (22)
 IB115 (25)
 II117 (26)
 IIIA69 (15)
 IIIB24 (5)
 IV32 (7)
Depth 
 Mucosa81 (13)
 Submucosa100 (16)
 Propria muscularis93 (15)
 Subserosa215 (34)
 Suspected/definite serosal invasion132 (21)
 Adjacent organ involvement12 (2)
No. of negative lymph nodes 
 Minimum0
 1st quartile13
 Median21
 Mean24
 3rd quartile32
 Maximum105
No. of positive lymph nodes 
 Minimum0
 1st quartile0
 Median1
 Mean3.5
 3rd quartile5
 Maximum28
Size (cm) 
 Minimum0
 1st quartile3
 Median4
 Mean5
 3rd quartile6
 Maximum24
Age (yrs) 
 Minimum31
 1st quartile57
 Median66
 Mean64
 3rd quartile73
 Maximum84
Figure 2.

Disease-specific survival by American Joint Committee on Cancer stage grouping.

Figure 3.

Calibration curves for the nomogram. X-axis is nomogram predicted probability. Patients were grouped by quartiles of predicted risk. Y-axis is actual disease-specific survival as estimated by the Kaplan–Meier method. Solid line: 5-year prediction; dotted line: 9-year prediction. Vertical bars represent 95% confidence intervals (95% CI). For each quartile of both nomogram predictions, the 95% CIs overlap the diagonal “ideal” line, where predicted survival would exactly match actual disease-specific survival.

We compared predictions from the nomogram with those obtained by using the AJCC stage groupings. Individual AJCC stage groups and nomogram predictions were compared for their ability to rank the patients (e.g., concordance index). Nomogram discrimination was superior to that of AJCC stage grouping (concordance index 0.77 vs. 0.75; P < 0.001, Z test). This difference is difficult to appreciate clinically. Figure 4 illustrates the discrepancies between the two prediction methods. Within each AJCC stage grouping is a histogram of nomogram-predicted probabilities, illustrating heterogeneity within many of the stages.

Figure 4.

Nomogram-predicted probabilities within each of the American Joint Committee on Cancer stages. The numbers in parentheses for each stage indicate the number of patients within that stage. Note the large variation in nomogram-predicted probability present within many of the stages.

DISCUSSION

Currently, patient prognosis is estimated on the basis of the AJCC staging system, and not on other factors like age, gender, or morphology, which may have an impact on DSS. Integrating these variables in a nomogram has yielded a model that is a more accurate predictor for DSS than is AJCC stage. Our study validates the predictive value of the nomogram, previously tested in a single U.S. institution.16 The difference in concordance index between the nomogram and the AJCC staging system is not great, and may therefore appear clinically irrelevant. However, the nomogram does discriminate more accurately than does the AJCC staging system (Fig. 4 illustrates the discrepancies in prognosis). Although neither system is a gold standard, the nomogram discriminates better, and for some patients the change in prognosis will be clinically meaningful. Accurate prediction can aid in individual patient counseling and in follow-up scheduling. It also may play a role in designing future trials and identifying subsets of patients within known AJCC stages who have different prognoses, and who may have different responses to novel adjuvant treatment regimens. It is important that this model, shown to be valuable in a single-institution U.S. patient population, is valid in a multicenter European population of patients with gastric carcinoma. The type of gastric carcinoma management depends largely on where the patient is being treated. For example, in the United States, many patients with gastric carcinoma receive postoperative chemoradiotherapy,6 whereas adjuvant treatment is not the norm for patients in Europe. In the current patient population as well as in the original group of patients used to develop the nomogram, no adjuvant treatment was given, and the surgical treatment consisted of D1 and D2 lymph node dissection in all validation patients. This is more extensive surgery than undertaken in the general U.S. patient population. The American College of Surgeons evaluated surgical treatment for > 18,000 patients with gastric carcinoma between 1982 and 1987 and concluded that dissection of the celiac lymph nodes occurred in only 14% of patients.25 Of the 3804 patients who received a curative resection, only 695 (18%) had dissection of the lymph nodes along the celiac axis, hepatic artery, or splenic artery (N2 lymph nodes).26 Disease stage differs between the current patient population and the U.S. patients who were analyzed in our previous report. Fewer patients in the current study had less advanced disease stage because we included only patients who underwent an R0 resection. Despite these major discrepancies between the series, the nomogram was a more accurate predictor than AJCC stage for determining DSS in a patient population treated in as many as 80 hospitals. This is consistent with common surgery in The Netherlands.

Patients in the current analysis were derived from the Dutch Gastric Cancer trial, which compared D1 with D2 dissection. The nomogram predicted well in this series despite the finding that the type of dissection was not a variable, per se, in the nomogram. The likely reason for this favorable outcome is that the numbers of positive and negative lymph nodes are predictor variables in the model. Thus far, there is still no overall difference in survival rates between the arms of the Dutch trial.21 Consequently, considering the type of resection as an input variable for nomogram construction does not seem to have additional value. Defining the extent of lymph node dissection (i.e., D1 or D2) requires intraoperative identification of all 16 lymph node stations as defined by the JRSGC.17, 18 Identification and subsequent resection of these separate stations may contribute to an improvement in clinical outcome, even in Western patients, considering recent publications that focus on adequate lymph node removal with critical organ resection, thus minimizing postoperative morbidity and mortality.27–29 Notwithstanding the efforts of improving locoregional control through extended lymph node dissection, the surgical effort of meticulous dissection is not performed routinely in Western patients with gastric carcinoma, especially not outside the framework of clinical trials. Including the type of resection as a mandatory input variable in the predictive nomogram would make the nomogram less applicable in daily practice. However, the basis of the initial nomogram was an institution where extended lymph node dissection is performed in the majority, but not all, of patients. By requiring only the numbers of negative and positive lymph nodes resected for the nomogram computation without specifying their location, we believe that the extent of lymph node dissection is addressed sufficiently.

In conclusion, the gastric carcinoma nomogram performed well when applied to a validation dataset of patients with different stages of disease (from a large number of institutions) who were treated with a focus on lymph node clearance. The nomogram provided predictions that discriminated better than the AJCC staging system, regardless of the extent of lymph node dissection, and illustrated the heterogeneity of risk within many stages. With the availability of this external validation, individual patient counseling and tailored adjuvant therapy decision-making should be encouraged using the nomogram, which is freely available in software [available from URL: www.nomograms.org].

Acknowledgements

The authors thank Manish Shah for his contribution in an early analysis.

Ancillary