• neoplasm;
  • survival;
  • period analysis;
  • modeling


  1. Top of page
  2. Abstract
  3. Data sources and statistical analysis
  4. Results
  5. Discussion
  6. References

Model-based projections were shown to be useful for deriving most up-to-date population-based cancer survival estimates. However, the performance of these projections, which can be derived by various approaches, has only been evaluated in very few cancer patient populations. Using incidence and follow-up data for 22 common cancers from 9 long-standing population-based cancer registries from diverse parts of Europe, we compared the performance of model-based period and cohort analysis for predicting 5-year relative survival of patients diagnosed in 1996–2000 against standard survival analysis approaches (cohort, complete and period analysis). Overall, model-based predictions provided a best estimate of the later observed actual survival in 135 of 198 occasions, compared to 25, 18 and 33 occasions for cohort, complete and period analysis, respectively. Projections based on cohort and period type modeling performed essentially equally well on average, and their performance was better for more common cancers, in registries with larger population bases, and for cancers subjected to continuous clinical progress and/or ongoing screening efforts. Projections from model-based analysis may contribute to improved timeliness of monitoring of concurrent trends in population-based cancer survival in cancer registries operating in different populations and socioeconomic environments. © 2009 UICC

Timeliness is an important factor in the provision of population-based cancer survival estimates. However, survival estimates calculated by traditional methods, such as cohort and complete analyses, reflect the survival experience of patients diagnosed many years ago, and are therefore not suitable for monitoring recent trends in survival. The introduction and application of the period analysis methodology,1–3 and particularly the introduction of model-based period analysis,4, 5 have opened new opportunities for increasing the timeliness of population-based cancer survival estimates. In particular, it has been shown in a previous evaluation5 that modeling can be used to project survival estimates into the calendar years immediately following those for which data are already available, allowing for the compensation of the delay arising from the time necessary for achieving completeness in cancer registration, the work-up of cancer registry records, statistical analysis and reporting of results.

Only very few previous studies examined the performance of model-based period analysis for projecting survival estimates,5, 6 and no broad-based evaluation involving various projection approaches for deriving extrapolated estimates by model based analysis is currently available. In this study, we evaluated the performance of projections from model-based period and model based cohort analysis next to the standard methods of cohort, complete and period analysis in predicting survival of cancer patients actually observed, based on data from 9 European registries participating in the EUNICE (European Network for Indicators on Cancer) Survival Collaboration.

Data sources and statistical analysis

  1. Top of page
  2. Abstract
  3. Data sources and statistical analysis
  4. Results
  5. Discussion
  6. References

Data sources

The EUNICE survival database included patient data from 9 major geographical areas of Europe based on cancer registry incidence and follow-up data from at least 1980 onwards. The creation of the data base, inclusion criteria and data preparation for survival analysis were described in detail elsewhere.7 In brief, patients with 1 of the 22 most common malignant tumours diagnosed between 1980 and 2004 were selected, along with corresponding age, sex and calendar period-specific life tables to enable the calculation of relative survival estimates.

Statistical analysis

In our analysis, we first calculated the 5-year relative survival of patients diagnosed in 1996–2000, the most recent cohort of patients for which 5-year relative survival actually observed could be known based on data available in our data set. Next, we considered 5-year relative survival estimates that could have been derived from a data set including incidence and follow-up data from 1980 up to the end of the calendar year 1995 (i.e. the data set that could have been used during the years of diagnosis of the 1996–2000 cohort). Based on these data, we first derived the most up-to-date 5-year relative survival estimates by cohort, complete and period analysis approaches.

As illustrated in Figure 1, cohort estimates here pertain to patients diagnosed during the calendar years of 1986–1990 and followed up through 1995. Complete estimates pertain to all patients diagnosed in 1986–1995, of whom some could not have completed 5 years of follow-up by the end of 1995 and had to be censored at that date unless they had died before. Period estimates are likewise based on patients diagnosed in 1986–1995, but only survival experience in 1991–1995 is used in the analysis. To derive projected relative survival estimates, a Poisson regression model was used to project the survival expectations of patients diagnosed in 1996–2000 by extrapolating the trend in the preceding time periods, based on analysis techniques described in detail elsewhere,5 using a cohort and a period type model as illustrated in Figure 1. Briefly, the logarithm of the excess numbers of deaths was modeled as a function of follow-up year (categorical variable) and 5-year cohorts (numerical variable, 1981–1985 = 1, 1986–1990 = 2 and 1991–1995 = 3) or calendar periods (numerical variable, 1981–1985 = 1, 1986–1990 = 2 1991–1995=3) in cohort and period type modeling, respectively, using the logarithm of the person-time at risk as the offset. Based on these models, projected excess numbers of deaths and conditional survival probabilities by year of follow-up for the calendar period 1996–2000 were obtained, assuming that the trends seen in the preceding cohorts/periods would continue into the next five-year diagnostic period. Model-based estimates of 5-year relative survival for 1996–2000 were obtained as the product of these conditional survival probabilities.

thumbnail image

Figure 1. Incidence and follow-up years included in the analyses to predict 5-year relative survival of patients diagnosed in 1996–2000 using incidence and follow-up data up to the end of 1995.

Download figure to PowerPoint

All analyses were performed using the SAS statistical software system,8 and an adapted version of previously-described macros for relative survival analysis,9 including the modeling approach. In all analyses, expected survival estimates were calculated from population life tables according to the Ederer II method.10

As summary indicators of performance, the numbers of best predictions by analysis approach, defined as the estimate(s) closest to survival actually observed later, were calculated. This was additionally repeated after excluding the worst performing method, for the 4, 3 and 2 best performing approaches. Furthermore, average absolute deviation and average deviation from the later observed survival, as well as the number of occasions on which the survival estimates derived by each analytic approach were below, within, and above the range defined by the 95% confidence interval (CI) of the 5-year relative survival estimate actually observed for patients diagnosed in 1996–2000.


  1. Top of page
  2. Abstract
  3. Data sources and statistical analysis
  4. Results
  5. Discussion
  6. References

Table 1 presents, for each cancer registry and cancer site analysed, the number of patients included for the calculation of the actually-observed survival estimates for the calendar period of 1996–2000, as well as the estimate and standard error of the 5-year relative survival observed for this diagnosis period. The total numbers of patients by cancer registry varied from 8,859 in Geneva to 117,251 in Scotland. Total numbers of cases by cancer site varied from 3,764 for testicular cancer to 69,033 for breast cancer.

Table I. Number of Cases Included in the Calculation, Point Estimate, and Standard Error of the 5-Year Relative Survival Estimate, 1996–2000, by Cancer Sites and Registry
 Oral cavityOesophagusStomachColorectalLiverPancreasLarynxLung
Total12,320  7,625  19,935  64,427  4,815  13,268  4,824  64,195  
 Skin melanomaBreastCervixCorpusOvaryProstateTestisKidney
Total15,876  69,033  7,262  12,747  11,443  51,133  3,764  13,107  
 BladderBrain and nervous s.ThyroidNHLMultiple myelomaLeukaemia
Total19,120  7,176  4,673  15,240  5,973  10,666        

Table 2 presents, by cancer registry and cancer site, the difference between estimates obtained by the different analytic approaches and the actual 5-year relative survival observed later. Overall, the difference was negative for 732 out of 990 (73.9%) comparisons, indicating that the large majority of estimates derived were lower than the actual later observed survival. Among the 5 analytic approaches, the proportions underestimated were 84, 83 and 81% and 62 and 61% for cohort, complete, period, and model-based period and cohort analysis, respectively. For many of the most common cancers, such as breast, colorectal, prostate, as well as for ovarian, kidney and thyroid cancer, non-Hodgkin lymphoma and leukaemia, model-based analysis provided the best estimates in all but 1 or 2 registries. Model-based predictions were less reliable for cancer sites associated with modest or no survival improvements with time (e.g., cancers of the liver, pancreas, larynx, brain and nervous system and multiple myeloma), or those for which high survival has been achieved in earlier periods, and the potential for further progress was limited (i.e. testicular cancer).

Table II. Differences Between 5-Year Relative Survival Actually Observed for Patients Diagnosed in 1996–1999 and the Most Up-To-Date Estimates Available from Cancer Registry Data Up to 1995 by Cohort (C), Complete (CMP), and Period (P) Analysis, as Well as Model-Based Period (MP) and Cohort (MC) Analysis
 Oral cavityOesophagusStomachColorectalLiver
  1. For each registry and cancer site, the best/worst performing survival estimates is indicated in bold/underlined, respectively.

 PancreasLarynxLungSkin melanomaBreast
 KidneyBladderThyroidNHLMultiple myeloma

As shown in Table 3, among 198 combinations of cancer sites and registry, considering all 5 approaches first, projections from model-based analysis came closest to the 5-year relative survival observed later for patients diagnosed in 1996–2000 the highest number of times (on 69 and 66 occasions for model based cohort and period analysis, respectively, while cohort, complete and period analysis provided a best estimate on 25, 18 and 33 occasions, respectively). The worst estimates (i.e. those furthest away from the survival later observed) were obtained 112 times by cohort analyis, compared with 14, 25 and 26 and 31 times by complete and period analysis and period and cohort based projections, respectively. When considering the comparisons without cohort analysis (which performed worst among the 5 initial approaches), the number of best predictions were still overwhelmingly provided by one of the model based approaches, (71 and 72 times in 198 occasions by period and cohort based modeling, respectively), while most worst predictions were provided by complete analysis (106 times for 198 occasions). In the comparison of standard period and model-based estimates, the latter were still providing better predictions, and performed overwhelmingly better in avoiding a worst estimate. Finally, when comparing the 2 model-based approaches, a best estimate was found nearly equally often, 107 and 101 times for period and cohort based modeling respectively (equally good predictions were counted twice).

Table III. Number of Best and Worst Predictions, by Analysis Approach and Registry
 All five approachesFour best approachesThree best approachesTwo best approaches
  1. C, cohort analysis; CMP, complete analysis; P, period analysis; MP, model-based period analysis; MC, model-based cohort analysis.

Best predictions
Worst predictions

Compared to estimates by traditional approaches, projections derived by either model-based approach had a lower average absolute difference from the later observed actual survival in all registries except Geneva, where model based period analysis and period analysis performed best (see Table 4). The average deviation (which was always smallest for cohort or period based projections, and their magnitude was very similar for either of the 2 analysis approach), was found to be always negative apart from 2 occasions each for model based estimates, indicating that the later observed actual survival was on average underestimated.

Table IV. Average Absolute and Average Deviation from Later Observed Survival, by Analysis Approach and Registry, in Percent Units
Average absolute deviation from observed actual survival, in percent unitsAverage deviation from observed actual survival, in percent units
  1. C, cohort analysis; CMP, complete analysis; P, period analysis; MP, model-based period analysis; MC, model-based cohort analysis.


Table 5 shows the number of occasions when the estimates derived by the different analytic approaches were below, within and above the 95% CI of the survival estimates actually observed. Of the 198 estimates derived with each analytic approach, corresponding proportions were 61, 35 and 4% for the estimates derived by cohort analysis. Proportions were slightly more favorable for complete and period analysis, while when deriving estimates by model based projections, estimates fell within the 95% CI of the survival estimate observed later in 57 and 54% for model based period and cohort analysis, and severe underestimates occurred less frequently, i.e. in less than 40% for both types of model based analysis.

Table V. Number of Occasions When Estimates Derived by Different Analysis Approaches Were Below (B), Within (W), and Above (A) the 95% Confidence Interval of the Later Observed Survival Estimate
  1. C, cohort analysis; CMP, complete analysis; P, period analysis; MP, model-based period analysis; MC, model-based cohort analysis.


Standard errors of modelled cohort estimates were on average 13% higher than the standard errors of model based period estimates, while the standard errors of the latter estimates were on average 50% higher than the standard errors of the observed survival estimates for 1996–2000.


  1. Top of page
  2. Abstract
  3. Data sources and statistical analysis
  4. Results
  5. Discussion
  6. References

This evaluation of survival projections by model-based analysis in the context of an international comparative cancer survival study indicates that, on average, model-based projections provide better survival estimates for very recently diagnosed patients than any other survival analysis approach, including the standard period approach. Estimates calculated by either type of model-based analysis had the lowest average difference from later observed actual survival, and were able to substantially reduce the underestimation of survival compared to other analytic approaches.

Our results are consistent with findings from a previous evaluation of the model-based projection approach based on a single cancer registry (the cancer registry of Finland).5 The previous evaluation had included a broader time frame, making use of the long time series of data available in the Finnish Cancer Registry. For this analysis, a single time window was used, but the evaluation was performed for many registries of very different sizes and from populations showing a large variation in levels of, and trends in cancer survival. The consistent findings in different settings suggest that projections from model-based analysis may be useful for further improving the timeliness of survival estimates in international cancer survival monitoring.

For the calculation of projected survival estimates, model-based analysis relies on the incorporation of survival trends from preceding calendar periods. The accuracy of the projected survival estimates will depend on the continuation of trends in survival in the preceding calendar periods. The accuracy and the assessment of the trend will largely depend on the underlying sample size, as with increasing sample size the role random variation in trends is reduced.

The empirical results of this analysis indicate that both cohort and complete analysis are inappropriate approaches for deriving up-to-date survival estimates, as they overwhelmingly provide a least up-to-date estimate. For most cancers with continuous clinical progress and/or ongoing screening efforts, period analysis also provides suboptimal estimates. Model-based projections were on average much closer to the later actually-observed survival for cancer registries with the largest catchment populations (Finland, Norway and Scotland), and provided overwhelmingly better predictions for more common cancers than estimates derived by other analytic approaches. On the other hand, model-based estimates performed comparatively less well for cancer sites with lack of improvement in prognosis and with lower number of cases, in which case model-based projection is more affected by random variation, particularly in smaller cancer registries. In practical applications, projections focusing on more common cancers, as well as the selection of cancer sites for which clinical progress or cancer control measures provide an indication for changes in population-based survival may be particularly well suited for exploiting the benefits of model-based projections. Combination of registries for analyses on the country level or regional level rather than single registry level may be a further approach to minimize the effects of random variation.

Several limitations of this work require careful consideration. This analysis examined only 2 modeling strategies from the many theoretically possible options.11–13 Modeling was used to derived purely projected survival estimates, exclusively on the basis of model-based extrapolation. Furthermore, due to limited time-series of data available in the present analysis, during modeling, period estimates for the 1981–1985 period were “incomplete” due to the lack of follow-up data from patients diagnosed before 1980, while the cohort estimates for the calendar years of 1991–1995 were based on less than 5-years of follow-up time for more recent years. The latter modification however appears a necessary one, as otherwise a projection over a very long time (on average 10 years) and unnecessary exclusion of more recent survival information would have been necessary, as a “pure” cohort approach, requiring 5-years of completed follow-up for all calendar years, would have meant that the cohort of 1986–1990 becomes the most recent one to be potentially included in the analysis. This necessary modification explains the somewhat higher standard errors obtained in model based cohort analysis compared with model based period analysis.

We did not adjust for age in the current analysis; in theory, a sudden change in the age distribution of patients in consecutive periods may decrease the performance of the projections without age adjustment. However, overall, very little change occurred in the age distribution of patients irrespective of cancer site during the observation period, and for the few where more substantial changes were seen, increases or decreases in the age-specific proportions were quite linear.

In conclusion, this evaluation involving follow-up data from 9 cancer registries from diverse areas of Europe and 22 common cancers showed that model-based projections can be efficiently used to provide more up-to-date population-based survival estimates than other analytic approaches in common use, including the standard period analysis. The projection approach, either based on a cohort or a period type approach, appears to provide a valuable tool for estimating survival expectations of currently-diagnosed patients, particularly in situations where there is ongoing therapeutic and clinical progress.

Members of the EUNICE survival working group

Tiiu Aareleid (National Institute for Health Development, Estonia), Freddie Bray (Cancer Registry of Norway), Hermann Brenner (German Cancer Research Center, Germany), David Brewster (Scottish Cancer Registry, Scotland, UK), Jan Willem Coebergh (Eindhoven Cancer Registry, Netherlands), Emanuele Crocetti (Florence Cancer Registry, Italy), Adam Gondos (German Cancer Research Center, Germany), Timo Hakulinen (Finnish Cancer Registry, Finland), Bernd Holleczek (Saarland Cancer Registry, Germany), Maryska Janssen-Heijnen (Eindhoven Cancer Registry, Netherlands), Giedre Smailyte (Cancer Registry of Lithuania), Margit Mägi (Cancer Registry of Estonia), Jadwiga Rachtan (Cracow Cancer Registry, Poland), Stefano Rosso (Piedmont (Torino) Cancer Registry, Italy), Massimo Usel (Geneva Cancer Registry, Switzerland), Maja Primic Žakelj (Cancer Registry of Slovenia).


  1. Top of page
  2. Abstract
  3. Data sources and statistical analysis
  4. Results
  5. Discussion
  6. References