Chordomas are rare bone tumors arising from remnants of the embryonic notochord.
Chordomas are rare bone tumors arising from remnants of the embryonic notochord.
Data for this study were obtained from the Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute (1973-2009) to calculate the incidence, relative survival (RS), and standardized mortality ratio (SMR) of patients diagnosed with intracranial and extracranial chordomas and to assess the effects of age and sex on this disease.
The overall incidence of extracranial and intracranial chordomas was 8.4 per 10 million population. The median overall survival of patients with chordoma patients was 7.7 years. The median survival was 7.7 years for male patients and 7.8 years for female patients. Younger patients (aged <40 years) survived longer compared with older patients (10-year RS, 68% vs 43%). The estimated age-standardized 5-year, 10-year, and 20-year RS rates was 72%, 48%, and 31%, respectively. The SMR in the overall cohort was 4.6 (95% confidence interval, 4.22-5.0) or 21.0 (95% confidence interval, 16.6-27.2) in young adult patients and 3.0 (95% confidence interval, 2.6-3.4) in elderly patients.
The elderly had a more aggressive form of this disease; and, other than the incidence, sex did not influence outcome in this disease. The study of chordomas presents a good case for the contribution that the SMR can have on measuring the impact of a disease on a population of patients. Although the younger population has better survival rates, the impact (SMR) in the younger age groups is much higher than in older populations. Cancer 2013;119:2029–2037. © 2013 American Cancer Society.
Chordomas are rare bone tumors believed to arise from vestigial or ectopic notochordal remnants alongside the neuraxis at developmentally active sites, such as the ends of the neuraxis and the vertebral bodies. They arise at cranial, spinal, and sacral sites in 32%, 33%, and 29% of cases, respectively. Skull base location is more frequent in females and in African American patients. Although most chordomas are sporadic, familial clusters have been reported, suggesting inherited susceptibility in a minority of patients. It is noteworthy that these tumors also can occur in association with the tuberous sclerosis complex.[8, 9] Although distant metastasis may occur, chordomas are generally low-grade neoplasms that display a local malignant behavior, characterized by locally aggressive growth patterns and high local recurrence rates, which warrant aggressive local treatment to achieve disease control.
The standard treatment for this tumor is surgery, but the ability to obtain a complete surgical resection remains elusive for many patients, and more than half of chordoma patients have some degree of residual tumor on postoperative imaging. Moreover, chordomas grow in a lobulated fashion, and microscopic tumor digitations can be observed at a distance from the main tumor core in macroscopically normally appearing bone and surrounding soft tissues. Therefore, a radical total resection is rarely feasible in the skull base, where surrounding, normally appearing bone cannot be safely resected; and, in the spine, true total resection requires total, en-bloc vertebrectomy.[12, 13] A dose-response has been demonstrated for this tumor, which is why radiation therapy is usually administered postoperatively. More specifically, highly conformal, particle-beam, high-dose radiotherapy is recommended, because the critical anatomic structures often limit the administration of high doses of radiation. Various prognostic factors, including but not limited to sex, age,[15, 16] and tumor volume,[16, 22, 24] have been identified for this neoplasm, but the clinical significance remains questionable, because these findings are not consistent. For instance, female sex has been suggested as a negative prognostic feature in some but not all studies.[2, 19, 25, 27, 28]
The epidemiologic patterns of chordomas have been published previously. In this study, we update an analysis of the Surveillance Epidemiology, and End Results (SEER) data and provide true cause-specific, age-standardized survival estimates. In addition, we model the differences in survival across age groups in the face of differing expected mortality rates by using relative survival (RS) methods to assess the effect of age and sex on prognosis. And lastly, we demonstrate using chordomas as an example of how the standardized mortality ratio (SMR) can be used as a metric to describe the “impact” of a disease on a tumor population.
The US National Cancer Institute's 18 registries within the SEER Program were used to identify patients with intracranial and extracranial chordomas (International Classification of Diseases for Oncology, third edition code 9370/3). The SEER database is a large, nationwide database that currently covers an estimated 25% of the US population. It has a reported 97.7 case ascertainment rate, indicating that only an estimated 2.3% of cancers have been missed. In addition, its ascertainment of radiation oncology records is approximately 99%. The Adolescent and Young Adult Oncology Progress Review Group's age criteria were used to define the following age groups: “children” (ages 0-15 years), “young adults” (ages 16-39 years), “adults” (ages 40-64 years), and “elderly” (aged ≥65 years). Incidence data were extracted and age standardized with SEER*Stat software (version 7.1.0; National Cancer Institute, SEER Program, Bethesda, Md) using the SEER 18 registries, which contain data up to 2009. To generate cause-specific survival measures, individual patient survival data were obtained from the SEER 17 registries and were matched to SEER expected mortality tables by year of birth, age, sex, and race using the Ederer II technique to create RS data for estimation and modeling.[32, 33] Because the data are limited to the most recent release of expected mortality tables, our analysis and follow-up periods of patients were limited to 2007.
The use of RS methods to quantify differences in survival between age groups is of paramount importance because of the well known differences in the underlying general population mortality estimates, and RS is modeled using the concept of excess hazards. For example, a Caucasian man aged 22 years in 2006 had a 0.0014% chance of dying of any cause that year, whereas a white man aged 70 years had a 2.5004% chance of dying of any cause in the same year (expected mortality or hazards). RS is an extension of the Kaplan-Meier survival estimation method but takes into account expected mortality estimates by using 2 data sets: standard SEER individual patient data and the US expected mortality database. It is calculated by dividing the survival rate of patients at a particular period (time [t]) by the background (expected) survival rate for members of the general population matched for age, sex, and race. Therefore, if the mortality rate in a cancer population is higher than that of the general population, then the rate of death above that observed in the general population (excess hazards) is that which is because of the cancer.
The use of RS estimation over traditional cause-specific survival methods (in which patients who die are censored if their cancer is not considered to be the cause of death) is because there are significant inconsistencies across registries when coding deaths. For example, deaths among patients who commit suicide after receiving a diagnosis of cancer or patients who die because of radiation necrosis are difficult to code as cancer-specific or noncancer-related.
To describe the patient cohort, simple proportions were used along with logistic regression to measure the magnitude of the differences as appropriate. The finding of an age-by-follow-up interaction necessitates the use of age-standardized, cumulative RS to report the survival experience of the entire group. If excess hazard ratios (eHRs)—the summary statistic comparing hazard rates across age groups—vary across age groups during follow-up, then the groups that have higher eHRs/excess hazard rates earlier will include proportionately fewer patients representing their group later during follow-up. Therefore, direct standardization, as recommended by Hakulinen et al as the gold standard of RS estimation, is used to estimate overall group RS.
Regression modeling of survival was performed using the age categories described above and 2-year periods of follow-up. The dependent (outcome) variable was excess hazard, and 2 independent variables were used: age group and period of follow-up. The reference category was “young adults.” The model used was the piecewise constant hazards RS model described by Dickman et al. The measure presented by this model is the eHR, which is a unit of measure interpreted similar to the traditional hazard ratio used when measuring differences in overall survival (OS) in models like the Cox proportional hazards model, except that it is modeling excess hazard rates. An assumption made in this study was that of constant hazards—that is, we assume that hazards remain constant throughout each of the periods (year 1, year 2, etc).
The Dickman piecewise constant hazards model does not die to the issues associated with the traditional hazard ratio observed in proportional hazards models. First, it provides estimates of survival differences in a cause-specific manner because of the use of RS data. Second, this model is flexible enough to handle nonproportional hazards by using an interaction term. Through the use of interaction terms, it can detect “fork” or “reverse fork” type interactions, also known as covariate-by-follow-up time interactions.[37, 38]
Models with and without interaction terms are tested using the likelihood ratio test, and P values around .05 are considered to warrant closer attention. For this study, we chose not have a particular significance level, to warrant rejection of the null hypothesis. If the likelihood ratio test P value approximates .05 and there is visual evidence of poor fit of the proportional hazards model, then an interaction is considered to be present in the data, and the interaction model is deemed to be the model of best fit. This is because the nonproportional hazards model (interaction model) is a more detailed model, without proportional hazards as an assumption. In this case, a vector is estimated for each combination of year and age category. Subsequent models are always compared with a nonproportional model to evaluate model behaviors and to determine whether the subsequent estimates are appropriate. The fit is confirmed visually using an estimated versus observed plot and by plotting and testing deviance residuals for normality.
The standardized mortality ratio (SMR) is an epidemiologic measure that describes the impact a tumor has on the likelihood of death compared with the general population. Its concepts go back to the late 17th century when Mr. Daly sought to prove that the London Societies (early forms of life insurance companies) had annuity plans that were incorrect. The SMR is commonly used in cohort studies to compare the expected mortality in the control population (E) with the observed mortality in the study population (D). Here, we compare the general population expected mortality, generated by matching each patient with the US expected mortality tables, to generate the expected mortality estimate (E). Then, the observed deaths (D) are divided by the expected mortality (death count) estimate, as demonstrated shown in Equation (1). Equation (2) demonstrates the calculation of the expected death count from grouped data, with (n) representing the number at risk in interval j and wj representing the total censored patients during interval j. The total censored is divided by 2, because it is assumed that censorship will be at random throughout the interval; therefore, the censored patients will represent approximately half the total person-time for the interval j. Pj* represents the average expected survival (hence the use of the asterisk) for all patients included in interval j, and 1 − Pj* represents the average expected hazard rate. Overall, Equation (2) represents the sum of the interval-specific (j1, j2, j3 … jt) death count by approximating the total person-time at risk (nj − wj/2) multiplied by the expected hazard rate for each interval obtained by matching patients included in the study to the expected mortality tables.
Equation (3) demonstrates by integration the calculation of the expected death count for i = 1 to i = n individuals until time t, with the expected hazard rate (from expected mortality tables) as a known function: λi*(s) matched to each patient i = 1 to i = n.
For example, if 80 patients with tumor “X” die during the study period but only 10 patients would have been expected to die during that period, then the SMR would be 8. The SMR is interpreted in a manner to its interpretation in a cohort study; and, using the example above, a patient diagnosed with tumor “X” is 8 times more likely to die compared with a similar individual in the general population.
The overall incidence of chordomas was 8.4 per 10 million population. Children were affected at a rate of 1.4 per 10 million population, and the rate was 4.3 per 10 million population for young adults, 10.8 per 10 million population for adults, and 26.2 per 10 million population for the elderly. Males had a rate of 10.6 per 10 million population, and females had a rate of 6.6 per 10 million population, effectively making the rate ratio 1.6 males affected for every female, and this appeared to be consistent throughout the age groups. Figure 1 demonstrates a unimodal age distribution of incidence rates, with gradually increasing incidence rates as the population ages (see Fig. 2).
The median age at diagnosis was 58 years (interquartile range, 29 years) and was the same for males and females. Approximately 65% of patients were aged <65 years at diagnosis. The variables presented in Table 1 were analyzed for imbalances, and, except for surgery, none appeared to be associated with the age categories specified here (all P values > .20). Adults and young adults were 2.2 times and 2.7 times as likely to undergo surgery as elderly patients, respectively.
|None or unknown||23||59||129||63||234||55||228||58||614||58|
|No surgery or unknown||5||13||48||23||85||20||156||39||294||28|
This patient cohort had a total of 6434 years of follow-up with a median follow-up of 4.7 years. The longest patient follow-up was 33 years. Overall, the median survival was 7.7 years, and young patients fared the best. Figures 2 and 3 demonstrate that children and young adults had similar mortality profiles, whereas adults and elderly had steeper or more lethal survival curves. Overall, the median survival was 7.7 years for males and 7.8 years for females. Table 2 indicates that the RS estimates were very similar for males and females, with considerable overlap of the 95% confidence intervals (CIs). Although no interaction was observed, we still presented overall RS estimates using age-standardization. The overall group had 72% age-standardized 5-year RS, 48% age-standardized 10-year RS, and 31% age-standardized 20-year RS.
|Period, y||RS Rate, %||95% CI, %||RS Rate, %||95% CI, %||RS Rate, %||95% CI, %||RS Rate, %||95% CI, %||RS Rate, %||95% CI, %||RS Rate, %||95% CI, %||RS Rate, %||95% CI, %|
The model was tested for age categories with proportional and with nonproportional parameterization (with and without an interaction term), and the likelihood ratio test result for differences was P = .06, which indicates that the proportional hazards model is at risk of being a worse model. Once the parameter estimates and the estimate-versus-observed plots were examined and compared with the full model (nonproportional hazards model), we decided that the full model did not provide any new or better information, because the parameter estimates did not vary such that there was new information. We tested the inclusion of radiation and surgery into the model and observed that the point estimates of the eHRs for the age categories remained unchanged, and the likelihood ratio tests indicated that the simpler models were not significantly different (P > .25 and P > .71, respectively); therefore, the final model did not include radiation or surgery as confounders.
Table 3 presents the eHRs for each year of follow-up compared with the first 2 years postdiagnosis. The mortality rates were generally quite stable throughout follow-up, and only a slight increase in the mortality rates (viewed as ratios) was observed during years 6 through 8, also known as the period of peak incidence density of mortality (PPID). However, the size of the PPID is rather small at an eHR of 1.45. The model fit was deemed appropriate (see Fig. 3) and age groups fit the proportional hazards assumption. Therefore it is reasonable to use a proportional hazards model in future studies. Elderly patients were 3 times more likely to die throughout follow-up than young adult patients in this population.
|Age Category||eHR||P||95% CI|
|Young adults: First 2 y||1.00||—||—|
Because expected mortality was taken into account, adults were not more likely to die than young adults and, thus, had a cause-specific survival profile similar to that of young adults. A survival model (Cox proportional hazards model) also was fit to these data without adjustment for expected survival, because expected survival demonstrates HRs that overestimate the effect of age on outcomes (adults: HR, 1.72; 95% CI, 1.29-2.30; elderly: HR, 4.48; 95% CI, 3.38-5.95) with young adults as the baseline category in both situations.
Overall, there were 531 observed deaths, but only 116 were expected. Thus, the SMR was 4.6 (95% CI, 4.22-5.0) for the overall group, 80 (95% CI, 43-137) for children, 21 (95% CI, 16.6-27.2) for young adults, 5.9 (95% CI, 5.0-6.8) for adults, and 3.0 (95% CI, 2.6-3.4) for the elderly.
This study, which includes an analysis of greater than 1000 patients with chordoma, indicates that the age-standardized 5-year and 20-year RS rates of this disease are 72% and 39%, respectively. Cause-specific mortality (RS) is higher in the elderly population, because the elderly are 3 times more likely to die than young adults. These worse survival outcomes are observed after adjustment for differences in expected mortality across the age categories. The cause for this is speculative, but it is possible that this difference is caused either by a more aggressive form of the disease or by a diminished physiological reserve in association with increased comorbidity loading in this population group.
The comparison of RS rates with SMRs demonstrates well the properties of the SMR and the distinction between a tumor's lethality and its impact on a population. The RS rates and SMRs appear to present conflicting results. The 10-year RS rate was 69% in young adults compared with 28% in the older population. In contrast, the SMR was 21.2 in the young adult population, whereas elderly patients had an SMR of 2.9, indicating on first appearance that the elderly population fairs worse according to the well known RS metric. This is a good example of differences in the epidemiological concepts that each metric measures: lethality versus impact.
The SMR is a measure used to describe the impact of a tumor on a population. An impact can be considered an object coming forcibly into contact with another; and, to be measured properly, the properties of both the object (tumor) and the “object” it comes into contact with (population) should be described and compared. The SMR compares the mortality caused by a tumor on a population with the mortality expected for that population. In situations in which there are already high rates of mortality (elderly populations), the impact would be “felt” less by that population.
To describe this more visually, Addison's “Vision of Mirsa” is particularly useful. Addison described a bridge into the middle of a lake, with the general population walking across this bridge.[42, 43] The beginning of the bridge represents birth, and the end of the bridge, which ends in the middle of the water, represents the end of life. The general population is viewed as a very large group of individuals walking across this bridge, and every individual either will fall through a trap door on the way (represents mortality throughout life) or will fall off the end of the bridge, which represents the certainty of the end of life. The farther along the bridge the population walks, the increased density of trap doors will be encountered and opened, representing the increased density of mortality or increased expected mortality rates.
If the same bridge is imagined but only for patients diagnosed with chordomas, then the observer would note the same proportion of trap doors plus those associated with the disease. These extra trap doors (deaths because of disease) represent excess hazard rates. Differences in the SMR or in the impact of the disease between young adults and elderly can be observed by the obviousness of these new trap doors on this bridge. In the young population at the beginning of the bridge, the added trap doors can be observed clearly, because there are so few trap doors opening (low expected mortality). In the older populations on the chordoma bridge toward the end, there are already so many trap doors opening (high expected mortality) that the added trap doors are not readily visualized by the observer. This is why the young adult population had better RS yet an SMR of 21.2, whereas the elderly patients had worse RS, worse eHR, yet an SMR of 2.9. This demonstrates how the tumor has a smaller impact on the elderly population but is more lethal in this population.
This concept also can be observed when comparing 2 different types of brain tumors. Glioblastoma multiforme, a tumor with a poor prognosis, has a 2% 10-year RS rate and an SMR of 61 (95% CI, 60-62), whereas medulloblastomas have a 10-year survival rate of 52% with a slightly lower SMR of 50 (95% CI, 47-56).[37, 41] These drastic differences in RS yet similar SMRs are present because the impact of the disease is different. Medulloblastomas, although less lethal, occur in a population (young children with a median age of 9 years) that has very little expected mortality (few trapdoors on their “bridge”), which is why, although only a few deaths are observed, the low expected mortality increases the SMR considerably. Conversely, in patients with glioblastoma multiforme, the median age at diagnosis is 64 years (expected mortality rates much higher); and, in the face of much more lethality (10-year RS rates), the SMR remains similar to the SMR of patients with medulloblastoma. Thus, the SMR provides an excellent measure of the impact a cancer may have on a particular population.
The usefulness of this metric in the clinical setting is similar to that of the Kaplan-Meier survival function. It can be monitored over time to measure and quantify the effect of interventions, such as radiotherapy or improvements in surgical techniques, but it measures these outcomes on a different dimension, namely, that of impact rather than lethality. In addition, this metric is particularly useful for the appropriation of resources. Although we know that tumors may be more lethal in a particular population, it is more beneficial to appropriate the limited health resources to tumors and/or diseases that have a greater impact on the population. Combined with incidence rates, the SMR can provide a very useful metric of the impact that a disease has on the general population, and it can help in the rational appropriation of limited resources.
Because we used RS methodologies, which adjusted our figures for differences in the expected mortality of the general population, the cause-specific survival results presented here are precise. The elderly group had significantly different survival rates compared with young adults when we used OS methods, but the effect was attenuated slightly (Cox HR: 4.4; 95% CI, 3.38-5.95; Dickman eHR: 2.9; 95% CI, 2.10-4.14) in the elderly when the data were adjusted for expected survival. In addition, using OS measures, we observed that adults would be more likely to die than young adults (HR, 1.72; 95% CI, 1.29-2.30). When modeled using RS methods, this finding was diminished to an eHR of 1.38 (95% CI, 0.98, 1.94). Of course, the methods used were different, because the Dickman et al piecewise constant hazards model is parametric, capable of handling nonproportional hazards, and incorporates expected survival in its estimation. The Dickman relative survival piecewise constant hazards model is the regression model of choice when it comes to the estimation of hazard rates across differing age categories.
The impact of sex on survivorship is somewhat controversial. The Boston group reported better local tumor control and survival for male patients with noncervical chordomas who underwent surgery and received proton beam therapy. That group delivers radiation dose as a function of sex, and their female patients receive the highest doses of radiotherapy. However, smaller proton beam therapy series have not reported this significant difference in outcome related to sex. Yasuda et al reported 40 chordoma patients who underwent surgery with or without postoperative proton radiation. The estimated OS rate was 89.5% for males versus 66.7% for females (P = .5). Likewise, the Heidelberg group did not observe a significant OS difference in 44 patients with chordoma who received carbon ion radiation therapy. Those results are in keeping with other published series.[16, 46, 47] Thus, it appears that the risk of local recurrence and subsequent death is more neutral with regard to sex than initially assumed. In the previous SEER analysis, the impact on outcomes from various clinical and therapeutic characteristics was assessed. Surgery, age, size, and location were potentially important prognostic factors, but the impact of sex was not assessed in that study. It is noteworthy that we did not observe a sex-survivorship difference in the current large actualized cohort, which was adjusted for differences in mortality between males and females in the general population (Table 2). Although it could be argued that the small sample size of most series previously discussed has limited the statistical power to detect an association between sex and outcome, this argument cannot be made based on our current results. Moreover, it is more than doubtful that sex has been somewhat miscoded in the database, because this metric is the easiest to code in clinical research.
This study has all of the same limitations as all SEER rare tumor studies. There was no central pathology review, and there were no data on performance status at diagnosis or socioeconomic status. In addition, chemotherapy is not a variable captured by the SEER database, although it has been demonstrated that chordomas are be insensitive to most chemotherapeutic agents.
Nonetheless, we also should not forget that the SEER database is a sample of the US population and can have issues with generalizability. When comparing the SEER data with the overall US population, it is apparent that those residing within the SEER areas are more affluent, have lower unemployment, and are substantially more urban than the rest of the US population. These problems of generalizability can affect the validity of the confidence intervals presented here.
In conclusion, the piecewise constant hazards model described by Dickman et al is a superior model to the Cox proportional hazards model in situations in which the association of age with survival is being modeled. The elderly have worse survival outcomes after adjustment for different expected mortality differences across the age groups; and, other than incidence, sex does not influence this disease. The study of chordomas presents a good case of the contribution that the SMR can have on measuring the impact of a disease on a population of patients. Although the younger population has better survival rates, the impact (SMR) in the younger age groups is much greater than in older populations.
No specific funding was disclosed.
CONFLICT OF INTEREST DISCLOSURES
The authors made no disclosures.