Prognostic factor analysis of the survival of elderly patients with AML in the MRC AML11 and LRF AML14 trials

Authors


Professor Keith Wheatley, Birmingham Clinical Trials Unit, School of Cancer Sciences, Robert Aitken Institute, University of Birmingham, Edgbaston, Birmingham B15 2TT, UK.
E-mail: k.wheatley@bham.ac.uk

Summary

This analysis, of 2483 patients with acute myeloid leukaemia (AML) aged 60+ years entered into two UK trials, was performed to determine the baseline parameters related to survival and to develop a risk index. The Medical Research Council (MRC) AML11 trial (n = 1071) was used to develop the index; this was validated using data from the Leukaemia Research fund (LRF) AML14 trial on 1137 intensively (AML14I) and 275 non-intensively (AML14NI) treated patients. In AML11, cytogenetic group, age, white blood count, performance status and type of AML (de novo, secondary) were all highly significantly related to prognosis in multivariate analysis. The regression coefficients were used to define good, standard and poor risk groups, with 1-year survival of 53%, 43% and 16% respectively (P < 0·0001). The risk index showed very good discrimination in both AML14I and AML14NI (both P < 0·0001), thereby providing validation, although survival in all groups was very poor in AML14NI. The risk factors for survival in older AML patients were similar to those in younger ones and discrimination of patient groups with relatively good to very poor prognosis was possible. These risk groups apply to both intensively and non-intensively treated patients. Randomized trials of intensive versus non-intensive therapy are needed to determine which types of patient should be given which type of treatment.

Acute myeloid leukaemia (AML) in older patients has a much poorer prognosis than in younger patients, which is partly related to increased frequencies of secondary AML, adverse cytogenetic features (Grimwade et al, 2001) and overexpression of multidrug resistance (MDR) phenotypes (Leith et al, 1997). Age is an important independent prognostic factor in AML, with 5-year survival rates ranging from over 50% in children (Gibson et al, 2005) down to <10% in patients aged over 70 years in whom intensive therapy with curative intent is attempted (Goldstone et al, 2001). Older patients are also much less likely to be treated intensively than younger ones, and it is estimated that <10% of patients aged 60–69 years, and <5% of those aged over 70 years, were entered into the United Kingdom Medical Research Council (MRC) and Leukaemia Research Fund (LRF) trials, compared to well over 50% of younger adults. The primary reasons for this are probably that AML in older patients is seen as a disease that is less likely to respond to attempts at cure and that older patients are also more susceptible to the toxic effects of intensive treatment. AML is a heterogeneous disease and a number of important prognostic factors have been described (Leith et al, 1997; Wheatley et al, 1999; Grimwade et al, 2001; Kottaridis et al, 2001). This report presents the results of a prognostic factor analysis of the patients aged 60 years or over who were entered into the MRC AML11 trial, with validation of the model using data from the subsequent LRF AML14 trial in which there were both intensive and non-intensive treatment options.

Methods

Patients

Between November 1990 and June 1998, 1311 patients were entered at diagnosis into the MRC AML11 trial. Between December 1998 and March 2006, 1593 patients were entered at diagnosis into the LRF AML14 trial. Both trials were primarily designed for older patients with AML. Initially, when AML11 was running in conjunction with the AML10 trial for younger patients, the trial was mainly for patients aged 56 years or over. When AML10 was succeeded by the AML12 trial at the end of 1994, the usual age of patients in AML11, and subsequently in AML14, became 60 years or over. Younger patients could be entered into both AML11 and AML14 if they were not considered suitable for the more intensive therapy employed in the AML10 and AML12 trials. Since a cut off at age 60 years is frequently, although arbitrarily, used to distinguish between older and younger patients with AML, this report considers only patients aged 60 years or over, with 207 younger patients being excluded (169 AML11, 38 AML14). Patients with any form of de novo or secondary AML were eligible for AML11 and AML14, except for patients with acute promyelocytic leukaemia (APL) in the latter. Since validation of a model derived from AML11 that included APL patients would not be possible in AML14, and because APL is now widely regarded as a distinct disease entity (Mistry et al, 2003), APL patients (n = 48) were excluded from the analysis. Secondary AML could be either following prior cytotoxic chemotherapy or radiotherapy for other cancers (s-AML) or subsequent to a preceding haematological disorder (mds-AML). Eligibility for AML11 required 30% or more blasts in the bone marrow – i.e. AML, not myelodysplastic syndrome (MDS) – but high-risk MDS patients were eligible for AML14. Hence, to obtain more comparable populations, MDS patients (n = 108) were excluded, along with a small number rediagnosed as not having AML (n = 15). For the final models, a further 43 patients (20 AML11, 23 AML14) were excluded due to missing data on white blood cell counts (WBC), which was one of the key prognostic variables. There were an additional 534 (229 AML11, 305 AML14) patients with missing cytogenetic data. As this was such a large number it was felt better to include these patients with the use of a dummy variable rather than exclude them altogether. Thus, the outcome of 1071 patients is analysed in AML11, and 1412 in AML14. Both trials were approved by the appropriate Ethics Committees and all patients gave informed consent.

Treatment

The treatment schedule in AML11 followed a standard pattern, with two induction courses containing cytarabine and an anthracycline or anthracenedione, with or without thioguanine or etoposide. Patients in complete remission (CR) received either one or three consolidation courses, with or without interferon maintenance for 1 year. There were three randomized comparisons within the trial.

The AML14 trial consisted of two separate parts: an intensive treatment schedule similar to that in AML (AML14I); and a non-intensive schedule (AML14NI). The intensive schedule compared doses of daunorubicin (50 mg/m2 vs. 35 mg/m2) and cytarabine (100 mg/m2 bd vs. 200 mg/m2 bd), with a randomization to the p-glycoprotein inhibitor PSC-833 in conjunction with the 35 mg/m2 daunorubicin arm. Two courses of allocated induction therapy were scheduled. Patients in CR received a course of consolidation and could then be randomized to a further course versus not. The non-intensive part of AML14 initially compared hydroxycarbamide with low-dose cytarabine, with or without all trans retinoic acid (Burnett et al, 2007). This was succeeded by a randomization to low-dose cytarabine versus low-dose cytarabine plus gemtuzumab ozagamicin (Mylotarg). Since survival was significantly better with low-dose cytarabine than with hydroxycarbamide (P = 0·0009), patients receiving low-dose cytarabine, with or without Mylotarg, were also considered as a separate cohort (AML14NIA). The treatment regimens in AML11 and AML14 have been described elsewhere (Goldstone et al, 2001; Burnett et al, 2007, 2009).

Definition of endpoint

As remission induction was not a primary goal of the non-intensive part of AML14, the only endpoint considered here is overall survival, defined as the time from trial entry until death.

Statistical methods

For univariate analyses, Kaplan–Meier life-tables were constructed for survival data and were compared by means of the log-rank test, with surviving patients being censored at 1 January 2008, when follow-up was complete for all but 37 patients (the small number of patients lost to follow-up are censored at the date they were last known to be alive). Multivariate analysis was performed using Cox regression. Because of the danger of obtaining false positive results with multiple testing, the P-value for inclusion in the models was set at P < 0·01. As Cox regression requires complete data for a patient to be able to contribute to the model, patients with missing values for parameters included in the models that had only a small number of missing values were excluded from the entire study in order to give comparable numbers in the univariate and multivariate analyses; this only applied to WBC. However, since there were much larger numbers of patients with missing cytogenetic results and this parameter turned out to be the most important prognostic factor, missing results of cytogenetics were included in the analyses as a dummy variable. Unless otherwise stated, percentage values quoted in the text for survival are at 1 year. All P-values are two-tailed.

Results

Patient characteristics

The presenting features of the patient population are shown in Table I. Cytogenetic groups were defined as follows: favourable – t(8;21), inv(16), irrespective of the presence of other abnormalities; adverse – monosomy 5, monosomy 7, del(5q), abnormal 3q, complex (5 or more chromosomal abnormalities); intermediate – all other abnormal karyotypes, normal karyotype. Because of the small number of patients with favourable cytogenetics, these patients were included with patients with intermediate cytogenetics in the multivariate models. Other diagnostic parameters that were investigated and not found to be related to outcome on univariate analysis, and are therefore not considered further, included gender, haemoglobin, platelets, bone marrow blast percentage, and FAB (French-American-British classification) type.

Table I.   Characteristics of the patients in AML11, AML14I and AML14NI.
ParameterNumber (percentage) of patients
AML11AML14IAML14NI
  1. Percentages may not add to 100 due to rounding.

Number of patients10711137275
Age group (years)
 60–64347 (32)328 (29)17 (6)
 65–69387 (36)461 (41)34 (12)
 70–74261 (24)252 (22)76 (28)
 75+76 (7)96 (8)148 (54)
Type of AML
 De novo835 (78)907 (80)197 (72)
 Secondary236 (22)230 (20)78 (28)
WBC (×109/l)
 0–9·9533 (50)597 (53)144 (52)
 10–49·9280 (26)290 (26)75 (27)
 50–99·9134 (13)129 (11)37 (13)
 100+124 (12)121 (11)19 (7)
Performance status
 0502 (47)651 (57)82 (30)
 1394 (37)374 (33)132 (48)
 286 (8)66 (6)34 (12)
 375 (7)36 (3)23 (8)
 414 (1)10 (1)4 (1)
Cytogenetic group
 Favourable30 (3)33 (3)3 (1)
 Intermediate637 (59)672 (59)143 (52)
 Adverse175 (16)191 (17)65 (24)
 Unknown229 (21)241 (21)64 (23)

Survival

Survival of the patient population at 1 year and 3 years in AML11 was 37% and 15% respectively. In AML14I and AML14NI it was 47% and 19%, and 19% and 2% respectively. Univariate analysis of survival by individual parameters is shown in Table II for all three trials.

Table II.   Survival at 1 year by baseline parameters – univariate analysis.
ParameterValueSurvival at 1 year (%)
AML 11AML 14ALLAML 14IAML 14NI
  1. − = too few patients for analysis (<5).

All patients 37424719
Age group (years)60–644445476
65–6936485021
70–7433384222
75+29304918
P-value<0·0001<0·00010·040·6
Type of diseaseDe novo AML40445020
Secondary AML28323717
P-value<0·0001<0·0001<0·00010·2
WBC (×109/l)0–9·942475224
10–49·939404519
50–99·93130368
100+2131360
P-value<0·0001<0·00010·0001<0·0001
Performance status045485123
133394620
228273315
32719284
4142940
P-value<0·0001<0·00010·001<0·0001
Cytogenetic groupFavourable475661
Intermediate43475225
Adverse1518232
Missing38445023
P-value<0·0001<0·0001<0·0001<0·0001

Multivariate analysis

Multivariate analysis of AML11 showed that all five parameters listed in Table II remained significant at P < 0·0001, with order of importance (chi-square value): cytogenetics (57), WBC (29), age (23), type of AML (21), and PS (17). The regression coefficients from the Cox regression were used to create an index for survival. The resulting equation was:

image

where cytogenetics: 1 = intermediate/favourable, 2 = adverse, 1·174 = unknown*; WBC (×109/l): 1 = <10·0, 2 = 10·0–49·9, 3 = 50–99·9, 4 = 100+; PS: World Health Organization performance status 0–4; Age (years): 1 = 60–64, 2 = 65–69, 3 = 70–74, 4 = 75+; AML type: 1 = de novo, 2 = secondary.

Values of the index ranged from 1·35 to 3·32. A clinically relevant index needs a small number of categories, so three equally sized risk groups (good, standard, poor) were constructed (Fig 1A), with cut-off points on the index of 1·82 and 2·14. A simpler scoring system may be easier to use in the clinical setting, so we constructed an index based on summation, as indicated in Table III. The values chosen for the summation were based on the approximate ratio of the coefficients in the above equation (i.e. 4:1:1:1:2). Table IV shows a breakdown comparing the ‘simple’ risk group assignment against the original risk group assignment. There was strong agreement between the two, with 90% of cases being assigned to the same category and, as expected, a survival plot for the simplified scoring system also showed wide survival differences between groups (Fig 1B).

Figure 1.

 Survival in AML11 by prognostic index. Under number of events, Obs. is the number observed in each group, Exp. is the number expected (from log-rank analysis).

Table III.   Simplified risk score.
ParameterScore
Cytogenetic group1 = favourable/intermediate, 5 = adverse, 2 = unknown
WBC group1 = <10·0, 2 = 10·0–49·9, 3 = 50–99·9, 4 = 100+ (×109/l)
Performance statusPerformance status score: 0, 1, 2, 3, 4
Age group1 = 60–64, 2 = 65–69, 3 = 70–74, 4 = 75+ (years)
AML type1 = de novo, 3 = secondary
TotalScore (Cytogenetic group) +  Score (WBC group) +  Score (Performance status) + Score (Age group) + Score (AML type)
Risk Group4–6 = Good, 7–8 = Standard, 9+ = Poor
Table IV.   AML11 – multivariate score compared to simplified score.
 Regression equation score
Simplified scoreGoodStandardPoor
Good30100
Standard583080
Poor046358

Validation of the model

The risk group definitions obtained above from the multivariate analysis for the good, standard and poor groups were applied to both the intensive and non-intensive parts of AML14 (Table V). For AML14I, the distribution of the patients across the three risk groups was approximately equal, but with a higher proportion of patients falling in the good risk group. For AML14NI, the majority of patients were in the poor risk group, with only 7% classified as good risk. The AML11 risk groups gave very good prognostic discrimination for AML14I, with survival curves similar to those for AML11 (Fig 2A). For AML14NI, survival for all three groups was poor so differences were less clear cut than for AML14I, with no noticeable difference between good and standard risk patients (Fig 2B). However, survival was still markedly worse in poor risk patients, leading to a highly statistically significant difference in outcome (P < 0·0001). A similar situation, but with slightly better outcome in each risk group, was observed if the analysis was restricted to patients who received low-dose cytarabine (Fig 2C). Comparisons of survival at one year in each of the three studies, plus the low-dose cytarabine-treated patients in AML14NI (AML14NIA), are shown in Table VI. Patients in each risk group in AML14NI had substantially worse survival than those in the same risk groups in AML14I, although those in AML14NIA did somewhat better.

Table V.   Distribution of AML11 risk groups in AML14.
Risk groupNumber (%) of patients
AML14IAML14NIAML14ALL
Good437 (38)20 (7)457 (32)
Standard346 (30)93 (34)439 (31)
Poor354 (31)162 (59)516 (37)
Figure 2.

 Survival in AML14 by AML11 prognostic index.

Table VI.   Survival at 1 year by AML11 risk group.
Risk groupSurvival at 1 year (%)
AML11AML14IAML14NIAML14NIAAML14ALL
Good5360253659
Standard4348334245
Poor1630101424

Discussion

The large number of patients (n = 2483) available make this the biggest and, therefore, one of the most reliable analyses of prognostic factors for survival in older AML patients ever performed. It confirms the importance, on both univariate and multivariate analysis, of age, WBC, performance status, type of disease and, particularly, cytogenetics. These factors are similar to those that are important in younger patients (Grimwade et al, 1998; Wheatley et al, 1999). This indicates that, while AML is a heterogeneous disease, there are no clear cut distinctions between younger (age <60 years) and older (age 60+ years) patients, though clearly the pattern of the disease changes with increasing age, e.g. secondary AML and adverse cytogenetic abnormalities become more frequent. The only other parameter that has been reliably shown, based on large series, to be a major prognostic factor in elderly AML is multidrug resistance protein expression (Leith et al, 1999). The AML11 trial commenced in 1990 before the importance of MDR was recognised, so no data on this were collected. Some data are available from AML14 but on a limited subset of patients. It will be important for new proposed prognostic factors to be evaluated as they emerge in the context of known factors and in large series.

Indices based on regression equations produce a continuum of risk, from the poorest to the best outlook. For practical purposes, this continuum needs to be divided into a number of discrete groups, the cut-off points for which are inevitably to some extent arbitrary. Although it only takes a few seconds to enter a patient’s parameters into a computer and obtain a risk classification based on the regression equation, for ease of use in the clinical setting, some clinicians may prefer to use a simple scoring system and we have shown that such a system provides good discrimination consistent with that from the multivariate index.

A problem with multivariate models is that they can suffer from ‘regression-dilution bias’ (Hughes, 1993). It is, therefore, important that such models be validated on independent data sets. In this case, we have used both an intensively treated cohort of patients from AML14I, and a non-intensively treated cohort from AML14NI (as well as the overall AML14 population). In both cases, the AML11 risk groups provided strong prognostic discrimination. In AML14I, it was as good as in AML11, so it is well validated in the intensive treatment setting. It also provided good discrimination in AML14NI between standard and poor risk groups, although the small number of good risk patients did not do better than standard risk patients. The latter finding may be due to the play of chance with small numbers, although insufficient chemotherapy in a group of patients whose good prognosis is related to the chemosensitivity of their disease could be a factor.

It may be that a different model will provide optimum discrimination in the somewhat different clinical setting of therapy without curative intent. We have performed a preliminary investigation of this using the AML14NI data set, but the model obtained will require validation (using data from the current AML16 trial) before it can be reported. In a smaller cohort of 416 intensively treated patients, Malfuson et al (2008) recently proposed a model based largely on cytogenetics, with age performance status and WBC also being taken into account, thereby providing further confirmation of the relevance of these parameters to the prognosis of AML in the elderly.

We do not, however, recommend that decisions on how to treat a patient should be made on the basis of entering the individual’s value for each prognostic factor into a regression equation. A number of other factors need to be taken into account when determining how to treat a patient, including the patient’s wishes and the general clinical impression (which may not relate to quantifiable parameters). It is not easy to identify general classes of patients who are likely to do well with intensive therapy. Patients with a combination of younger age, de novo disease, good performance status, low WBC and favourable cytogenetics would be expected to have the best outcome. Even then, it may not be clear whether this is related to the intensive therapy received or to the more favourable prognosis of the disease.

This is perhaps the most pressing current issue in therapy for AML in older patients. The mean age at diagnosis of AML is about 65 years (Cartwright et al, 1990), yet the majority of patients entered into clinical trials are younger than this. Many older patients are not offered, or choose not to accept, intensive induction therapy. There is very little reliable evidence available as to whether this is the correct decision. In this report, we have shown that patient groups defined by their outcome in AML11 experienced large differences in outcome between the intensive and non-intensive parts of AML14: with survival on intensive treatment at one year being much better in each risk group: 35%, 15% and 20% better for good, standard and poor risk groups respectively. Just considering the AML14NIA patients who received more effective therapy with low-dose cytarabine, the differences were still mainly large, at 24%, 6% and 16% respectively (the apparently more similar outcome at 1 year in the standard risk AML14NIA group compared with AML14I standard risk patients – 42% vs. 48%– was not maintained out to 3 years – 5% vs. 17% (Fig 2A and C) – by which time almost all AML14NIA patients had died). The important question is whether this is because intensive therapy actually is much better for all, or the great majority of, patients, or whether the unknown and/or unquantifiable selection factors, such as clinical impression of ‘fitness’, are large enough to explain the differences, or perhaps a combination of these considerations applies. We cannot quantify any such selection factors but, were they to be insufficient to explain the observed differences in outcome, one possible interpretation of the data would be that intensive therapy may be better for the majority of patients.

In relation to the practical application of this index, it should be noted that cytogenetic results are not available at diagnosis and may take several days to obtain. This would be a potential problem if the index were to be used to determine therapy and if therapy needed to be initiated rapidly following diagnosis. With regard to the latter consideration, Malfuson et al (2008) recently recommended waiting until the cytogenetic result is available before starting therapy, but many clinicians considering the use of intensive induction therapy may not be happy to delay treatment. Alternatively, could an index be developed that did not include cytogenetics which would enable treatment decisions to be made at diagnosis? We investigated this by analysis of the data excluding cytogenetics but, since cytogenetics is by far the most powerful predictor of outcome, the resulting model provided much poorer discrimination.

However, as discussed above, the indices developed in this report simply tell us that elderly AML patients with differing prognoses can be identified, both amongst those treated intensively and those treated non-intensively; it does not tell us which patients should be treated intensively or not.

It is not possible to be certain that non-randomized methods of comparison (e.g. multivariate modelling, matched pair analyses, propensity scores) are not subject to selection biases, so the only way to reliably resolve the issue of which patients should be treated intensively, and which not, is through randomized trials. Very few randomized trials have addressed the issue of whether elderly patients should be treated intensively or not. One trial showed a survival benefit for the intensive arm (Lowenberg et al, 1989), while another did not (Tilly et al, 1990). Both were small studies, possibly with selected populations, and the general relevance of these results more widely is unclear. In order to reliably determine which patients do benefit from intensive therapy, and which do not, one needs evidence from further, large randomized trials, in which appropriately heterogeneous populations are randomized between intensive versus non-intensive therapy and with stratification by risk group. Whether such trials can actually be performed is uncertain. The AML14 trial did attempt to address this question with a randomization available between entry to AML14I or AML14NI, but the vast majority of patients either chose or were electively assigned by their clinicians to the intensive arm or the non-intensive arm, and very few patients were randomized between the two treatment pathways (n = 8). In the absence of such trials, many important decisions on whether to treat older AML patients intensively or not will remain ad hoc and not be evidence-based.

Footnotes

  • *

    Note: The score of 1·174 for unknown cytogenetics was derived from the coefficient for the dummy variable indicating missing cytogenetics. As expected, this score was similar to the average cytogenetics score amongst patients where the cytogenetics was known (1·208).

Acknowledgements

Our thanks go to the clinicians who entered their patients into the AML11 and AML14 trials (a list of these is supplied in: Goldstone et al, 2001; Burnett et al, 2007, 2009), allowing a very large data set to be used to produce reliable results, and also to the patients for agreeing to participate in the trials in order to help improve our knowledge and, hopefully, lead to better treatment being available for future patients with AML. Numerous cytogenetic laboratories contributed to the provision of cytogenetic data. We thank the trial teams at the Birmingham Clinical Trials Unit (Liz Brettell, Juliette Gooch, Noreen Aktar, Rachel Bell, Liz Lawson) and the Clinical Trial Service Unit, Oxford (Rachel Clack, Jill Crowther, Cathy Hope, Sue Knight, Angela Radley) for data management.

Ancillary