Dr Nick Goulden, Senior Lecturer in Paediatric Haematology, University of Bristol and Bristol Children's Hospital, Oncology Day Beds, Bristol Children's Hospital, St Michael's Hill, Bristol BS2 8JD, UK. E-mail: email@example.com
It is now more than a decade since techniques capable of detecting residual leukaemia at levels at below the threshold of light microscopy (minimal residual disease, MRD) were first described in acute lymphoblastic leukaemia (ALL) (reviewed by Foroni et al, 1999; San Miguel et al, 1999). Initially, it was believed that measurement of MRD would allow much more accurate stratification of relapse risk than conventional prognostic factors such as age, presenting white cell count, leukaemia cytogenetics and morphological assessment of early response to therapy. However, early optimism was dashed by technical flaws in retrospective studies and concerns over clonal stability (Roberts et al, 1996). Gradually, these problems have been overcome by technical innovation and the study of large homogeneous cohorts of patients with prolonged follow-up. Two years ago, this culminated in the publication of three independent prospective studies, each showing that clearance of MRD is an independent prognostic factor in childhood ALL (Cave et al, 1998; Coustan-Smith et al, 1998; van Dongen et al, 1998).
Interest in the clinical application of MRD analysis has been rekindled. Several groups, including the German Berlin–Frankfurt–Münster (BFM) group, are now measuring residual disease in order to stratify therapy. Almost certainly, other collaborative groups will follow this lead. This annotation explores some of the practical issues that must be addressed if the full potential of this technology is to be realized.
Concerns over unnecessary toxicity and the poor prognosis of children who relapse have led to the use of risk-directed protocols in ALL. However, currently available methods for stratification of treatment intensity are inaccurate. This is exemplified by consideration of the probable outcome of the current Medical Research Council (MRC) protocol in the UK, which is based on the highly successful CCG 1800 series (Table I). Here, only two out of every 10 patients destined to relapse will receive the most intensive therapy. Moreover, every child receives two blocks of intensification therapy, despite the fact that studies undertaken 20 years ago demonstrated that 50% of children could be cured without such treatment (Eden et al, 1991; Chessells et al, 1992; Rivera, 1994).
Table I. An estimate of the outcome (5-year event-free survival, EFS) for every 100 patients treated according to the current MRC 97 amended 99 protocol.
Number of relapses (%)
The number of relapses (% of total relapse) expected in each group is also shown. Figures are based on reports of CCG 1882, 1881, &, 1891 (Nachman et al, 1998; Pui et al, 1998).
All three prospective studies published in 1998 demonstrated that MRD analysis could predict outcome within groups of children with homogeneous clinical risk features receiving identical chemotherapy (Cave et al, 1998; Coustan-Smith et al, 1998; van Dongen et al, 1998). This is exemplified by the results of the MRD study based on the BFM 90 protocol (van Dongen et al, 1998). Here, therapy was stratified according to leukaemic cell mass, immunophenotype, the presence of the Philadelphia translocation and the response to 7 d of prednisolone. Marrow from a cohort of 129 patients was examined for the presence of MRD at 1 month and 3 months from diagnosis. The distribution of clinically assigned risk groups and outcome did not differ significantly between children in whom MRD was assessed and the overall population receiving treatment. In each case, at least one marker of MRD with a minimum sensitivity of detection of one leukaemic cell in 10 000 normal ones was used. There are four major points to note from this study:
1) The risk of relapse as defined by MRD analysis was more accurate than that achieved by clinical methods (Table II).
Table II. Comparison of distribution and outcome of clinical and MRD-based risk groups in the BFM 90 ALL protocol.
Clinical risk group
MRD-based risk group
Outcome (5-year event-free survival) for every 100 patients treated according to the BFM 90 protocol according to clinical and MRD-based risk (van Dongen et al, 1998). The number of relapses (% of total relapse) expected in each group is also shown.
2) The MRD based high-risk group was larger than that defined by conventional methods. It encompassed equal numbers of clinically determined high- and intermediate-risk patients. The risk of relapse was such that intensification of therapy, within the confines of a randomized prospective study, would be appropriate for these children.
3) Rapid clearance of MRD defined a larger cohort at low risk of failure than that defined by conventional risk factors. This group represented 40% of all patients and three-quarters of whom were drawn from the clinical intermediate-risk group. The risk of relapse is so small (2% at 5 years) that a reduction in therapy should be considered. Once again, any reduction should only be considered as part of a randomized controlled study.
4) Finally, the MRD-based system is far from foolproof. Half the eventual relapses were drawn from an MRD-based intermediate-risk group with an event free survival of 75%.
Children undergoing bone marrow transplantation (BMT)
Relapse is the commonest cause of death after allogeneic BMT for ALL. Although the presence of detectable MRD after BMT carries a high risk of relapse, this is usually so rapid as to preclude any useful clinical intervention (Knechtli et al, 1998a). In contrast, an assessment of MRD status immediately prior to allogeneic BMT is of value. In a study of 54 patients, all 12 with high levels (0·1–1%) of MRD immediately prior to transplant subsequently relapsed (Knechtli et al, 1998b), whereas 26 out of 33 patients with no evidence of MRD at a level above one leukaemic cell in 10 000 normal cells remained in remission (P < 0·001 in multivariate analysis). As a result, our centre now offers experimental therapies, including intensified pretransplant chemotherapy, to children with high-level MRD preBMT.
Increasing experience with alternative donor grafts means that it is now possible to offer an allograft to every patient (Aversa et al, 1999; Heslop, 1999). However, BMT is both expensive and dangerous. Each unrelated donor BMT costs in excess of Eur 90 000 and transplant-related mortality is at least 10%. Thus, it is imperative to limit this treatment to only those children who can benefit most. Ideally, this requires randomized trials between BMT and chemotherapy, and comparisons of different transplant regimes. However, small numbers, physician and parental bias, and the pace of technological advance preclude this (Chessells, 1998). We therefore advocate that measurement of MRD as a marker of disease load at transplant should be included in reports of clinical experience of BMT for ALL.
WHEN & HOW OFTEN SHOULD MRD BE STUDIED?
At least two serial measurements of MRD should be made during the first 6 months of chemotherapy
Early reports claimed that up to a third of children destined to remain in long-term remission had evidence of MRD in the second year of therapy (Roberts et al, 1996). Subsequent analysis of unselected groups of patients has shown that this is not the case. It is now clear that most patients, regardless of clinical outcome, lack detectable MRD greater than one leukaemic cell in 10 000 normal cells within 6 months of diagnosis (Foroni et al, 1999). Moreover, re-expansion of leukaemia prior to relapse is so rapid that, in many cases, the re-emergence of MRD precedes relapse by only a matter of weeks (Goulden et al, 1998; van Dongen et al, 1998). Thus, prediction of clinical outcome must rely on early clearance of MRD. This can be done in several ways. One approach is to define a threshold, at a given time-point during therapy, above which the likelihood of relapse is so great that clinical intervention is indicated. Several authors have defined a level of 1% at the end of a three or four drug induction (Brisco et al, 1994; Cave et al, 1998). An alternative is to define a later time at which a positive result is universally correlated with relapse. In two small retrospective studies, each child with evidence of MRD after 5 months of treatment according to MRC trials UKALL X or XI had relapsed (Evans et al, 1998; Goulden et al, 1998). Unfortunately, the false negative rate of single time-point analysis is high; less than half of those destined to relapse can be identified in this way.
Single time-point analysis also fails to account for the efficacy of subsequent treatment (Table II). For example, up to a third of children with high levels of MRD at the end of induction in the BFM and European Organization for Research and Treatment of Cancer (EORTC) studies did not relapse (Cave et al, 1998; van Dongen et al, 1998). Consequently, we recommend two serial measurements of MRD during the early months of therapy. This may not be necessary in children with a negative result at the end of induction. In the BFM study, only one out of 55 children with no evidence of MRD, above one cell in 10 000 normal cells, had a subsequent positive result (van Dongen et al, 1998).
Is it necessary to define the significance of mrd for each protocol?
The clinical significance of MRD at any time-point is a function of previous and subsequent chemotherapy
This is best illustrated by a recent report from the COALL group (Zur Stadt et al, 1999). Here, polymerase chain reaction (PCR)-based methods, similar to those used by the BFM, were used to measure MRD at the end of 35 d of induction chemotherapy. COALL reported two to three times more patients with high levels of MRD (> 1%) than that documented at the same time-point in the BFM study, possibly attributable to the absence of asparaginase in induction in the COALL regime. However, the 5-year event-free survival of the COALL and BFM clinical protocols was not statistically different. This point is particularly topical in the UK. The recent switch from early to delayed intensification strategies means that the clinical significance of an MRD result in the new UK protocol cannot be inferred from studies based on previous MRC trials, UKALL X, XI and ALL97.
Which is the best method for measurement of mrd?
The utility of any method chosen for application to a clinical protocol depends on the aim of the study
Three types of leukaemia-specific targets have been widely used as markers of MRD in ALL. These are cytogenetic abnormalities, antigen receptor gene rearrangements and leukaemia-specific immunophenotypes. A bewildering number of techniques have been described for the detection of each type of target. Several factors should be taken into account when assessing the clinical utility of any method. As a generalization, cost is directly proportional to the complexity of a technique.
The most widely applicable system available for the measurement of MRD involves PCR of gene rearrangements. Amplification with four sets of primers can detect at least one clone-specific rearrangement in more than 90% of patients (Goulden et al, 1998). Approximately 98% of cases can be studied if 30 primer sets are used. This latter system identifies two clonal rearrangements in 80% of cases, thus minimizing the impact of clonal evolution (Pongers-Willemse et al, 1999; Sczepanski et al, 1999). The number of children amenable to study using flow cytometry (which detects MRD on the basis of a ‘leukaemia-specific’ immunophenotype) is now more than 80% when novel antibodies, generated by microarray technology, are used (Campana & Coustan-Smith, 1999). Less than 40% of ALL in childhood is characterized by a clone-specific translocation amenable to PCR. However, the lack of requirement for sequence analysis combined with excellent sensitivity and stability makes this the method of choice in international studies of Ph+ and infant ALL.
At present, there is no universally applicable marker of MRD in ALL. Overall, the best approach might combine flow cytometry and gene rearrangement PCR, as these techniques provide comparable results (Neale et al, 1999). This would maximize the number of children open to study. However, many might regard such a blanket approach as prohibitively expensive. In practice, it is probable that up to 10% of children will be excluded from study by any single system. If MRD-based stratification is shown to be successful in the long term, the counselling of such children and their families will become an issue.
A marker of residual leukaemia should be homogeneously distributed throughout the leukaemic clone and remain stable during the course of the disease. Neither antigen receptor gene rearrangements nor leukaemia-specific immunophenotypes fulfil these criteria (Roberts et al, 1996; San Miguel et al, 1999). A number of effective strategies have been devised to circumvent this problem. These include preferential amplification of antigen receptor genes least affected by subclone formation and following at least two leukaemia-specific targets in each case (Pongers-Willemse et al, 1999). Exponents of flow cytometry follow several leukaemia-specific immunophenotypes if possible (Coustan-Smith et al, 1998). Whereas these strategies decrease the incidence of a false negative result to less than 10%, the study of multiple markers increases complexity and cost.
The sensitivity of any given technique is directly proportional to its complexity. Less sensitive methods will be of value if the aim is to define patients with high levels of disease who will be at highest risk of relapse. For example, antigen receptor fingerprinting relies on detection of a clonal rearrangement of an identical size to that seen at diagnosis. Sequence analysis is not required (Foroni et al, 1999). Fingerprinting, however, will fail to detect levels of MRD of 1 in 10 000 normal cells or less in many patients (Owen et al, 1997). This will be crucial if reduction of therapy is planned on the basis of a negative result at the end of induction. More complex and expensive sequence-based strategies will probably be required in such a protocol.
Quantification of residual leukaemia is important for two reasons. Firstly, it provides an assessment of the sensitivity of an individual marker of MRD, particularly important when defining a negative result. Secondly, it is possible to correlate the risk of relapse with the kinetics of decay of residual disease (Cave et al, 1998; Coustan-Smith et al, 1998; van Dongen et al, 1998). Flow cytometry is directly quantitative. In contrast, until very recently, quantitative PCR required either limiting dilution or competitor assays (Brisco et al, 1994; Ouspenskaia et al, 1995; Cave et al, 1998). The complexity of these precludes their widespread clinical use. Most investigators have therefore relied on semiquantitative methods in which an estimate of the level of residual disease in a sample is obtained by comparison with logarithmic dilutions. This problem has been overcome by the development of Taqman and light-cycler technologies for quantitative PCR. Accurate measurement of residual disease will now be possible within a few hours of receiving a sample (Pongers-Willemse et al, 1998). Although real-time methods require substantial investment in hardware and sequence analysis, they are probably more reproducible between laboratories and are already accepted as a gold standard by many in the field.
The expertise and high set-up and running costs required to perform these studies mean that they must be centralized. However, this will require a mechanism for transport of adequate samples to the testing centres. A particular concern is the instability of samples for RNA-based PCR (Cross et al, 1993). In theory, samples for flow cytometry may be fixed locally and transported at a later date. Selection of a method that provides results and identifies inadequate samples within a clinically acceptable time frame is mandatory.
The cost per child of the three most clinically useful methods is presented in Table III.
Table III. Cost (in Euros) for each of the three widely applicable techniques for the routine study of MRD in ALL of childhood.
Real-time gene rearrangement PCR
Assumes each follow-up sample is analysed fresh.
Assumes simultaneous analysis of three stored follow-up samples.
Costs include consumables and salaries of technical staff. Data concerning the cost of immunophenotyping and real-time PCR was provided by Professor J. J. van Dongen (personal communication).
Attempts to define the predictive value of the analysis of blood should be included in clinical studies
Information concerning the significance of MRD has come from testing bone marrow. Regular bone marrow aspiration causes many practical difficulties in children. As a result, samples are often dilute and inadequate. In addition, the focal nature of residual leukaemia, first demonstrated in animal models, has now been confirmed in humans (Martens et al, 1987; Sykes et al, 1998). Thus, a single aspirate may provide an unrepresentative result when MRD is present at levels less than one leukaemic cell in 1000 normals. In B-lineage ALL, the level of MRD present in a blood sample is at least one log less than that seen in a simultaneous marrow (Sykes et al, 1998). No published study has compared the predictive value of the MRD measurement in blood and marrow. This important omission must be addressed in the near future.
How can quality be assured?
Internal and external quality assurance are necessary for the translation of MRD technology from a research tool to a routine clinical test
The failure of early studies of MRD to provide consistent results was, in part, owing to the immaturity of the technology (Roberts et al, 1996). In the case of PCR-based methods, rigorous control of contamination and an assessment of the amount of amplifiable DNA or RNA in a sample are now recognized as essential. Equally important is the need to standardize techniques of bone marrow aspiration. Sequential samples taken from the same marrow puncture are increasingly haemodilute. This may lead to progressive dilution of the amount of MRD present and can be avoided by instructing operators to provide MRD samples from the first aliquot from a fresh puncture.
Centres that provide clinical MRD results should participate in external quality assurance schemes. Generic schemes already exist to monitor the performance of laboratories undertaking diagnostic PCR and flow cytometry. The development of standardized PCR primers and protocols through the Europe-wide BIOMED-1 concerted action will facilitate additional quality control via exchange of samples (Pongers-Willmese et al, 1999).
WHAT ARE THE FINANCIAL IMPLICATIONS & WHO SHOULD PAY?
Incorporation of MRD into clinical protocols may save money
Development of this technology has been funded by research charities, but the cost of clinical studies will fall on purchasers of healthcare. These organizations must first decide whether they believe that current results of ALL therapy are acceptable, taking into account the cost of unnecessary toxicity and of expensive intensive treatments such as BMT. They must also consider the advice of those who argue that, to date, MRD analysis has no proven place in the treatment of childhood ALL.
We contend that MRD analysis should pay for itself. In the next 5 years, approximately 1500 children will be treated for ALL in the UK. At the most, MRD analysis will cost Eur 4·5 million for these patients. This is the equivalent of 50 allogeneic bone marrow transplants. If MRD analysis helps to prevent relapse and thus precludes the need for BMT in 50 children (an improvement in event-free survival without BMT of 4%), the investment will have been recouped. In addition, assessment of MRD before BMT will lead to a more rational application of this procedure. Cost savings can also be expected from the reduction of therapy. Previous protocols have demonstrated that 50% of children do not require intensification therapy for cure. We calculate that the cost of supportive care for each intensification block in the UK is approximately Eur 2000.
Financial constraints are a reality of modern healthcare. It is very tempting to try to cut corners with cheaper, less sensitive methods. Whereas such a strategy may identify children at very high risk of relapse, these may be the most difficult patients to cure. In addition, parents are increasingly concerned not only that their child will survive leukaemia but also that toxicity is minimized. If groups such as the BFM are able to show that sensitive assessment of MRD allows a reduction in the intensity of therapy without an increased risk of relapse, it will no longer be possible to justify intensified therapy for all.
What are the priorities for clinical studies of mrd?
It is now time to examine whether stratification of therapy according to MRD status can bring us closer to the ultimate therapeutic goal of cure of the greatest number of patients with minimum side-effects
We suggest that the following studies are particularly important in this respect:
1) An assessment of the logistics of sample collection and the establishment of internationally validated, widely applicable methods for the quantification of MRD.
2) A comparison of the predictive value of the analysis of blood and marrow.
3) Blinded analysis of the clinical significance of MRD for each treatment protocol. This will involve the recruitment of several hundred patients and require at least 4 years follow-up. The protocol should then be used as the control arm in a randomized trial of a modification(s) of that therapy based on MRD results. In the UK, this would mean that a protocol in which treatment changed according to MRD assessment would not be feasible prior to 2003.
4) International collaborative studies of novel and intensified therapies for the relatively small numbers of children identified to be at very high risk of relapse. These should include assessment of the impact of pharmacologically guided therapy and immunotherapy including immunotoxin and allogeneic BMT.
5) Studies of reduction of therapy in patients identified to be at very low risk (< 2%) of relapse.
There is now good evidence that alteration of therapy can improve the prognosis for children who show a slow early response to therapy, assessed by light microscopy. MRD analysis simply seeks to extend the correlation between response and outcome that has been a feature of clinical studies for 20 years. Early disenchantment has been replaced by enthusiasm for application in clinical studies. It is vital that these are properly funded and designed with scrupulous attention to logistics and quality assurance.
We thank the Leukaemia Research Fund of Great Britain for their financial support over the last decade. We also wish to acknowledge the dedication and technical skills of all laboratory and clinical staff within the Bristol Children's Hospital Leukaemia Research Group.