Response assessment in Waldenström macroglobulinaemia: update from the VIth International Workshop

Authors


Correspondence: Dr Roger G. Owen,

HMDS Laboratory, Level 3, Bexley Wing,

St James's University Hospital,

Beckett Street, Leeds LS9 7TF, UK.

E-mail: rogerowen@nhs.net

Summary

This report represents a further update of the consensus panel criteria for the assessment of clinical response in patients with Waldenström macroglobulinaemia (WM). These criteria have been updated in light of further data demonstrating an improvement in categorical responses with new drug regimens as well as acknowledgement of the fact that such responses are predictive of overall outcome. A number of key changes are proposed but challenges do however remain and these include the variability in kinetics of immunoglobulin M (IgM) reduction with different treatment modalities and the apparent discrepancy between IgM and bone marrow/tissue response noted with some regimens. Planned sequential bone marrow assessments are encouraged in clinical trials.

Waldenström macroglobulinaemia (WM) is a distinct B-cell lymphoproliferative disorder characterised by the presence of immunoglobulin M (IgM) monoclonal gammopathy and bone marrow infiltration by lymphoplasmacytic lymphoma (Owen et al, 2003; Swerdlow et al, 2008). Criteria for the formal assessment of response were proposed following the Second International Workshop on WM (Weber et al, 2003) and these were further revised following the Third International Workshop (Kimby et al, 2006). The emergence of additional clinical data has resulted in a need to re-evaluate the current criteria. These data include the increasing frequency of high quality responses with newer combinations as well as the confirmation that categorical response is predictive of outcome (Gertz et al, 2009; Treon et al, 2009, 2011a; Laszlo et al, 2010; Tedeschi et al, 2012). Similarly, it is becoming clear that discrepancies can exist between IgM responses and bone marrow/tissue responses, which has led to a re-evaluation of the value of repeat bone marrow assessment (Chen et al, 2007; Treon et al, 2007, 2011b; Varghese et al, 2009; Barakat et al, 2011). These current proposals represent a consensus report produced following the Sixth International Workshop on WM. It is acknowledged that the principal role of these criteria is to promote uniform reporting of clinical trial data. The principles outlined may also aid physicians in the routine clinical management of individual patients although it is recognized that clinical benefit can be obtained in the absence of high quality categorical responses.

Categorical response definitions

The categorical response definitions proposed are provided in detail in Table 1. A number of key changes are proposed. Categorical responses may now be determined either by IgM M protein quantitation by densitometry or by total serum IgM quantitation by nephelometry, given that they appear to provide similar levels of correlation with bone marrow response (Tripsas et al, 2012). It should however be noted that IgM values as assessed by nephelometry are systematically higher than M protein values determined by densitometry (Riches et al, 1991; Murray et al, 2009). It is crucial that sequential response assessments for individual patients are performed in the same laboratory using the same methodology. Further considerations include the knowledge that the biological variability for M protein quantitation by densitometry and immunoglobulin quantitation by nephelometry are 8% and 13%, respectively (Katzmann et al, 2011).

Table 1. Categorical response definitions
Response categoryDefinition
  1. a

    Sequential changes in IgM levels may be determined either by M protein quantitation by densitometry or total serum IgM quantitation by nephelometry.

Complete response (CR)

Absence of serum monoclonal IgM protein by immunofixation

Normal serum IgM level

Complete resolution of extramedullary disease, i.e., lymphadenopathy and splenomegaly if present at baseline

Morphologically normal bone marrow aspirate and trephine biopsy

Very good partial response (VGPR)

Monoclonal IgM protein is detectable

≥90% reduction in serum IgM level from baselinea

Complete resolution of extramedullary disease, i.e., lymphadenopathy/splenomegaly if present at baseline

No new signs or symptoms of active disease

Partial response (PR)

Monoclonal IgM protein is detectable

≥50% but<90% reduction in serum IgM level from baselinea

Reduction in extramedullary disease, i.e., lymphadenopathy/splenomegaly if present at baseline

No new signs or symptoms of active disease

Minor response (MR)

Monoclonal IgM protein is detectable

≥25% but<50% reduction in serum IgM level from baselinea

No new signs or symptoms of active disease

Stable disease (SD)

Monoclonal IgM protein is detectable

<25% reduction and <25% increase in serum IgM level from baselinea

No progression in extramedullary disease, i.e.,lymphadenopathy/splenomegaly

No new signs or symptoms of active disease

Progressive disease (PD)

≥25% increase in serum IgM levela from lowest nadir (requires confirmation) and/or

progression in clinical features attributable the disease

A new category of Very Good Partial Response (VGPR) is also proposed, which is defined by the presence of monoclonal IgM on immunofixation and/or ≥ 90% reduction in serum IgM levels from baseline along with complete resolution in extramedullary disease, i.e., lymph node and splenic disease, if present at baseline. This category is supported by numerous data demonstrating improved depth of responses with newer therapeutic combinations, such as bortezomib containing regimens and purine analogue/alkylator/monoclonal antibody combinations (Treon et al, 2009; Laszlo et al, 2010; Tedeschi et al, 2012). Similarly, it has become clear that categorical response predicts outcome, at least in terms of progression-free survival (PFS) and rituximab-based therapies. In this context, patients who achieve a VGPR have an outcome similar to those patients achieving complete responses (CR) (Treon et al, 2011b). Similarly, the minor response (MR) category has been further validated by new data demonstrating an improved outcome compared to those patients with stable (SD) or progressive disease (PD) (Gertz et al, 2009).

The panel reiterated the value of sequential assessment of response following the completion of therapy given the delayed IgM responses seen particularly in the context of purine analogue and monoclonal antibody-based therapy (Del Giudice et al, 2005; Varghese et al, 2009; Tedeschi et al, 2012). It is essential therefore that the categorical responses reported in clinical trials be the best-recorded response, irrespective of the time point at which those responses were documented. The reporting of time to best response is also to be encouraged in clinical trials as this is a particularly meaningful criterion for those patients with high IgM concentrations and hyperviscosity syndrome. It was further affirmed that CR be confirmed by morphological assessment of the marrow and that confirmation with a second immunofixation assay should also be performed.

The serum free light chain assay (sFLC) has been widely applied in multiple myeloma and has proved particularly informative in patients with non-secretory and light-chain only disease. Limited data is available in WM but the assay appears to be informative in the majority of patients and may provide an earlier indication of both response and progression (Itzykson et al, 2008; Leleu et al, 2008, 2011a). A similar assay (heavy chain/light chain assay, HLC), which allows the quantification of IgM kappa and IgM lambda, has recently been developed and is based upon the unique junctional epitopes that exist between heavy and light chains and initial reports have suggested a potential role in WM response assessment (Leleu et al, 2011b; Manier et al, 2011). The panel however considered there were insufficient data at this time to incorporate sFLC and HLC assessments into the revised criteria and further prospective evaluation is encouraged.

It should be noted that these response criteria are applicable only to patients with symptomatic WM and that the assessment of response in patients with IgM-related disorders such anti-myelin-associated glycoprotein neuropathy, cryoglobulinaemia and cold agglutinin disease will probably require specific criteria to assess clinical benefit.

Relapse and progression criteria

These criteria were also reviewed in light of the increasing incidence of CR and VGPR reported with newer therapeutic combinations (Treon et al, 2009; Laszlo et al, 2010; Tedeschi et al, 2012). It is again stressed that progression, when this is defined solely on the basis of increasing IgM concentrations, is not necessarily an indication to reintroduce treatment. Relapse from CR is defined by the reappearance of monoclonal IgM protein and/or recurrence of bone marrow involvement, lymphadenopathy/splenomegaly or symptoms attributable to active disease.

Progression from PR is defined by ≥25% increase in IgM level from lowest recorded value and confirmed by a repeat assessment. The development of new signs and symptoms of disease, including Bing Neel syndrome and histological transformation, is also considered as evidence of disease progression. An absolute increase of at least 5 g/l is required to define progression when the IgM level is the only applicable criterion.

Bone marrow assessment

The previous response criteria mandated morphological assessment of the bone marrow for the confirmation of CR but it was not considered necessary in patients achieving less than a CR. Bone marrow appearances in WM are, by definition, heterogeneous and the extent and pattern of infiltration as well as the extent of plasma cell differentiation can vary considerably from patient to patient (Owen et al, 2001; Morice et al, 2009; Pasricha et al, 2011). Emerging data however has demonstrated there can be discrepancies between serum IgM and bone marrow responses. IgM responses are typically slow with purine analogue and monoclonal antibody-based therapy, as it appears these agents selectively deplete the CD20+ B-cell component with sparing of the CD138+ plasma cell component (Varghese et al, 2009; Barakat et al, 2011). In this context it is possible to demonstrate significant B-cell depletion in the marrow but suboptimal IgM responses. Satisfactory IgM responses are subsequently documented in the majority of patients with maximum responses documented at a median of 6 months following the completion of therapy in fludarabine-treated patients, for instance (Varghese et al, 2009). Conversely, bortezomib-containing regimens and other novel agents, such as the mammalian target of rapamycin inhibitor everolimus, may demonstrate excellent IgM responses but discordant bone marrow responses (Chen et al, 2007; Treon et al, 2007, 2011a).

Serial bone marrow assessment is encouraged for all patients enrolled in clinical trials irrespective of their IgM response. Similarly, repeat marrow assessment can provide significant value in the routine management of individual patients. In order to make a detailed assessment of residual infiltrates it is recognized that both bone marrow aspirate and trephine biopsies should be obtained and that these should be routinely supplemented by flow cytometric and immunohistochemistry studies. Attempts should be made to characterise residual infiltrates with respect to their B-cell and plasma cell content and immunohistochemical assessment of trephine biopsy sections currently provides the optimal method (Morice et al, 2009; Varghese et al, 2009; Barakat et al, 2011). CD138 and/or IRF4 (also known as MUM1) may be used to demonstrate residual plasma cells while CD20 may be used to define residual B-cell infiltration although additional markers may be necessary in rituximab-treated patients due to masking of the antigenic site, which may be seen in post-treatment specimens (Treon et al, 2001). PAX5 is a potentially useful marker in this context as expression will be confined to the B-cell component but CD19 and CD79 may be difficult to interpret as expression is seen in both B-cells and plasma cells (Morice et al, 2009; Varghese et al, 2009; Barakat et al, 2011).

The most appropriate time point at which to assess bone marrow response following the completion of treatment is unknown and may vary with the type of treatment given. A standardized approach is desirable, as this will facilitate meaningful comparisons of future clinical trial data. It is therefore suggested that bone marrow responses be formally assessed 4–6 weeks from the completion of induction therapy. It is however recognized and encouraged that assessment at additional time points may be desirable in some clinical protocols, particularly those examining maintenance therapies.

The panel also recognized the potential value and impact of minimal residual disease (MRD) studies in WM. Numerous studies in myeloma and chronic lymphocytic leukaemia (CLL) have demonstrated that MRD is demonstrable in a significant proportion of patients in conventional CR and that this is highly predictive of outcome (Rawstron et al, 2002; Moreton et al, 2005; Paiva et al, 2008). Currently, flow cytometry appears to be the most appropriate technique as it is applicable to the vast majority of patients with myeloma and CLL and has a reproducible sensitivity of 0·01%. Categories incorporating MRD-negative remissions have therefore been introduced in both myeloma and CLL (Hallek et al, 2008; Rajkumar et al, 2011). There is encouraging, but limited MRD data in WM (García-Sanz et al, 2011) and, given the heterogeneity in cellular content and the differential responses seen with various therapies, additional data is required before MRD assessment can be included in the consensus criteria.

Efficacy measures for clinical trials

It is recognized that there is very limited randomised clinical trial data in WM and the assessment of the efficacy of different regimens frequently involves the comparison of non-randomised phase II trials. It is therefore essential that all clinical trials report their efficacy results in a standardized manner. In this context the crucial parameters to report are overall survival (OS), PFS and time to progression (TTP). Given the improvement in categorical responses and increasing rates of CR it is also encouraged that disease-free survival (DFS) is reported for patients achieving CR and duration of response (DOR) is reported for all responding patients. Given the variability in kinetics of monoclonal IgM reduction with different treatments, determining the time taken to achieve best response is also considered a worthwhile efficacy measure.

It is recognized that progression in WM, as defined, does not always result in the reintroduction of treatment and the collection of data relating to the time of next treatment (TTNT) is strongly encouraged. Similarly, it is also acknowledged that deaths can occur in WM due to co-morbid conditions and the reporting of cause-specific survival (CSS) is also encouraged. The proposed efficacy measures are defined in detail in Table 2.

Table 2. Efficacy measure definitions
EndpointDefinition
Overall survivalTime from the initiation of treatment to death from any cause
Cause-specific survivalTime from the initiation of treatment to death censoring for deaths from unrelated causes
Progression-free survival (PFS)Time from the initiation of treatment to disease progression or death from any cause
Time to progression (TTP)Time from the initiation of treatment to disease progression with deaths due to unrelated causes censored
Disease-free survival (DFS)Time from the first documentation of complete response to disease progression with deaths due to unrelated causes censored
Duration of response (DOR)Time from the first documentation of response to disease progression with deaths due to unrelated causes censored
Time to next treatment (TTNT)Time from the initiation of treatment to next therapy

Haemopoietic recovery

Incomplete or suboptimal haemopoietic recovery remains a difficult issue in WM response assessment because treatment itself often impacts haemopoietic recovery. The presence of persistent cytopenias is clearly relevant in the management of individual patients but the panel reaffirmed their view that haemopoietic recovery should not be included as a criterion in response assessment. Repeat bone marrow assessment is however encouraged in those patients with adequate IgM response but poor haemopoietic recovery in order to fully exclude refractory disease.

Imaging studies

It was reaffirmed that computerised tomography (CT) scanning (of chest, abdomen and pelvis) be performed in all patients prior to commencing therapy and that repeat scanning be used in determining categorical response for those with measurable disease. It is recognized that the precise measurement of nodal disease was problematical in many cases, particularly in those patients with multifocal nodal involvement and complex nodal masses. It was therefore proposed that complete resolution of extramedullary disease be required for attainment of CR and VGPR and reduction in nodal and splenic involvement was needed for the attainment of a PR, assuming that the IgM criteria were met. Positron emission tomography (PET) appears to be informative for the presence of extramedullary disease in approximately 80% of patients with active WM but further prospective studies are needed (Banwait et al, 2011).

Conclusions and future directions

It is recommended that these criteria be adopted in prospective clinical trials, as meaningful comparisons of non-randomised data can only be made in the context of uniform reporting of overall response and efficacy outcomes. Similarly, the prospective evaluation of novel serological, immunophenotypic and imaging methods in the context of clinical trials protocols is also encouraged.

Acknowledgements

RGO and SPT conceived of the project and wrote the paper. All authors were involved in consensus discussions and reviewed the manuscript.

Ancillary