Evaluation of tumor response after locoregional therapies in hepatocellular carcinoma†
Are response evaluation criteria in solid tumors reliable?
We thank Mrs. Diana Segarra for her editorial assistance.
Evaluation of response to treatment is a key aspect in cancer therapy. Response Evaluation Criteria in Solid Tumors (RECIST) are used in most oncology trials, but those criteria evaluate only unidimensional tumor measurements and disregard the extent of necrosis, which is the target of all effective locoregional therapies. Therefore, the European Association for the Study of the Liver (EASL) guidelines recommended that assessment of tumor response should incorporate the reduction in viable tumor burden. The current report provides an assessment of the agreement/concordance between both RECIST and the EASL guidelines for the evaluation of response to therapy.
The authors evaluated a cohort of 55 patients within prospective studies, including 24 patients with hepatocellular carcinoma who underwent transarterial chemoembolization (TACE) with drug eluting beads (DEB-TACE) and 31 patients who underwent percutaneous ablation (percutaneous ethanol injection [PEI]/radiofrequency [RF]). Triphasic helical computed tomography scans were performed at baseline, at 1 month, and at 3 months after procedure, and 2 independent radiologists evaluated tumor response.
Evaluating response according to RECIST criteria, no patients achieved a complete response (CR), 21.8% of patients achieved a partial response (PR) (none in the PEI/RF group), 47.3% of patients had stable disease (SD), and 30.9% of patients had progressive disease (PD). When response was evaluated according to the EASL guidelines, 54.5% of patients achieved a CR, 27.3% of patients achieved a PR, 3.6% of patients had SD, and 14.5% had PD. The κ coefficient was 0.193 (95% confidence interval, 0.0893-0.2967; P < .0001).
RECIST missed all CRs and underestimated the extent of partial tumor response because of tissue necrosis, wrongly assessing the therapeutic efficacy of locoregional therapies. This evaluation should incorporate the reduction in viable tumor burden as recognized by nonenhanced areas on dynamic imaging studies. Cancer, 2009. © 2008 American Cancer Society.
Evaluation of response to treatment is a key aspect in cancer therapy, because objective response may become a surrogate marker of improved survival. Several criteria have been developed to allow a uniform response assessment, 1, 2 including the World Health Organization (WHO), which published criteria for response evaluation that became the most commonly used by investigators around the world.3, 4 Nonetheless, in the subsequent years, several problems in terms of interpretation of these criteria appeared, leading to numerous modifications or clarifications and resulting in a situation in which response criteria no longer were comparable among research organizations. In this scenario, the Response Evaluation Criteria In Solid Tumors (RECIST) guidelines were published by the National Cancer Institute in 2000 with the objective of unifying criteria of response assessment.5 These rely on the measurement of the greatest dimension of all target lesions, and response is categorized as a complete response (CR) (the disappearance of all target lesions), a partial response (PR) (a decrease ≥30% in the sum of the greatest dimension of target lesions), progressive disease (PD) (an increase ≥20% in the sum of the greatest dimension of target lesions and/or the appearance of new lesions and/or unequivocal progression of existing nontarget lesions), and stable disease (SD) (not enough shrinkage nor sufficient increase to qualify as a PR or as PD, respectively).
Despite the improvement of RECIST compared with previous criteria, it has been demonstrated that their applicability in different neoplasms is less than optimal. 6 Since the publication of the RECIST criteria 7 years ago, several reports have been published regarding the low reliability of RECIST criteria in evaluating response in different types of tumors, such as prostate cancer,7 malignant pleural mesothelioma,8-10 nonsmall cell lung cancer,11, 12 gastrointestinal stromal tumor,13, 14 soft tissue sarcoma,15 neuroendocrine tumors,16 and disseminated pediatric malignancy.17 In addition, with the advent of new drugs, most of which are cytostatic rather than cytotoxic, the assessment of activity by measuring tumor shrinkage may not be appropriate at all.18-20 Finally, RECIST criteria do not take into account changes in tumor viability that may be associated with tumor response. In patients with hepatocellular carcinoma (HCC), the objective of all effective locoregional therapies (ablation and chemoembolization) is to obtain necrosis of the tumor, regardless of the shrinkage of the lesion. Even in if extensive tumor necrosis is achieved, this may not be paralleled by a reduction in the greatest dimension of the lesion. Therefore, in 2000, a panel of experts on HCC of the European Association for the Study of the Liver (EASL) agreed that estimating the reduction in viable tumor volume (recognized as nonenhanced areas using dynamic imaging techniques) should be considered the optimal method for assessing local response to treatment in patients with HCC.21 Therefore, most authors reporting results of locoregional therapy for HCC evaluate tumor response according to this recommendation.22-26
The objective of this study was to compare the concordance/agreement of tumor response evaluated by RECIST criteria versus the EASL guidelines in a cohort of patients received the locoregional therapies frequently used in the setting of HCC: percutaneous ablation (by ethanol injection or radiofrequency) and transarterial chemoembolization (TACE).
MATERIALS AND METHODS
The current study included 2 cohorts of patients who were selected from 2 prospective investigations assessing TACE with drug eluting beads (DEB-TACE) (24 patients) and percutaneous ablation (31 patients) performed by our group. 22, 27 Both studies were approved by the ethics committee of our institution.
The DEB-TACE Cohort
This cohort included 27 consecutive, asymptomatic patients with untreated, intermediate HCC and compensated Child-Pugh A cirrhosis without vascular invasion or extrahepatic spread (Barcelona Clinic Liver Cancer [BCLC] stage B) from February 2004 to May 2005. The patients received 2 treatments with DEB-TACE separated by 2 months except in 1 patient, who developed persistent, complete arterial obstruction during the first procedure, precluding a second treatment, and the assessment of response was not done in 3 patients (2 because of liver abscess and 1 because of arterial dissection). 22 The diameter of DEB ranged between 500 μm and 700 μm, and these beads were preloaded with doxorubicin adjusted for body surface and bilirubin (100 mg/m2 in patients with bilirubin <1.5 mg/dL and 75 mg/m2 in patients with bilirubin 1.5-3 mg/dL; maximum dose, 150 mg; median dose, 143 mg).
The Percutaneous Ablation Cohort
This study included 42 patients and was designed to evaluate the use of contrast-enhanced ultrasound for assessing tumor response after percutaneous ethanol injection (PEI) or radiofrequency (RF) in patients with early HCC (BCLC stage A). 27 We included in this RECIST assessment the 31 patients who had basal and post-treatment computed tomography (CT) scans obtained with the same scan and within the period of time recommended by RECIST criteria.
Assessment of Response
All patients underwent a multiphasic study (nonenhanced, arterial, portal and late venous phases) performed with a helical CT scanner (Somatom Plus 4; Siemens, Erlangen, Germany) with 120 mL of nonionic contrast agent. Slice collimation was 5 mm in the arterial and portal phases and 8 mm collimation in the other 2 phases, and the pitch was 1.5. Images were reconstructed at 5-mm or 8-mm intervals. CT scans were obtained within 1 month before treatment, 1 month after percutaneous ablation or the second DEB-TACE treatment, and 3 months after the last procedure. Tumor assessment was made using 5-mm interval, axial reconstructed images obtained during the arterial phase. Response was defined according to RECIST criteria 5 and the EASL recommendation using WHO criteria ad taking into account tumor necrosis recognized by nonenhanced areas. According these latter criteria, a CR was defined as the absence of enhanced tumor areas, reflecting complete tissue necrosis; a partial response (PR) was defined a decrease >50% of enhanced areas, reflecting partial tissue necrosis; PD was defined as an increase >25% in the size of ≥1 measurable lesion(s) or the appearance of new lesions; and SD was defined as a tumor response between PR and PD.21 The responses were evaluated blindly by 2 experienced abdominal radiologists (C.A. and J.R.); in case of a discrepancy in response assessment, the radiologists reviewed the images together, and a decision was reached by consensus.
The values for baseline patients characteristics were expressed as the mean ± standard deviation or the median and range, as appropriate. Comparisons between groups were done by using the Student t test or the Mann-Whitney test for continuous variables and the chi-square test or the Fisher exact test for categorical variables. Agreement between both tumor response criteria was assessed by κ coefficient. A conventional P value of .05 was considered statistically significant. Calculations were done with the SPSS software package (SPSS, Inc., Chicago, Ill).
In total, 55 patients were included in this study: Twenty-four patients received by DEB-TACE, and 31 patients underwent percutaneous ablation (14 by RF and 17 by PEI). Table 1 summarizes the main characteristics of these 2 cohorts. All patients were cirrhotic, and the majority of them were secondary to hepatitis C viral infection (n = 39 patients; 70.9%). The median age was 67 years (range, 41-79 years), and there was a predominance of men in both groups (n = 38 men; 69.1%). Liver function was preserved in both groups but was significantly better in the DEB-TACE group. This is because well preserved liver function is required in patients who are candidates for TACE. The median basal α-fetoprotein level was 15 ng/mL, and there were no statistically significant differences between the 2 groups. The size and the number of nodules were significantly larger in the DEB-TACE group, as expected, and all patients in that group fit within the BCLC intermediate stage. Conversely, most patients in the PEI/RF group presented with a solitary foci, and the nodule size did not exceed 50 mm.
Table 1. Main Patient Characteristics
|Age: Median (range), y||65.5 (51-79)||69 (41-78)||67 (41-79)||NS|
|No. of men/women||19/5||19/12||38/17||NS|
|No. with HCV/HBV/ethanol/others||13/1/6/4||26/2/3/0||39/3/9/4||NS|
|Child A/B disease, % of patients||100/0||73/27||85/5||.007|
|Basal AFP, ng/mL|| || || || |
| Median (range)||26.5 (1-78,487)||13 (1-1486)||15 (1-78,487)||NS|
| <10 ng/mL, %||42||35||38|| |
| 10-100 ng/mL, %||37||55||47|| |
| >100 ng/mL, %||21||10||15|| |
|AST, mean±SD UI/L||81.3±53.9||89.1±70.5||85.7±63.3||NS|
|ALT, mean±SD UI/L||95.4±94.7||77.4±68.7||85.2±80.8||.04|
|Bilirubin, mean±SD mg/dL||1±0.4||1.3±0.9||1.2±0.7||NS|
|Alkaline phosphatase, mean±SD UI/L||253.8±147.8||290±143.8||273.2±145.4||NS|
|GGT, mean±SD UI/L||152.7±105.4||98.3±81.9||193±96.3||NS|
|Albumin, mean±SD UI/L||40.4±4||37.4±4.6||38.8±4.5||.02|
|Prothrombin rate, mean±SD %||84.1±7.5||78.4±14.9||80.8±12.3||NS|
|Total nodule size: Median (range), mm||66.5 (33-150)||21 (14-52)||34 (14-150)||<.001|
|No. of nodules: Median (range)||2 (1-8)||1 (1-2)||1 (1-8)||<.001|
Tumor response was evaluated 1 month after treatment in all patients and at 3 months after the procedure in all of the 24 patients who received DEB-TACE and in 26 patients who underwent PEI/RF; this second CT scan evaluation was not performed in 5 patients: In 3 patients, a CR was not achieved by the percutaneous ablation, and retreatment was conducted; 1 patient who had a CR according to the EASL guidelines underwent liver transplantation 2 months after PEI, and the explant analysis confirmed complete necrosis of the tumor; and, finally, in 1 patient who achieved a CR according to the EASL guidelines, the follow-up assessment at 3 months was based on contrast-enhanced ultrasound, whereas a CT scan was delayed to 6 months later. At that time point, the response was classified again as complete.
Evaluation of tumor response is summarized in Table 2. According to EASL criteria, an objective response was achieved in 87.1%, 75%, and 81.8% of patients in the PEI/RF group, the DEB-TACE group, and the total cohort, respectively. Because of its curative capability, as expected, the CR rate was significantly greater in the PEI/RF group compared with the DEB-TACE group (74.2% vs 29.2%, respectively; P = .009). Conversely, according to RECIST criteria, the PEI/RF group did not achieve any objective response, SD was obtained by in only 19 patients (61.3%), and 12 patients (38.7%) had PD. In the DEB-TACE group, no CR was achieved, a PR was achieved by 12 patients (50%), 7 patients (29.2%) had SD, and 5 patients (20.8%) had PD. Therefore, although DEB-TACE is a locoregional therapy with no curative capacity, the objective response rate was significantly greater than that achieved with PEI/RF when RECIST criteria were considered (0% vs 50%, respectively; P < .001).
Table 2. Tumor Response Evaluated by European Association for the Study of the Liver Guidelines and Response Evaluation Criteria in Solid Tumors
|DEB-TACE|| || || || || |
| RECIST||0 (0)||12 (50)||7 (29.2)||5 (20.8)||12 (50)|
| EASL||7 (29.2)||11 (45.8)||1 (4.2)||5 (20.8)||18 (75)|
|PEI/RF|| || || || || |
| RECIST||0 (0)||0 (0)||19 (61.3)||12 (38.7)||0 (0)|
| EASL||23 (74.2)||4 (12.9)||1 (3.2)||3 (9.7)||27 (87.1)|
|Total|| || || || || |
| RECIST||0 (0)||12 (21.8)||26 (47.3)||17 (30.9)||12 (21.8)|
| EASL||30 (54.5)||15 (27.3)||2 (3.6)||8 (14.5)||45 (81.8)|
Table 3 summarizes the comparison of tumor response evaluation between the EASL and RECIST criteria. RECIST criteria, as discussed above, did not identify an objective response in any patient who underwent PEI/RF despite the curative potential of this procedure; and, paradoxically, among the 23 CRs that were identified by EASL criteria, RECIST classified the same responses as SD in 17 patients and PD in 6 patients. Similarly, the 4 PRs by that were identified by EASL criteria were classified as SD in 1 patient and PD in the remaining 3 patients. Finally, RECIST classified all responses that were categorized as PD by EASL criteria in the same category. The evident lack of a correlation between these tumor response criteria was reflected by κ coefficients of 0.301 (95% confidence interval [CI], 0.0498-0.552; P < .003), 0.0761 (95% CI, 0.0042-0.1481; P = .002), and 0.193 (95% CI, 0.0893-0.2967; P < .001) in the DEB-TACE group, the PEI/RF group, and both groups, respectively.
Table 3. Correlation of Tumor Response Evaluation Between European Association for the Study of the Liver Guidelines and Response Evaluation Criteria in Solid Tumors
|DEB-TACE|| || || || ||0.301||0.0498-0.552||<.003|
| CR||0||6||1||0|| || || |
| PR||0||6||5||0|| || || |
| SD||0||0||1||0|| || || |
| PD||0||0||0||5|| || || |
|PEI/RF|| || || || ||0.0761||0.0042-0.1481||<.002|
| CR||0||0||17||6|| || || |
| PR||0||0||1||3|| || || |
| SD||0||0||1||0|| || || |
| PD||0||0||0||3|| || || |
|Total|| || || || ||0.193||0.0893-0.2967||<.0001|
| CR||0||6||18||6|| || || |
| PR||0||6||6||3|| || || |
| SD||0||0||2||0|| || || |
| PD||0||0||0||8|| || || |
The accurate evaluation of response to treatment is a key aspect in cancer therapy, because an objective response may become a surrogate marker of improved survival. Therefore, the development of reliable and reproducible criteria universally applicable to allow comparisons between different groups is mandatory. Thus, the National Cancer Institute developed RECIST; and, since its publication, it has been used commonly as a tool in oncology investigations. However, RECIST evaluates only unidimensional tumor measurements and disregards the extent of necrosis, which is the objective of all effective locoregional therapies widely used for HCC, including ablation and intra-arterial procedures like chemoembolization. In light of these findings, the Barcelona-2000 EASL clinical guidelines recommended that assessment of tumor response should be performed taking into account the reduction in viable tumor burden as recognized by nonenhanced areas by dynamic CT or magnetic resonance imaging studies, 21 and this criterion has been used widely by the majority of groups dealing with HCC.
To our knowledge, there are no prospective studies specifically aimed at comparing the EASL criteria versus RECIST in patients with HCC who received locoregional treatments. However, several studies have established an association between complete necrosis of the tumor and survival. We recently evaluated the outcome predictors in a cohort of 282 patients with early HCC who underwent percutaneous ablation in a single center. An initial CR was achieved by 192 patients and was maintained until the end of follow-up by 80 patients. In that study, a multivariate analysis of the predictors of survival clearly demonstrated that initially achieving complete tumor necrosis was associated with significantly better survival (odds ratio, 1.83; 95% CI, 1.1-3.1; P = .020). 25 Those results were reproducible by an Italian study in a similar population.28 Other studies have evaluated the response with both RECIST and EASL criteria after radioembolization with Yttrium-90. Sato et al23 published their preliminary data evaluating tumor response after radioembolization with Yttrium-90 by using WHO criteria, RECIST, and EASL criteria and reported objective tumor response rates of 24%, 31%, and 71%, respectively. In addition, WHO criteria and RECIST detected no CRs, whereas 10% of patients achieved a CR according to EASL criteria. Furthermore, the same group recently analyzed tumor responses in 76 HCC lesions that were treated by Yttrium-90 using RECIST, the WHO criteria, the size of necrosis, and RECIST combined with necrosis. The tumor response rate was 23% according to RECIST, 26% according to WHO criteria, 57% according to necrosis criteria, and 59% according to the combined criteria; in addition, response was detected earlier according to necrosis criteria compared with the use of size criteria alone. For these reasons, those authors concluded that the use of combined size and necrosis criteria might lead to a more accurate assessment of response to Yttrium-90 radioembolization than size criteria alone.29
In our study, RECIST criteria missed all CRs obtained by tumor necrosis and underestimated the extent of partial tumor response because of tissue necrosis. This finding was particularly relevant for patients who received percutaneous ablation. These therapies have curative capability, and they constitute 1 of the most commonly used treatments in HCC, particularly in those countries where liver transplantation is not feasible. Ablation repeatedly has produced high CR rates (>90%) if necrosis is taken into account. 30-32 However, if RECIST were used, then the rate of initial objective response would be near 0%. Thus, its established efficacy would be neglected.
Similar conclusions were reached when assessing transarterial procedures. The objective of these treatments is to achieve ischemic necrosis of the tumors, which is associated with increase in survival. 26, 33, 34 Therefore, it seems reasonable to evaluate response by measuring the extent of tumor necrosis instead of the mere measurement of size by unidimensional or bidimensional systems. Again, RECIST criteria underestimate tumor response to these treatments, and its use should be discouraged in this setting.
Finally, with the advent of biologic therapies, additional concerns about the reliability of RECIST criteria will emerge. Several promising molecules implicated in cancer cell signaling currently are under phase 2/3 evaluation. The majority of these treatments have demonstrated cytostatic, rather than cytotoxic, properties, so that shrinkage of the tumor after treatment is not expected, and RECIST would not detect the potential efficacy of these promising approaches. 18, 19 This is true for bevacizumab in metastatic colorectal cancer,35 erlotinib in nonsmall cell lung cancer,36 temsirolimus in renal cancer,37 and, more recently, sorafenib in liver cancer, the sole systemic agent that unequivocally has improved survival in patients with HCC.38 Because of the finding that most of these agents act through the inhibition of neoangiogenesis, the assessment of necrosis could evaluate a potential tumor response more accurately, and it would prevent the rejection of a promising treatment because of an underestimation of its real antitumor activity.
In summary, the current analysis indicates that RECIST has no value in the assessment of tumor response after locoregional therapies in patients with HCC. Thus, RECIST should not be used, and this finding prompted us to recommend basing the evaluation of response on measurements of the reduction in viable tumor burden as recognized on dynamic imaging studies. 39
Conflict of Interest Disclosures
This action has been performed within the cooperation framework established by the Transversal Cancer Action approved by the Council of Ministers on October 11, 2007, in accordance with the agreement between The Carlos III Health Institute (ISCIII), which is an autonomous entity currently belonging to the Ministry of Science and Innovation, and the Spanish Biomedical Research Network (CIBER) for the area of hepatic and digestive disorders. Supported in part by a grant from the Carlos III Institute of Health (PI 06/132). Alejandro Forner is supported in part by a grant from the Carlos III Institute of Health (PI 05/645) and from the BBVA Foundation; Maria Varela is supported by a grant from the Scientific Foundation of the Spanish Cancer Association; and Amelia J. Hessheimer, Carlos Rodriguez-Lope, and Maria Reig are supported by a grant from the BBVA Foundation.