Potential conflict of interest: W. Sieghart received speaker and consulting fees and research grants from Bayer Schering Pharma; H. Heinzl received a research grant from Roche; M. Trauner received speaker fees and travel grants from Roche; M. Peck-Radosavljevic received speakers and consulting fees and research grants from Bayer Schering Pharma, Lilly Pharma, and Boehringer Ingelheim.
We aimed to establish an objective point score to guide the decision for retreatment with transarterial chemoembolization (TACE) in patients with hepatocellular carcinoma (HCC). In all, 222 patients diagnosed with HCC and treated with multiple TACE cycles between January 1999 and December 2009 at the Departments of Gastroenterology/Hepatology of the Medical Universities of Vienna (training cohort) and Innsbruck (validation cohort) were included. We investigated the effect of the first TACE on parameters of liver function and tumor response and their impact on overall survival (OS, log rank test) and developed a point score (ART score: Assessment for Retreatment with TACE) in the training cohort (n = 107, Vienna) by using a stepwise Cox regression model. The ART score was externally validated in an independent validation cohort (n = 115, Innsbruck). The increase of aspartate aminotransferase (AST) by >25% (hazard ratio [HR] 8.4; P < 0.001), an increase of Child-Pugh score of 1 (HR 2.0) or ≥2 points (HR 4.4) (P < 0.001) from baseline, and the absence of radiologic tumor response (HR 1.7; P = 0.026) remained independent negative prognostic factors for OS and were used to create the ART score. The ART score differentiated two groups (0-1.5 points; ≥2.5 points) with distinct prognosis (median OS: 23.7 versus 6.6 months; P < 0.001) and a higher ART score was associated with major adverse events after the second TACE (P = 0.011). These results were confirmed in the external validation cohort and remained significant irrespective of Child-Pugh stage and the presence of ascites prior the second TACE. Conclusion: An ART score of ≥2.5 prior the second TACE identifies patients with a dismal prognosis who may not profit from further TACE sessions. (HEPATOLOGY 2013;57:2261–2273)
Hepatocellular carcinoma (HCC) is the fifth most common cancer worldwide, and develops predominately in patients with liver cirrhosis.1 The Barcelona Clinic Liver Cancer (BCLC) staging system2, 3 links tumor characteristics with liver function and evidence-based therapeutic strategies and has been endorsed by the European4 and the American5 HCC management guidelines. Accordingly, curative treatments, like orthotopic liver transplantation (OLT), resection, or radiofrequency ablation (RFA) are reserved for patients with early stage HCC (BCLC stage 0/A). Unfortunately, HCC is commonly diagnosed at intermediate (BCLC stage B) or advanced (BCLC stage C) tumor stages6, 7 where only palliative treatment options can be offered, resulting in a limited overall survival (OS) of 11-20 months.
Transarterial chemoembolization (TACE) is the recommended treatment modality for asymptomatic, large, or multifocal HCC without macrovascular invasion or extrahepatic metastasis (intermediate HCC, BCLC stage B). As most patients with HCC also suffer from liver cirrhosis, not only tumor characteristics but also the degree of liver dysfunction are of prognostic importance for patients undergoing TACE. Several studies showed8 that baseline tumor characteristics like tumor size or extent, alpha-fetoprotein (AFP) values, as well as baseline Child-Pugh score, presence of ascites, and several baseline lab values, e.g., AST9 are associated with OS of HCC patients.
Furthermore, tumor-related dynamics after TACE are important for patient prognosis, as radiologic and biochemical (AFP) tumor responses have been associated with improved patient outcome.10-12
Finally, deterioration of liver function after TACE may negatively impact the patient prognosis and liver function may further worsen after repeated TACE sessions or even obviate any consequent antitumor treatment.
The aim of this study was to establish a clinically usable point score to guide the decision for retreatment with TACE in patients with HCC. Using a stepwise multivariate regression model we developed a novel point score predicting patient outcome with respect to patient characteristics prior to the second TACE as well as the dynamic of tumor and liver-function related parameters after the first TACE session.
Materials and Methods
All patients, >18 years old at the time of the first TACE cycle, diagnosed with HCC by histology or dynamic imaging (computed tomography [CT] / magnetic resonance imaging [MRI] scans) according to the European Association for the Study of the Liver (EASL) diagnostic criteria4 who were treated with conventional TACE (cTACE), transarterial embolization (TAE), or TACE with drug-eluting beads (DEB-TACE) (hereafter summarized and referred to as TACE) at the Department of Gastroenterology and Hepatology of the Medical University of Vienna between January 1999 and December 2009 (n = 231) were screened for eligibility (Fig. 1).
Patients with HCC at BCLC stage A or B and preserved liver function (Child-Pugh stage A or B) who received at least two TACE sessions within 3 months (≤90 days) were included and formed the training cohort for all further analysis.
Patients were excluded if they received TACE before OLT or resection, or if patients received TACE for HCC recurrence after OLT. Additionally, patients who received TACE despite poor liver function (Child-Pugh C) and patients at BCLC stage C were excluded.
The results of the training cohort were then confirmed in an independent external validation database. This database includes all HCC patients >18 years diagnosed by dynamic imaging (CT/MRI) or histology according to EASL diagnostic criteria4 who received TACE between January 2001 and January 2008 at the Medical University of Innsbruck (n = 252). The selection criteria for the validation cohort were the same as for the training cohort (Fig. 1).
In both institutions the presence of Child-Pugh C cirrhosis, portal vein thrombosis, or Eastern Cooperative Oncology Group (ECOG) >1 were considered contraindications for retreatment with TACE.
Collection of Data.
This study was approved by the Ethics Committees of the Medical Universities of Vienna and Innsbruck.
Baseline imaging (triphasic CT/MRI scan) was performed 5-7 days before the first TACE session. HCC was staged according to the BCLC classification2, 3 and by the International Union Against Cancer (UICC) tumor node metastasis (TNM) classification, 6th edition.13
In both institutions, radiologic tumor response was assessed by CT/MRI scan prior to the second TACE session (maximal 90 days after the first TACE) according to EASL criteria.4 Objective tumor response was defined as partial response to the first TACE session, while stable disease (SD) and progressive disease (PD) were judged as the absence of objective tumor response. Patients with complete response (CR) after the first TACE did not receive a further TACE session and were therefore not included into this study analysis.
All laboratory values including AFP levels as well as liver function parameters including the Child-Pugh score14 were determined 1 day before the first TACE session and 1 day before the second TACE session.
Additionally, we determined the dynamic of the Child-Pugh score (hereafter designated Child-Pugh score increase) between the timepoints pre-TACE-1 and pre-TACE-2. All other changes of liver function related laboratory parameters (AST, alanine aminotransferase [ALT], etc.) between the first and second TACE were performed as outlined in the Statistics section.
AFP response was defined as an AFP decrease by 50% from pre-TACE-1 values of ≥200 kU/L.12 We formed three AFP groups for univariate analysis: pre-TACE-1 AFP ≥200 kU/L with response versus pre-TACE-1 AFP ≥200 kU/L and no response versus pre-TACE-1 AFP levels <200 kU/L.
We recently demonstrated15 that elevated C-reactive protein (CRP) values have a strong prognostic significance for patients with HCC. Thus, CRP values (<1 mg/dL and ≥1 mg/dL) prior the second TACE were included into statistical analysis.
Adverse events that occurred within 4 weeks after TACE or were unequivocally TACE-related were documented according to the Common Terminology Criteria for Adverse Events v. 3.0 (CTCAE).16
At the Medical University of Vienna, TAE, cTACE, and DEB-TACE were performed as described.17, 18 Also, the TACE technique used at the Medical University of Innsbruck (validation cohort) has been reported.19 More information is outlined in the Supporting Methods section. Both institutions used a treatment on demand TACE schedule and no TACE session was performed in the presence of complete radiologic response.
Study Design and Statistical Analyses.
The study design is provided in Fig. 2. Patient characteristics prior to the first and second TACE are presented with descriptive statistics. The chi-squared test (Fisher's exact test) was used to compare quantitative outcome between groups.
OS was defined as the time from the day prior to the second TACE session until death or last follow-up. Survival curves were calculated using the Kaplan-Meier method. Median survival times (OS) and their 95% confidence intervals (CIs) are reported.
The log-rank test was used to assess the effects of patient variables (pre-TACE 1 and pre-TACE 2) as well as tumor response variables and variables representing worsening of liver function (between TACE-1 and TACE-2) on OS. The effect of continuous variables (e.g., AST, ALT, γGT etc.) on OS was assessed for each variable by forming four groups at its quartiles. When the respective log-rank test was significant, a spline-based approach was applied to assess the functional form of the variable on OS.20 Based on this graphical representation a clinically sensible and applicable transformation of the respective variable was chosen.
Variables with P < 0.05 in the univariate analysis were entered as candidate variables into a stepwise Cox regression model (conditional backward selection).
The regression coefficients of the Cox regression model were multiplied by 2 and rounded in order to obtain easy to use point numbers facilitating the bedside calculation of the ART score.
To avoid overoptimistic results due to model fitting and evaluation in the same dataset, we evaluated the prognostic performance of the ART score in an independent external validation cohort.
All reported P-values are results of two-sided tests. A significance level of 0.05 was applied throughout. Statistical analyses were performed using IBM SPSS v. 20.0 (SPSS, Armonk, NY) and SAS 9.3 (SAS Institute, Cary, NC).
Patient characteristics of both cohorts prior to the first and second TACE are shown in Table 1. In the training cohort (n = 107), the majority of patients were at BCLC stage B (n = 94, 88%) and 27% of patients (n = 29) had received an antitumor therapy prior the first TACE including liver resection (n = 6), PEI (n = 19), and RFA (n = 4), while 73% of patients received TACE as first-line therapy (n = 78).
Table 1. Patient Characteristics
Abbreviations: HBV, hepatitis B virus infection; HCV, hepatitis C virus infection; BCLC, Barcelona Clinic Liver Cancer; TNM, Tumor Nodes Metastasis.
Four missing Child stages in the validation cohort due to a missing value.
Two missing value in the validation cohort.
Four missing values in the training cohort, 2 in the validation cohort.
Most patients (n = 77, 72%) were treated with chemoembolization (cTACE versus DEB-TACE: n = 56 versus 21), while 28% patients (n = 30) received TAE only. Between TACE 1 and TACE 2, 32 patients suffered from a Child-Pugh score increase by at least 1 point, while 59 patients showed no change and 16 patients showed a decrease of the Child-Pugh score by at least 1 point. Prior to the second TACE, the majority of patients (n = 72, 67%) had Child-Pugh A cirrhosis. Overall, the median number of TACE interventions was 3 (range 2-12). The median time interval between the first and second TACE was 45 days (range 13-90).
In the validation cohort (n = 115), the majority of patients were at BCLC stage B (n = 79, 69%) and 9 patients (8%) had received an antitumor therapy prior the first TACE including liver resection (n = 7) and RFA (n = 2). In all, 114 patients were treated with cTACE with lipiodol and epirubicin, and one patient received DEB-TACE. Between TACE 1 and TACE 2, 27 patients suffered from a Child-Pugh score increase by at least 1 point, while 66 patients showed no change and 18 patients showed a decrease of the Child-Pugh score by at least 1 point. Prior the second TACE, most patients had Child-Pugh A cirrhosis (n = 69, 62%). Overall, the median number of TACE interventions was 3 (range 2-20). The median time interval between the first and second TACE was 42 days (range 26-85).
Univariate Analysis of Prognostic Factors in the Training Cohort.
In the training cohort (n = 107), 88% of the patients (n = 94) died during the observational period between January 1999 and December 2011, and 12% patients (n = 13) were still alive (n = 8) or lost to follow-up (n = 5). The median OS of the whole training population was 16.2 months (95% CI, 13.4-19.0) (Table 2). The median time of follow-up was 70.5 months.
Table 2. Univariate Analysis of Prognostic Factors in HCC Patients Treated with TACE in the Training Cohort
Of the patient characteristics (Table 1), only Child-Pugh stage (pre-TACE 2, P = 0.004), tumor extent (pre-TACE 1, P = 0.047), and CRP-levels (pre-TACE 2, P = 0.001) had a significant impact on OS (Table 2).
Tumor response variables like radiologic tumor response (median OS: response versus nonresponse: 18.8 versus 9.3 months [95% CI: 14.2-23.4 versus 7.3-11.4 months], P = 0.001), as well as an AFP decrease >50% (median OS: AFP response versus no AFP response versus baseline AFP <200 kU/L: 16.7 versus 8.5 versus 16.7 months [95% CI: 12.1-21.3 versus 3.4-13.6 versus 12.5-20.9 months], P = 0.005) from baseline were significantly associated with a better outcome.
We next evaluated the impact of liver function deterioration between pre-TACE-1 and pre-TACE-2 on patient outcome. Of all liver function-related laboratory parameters, only the quartiles of AST increase were associated with a worse survival (data not shown). Subsequent spline-based analysis of the influence of AST increase on the hazard ratio of death revealed a clear sigmoid shape (Supporting Fig. 1). An HR of 1.5 (in comparison to an AST increase of 0%) was considered clinically relevant and corresponded to an (rounded) AST increase of 25%, which was therefore used as dichotomous cutoff for further analysis. An AST increase >25% was associated with a worse median OS (increase versus no increase: 6.4 versus 20.2 months [95% CI: 4.8-8.0 versus 15.9-24.6 months], P < 0.001). Similarly, an increase of CP score of 1 or more points after the first TACE was significantly associated with a poor median OS (Table 2).
Given that the time of AST and Child-Pugh score assessment was heterogeneous (13-90 days after TACE 1), we also evaluated whether time of assessment had any influence on our results. For this purpose, we formed two groups based on the median of the time interval between TACE 1 and TACE 2 and analyzed the distribution of the variable AST increase >25% and Child-Pugh increase with respect to the median time between TACE 1 and TACE 2. As shown in Supporting Table 1, there was no accumulation of AST increase >25% or Child-Pugh increase at earlier timepoints of assessment. Finally, we evaluated, whether the prognostic significance of AST increase >25% and Child-Pugh increase differed depending on the time of their assessment. As shown in Supporting Table 2, the time of assessment had no influence on the prognostic significance of both variables.
Stepwise Cox Regression Model with All Factors Predicting OS in the Training Cohort.
According to the univariate analysis (Table 2), the significant parameters Child-Pugh stage pre-TACE 2, tumor extent (pre-TACE 2, CRP levels (pre-TACE 2), AFP response, radiologic response, AST increase >25%, and Child-Pugh score increase were entered into a Cox regression analysis.
After the stepwise removal of variables which were not significant (step: 1: AFP response, P = 0.42; step 2: Child-Pugh stage, P = 0.15; step 3: tumor extent, P = 0.27; step 4: CRP, P = 0.12) only radiologic tumor response, AST increase of >25%, and Child-Pugh score increase of 1 point or ≥2 points (Table 3) remained significant predictors of OS. The calculated regression coefficients (B-values) were multiplied times 2 and rounded in order to facilitate the calculation of the ART score (Table 3).
Table 3. Results of Multivariate Stepwise Backward Cox Regression Analysis of Prognostic Factors in Patients With HCC Treated With TACE in the Training Cohort
The regression coefficients (B) were multiplied by 2 and rounded in order to facilitate the bedside calculation of the ART score.
Child-Pugh score increase
+ 1 point
+ ≥2 points
AST increase >25%
Radiologic tumor response
ART Score Predicts OS in the Training and the Validation Cohort.
We next calculated the ART score for all patients for whom all three parameters were available (training cohort: n = 97, validation cohort: n = 107).
In the training cohort, the ART score identified two subgroups with distinct prognosis (Fig. 3A). Patients with an ART score of 0-1.5 points had a median OS of 23.7 months (95% CI, 16.2-32.2 months). In contrast, patients with an ART score ≥2.5 points had a median OS of 6.6 months (95% CI, 4.5-8.8 months; P < 0.001) (Fig. 3B). The ART score performed equally well in all three transarterial techniques used in the training cohort (Fig. 3C-E).
Of patients in the training cohort with an ART score of 0-1.5 points (n = 60), 53 (88%) received more than 2 TACE sessions, while of patients with an ART score ≥2.5 points (n = 37), 24 (65%) received more than 2 TACE sessions (P = 0.006, chi-squared test). Of note, the ART score remained of prognostic significance no matter if the patients received overall ≤3 or >3 TACE cycles (Supporting Fig. 2a). Detailed information about how patients achieved an ART score of ≥2.5 points is provided in Supporting Table 3.
Crucially, the ART score showed similar results in the independent external validation cohort (n = 107; Fig. 3F; Supporting Fig. 2b): The median OS of patients with an ART score of 0-1.5 points (n = 74) was 27.6 months (95% CI, 22.5-33.5 months) and 8.1 months (95% CI, 5.7-10.5 months, P < 0.001) for patients with an ART score ≥2.5 points (n = 33).
Of patients in the validation cohort with an ART score of 0-1.5 points (n = 74); 55 (74%) received >2 TACE sessions, while of patients with an ART score ≥2.5 points (n = 33), 21 (64%) received >2 TACE sessions (P = 0.26, chi-squared test). Similar to the training cohort, the ART score remained of prognostic significance irrespective of the number of TACE cycles applied in the validation cohort (Supporting Fig. 2b)
ART Score Predicts OS in Several Clinical Subgroups.
The ART score remained a significant predictor of OS if the training or the independent validation cohort was stratified according to important clinical characteristics prior the second TACE: an ART score of ≥2.5 points identified subgroups of different prognosis in patients with Child-Pugh stages B7, B≥8, presence of ascites, and normal or elevated CRP levels (Fig. 4A-F).
Furthermore, higher ART score values were associated with more documented clinical adverse events within 4 weeks after the second TACE in both cohorts (Table 4).
Table 4. Association of ART Score With Adverse Events After Second TACE in the Training and Validation Cohorts
Pooled P-value for both cohorts (Fisher's exact test).
AE ≥ grade 3 within 4 weeks after first TACE
AE ≥ grade 3 within 4 weeks after second TACE
Unscheduled hospitalizations after second TACE
30-day mortality after second TACE
Most patients with HCC suffer from liver cirrhosis. Thus, deterioration of liver function after TACE may jeopardize a survival benefit from this treatment. In this regard, a panel of experts recently proposed an algorithm for retreatment with TACE.8 In this algorithm, deterioration of liver function after the first TACE was considered a reason to avoid further TACE cycles and to switch patients to other evidence-based treatments like sorafenib therapy.21
However, liver function deterioration was not defined in detail in this algorithm and may range from subtle changes in liver-related laboratory parameters to severe hepatic decompensation. The decision making for retreatment with TACE was therefore left to the subjective clinical judgment of the managing physician.8
The aim of this study was to establish an objective tool to guide the decision process for the retreatment with TACE in patients with HCC.
We found that both the lack of a radiologic tumor response and deterioration of liver function (defined as an AST increase >25% and/or an increase of the Child-Pugh score) after the first TACE were associated with a dismal prognosis for patients who were retreated with TACE. In our Cox regression model, these parameters remained independent and statistically significant, while baseline characteristics prior the first TACE dropped out (Table 3). These results strongly underline the importance of the antitumor and hepatic effects of the first TACE.
Therefore, we developed the ART score based on the regression coefficients of the significant variables of our multivariate regression model. The ART score identified two distinct groups of different prognosis and was a significant predictor of OS (Fig. 3A,B). Crucially, our findings obtained in the training cohort could be confirmed in an external, independent validation cohort (Fig. 3F). Furthermore, the ART score retained prognostic significance even if the training or the validation cohort was stratified according to different Child-Pugh stages, the presence of ascites, or CRP elevation (Fig. 4A-F) prior to the second TACE session as well as the overall number of TACE cycles applied (Supporting Fig. 2a,b).
Taking a closer look at these data, the ART score meets the goal to identify patients who will not profit from retreatment with TACE pretty well. Patients who gained 2.5 or more points after the first TACE had an OS of about 7 months, with a tight 95% CI of 5.7-10.5 months (Fig. 3F). Therefore, the OS of this BCLC stage B subpopulation identified by the ART score is as bad as the overall survival of the placebo group in the SHARP trial,22 which included patients with more advanced BCLC stage C. In contrast, patients with less than 2.5 points in the ART score had a good prognosis with a median OS of 28 months and a lower limit of the 95% CI of 22.5 months (Fig. 3F). This finding is especially relevant for patients with Child-Pugh B cirrhosis prior to the second TACE since the use of TACE in this patient population is still a matter of controversial debates.8 In our study, the ART score was even predictive in different Child-Pugh B subgroups (Child-Pugh B7 points and Child-Pugh B>8 points) (Fig. 4B,C). Of note, patients with a Child-Pugh stage B7 prior to the second TACE and a favorable ART score of 0-1.5 points had an excellent prognosis (Fig. 4B). A similar trend was observed in patients with a Child-Pugh stage B>8 prior the second TACE (Fig. 4C). In case of an ART score of 0-1.5 points these patients showed a similar OS as observed in BCLC stage B patients treated with sorafenib in the SHARP trial (OS for sorafenib versus placebo: 14.5 versus 11.4 months).23
These findings are of key clinical relevance for several reasons. First, the ART score is simple and easily applicable in a real-life clinical setting even in countries with limited healthcare resources. Second, the application of the ART score may protect patients with subtle, otherwise unrecognized or neglected laboratory changes from detrimental retreatment with TACE. Third, the use of the ART score may also prevent undertreatment with TACE. Difficulties encountered by the recent multicenter TACE trial (SPACE study: TACE + sorafenib or placebo) which excluded patients at Child-Pugh stage B>7 points and patients with ascites of any grade from further TACE sessions could have potentially been avoided by using such a score. In this otherwise well-designed study,24 the strict retreatment criteria were at least in part responsible for 36% of patients in the sorafenib and 19% in the placebo arm receiving only one baseline TACE session, even though only patients with the optimal baseline characteristics (BCLC B, Child-Pugh A, ECOG 0) were enrolled in this trial. The fact that almost a third of the patients in either group received further TACE sessions after they went off protocol further outlines the danger of inadequate retreatment criteria for protocol compliance and consequently the success of multicenter TACE studies. The ART score developed here is able to identify patients with good prognosis despite the presence of Child-Pugh stage B 7-9 points (Fig. 4B,C) or ascites (Fig. 4D) and would therefore provide a robust and objective evidence based tool to guide retreatment with TACE in future clinical trials.
Finally, regarding the association of higher ART score values with SAEs and unplanned admissions (Table 4) and poorer OS (Figs. 3, 4), the application of this score may spare patient suffering and consequential costs by avoiding treatment-related side effects.
The retrospective nature and the heterogeneous TACE types (TAE, cTACE, DEB-TACE) in the training cohort may be potential limitations of this study. However, we confirmed the results in all three TACE types in the training cohort (Fig. 3C-E) and in a completely independent external (Table 1, Figs. 3F, 4) patient cohort in which most patients received conventional TACE. Additionally, the outcome of our patient population within the different Child-Pugh stages (Table 2) matches the published survival data reported in prospective clinical trials and meta-analysis3 and, thus, further confirms the validity of our data. Another limitation may be the ART score assessment at heterogeneous timepoints between the first and second TACE (13-90 days), since the ART score is composed of laboratory changes that may be potentially reversible over time. However, time-related sensitivity analysis (Supporting Table 1-2) revealed no significant hint that the time of the ART score assessment influenced the results of this study. Finally, the ART score was developed by using the radiologic EASL-response criteria. Although the prognostic performance of EASL criteria in the setting of TACE seems to be equal to the performance of mRECIST criteria,25 the latter may be more adequate to dissect the prognosis of patients with partial response from that of subjects with stable disease.26 This could rely on a different definition of partial response in the two models: greater than 50% tumor reduction for EASL and greater than 30% for mRECIST criteria. Given that radiologic response is a parameter of the ART score, there is a need for prospective studies validating the ART score which include mRECIST criteria to the study design.
In summary, we developed a novel and externally validated, noninvasive, objective, widely applicable prognostic (ART) score for patients with HCC allocated to retreatment with TACE. Patients with 2.5 or more points in the ART score prior the second TACE may not profit from further TACE sessions. Based on the TACE retreatment algorithm published by Raoul et al.,8 we propose that these patients should rather receive other evidence-based treatments like, e.g., sorafenib therapy (Supporting Fig. 3). Our data warrant validation of this new concept in a prospective clinical trial.