Sixty-five gene-based risk score classifier predicts overall survival in hepatocellular carcinoma

Authors


  • Potential conflict of interest: Nothing to report.

Abstract

Clinical application of the prognostic gene expression signature has been delayed due to the large number of genes and complexity of prediction algorithms. In the current study we aimed to develop an easy-to-use risk score with a limited number of genes that can robustly predict prognosis of patients with hepatocellular carcinoma (HCC). The risk score was developed using Cox coefficient values of 65 genes in the training set (n = 139) and its robustness was validated in test sets (n = 292). The risk score was a highly significant predictor of overall survival (OS) in the first test cohort (P = 5.6 × 10−5, n = 100) and the second test cohort (P = 5.0 × 10−5, n = 192). In multivariate analysis, the risk score was a significant risk factor among clinical variables examined together (hazard ratio [HR], 1.36; 95% confidence interval [CI], 1.13-1.64; P = 0.001 for OS). Conclusion: The risk score classifier we have developed can identify two clinically distinct HCC subtypes at early and late stages of the disease in a simple and highly reproducible manner across multiple datasets. (HEPATOLOGY 2011)

Hepatocellular carcinoma (HCC) is the third leading cause of cancer death worldwide and accounts for an estimated 600,000 deaths annually.1 Although surgical resection for HCC provides the best chance for a cure, the prognosis after surgery differs considerably among patients. Because of this clinical heterogeneity, predicting the recurrence or survival of HCC patients after surgical resection remains challenging. An accurate stratification reflecting the prognosis of HCC patients would help select the therapy with the potential to confer the best survival, so considerable effort has been devoted to establishing such a stratification (or staging) model for HCC by using clinical information and pathological criteria.2, 3 Currently, several clinical classification systems, including Cancer of the Liver Italian Program, the Barcelona-Clinic Liver Cancer (BCLC), the Chinese University Prognostic Index, and the Japanese Integrated Staging schema have been developed and used in clinics.4–7 Although these staging systems have proven useful,8 their predictive accuracy remains limited and they failed to provide biological characteristics of HCC that might account for the clinical heterogeneity.

With the recent advances in gene expression profiling technology, improvement in prediction models for risk assessment in HCC has been reported.9–18 However, although these gene expression signatures might better reflect the biological characteristics of HCC tumors, the complexity of prediction models based on such signatures has hampered their clinical usefulness. To overcome this limitation, we developed a simple risk scoring system that can predict overall survival (OS) of patients after surgical resection for HCC.

Abbreviations

AUC, area under the curve; BCLC, Barcelona-Clinic Liver Cancer; GEO, Gene Expression Omnibus; HBsAg, HBV surface antigen; HBV, hepatitis B virus; HCC, hepatocellular carcinoma; INSERM, Institute for Health and Medical Research; LCI, Liver Cancer Institute; LOOCV, leave-one-out-cross-validation; MSH, Mount Sinai Hospital; NCBI, National Center for Biotechnology Information; NCI, National Cancer Institute; OS, overall survival; ROC, receiver-operating characteristic.

Materials and Methods

Patients and Gene Expression Data.

Gene expression and clinical data from the National Cancer Institute (NCI), Mount Sinai Hospital (MSH), and Liver Cancer Institute (LCI) HCC cohorts, as reported in previous studies, were acquired from the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database (accession numbers GSE1898, GSE4024, GSE9843, and GSE14520).11, 13, 15–17 Gene expression data from HCC patients at the French National Institute for Health and Medical Research (INSERM) were obtained from ArrayExpress, another public microarray database (accession number E-TABM-36).9

In addition to these gene expression data from previous studies, we included gene expression data from 100 patients with HCC (the Korean cohort) as an independent validation cohort for the risk score. Tumor specimens and clinical data were obtained from HCC patients undergoing hepatectomy as primary treatment for HCC at Seoul National University, Seoul, and Chonbuk National University, Jeonju, Korea. One hundred surgically removed frozen HCC specimens were used for microarray experiments. Samples were frozen in liquid nitrogen and stored at −80°C until RNA extraction. The study protocols were approved by the Institutional Review Boards at both institutions, and all participants provided written, informed consent. Gene expression data from the Korean cohort were generated using the Illumina microarray platform (Illumina, San Diego, CA). Patients in the Korean cohort were followed up prospectively at least once every 3 months after surgery.

Most of the patients in the two validation cohorts were men (83% for Korean cohort and 87.5% for LCI cohort), Child-Pugh class A (92% for Korean cohort and 87% for LCI cohort), and had cirrhosis (64% for Korean cohort and 92.0% for LCI cohort). Hepatitis B virus (HBV) infection was determined by serological positivity for HBV surface antigen (HBsAg) or anti-HBe antibodies. All patients received surgical resection and a majority of patients had a single tumor at the time of resection (96% for Korean cohort and 78% for LCI cohort). Patients were staged according to the TNM 6th edition (2006) and Barcelona Clinic Liver Cancer staging system.19 Tumor size was based on the largest dimension of the tumor specimen. Tumor grade was scored using the modified nuclear grading scheme outlined by Edmondson and Steiner.20 Grades 1 and 2 were defined as well-differentiated and grades 3 and 4 as moderately/poorly differentiated. The majority of patients in the three cohorts had not received anti-HBV treatment after surgery. The Eastern Cooperative Oncology Group (ECOG) performance score of all patients was 0 or 1. The presence of cirrhosis was also confirmed on the surgical specimen. OS was defined as the time from surgery to death and censored when a patient was alive at last contact.

Table 1 shows the pathologic and clinical characteristics of the patients in all five cohorts. All patients had undergone surgical resection as their primary treatment. Patient data were retrospectively collected from medical records. BCLC staging is based on preoperation data, and vasculature invasion is pathologically defined as the presence of endolymphatic or lymphovascular tumor emboli within tumors. Survival data are not publicly available for the MSH and INSERM cohorts; thus, these patients were not used for survival analyses.

Table 1. Clinical and Pathological Features of HCC Patients
Variable NCI CohortKorean CohortLCI CohortMSH CohortINSERM Cohort
  1. a

    Alpha-fetoprotein.

  2. b

    Hepatitis B virus.

  3. c

    Gender information is not available for patient who received liver transplantation.

Number of patients 1391001929157
 male102 (73%)83 (83%)168 (87.5%)27 (30%)46 (81%)
 female37 (27%)17 (17%)24 (12.5%)54 (59%)11 (19%)
 NA   10 (11%)b 
AgeMedian5755506866
 range19-8525-7021-7742-8018-79
AFPa (>300 ng/ml)+55 (40%)32 (32%)88 (46%)15 (17%) 
 73 (52%)68 (68%)104 (54%)54 (59%) 
 NA11 (8%)0 22 (24%)57
HBVb+ (%)63 (45%)58 (58%)187 (97.4%) 16 (28%)
 − (%)76 (55 %)17 (17 %)5 (2.6%) 41 (72%)
 NA 25 (25%) 91 (100%) 
Edmonson grade12 (1.4%)10 (10%)65 (34%)  
 257 (41%)51 (51%)116 (60%)  
 374 (53.3%)33 (33%)9 (5%)  
 46 (4.3%)6 (6%)0 (0%)  
 NA  2 (1%)9157
CirrhosisYes69 (49.6%)64 (64%)176 (91.6%)  
 No70 (50.4%)34 (34%)16 (8.4%)  
 NA02 (2%)09157
Child-Pugh classA 92 (92%)167 (87%)  
 B 4 (4%)25 (13%)  
 C 4 (4%)   
 NA139  9157
Vasculature invasionYes 48 (48%)42 (21%)  
 No 52 (52%)126 (66%)  
 NA139 24 (13%)9157
AJCC stageI 35 (35%)85 (44%)  
 II 17 (17%)76 (40%)  
 III 48 (48%)31 (16%)  
 IV 0 (0%)0  
 NA139  9157
BCLC stage0 0 (0%)17 (9%)  
 A 53 (53%)136 (71%)  
 B 37 (37%)21 (11%)  
 C 6 (6%)17 (9%)  
 D 4 (4%)0 (0%)  
 NA139  9157
Death 744340NANA

RNA Isolation, Microarray Experiments, and Gene Expression Data.

For generation of gene expression data from the Korean cohort, total RNA was isolated from tissue samples using a mirVana RNA Isolation labeling kit (Ambion, Austin, TX). Five hundred nanograms of total RNA were used for labeling and hybridization, in accordance with the manufacturer's protocols (Illumina). After the bead chips were scanned with an Illumina BeadArray Reader (Illumina), the microarray data were normalized using the quantile normalization method in the Linear Models for Microarray Data package in the R language environment (http://www.r-project.org).21 The expression level of each gene was transformed into a log-2 base for further analysis. Primary microarray data are available from the NCBI GEO public database (accession number GSE16757).

Statistical Analysis.

BRB-ArrayTools were primarily used for statistical analysis of gene expression data22 and all other statistical analyses were performed in the R language environment. We estimated patient prognoses using Kaplan-Meier plots and the log-rank test. Stratification of patients in the NCI cohort according to Seoul National University (SNU) recurrence signature was done as described.18

Receiver-operating characteristic (ROC) curve analyses were carried out to estimate discriminatory power of the prognostic gene expression signatures and clinical variables. We calculated the area under the curve (AUC), which ranges from 0.5 (for a noninformative predictive marker) to 1 (for a perfect predictive marker) and a bootstrap method (1,000 resampling) was used to calculate the 95% confidence internal (CI) for AUC.

We used multivariate Cox proportional hazards regression analysis to evaluate independent prognostic factors associated with OS, and as covariates we used a 65-gene risk score, tumor stages, and pathologic characteristics.23 P < 0.05 indicated statistical significance and all statistical tests were two-tailed. A heatmap of gene expression was generated using Cluster and TreeView software.24 GoMiner was used to group genes-based gene ontology (GO) characteristics of them.25

Development and Validation of a 65-Gene Risk Scoring System.

To generate a risk score, we adopted a previously developed strategy using the Cox regression coefficient of each gene among a 65-gene set from the NCI cohort.26 The risk score for each patient was derived by multiplying the expression level of a gene by its corresponding coefficient (risk score = sum of Cox coefficient of Gene Gi X expression value of Gene Gi). The patients were thus dichotomized into groups at high or low risk using the 50th percentile (median) cutoff of the risk score as the threshold value. The median risk score in the NCI cohort was 8.36. The coefficient and the threshold value (8.36) derived from the NCI cohort were directly applied to gene expression data from the Korean, LCI, MSH, and INSERM cohorts to divide the rest of the patients into high-risk and low-risk groups. Gene expression data and the master prediction model are available as Supporting Data 1.

Results

Sixty-five Gene Expression Signature in HCC and Development of the 65-Gene Risk Score.

To identify a limited number of genes whose expression pattern is significantly associated with the prognosis of HCC, we used two previously identified gene expression signatures. The NCI proliferation signature (1,016 gene features) was identified when two major clusters of HCC patients were uncovered by the hierarchical clustering method and the signature was found to be significantly associated with OS and recurrence-free survival (RFS).13, 15, 16 The Seoul National University (SNU) recurrence signature (628 gene features) was developed to predict the likelihood of recurrence after surgical treatment of HCC.18 We hypothesized that the genes present in both signatures would be better predictors than genes only present in one signature. Therefore, expression patterns of these genes would be sufficient to predict the prognosis of HCC patients. When the two gene lists were compared with each other, only 65 genes overlapped (Fig. 1A).

Figure 1.

Stratification of HCC patients in the NCI cohort with a 65-gene risk score. (A) Venn diagram of gene lists from two independently generated prognostic expression signatures. (B) Risk scores in the NCI cohort. Each bar represents the risk score for an individual patient. (C) Kaplan-Meier plots of the two subgroups in the NCI cohort stratified by risk score. (D) Kaplan-Meier plots of the two subgroups in the NCI cohort stratified by NCI proliferation signature. (E) Kaplan-Meier plots of the two subgroups in the NCI cohort stratified by SNU recurrence signature. OS, overall survival.

In order to develop a new risk assessment model for prognosis with 65 genes, we adopted a previously developed strategy that generates the risk score using the Cox regression coefficient of each gene in the prognostic signature.26 The risk score for each patient was calculated using the regression coefficient of each gene in the 65-gene signature (Table 2). HCC patients in the NCI cohort were then dichotomized into a high-risk and low-risk group for death using the 50th percentile cutoff (8.36) of the risk score as the threshold value (Fig. 1B). The OS rates were significantly lower in the patient group with the high risk score (P = 1.0 × 10−4 by the log-rank test; Fig. 1C). When predicted outcomes of new and a reduced model were compared with those from the original prognostic signatures, the statistical significance of the 65-gene risk score in discriminating between HCC patients with different prognoses is similar to the discriminatory power of the two original gene expression signatures (Fig. 1D,E). We also assessed predictive performance of 3-year OS of three prognosis models by calculating AUCs from ROC analysis. Not surprisingly, the AUC of the 65-gene risk score (0.68; 95% CI, 0.604-0.761) is highly similar to those from original prognosis models (Supporting Fig. 1). This result strongly suggests that the expression patterns of the 65 genes are sufficient to predict the prognosis of HCC patients, although this dataset represents only 5.8% of genes in the NCI proliferation signature and 10.3% of genes in the SNU recurrence signature.

Table 2. Regression Coefficients of 65 Genes from Univariate Cox Regression Analysis
GeneCoefficientSEZ-scoreP-valueHRHR 95% CI
ACSL5−0.3490.12−2.90.00370.7060.558-0.893
ADH1B−0.4010.193−2.080.0380.670.459-0.978
ADH6−0.2580.169−1.530.130.7720.554-1.08
ALDOA0.2260.1461.550.121.250.942-1.67
APOC3−0.2970.0885−3.360.000790.7430.625-0.884
AQP9−0.2690.066−4.084.50E-050.7640.671-0.87
ARPC20.2420.2071.170.241.270.849-1.91
BPHL−0.6450.215−2.990.00280.5250.344-0.8
C1orf115−0.5190.128−4.064.90E-050.5950.463-0.764
C4BPB−0.3530.0873−4.045.30E-050.7030.592-0.834
CDO1−0.3470.101−3.436.00E-040.7060.579-0.862
CHI3L1−0.1950.0652−2.990.00280.8230.724-0.935
COBLL1−0.3890.13−30.00270.6770.525-0.874
CRAT−0.7720.192−4.025.80E-050.4620.317-0.673
CRYL1−0.4390.112−3.948.10E-050.6440.518-0.802
CTSC0.03380.1390.2440.811.030.788-1.36
CXCR40.2960.1751.690.0911.340.954-1.89
CYB5A−0.2550.102−2.50.0120.7750.634-0.946
CYP27A1−0.380.114−3.350.000810.6840.547-0.854
CYP2J2−0.6130.165−3.722.00E-040.5410.392-0.748
CYP4F12−0.1980.125−1.590.110.820.643-0.105
DDIT40.3050.1412.170.031.361.03-1.79
EPHX2−0.3820.154−2.480.0130.6820.504-0.923
ETV50.2280.1811.260.211.260.881-1.79
F10−0.3250.0858−3.790.000150.7220.61-0.855
F30.7880.1744.545.70E-062.21.57-3.09
F5−0.3120.0946−3.30.000960.7320.608-0.88
GJB1−0.4210.096−4.391.10E-050.6560.544-0.792
GPHN−0.5210.229−2.270.0230.5940.379-0.93
HN10.4140.1472.820.00471.511.14-2.02
HNF4A−0.5670.213−2.660.00780.5670.373-0.862
IGFBP30.02520.09850.2560.81.030.846-1.24
IQGAP10.340.1861.830.0681.40.976-2.02
IQGAP2−0.5390.134−4.025.90E-050.5830.449-0.759
ITPR2−0.6320.186−3.40.000670.5310.369-0.765
KHK−0.5140.14−3.660.000250.5980.455-0.787
LAMB10.530.1473.60.000321.71.27-2.27
LECT2−0.1380.0702−1.970.0490.870.759-1.0
MST1−0.3120.0882−3.540.000410.7320.616-0.87
MTSS1−0.4350.118−3.70.000220.6470.514-0.815
PAH−0.3810.106−3.580.000340.6830.555-0.842
PFKFB30.5210.1683.110.00191.681.21-2.34
PKLR−0.3860.12−3.220.00130.680.537-0.86
PKM20.3580.1262.850.00441.431.12-1.83
PLG−0.2560.0753−3.40.000670.7740.668-0.897
PLOD20.09480.1340.7090.481.10.846-1.43
PPT10.1340.180.7460.461.140.804-1.63
RALA0.8780.233.810.000142.411.53-3.78
RGN−0.3920.0965−4.074.80E-050.6750.559-0.816
RGS10.2550.1112.30.0211.291.04-1.6
RGS20.2680.09372.860.00431.311.09-1.57
RNASE4−0.2580.147−1.760.0790.7720.579-1.03
SERPINA10−0.3910.143−2.740.00610.6760.511-0.894
SERPINC1−0.2280.0601−3.790.000150.7960.708-0.896
SERPINF2−0.3520.086−4.14.20E-050.7030.594-0.832
SFTPC−0.2690.183−1.480.140.7640.534-1.09
SLC22A7−0.4760.123−3.870.000110.6210.488-0.79
SLC2A2−0.380.0736−5.172.30E-070.6840.592-0.79
SLC30A1−0.3370.144−2.340.0190.7140.539-0.946
SLC38A10.1840.1411.30.191.20.911-1.59
SPHK10.3560.1532.330.021.431.06-1.93
SULT2A1−0.3510.087−4.045.40E-050.7040.593-0.835
TBX3−0.2940.15−1.950.0510.7450.555-1.0
TM4SF10.3210.1043.070.00211.381.12-1.69
TSPAN30.4160.1842.260.0241.521.06-2.17

To test whether genes not shared by two prognostic signatures have similar discriminatory power, two additional risk scores were generated from 65 genes that were randomly selected from nonoverlapped gene lists in each prognostic signature and applied to NCI and SNU cohorts. As expected, the NCI proliferation signature risk score showed significant predictive performance on patients in NCI cohorts (Supporting Fig. 2B). However, it failed to show significant predictive performance on patients in SNU cohorts (Supporting Fig. 2C). The SNU recurrence signature risk score also showed opposite predictive performance on patients from two different cohorts (Supporting Fig. 2B,C). However, common gene risk scores showed consistent predictive performance on patients from both cohorts. These data suggest that genes shared in two independent prognostic signatures might be more robust than those only present in one signature.

Validation of the 65-Gene Risk Score.

We next sought to validate the risk score using expression data of the 65 genes from the independent HCC cohort. Gene expression data for 100 tumors from Korean patients with HCC were collected and used as an independent test set. The coefficient and threshold value (8.36) derived from the NCI cohort were directly applied. When patients in the Korean cohort were stratified according to their risk score, the patient group with a low risk score had a significantly better prognosis (P = 5.6 × 10−5 for OS, log-rank test) (Fig. 2A) than patients with a high risk score. The risk score was further validated in another independent cohort (LCI cohort, P = 5.0 × 10−4 for OS, log-rank test) (Fig. 2B). Taken together, these results demonstrate that it is possible to determine a risk score on the basis of the expression of a small number of genes.

Figure 2.

OS of HCC patients stratified by risk score in the Korean and LCI HCC cohorts. HCC patients in the Korean cohort (A) and LCI cohort (B) were stratified by the 65-gene risk score. OS, overall survival.

Sixty-five-Gene Risk Score Is an Independent Risk Factor for OS.

We next combined clinical data from two test cohorts and assessed the prognostic association between our newly developed 65-gene risk score and other known clinical risk factors using univariate Cox regression analyses. In addition to the alpha-fetoprotein (AFP) level, tumor size, grade, and vasculature invasion, which are already well-known risk factors, the risk score was a significant indicator for OS (Table 3). We then included all relevant clinical variables in a multivariate Cox regression analysis. Importantly, the risk score remained the significant prognostic risk factor (hazard ratio [HR] 1.36, 95% CI 1.13-1.64, P = 0.001 for OS) (Table 3).

Table 3. Univariate and Multivariate Cox Proportional Hazard Regression Analyses of Clinical Variables Associated with Overall Survival of HCC Patients in Validation Cohort
 UnivariateMultivariate
 Hazard Ratio (95% CI)P-valueHazard Ratio (95% CI)P-value
  1. AFP, alpha-fetoprotein

Gender (M or F)1.22 (0.63-2.37)0.540.87 (0.43-1.76)0.71
Age (>60)1.15 (0.7-1.89)0.571.27 (0.74-2.15)0.38
AFP (>300 ng/mL)1.9 (1.23-2.93)0.0031.63 (1.0-2.59)0.04
Tumor size (quintiles)1.57 (1.32-1.88)3.1 × 10−71.41 (1.16-1.71)4.0 × 10−4
Grade (1,2,3,4)1.34 (0.99-1.82)0.050.95 (0.69-1.33)0.79
Vasculature invasion (Y, or N)2.07 (1.33-3.22)0.0011.35 (0.85-2.16)0.19
Risk score (quintiles)1.53 (1.28-1.82)2.2 × 10−61.36 (1.13-1.64)0.001

We next carried out ROC analysis to assess predictive performance of 3-year OS of 65-gene risk scores in a pooled test cohort and compared it with other clinical variables that showed significance in univariate analysis (tumor size, vasculature invasion, grade, and AFP). The AUC of risk score (0.699; 95% CI, 0.636-0.764) is very close to that of tumor size (0.691; 95% CI, 0.628-0.755), the most significant clinical variable in univariate analysis (Fig. 3). Taken together, these findings suggest that the risk score retains its prognostic relevance even after the classical clinicopathological prognostic features have been taken into account.

Figure 3.

Comparison of ROC curves of clinical variables and risk score in validation cohorts. Clinical variables and the 65-gene risk score were applied to patients in pooled validation cohorts and their prognostic significance was estimated by AUC from ROC analysis for 3-year OS. AUC: area under curve, CI: 95% confidence internal of AUC. TS: tumor size, VI: vasculature invasion. RS: risk score, AFP: alpha-fetoprotein.

We further tested the independence of the risk score over current staging systems. When the risk score was applied to patients with early stage (BCLC stage A) and intermediate and advanced stage (BCLC stage B and C) HCC, it successfully identified high-risk patients in different BCLC stages (Fig. 4). The risk score was also independent of American Joint Committee on Cancer (AJCC) stages (Supporting Fig. 3). We next tested whether a new risk score can improve the discrimination of prognosis over BCLC stages. Performance of the combined model (BCLC and risk score) is substantially improved over the baseline models with only BCLC and risk score as evidenced by an increase of AUC from ROC analysis (Supporting Fig. 4A). Moreover, subset ROC analysis within each BCLC stage clearly demonstrated an incremental value of risk score over current staging system (Supporting Fig. 4B).

Figure 4.

Kaplan-Meier plots of OS of HCC patients stratified by BCLC stages and risk score. Patients were stratified by BCLC stage (A) or risk score (B,C). P-values were obtained from the log-rank test.

Because vasculature invasion is the clinical variable best known to be significantly associated with OS of HCC after surgical resection,27–31 we next tested how independent the new risk score is of vasculature invasion. As expected, the prognosis of patients without vasculature invasion was significantly better than that of patients with invasion (Supporting Fig. 5A). When the risk score-based stratification was applied separately to invasion-positive and -negative patients, it successfully identified high-risk patients in both subgroups (Supporting Fig. 5B,C). Importantly, when all stratifications were combined together the risk score even identified patients without vasculature invasion whose risk was worse than or similar to that of patients with invasion (Supporting Fig. 5D).

We next examined the potential association of risk score with underlying liver disease by including Child-Pugh class and cirrhosis information into the analysis. As expected, Edmondson grade reflecting pathological characteristics of tumors showed an incremental association with risk score. The number of patients with a high risk score is slightly increased in higher grades. However, indices for underlying liver disease lack any association with risk score (Supporting Table 1), indicating that risk score does not reflect biological characteristics associated with underlying liver disease.

Molecular Characteristics of HCC Associated with 65-Gene Risk Score.

We grouped 65 genes in risk scores in the context of the GO to summarize biological characteristics of risk score. Not surprisingly, genes involved in signaling transduction are enriched in those whose expression is positively associated with poor prognosis (high risk genes, Supporting Table 2), whereas genes associated with normal metabolic functions of liver are enriched in low risk genes (Supporting Table 3).

In addition, we used gene expression data from the MSH cohort, for whom many biological characteristics are available.11 Ninety-one patients from the MSH cohort were stratified according to risk score by applying the coefficient and threshold values (8.36) derived from the NCI cohort. All three signaling events (phosphorylation) examined in the previous study with the MSH cohort were significantly associated with the risk score (Supporting Table 4). We found that a high risk score was significantly associated with enriched phosphorylation of AKT (P = 0.003, χ2 test), IGFR1 (P = 2.2 × 10−4, χ2 test), and RPS6 (P = 3.6 × 10−5, χ2 test). Mutation of TP53 is not associated with the risk score (P = 0.93), whereas a high frequency of mutations of CTNNB1 (beta-catenin) was significantly associated with a low risk score (23/27 mutations, P = 0.05, χ2 test). To validate the association between risk score and CTNNB1 mutations in HCC, patients in the INSERM cohort (n = 57) were stratified by risk score using same 8.36 cutoff threshold.9 Of 17 HCC tumors with CTNNB1 mutations, 16 were in the low-risk group, and this association was statistically significant (Supporting Table 5; P = 0.015, χ2 test).

Discussion

By applying multistep exploration and validation strategy (Supporting Fig. 6), we identified and validated a risk score based on expression patterns of 65 genes that can easily quantify the likelihood of OS in HCC patients who have undergone surgical resection as the primary treatment.

Several lines of evidence strongly support that the risk score is an independent and significant predictor of prognosis. First, the risk score was the significant predictive factor for OS in the combined validation cohort in multivariate analysis (Table 3). Second, the risk score can identify high-risk patients in both early stage HCC (BCLC stage A) and those with intermediate or advanced stage (BCLC stage B and C) (Fig. 4). The strength and independence of the risk score over the current staging systems remained significant even when the AJCC staging system was applied (Supporting Fig. 3). Third, the risk score identified a poor prognosis patients without vasculature invasion, who are typically considered as good prognosis patients (Supporting Fig. 5). Fourth, the risk score was the most significant predictor of 3-year survival of patients in ROC analysis (Fig. 3). Taken together, these results strongly support the notion that the risk score identifies clinical characteristics significantly associated with the prognosis of HCC that are not recognized by current staging criteria.

Although it is interesting to see that risk score reflects biological characteristics (Supporting Table 4), its associations need to be validated in future studies. For example, activation of AKT is the most commonly altered signaling event in many cancers and many genetic alterations lead to activation of AKT.32 Thus, it is currently uncertain whether AKT is the driver of tumor development in patients with a high risk score and would be potential therapeutic targets for these patients. However, the significant association of risk score with CTNNB1 mutations is in good agreement with the results of previous studies demonstrating a significant correlation between CTNNB1 mutations and a favorable prognosis among patients with HCC.33, 34 Moreover, TBX3, one of the canonical downstream target genes of CTNNB1,35 was included in our 65-gene signature, and its expression was associated with a better prognosis, which strongly supports the activation of CTNNB1 in the low-risk group in all HCC patients examined. It is also noteworthy that the risk score does not reflect the status of underlying liver disease, indicating that there might be room for improvement. A previous study identified a prognostic gene expression signature from surrounding nontumor tissues of patients with HCC that better reflects biological characteristics of underlying liver disease than tumors.12 The risk score might be improved by incorporating genomic data from surrounding tissues that does not overlap with but is complementary to those from tumor tissues.

Classification of human cancers into more homogenous clinical groups such as stages and grades significantly improved the treatment of patients by standardizing patient care. Molecular classification of cancers further improved patient care by enabling the development of treatments tailored to the abnormalities present in each patient's cancer cells. Currently, decision-making for HCC treatment in the clinical setting is mainly based on clinical data, which is best reflected in BCLC staging and its associated treatment algorithm.2 However, this staging method offers little or almost no information about biological characteristics of HCC that would be very critical for tailored treatment in the future. Importantly, risk score may provide clues on biological characteristics of tumors (i.e., activation of CTNNB1) as well as prognostic characteristics. Thus, it would provide an opportunity for developing rationalized clinical trials based on the molecular characteristics of tumors that are supplemental to current staging systems. Because our data showed that a small number of genes (65 genes) is sufficient to identify patient with a poor prognosis (Supporting Fig. 1), it will open up the possibility that simpler and easily accessible technology in clinics like quantitative reverse transcription polymerase chain reaction can replace complicated microarray technologies to develop easy-to-use prognostic models with small samples from biopsies.

Our current stratification strategy is limited by its assumption that there are two major prognostic HCC subgroups. Although this assumption is largely supported by the results of previous studies,10, 12, 13, 15, 16, 18 we cannot rule out the possibility that there are more than two prognostic groups of HCC patients, given the genetic heterogeneity of the disease. However, because our method generates continuous risk scores, it is easy to adjust cutoff criteria to restratify HCC patients according to the degree of genetic heterogeneity. Future studies should clarify this result.

In conclusion, the use of a risk score as defined by an expression pattern of 65 genes can identify HCC patients with poorer prognosis in a reliable and reproducible manner across independent patient cohorts. However, due to the heterogeneity in both ethnic backgrounds and potential differences in patient care in different hospitals, conclusions of the current study should be validated in a larger, independent cohort. Moreover, at present it is unclear whether the risk score offers information about the potential benefits of adjuvant therapies after surgical resection. Thus, prospective validation using tissues from patients having received adjuvant therapies is necessary in future studies with proper incorporation of analyses to correlate it with underlying liver diseases, identify patterns of recurrence, and determine the impact of subsequent therapies.

Ancillary