• Open Access

Tumor stroma reaction-related gene signature predicts clinical outcome in human hepatocellular carcinoma


To whom correspondence should be addressed. E-mail: jiafan99@yahoo.com


The tumor stroma is a source of many potential new tumor biomarkers, in which the immune response is of major importance. Little is known regarding how changes in stromal gene expression affect hepatocellular carcinoma (HCC) progression. We analyzed the expression of 28 stroma-related genes using quantitative RT-PCR in 122 HCC samples, and related the results to patient prognosis. Hierarchical clustering based on expression of the 28 genes, or the 6-Th1 cell markers, can classify HCC patients into prognostically different subgroups. We further identified a five-gene classifier (protective genes IRF1 and GZMB and risk genes CRTH2, VEGF, and MMP7) as a significant independent prognosticator for recurrence (hazard ratio [HR], 4.80; 95% confidence interval [CI], 2.31–9.96; P = 2.6 × 10−5) by multivariate analyses. Importantly, the classifier was further validated in an independent set of 172 HCC samples (HR, 2.21; 95% CI, 1.20–3.00; P = 0.002). The predictive ability of the classifier, as measured by area under the curve (0.713 and 0.613 for original and validation cohorts, respectively), was comparable to those of vascular invasion, Barcelona Clinic Liver Cancer stage, and TNM stage. The densities of various tumor stromal cells were analyzed by immunostaining. Comparing the immunostaining and gene expression data showed significant association of the gene classifier with the amount of reactive stroma in both cohorts. Thus, the signature reveals the strong prognostic capacity of immune responses, angiogenic activity, and ECM remodeling, highlighting the importance of stromal biology in HCC progression. Contained in this novel predictor may be targets suitable for new therapeutic interventions, or for use as independent prognosticators. (Cancer Sci 2011; 102: 1522–1531)

Hepatocellular carcinoma is the fifth most common cancer worldwide by annual incidence and the third leading cause of cancer death.(1) Surgical resection provides an opportunity for cure, but the outcome remains grave mainly due to frequent tumor recurrence. Over 80% of postoperative recurrence occurs in the liver remnant, which can be either intrahepatic metastasis from the primary tumor or de novo multicentric tumor. It is suggested that early recurrence is most likely the consequence of occult metastasis from the initial tumor, whereas late recurrence more likely represents multicentric tumors.(2)

Recently, late recurrence has been proved to be associated with the gene expression signature of non-tumoral liver tissue adjacent to the primary tumor.(3–5) In contrast, clinicopathologic factors and molecular features of the primary tumor are suggested to be associated with early recurrence. Several specific gene expression signatures derived from the cancerous tissues of HCC have been identified for predicting early recurrence due to intrahepatic metastasis.(6,7) Nonetheless, to date, these cancer cell-oriented predictive systems are neither superior to morphological classification, nor show any overlapping predictor genes, and include fewer disease-related genes.(8,9) The stromal cells play a critical role in tumor development, tumor control, and response to treatment. It should be reminded that features derived from these non-neoplastic bystander cells are much more stable than that from tumor cells, making the stroma-based predictive systems versatile among different cancer types. One example is that the intratumoral density of lymphocytes, vascular endothelial cells, and fibroblasts is of strong prognostic significance in various solid tumors, including HCC, and colorectal and breast cancers.(10–12) Another example is that stromal derived signatures from breast cancer,(13) colorectal cancer,(14–16) and lymphomas(17) contained disease-related genes with functional overlaps. Of particular importance, an inflammation gene signature in peritumor tissue that predicts venous metastasis in HCC was successfully reproduced in lung cancer.(18,19) Notably, HCC, a hypervascular malignancy, is usually present in inflamed fibrotic or cirrhotic liver characterized by inflammatory cell infiltration and tissue remodeling,(20) and the liver itself is an immunological organ.(21) Thus, the immune status and stromal reactions at the tumor site can largely influence the biological behavior of HCC. However, the exact roles of the stromal responses in human HCC are still unclear.

We carried out a systematic expression analysis using real-time quantitative RT-PCR of stroma-related genes that were suggested to be involved HCC progression. Our primary aim was to infer prognostic information of stromal responses in HCC. A secondary aim was to investigate whether the prognostic immune-specific gene profile in colorectal cancer(14–16) could be reproducible in HCC. We hypothesized that an expected gene expression signature might provide a molecular gauge for the presence and significance of the signals emanating from tumor–stroma interactions in HCC. We ultimately identified a five-gene signature that we tested for its ability to predict HCC recurrence from the initial tumor in an independent group of patients.

Materials and Methods

Patients and clinical assessment.  Using computer-generated random numbers, frozen specimens from two independent cohorts of HCC patients were taken from our prospectively established tissue bank for gene expression analysis. The original cohort consisted of 122 patients treated between February 2002 and November 2005. The validation cohort consisted of 172 patients treated from January 2002 to December 2006. The inclusion and exclusion criteria were described previously:(22,23) (i) distinctive pathologic diagnosis of HCC; (ii) without anticancer treatment and distant metastases before surgery; (iii) underwent primary and curative resection, defined as macroscopically complete removal of the tumor; and (iv) with complete clinicopathologic and follow-up data. In addition, patients were only included if tumor RNA was of sufficient quality (28S/18S ratio >1.5 and A260/A280 ratio between 1.9 and 2.1, as we described previously).(24) Tumor staging was according to the American Joint Cancer Committee/International Union Against Cancer TNM (6th edition)(25) and BCLC(26) staging systems. Tumor differentiation was graded by the Edmondson-Steiner grading system. Liver function of all patients classified as Child–Pugh stage A. Post-surgical surveillance, adjuvant therapy, and treatment modalities after relapse were given according to a uniform guideline as previously described.(22,23,27) In brief, all patients were followed up every 2–6 months after operation until March 2010. Serum AFP level and liver ultrasonography were detected during each visit by independent doctors without knowledge of the study. A CT scan or MRI was carried out every 6 months. If metastasis or recurrence was suspected, CT or MRI was done immediately. Time to recurrence and OS were calculated from the date of surgery to the date of the first recurrence and death, respectively. Data were censored at the last follow-up for patients without relapse, or death.

The median duration of follow-up was 50.0 (range, 2.0–97.0) and 39.0 (range, 2.0–89.0) months for the original and validation cohorts, respectively. Detailed clinicopathologic features are provided in Table 1. Informed consent was obtained from each patient, and the study was approved by the institutional review board of Zhongshan Hospital (Shanghai, China). The set-up of the study is provided in Figure 1(A). Using 24 months as the cut-off, intrahepatic recurrences were divided into early and late recurrence.(2) We specifically took late intrahepatic recurrence as multicentric new tumor, and consequently used as censored data in analyzing TTR. In addition, extrahepatic recurrence was more likely to be true metastasis from the primary tumor. This may provide a simple basis for distinguishing a true HCC relapse from de novo hepatocarcinogenesis, that is, extrahepatic and early intrahepatic recurrence representing the true versus late intrahepatic recurrence representing the false (Fig. 1B).

Table 1.   Clinicopathologic characteristics of the original and validation cohorts of patients with hepatocellular carcinoma
VariablesOriginal cohortValidation cohort
RS – lowRS – highP-valueRS – lowRS – highP-value
  1. *P-value calculated using Student’s t-test. **P-value calculated by Fisher’s exact test. †There were four and three HCV+HBV+ cases in the original and validation cohorts, respectively. All other hepatitis positive cases were HBV+HCV. AFP, alpha fetoprotein; ALT, alanine aminotransferase; BCLC, Barcelona Clinic Liver Cancer; γGT, gamma glutamyl transpeptidase; RS, recurrence score.

No. of patients6161 8587 
Age (years), Median, range53 (28–75)50 (32–75)0.07*53 (18–83)51 (24–75)0.24*
 Female711 1911 
Hepatitis history†
 No64 66 
AFP (ng/mL), median (range)48 (1–60 500)270 (1–60 500)0.07*119 (0–60 500)165 (2–60 500)0.69*
ALT (U/L), median (range)43 (6–696)35 (10–271)0.60*33 (96–724)41 (12–540)0.96*
γGT (U/L), median (range)68 (8–421)70 (9–811)0.66*56 (14–405)71 (18–425)0.31*
Liver cirrhosis
 Yes5747 7378 
Tumor size (cm)
 >52932 4544 
 Complete3237 4345 
Tumor number
 Multiple1315 2125 
Vascular invasion
 No3929 5542 
 III–IV2635 3140 
TNM stage
 II2018 2436 
 IIIA819 1619 
BCLC stage
 B2122 1816 
 C1118 2134 
 Intra- and extrahepatic22 69 
 Intrahepatic, ≤24 months726 1531 
 Intrahepatic, >24 months126 128 
 No recurrence3822 4729 
Figure 1.

 Data analysis schematic (A) and determination of a true recurrence versus a multicentric occurrence (B) of hepatocellular carcinoma. Mon, months.

Gene selection and real-time RT-PCR.  Gene selection involved two steps. First, a group of genes related to antitumor immunity, immunosuppression, and inflammation was analyzed in colorectal cancer, with the identification of a prognostic Th1 immune profile.(14–16) This group of genes was included. Second, we included additional genes described as modulators of inflammation, angiogenic process, and ECM remodeling. Thus, a total of 28 genes were selected. Two primer sets were designed: TRAV24, forward 5′-AGA AAG GAC GAA TAA GTG CCA-3′, reverse 5′-CAG GGT CGG GTT CTG GATA-3′; and CD45RO, forward 5′-AGA AAG GAC GAA TAA GTG CCA-3′, reverse 5′-CAG GGT CAG GGT TCT GGA TA-3′. All other primers were purchased from SABiosciences (Frederick, MD, USA) (Table 2).

Table 2.   List of genes analyzed
No.SymbolUniGeneDescriptionAssay †Band size (bp)
  1. †SABioscience gene expression assay ID. ‡Primers for TRAV24 and CD45RO were designed in our laboratory, as provided in Methods.

 1IRF1Hs.436061Interferon regulatory factor 1PPH00320E63
 2TRAV24T-cell receptor alpha variable 10186
 3CD3ZHs.156445T-cell surface glycoprotein CD3 zeta chainPPH01484A151
 4GATA3Hs.524134GATA binding protein 3PPH02143A167
 5CRTH2Hs.299567Chemoattractant receptor-homologousPPH00418A123
 6GZMBHs.1051Granzyme B; cytotoxic serine protease BPPH02594A85
 8TBX21Hs.272409T-box 21; T-cell-specific T-box transcription factorPPH00396A136
 9IFNGHs.856Interferon, gammaPPH00380B129
10CD45ROHs.192039CD45 antigen; Leukocyte-common antigen108
11IDOHs.840Indoleamine-pyrrole 2,3 dioxygenase 1PPH01328B85
12ARG1Hs.440934Arginase, type IPPH20977A108
13B7H1Hs.521989Programmed cell death 1 ligand 1PPH21094A171
14TGFB1Hs.645227Transforming growth factor, beta 1PPH00508A91
15TNFAHs.241570Tumor necrosis factorPPH00341E54
16IL1BHs.126256Interleukin 1, betaPPH00171B117
17IL8Hs.561078Interleukin 8PPH00568A126
18IL10Hs.193717Interleukin 10PPH00572B99
19IL23AHs.622865Interleukin 23, alpha subunit p19PPH01688A187
20VEGFHs.73793Vascular endothelial growth factor APPH00251B183
21PDGFRAHs.74615Platelet-derived growth factor receptor, alpha polypeptidePPH00219B184
22STAT3Hs.463059Signal transducer and activator of transcription 3PPH00708E157
23PTGS2Hs.196384Prostaglandin-endoperoxide synthase 2PPH01136E68
24NOS2AHs.434386Nitric oxide synthase 2A; NOS, type IIPPH00173E132
25HIF1AHs.654600Hypoxia-inducible factor 1, alpha subunitPPH01361B84
26MMP9Hs.297413Matrix metallopeptidase 9 (gelatinase B)PPH00152E63
27MMP7Hs.2256Matrix metallopeptidase 7PPH00809E58
28NEMOHs.43505NFkappaB essential modulatorPPH00660A119
29TBPHs.590872TATA box binding proteinPPH01091E57
30HPRT1Hs.412707Hypoxanthine phosphoribosyltransferase 1PPH01018B89

Total RNA was extracted using TRIzol reagent (Invitrogen, Carlsbad, CA, USA), and purified using the RNeasy Mini kit (Qiagen, Valencia, CA, USA), including a DNaseI digestion. First-strand cDNA was synthesized from 1 μg total RNA with oligo(dT)18 primers using SuperScript III Reverse Transcriptase (Invitrogen). Gene expression was analyzed using a Real-Time SYBR Green/ROX PCR Master Mix (SABiosciences) on the ABI Prism 7900 Sequence Detection System (Applied Biosystems, Foster City, CA, USA). Duplicate RT samples were used in each assay and collapsed by averaging. The thermal cycling conditions comprised an initial denaturation step at 95°C for 10 min and 40 cycles at 95°C for 15 s and 60°C for 1 min.

The relative changes in gene expression were calculated by the −ΔΔCt method using RNA pools from 10 disease-free normal liver donors as a calibrator and normalized against two housekeeping genes (TBP and HPRT1), as we previously described.(24)

Tissue microarray and immunohistochemistry.  Tissue microarrays were constructed as described previously.(22,23,27) Triplicates of 1 mm-diameter cylinders of representative areas in tumor center, away from necrotic, hemorrhagic, and major fibrotic areas, were included in each case. Serial sections (4 μm thick) were placed on slides coated with 3-aminopropyltriethoxysilane.

The mouse mAbs used were anti-human CD57, CD8, CD45RO, granzyme B, CD68, CD34, αSMA, VEGF (Novocastra Laboratories, Newcastle, UK), and FOXP3 (AbD Serotec, Raleigh, NC, USA). The rabbit pAbs used were anti-human IRF-1 (Abcam, Cambridge, MA, USA), CRTH2 (Abnova, Walnut, CA, USA), and MMP7 (Proteintech Group, Chicago, IL, USA). Immunohistochemistry was carried out using a two-step protocol (Novocastra Laboratories) as we previously described.(22,23,27)

The number of positively stained cells was counted manually by two independent investigators blind to the patient information. The results were expressed as the mean (±SE) number cells/1 mm core for triplicate of each patient’s sample.

Data analyses.  Hierarchical clustering was carried out with Cluster software (version 3.0), and visualized with the TreeView software version 1.0.6 (http://rana.lbl.gov/EisenSoftware.htm).(28) For this analysis, all values for each gene were normalized by the median.

Statistical analyses were carried out using SPSS version 15.0 (SPSS, Chicago, IL, USA). Univariate analysis was calculated by the Kaplan–Meier method, compared by the log–rank test. Multivariate analysis was done using the Cox multivariate proportional hazards regression model with stepwise manner (backward, conditional). The Pearson chi-square test or Fisher’s exact test was used to compare qualitative variables, and quantitative variables were analyzed by Student’s t-test or Pearson’s correlation test. The predictive accuracy was calculated using ROC. P < 0.05 (two-tailed) was considered significant.

For calculating RS, a stepwise multivariate Cox’s proportional hazards model was used to identify the best subset of genes, with recurrence as a dependent variable. To allow for direct comparisons, data from the two cohorts were subjected to normalization using median across samples for each gene. Patients were classified as having a high- or a low-risk gene signature, with the median of the RS as cutoff, based on the RS defined in the original cohort. In the validation cohort, both the regression coefficients and the cut-off value of RS derived from the original cohort were applied directly.


Hierarchical clustering and prognosis.  As an introductory analysis of these genes, we used unsupervised hierarchical clustering analysis to cluster patients and genes based on the similarity of expression pattern of the 28 genes. Then, 122 patients in the original cohort were classified into a cluster tree with two subgroups (Cluster I, n = 65; Cluster II, n = 57; Fig. 2A). Except that Cluster I included preferentially more cases with elevated serum AFP level (P = 0.010) and larger tumor size (P = 0.018, chi-square test), the two clusters were independent of other clinicopathologic factors. Univariate analysis showed significant difference in TTR (median months, 96.0 vs 16.4; P = 7.1 × 10−5) and potential significance in OS (median months, 72.0 vs 27.0; P = 0.068) between the two groups (Fig. 2B–C). Prognostic information of conventional clinicopathologic features are listed in Table S1. Further, multivariate analysis was carried out with all significant variables on univariate analysis to control for confounders. The multivariate analysis revealed that the 28-gene clusters (HR, 3.45; 95% CI, 1.79–6.67; P = 0.0002) and TNM stage (HR, 3.40; 95% CI, 2.24–5.14; P < 0.0001) remained significant for recurrence (Tables 3 and S1). The results suggested that the gene signature reflected primarily the tumor biological behavior (i.e. tumor size and serum AFP level) and recurrence status of these patients.

Figure 2.

 Hierarchical cluster and survival differences between clusters of patients with hepatocellular carcinoma. Patients in the original cohort were clustered hierarchically based on expression of all 28 genes (A–C) or six Th1 cell markers (D–F). Cluster tree of individual patient samples and overall pattern of median-centered gene expression data are shown. Red indicates high expression and green indicates low expression. Kaplan–Meier curves divided by hierarchical clustering using all 28 genes (B,C) or Th1 cell markers (E,F) showed significance.

Table 3.   Multivariate analysis on risk of recurrence in the original cohort
VariableHazard ratio95% CIP-value*
  1. *P-value was determined by Cox multivariate proportional hazard regression model with stepwise manner (backward, conditional). CI, confidence interval.

Model A
 28-gene cluster (cluster II vs I)3.451.79–6.670.0002
 TNM stage (III vs II vs I)3.402.24–5.14<0.0001
Model B
 Six-gene cluster (cluster ii vs i)2.701.37–5.310.0040
 Tumor multiplicity (multiple vs single)3.051.62–5.740.0010
 Vascular invasion (yes vs no)3.671.86–7.26<0.0001

Hierarchical clustering with adaptive Th1 cell markers.  Expression of a set of seven Th1 adaptive immunity-specific genes was associated with both longer disease-free and overall survival in colorectal cancer.(14–16) Six out of the seven genes were also included in our study (TBX21, CD3Z, IFNG, IRF1, GZMB, and GNLY). A hierarchical tree classifying the patients according to expression of the six genes revealed an inverse correlation with clinical outcomes: Cluster i (n = 67) with high expression of the six genes had a significantly favorable survival (median months, 96.0 vs 22.0, P = 0.004) and reduced recurrence (median months, 96.0 vs 14.0, P = 1.2 × 10−5) than that of the low-six-gene Cluster ii (n = 55) (Fig. 2D–F). On multivariate analysis, the six-gene cluster remained significant for TTR (HR, 2.70; 95% CI, 1.37–5.31; P = 0.004), but showed no significance for OS (Tables 3 and S1). Thus, results derived from colorectal cancer can be reproducibly captured in transcriptome space in HCC, supporting the notion that recurrence and survival are governed in large part by the state of the local adaptive immune response.

Creation of a prognostic model with a smaller number of genes.  The hierarchical cluster can only be applicable retrospectively and cannot be used to predict any future patients. We tried to identify a smaller number of genes relevant to HCC recurrence, and to create a prognostic model that could be applied prospectively. Five genes (GZMB, IRF1, CRTH2, VEGF, and MMP7) were identified as independent prognosticators on multivariate Cox model. GZMB and IRF1 were protective genes (negative coefficients), whereas the other three were risk genes (positive coefficients). Then we calculated the RS for each case on the basis of a linear combination of the gene expression values weighted by their estimated regression coefficients from the Cox model. Recurrence Score = (−6.306 × IRF1 value) + (−9.614 × GZMB value) + (2.983 × CRTH2 value) + (3.849 × VEGF value) + (4.334 × MMP7 value).

According to the median of RS (median, 0.0975; range, −3.625 to 2.556) as cut-off, the five-gene signature was significantly associated with both recurrence and death in the original cohort. The patients with a high RS had significant shorter median TTR (16.3 vs 96.0 months, P = 9.1 × 10−6) and OS (25.2 vs 96.0 months, P = 0.033) than the patients with a low RS (Fig. 3A–C). On multivariate analyses, the high-risk five-gene signature, presence of vascular invasion, and tumor multiplicity were significantly associated with high risks of recurrence among the 122 patients (Tables 4 and S1). Patients with a high RS were approximately five times more likely to develop an early tumor recurrence than those with a low RS (HR, 4.80; 95% CI, 2.31–9.96; P = 2.6 × 10−5).

Figure 3.

 Development and validation of the five-gene signature in patients with hepatocellular carcinoma. (A,E) Heat map showing individual genes included in the gene signature. Each column represents an individual patient. The slope of the red triangle indicates the magnitude of the corresponding recurrence scores. P, protective genes; R, risk genes. Survival and recurrence were predicted by the five-gene classifier in the original cohort (B,C) and tested in the validation cohort (F,G). The receiver operating characteristic curves show that the five-gene classifier was comparable to vascular invasion and tumor stages in both the original (D) and validation (H) cohorts, in terms of the values of the area under the curve. BCLC, Barcelona Clinic Liver Cancer stage; RS, recurrence score; VI, vascular invasion.

Table 4.   Multivariate analysis on risk of recurrence in the original and validation cohorts
VariableHazard ratio95% CIP-value*
  1. *P-values determined by Cox multivariate proportional hazard regression model with stepwise manner (backward, conditional). CI, confidence interval; RS, recurrence score.

Original cohort (n = 122)
 RS (high vs low)4.802.31–9.962.6 × 10−5
 Tumor multiplicity (multiple vs single)3.561.86–6.80<0.0001
 Vascular invasion (yes vs no)3.681.80–7.52<0.0001
Validation cohort (n = 172)
 RS (high vs low)2.211.20–3.000.0020
 Tumor differentiation (poor vs well)1.901.63–3.040.0060
 TNM stage (IIIA vs II vs I)2.231.63–3.04<0.0001

To further evaluate the predictive ability for recurrence of our model, we carried out a ROC analysis by measuring AUC, with AUC = 0.5 corresponding to no discrimination and AUC = 1.0 indicating perfect prediction. Based on the AUC, the predictive value of RS for recurrence (0.713, 95% CI, 0.618–0.808, P = 0.0001) was superior to vascular invasion (0.705, 95% CI, 0.607–0.803, P = 0.0002), and similar to BCLC stage (0.717, 95% CI, 0.619–0.815, P = 0.0001) and TNM stage (0.757, 95% CI, 0.665–0.849, P < 0.0001) (Fig. 3D).

Validation of the five-gene prognosis predictor.  Results in the validation cohort were similar to those in the original cohort. In the validation cohort, patients with a high-risk five-gene signature had a poorer TTR (median months, 20.1 vs 84.0; P = 0.001) and OS (median months, 43.2 vs 84.0; P = 0.076) than those with a low-risk gene signature (Fig. 3E–G). On multivariate analyses, the high-risk five-gene signature was significantly and independently predictive for TTR (Tables 4 and S2). The AUC of the RS for recurrence (0.613, 95% CI = 0.618–0.808, P = 0.0001) was comparable with the conventional clinicopathologic factors, including vascular invasion (0.640, 95% CI = 0.607–0.803, P = 0.0002), BCLC stage (0.635, 95% CI = 0.619–0.815, P = 0.0001), and TNM stage (0.650, 95% CI = 0.665–0.849, P < 0.0001) (Fig. 3H).

Furthermore, we analyzed the five-gene signature in patients with stage I or stage II disease separately and together. Among patients with stage I or II disease or combined, those with a high-risk gene signature had a shorter TTR than those with a low-risk gene signature (P = 0.093, P = 0.057, P = 0.006, respectively; Fig. S1). It could indicate that the robustness of the RS model was confirmed and strengthened that stroma-related genes played a crucial role in HCC recurrence.

Importantly, none of the genes in the signature, used in isolation, achieved the prognostic significance of five-gene signature. This may be attributed to the fact that the strength of such an approach is the ability to look at unique patterns or clusters of pathway deregulation, not single gene.

Stromal signature is characteristic of reactive stroma.  A key question about the stromal gene signature is whether it measures the number of non-tumoral cells. Comparison of the immunostaining and gene expression data showed a significant association of the stromal gene signature with the amount of reactive stroma. Patients with a high-risk gene signature contained higher numbers of intratumoral CD34+ (i.e. microvessel density) (P = 0.013, P = 0.013) and CD68+ (P = 0.014, P = 0.004) cells, but lower numbers of CD8+ (P = 0.004, P = 0.017), CD57+ (P = 0.005, P = 0.072), CD45RO+ (P = 0.037, P = 0.027), and granzyme B+ (P = 0.027, P = 0.047) cells for the original and validation cohorts, respectively (Fig. 4). Additionally, significant positive correlations were detected between protein and mRNA expression of granzyme B (r = 0.37, P < 0.0001 vs r = 0.26, P = 0.034, for original and validation cohorts, respectively). However, the intratumoral αSMA+ (P = 0.56, P = 0.11) and FOXP3+ (P = 0.097, P = 0.19) cells showed no significant difference between high- or low-risk signatures for the original and validation cohorts (Fig. 4).

Figure 4.

 Biological interpretation of the five-gene signature in hepatocellular carcinoma cells. Representative pictures showing intratumoral reactive stroma CD8+ (A), CD57+ (B), CD34+ (C), FOXP3+ (D), granzyme B+ (E), CD45RO+ (F), CD68+ (G), and αSMA+ (H) cells. Brown color indicates positive staining. Magnification, ×400. Correlations of reactive stromal content with the gene signature are shown under the micrographs. Tumors were divided into two groups by the five-gene classifier. Means and SEs are indicated by the bar in scatter columns. HCC-1, original cohort; HCC-2, validation cohort.

Likewise, in situ immunostaining of the five-gene products showed that positive staining of granzyme B (Fig. 4E) and CRTH2 (Fig. 5A) were predominately detected on tumor-infiltrating inflammatory cells, whereas tumor cell and tumor stroma both stained positive for IRF-1, MMP7, and VEGF (Fig. 5B–D). These results authenticated that the stromal gene signature was representative of tumor–stroma interactions in HCC.

Figure 5.

 Cellular location of the protein expression of genes by immunohistochemistry. (A) Positive staining of CRTH2 was detected on tumor-infiltrating inflammatory cells rather than tumor cells. (B) Both tumor-infiltrating inflammatory cells and tumor cells showed nuclear staining of IRF-1. (C) Tumor stomal cells showed strong intensity for MMP7, with weak staining on the plasma of tumor cells. (D) Both tumor cells and tumor stroma were positive for VEGF. Positive staining cells are brown. Red arrows denote tumor-infiltrating inflammatory cells. Magnification, ×400.


Many metastasis-associated genes are expressed not by the tumor cell itself but by the cells of the microenvironment, as well as genes overexpressed in tumor itself that have a principle role in mediating tumor–host interactions.(29) We evaluated a group of marker genes representing intratumoral stromal reactions, and, in particular, the immune response. Through retrospective hierarchical analysis and prospective establishment of a predictive model, that is, a five-gene based recurrence score (GZMB, IRF-1, CRTH2, VEGF, and MMP7), we showed the pivotal role of stromal response in modulating and driving HCC progression. Remarkably, this stromal reaction-based prognostic predictor can forecast disease outcomes with similar accuracy to conventional clinicopathologic predictors in terms of the AUC, which was further validated in an independent dataset.

The good outcome cluster overexpresses a distinct pair of antitumor immune-related genes (GZMB and IRF-1) with simultaneously reduced expression of the Th2 polarization marker (CRTH2). This is consistent with our previous studies that highlighted the clinical importance of the intratumoral ratio of effector to regulatory T-cells.(27) Instead, stroma from individuals in the poor outcome cluster, who were nearly five times more likely to suffer recurrences, shows markers of increased ECM remodeling (MMP7) and angiogenic response (VEGF), as well as a shifting of the immune balance to favoring immune tolerance (overexpression of CRTH2 versus decline of GZMB and IRF-1). The results indicate that the approach of looking at patterns of deregulation representing differential immune reactions as well as angiogenesis and ECM remodeling could lead to a better understanding of clinically relevant HCC subphenotypes, and create potential new targets for therapeutic agents. First, MMP7, secreted mainly by tumor cells and stromal cells, counter apoptosis, orchestrate angiogenesis, regulate immunity, and promote metastasis and invasion.(30) As is well known, MMP7 is a validated anticancer target with several MMP inhibitors being tested in phase III trials in lung and breast cancers.(31) Second, VEGF, released from the tumor itself, inflammatory cells, and other stromal cells, heavily participates in the neovascularization of HCC. A recent phase III trial showed that sorafenib, a multikinase inhibitor of VEGF receptor and other receptor tyrosine kinases, significantly improved survival in HCC patients with advanced tumors.(32) Interestingly, data have indicated that MMP7 could free VEGF from its endogenous VEGF trap to release an active VEGF protein, a prerequisite for VEGF-driven angiogenesis.(33,34) This could provide an intrinsic base for this pair of genes to be included in our prognostic model.

It has been reported that a set of seven genes representing Th1 adaptive immunity were protective against recurrence in colorectal cancer.(14–16) In this study, we revealed a similar phenomenon in HCC, supporting the notion that Th1 adaptive immunity has a beneficial effect on clinical outcome. Further, a previous study has suggested that a unique Th1/Th2 immune profile in tumor-adjacent liver tissues was associated with risk of intrahepatic metastasis in HCC.(19) The current study suggests that similar biological information could be measured from tumor itself. Hepatocellular carcinoma is a heterogeneous disease that is broadly resistant to common chemotherapy and radiotherapy. We propose that shifting the balance of the cancer suppressive microenvironment toward more efficient effector immune responses could be a promising therapeutic strategy in HCC. Our five-gene model, as such, has the potential to provide the rationale and guide for future personalized immunotherapies.

Although one may argue that many other important genes involved in tumor stromal responses are not included, our approach has the power to reveal gene expression programs not otherwise easily identifiable, and this rational choice allowed identification of key genes and biological significance. These clearly indicate that coverage in the panel is sufficient to analyze stromal reaction-derived molecular information. However, a note of caution is useful when interpreting these results that need further large-scale, multicenter, prospective tests.

As a confirmation, the immune cell within the tumor was diminished in individuals in the poor outcome cluster, as measured by CD8+, granzyme B+ effector immune cells, and CD45RO+ memory T-cells. In contrast, the number of CD68+ macrophages and CD34+ microvessels significantly ascended in the poor outcome cluster. Furthermore, immunohistochemistry suggested that the protein expression of the five genes mainly presented in tumor stroma, in particular tumor-infiltrating inflammatory cells, and tumor cells as well. Therefore, these data sustain that the signature presented here can reflect the tumor stromal reactions, and in turn suggest that antitumor immune cells as well as macrophages and endothelial cells may contribute to this stromal reaction-based signature.

In conclusion, we identified a unique tumor–stroma reaction-related profile that is highly associated with HCC prognosis, providing an accurate and rational method of risk stratification and highlighting new therapies as well. Moreover, due to the successful introduction of anti-angiogenic drugs like sorafenib in advanced HCC, we consider patients with a high risk of tumor progression would benefit from a combination of sorafenib with a novel immunotherapy.


This work was supported by the National Key Sci-Tech Special Project of China (Grant No. 2008ZX10002-018/019), Shanghai Rising-Star Program (Grant No. 10QA1401300), the National Natural Science Foundation of China (Grant Nos 81071992 and 30901432), and the Research Fund for the Doctoral Program of Higher Education of China (Grant No. 20090071120026).

Disclosure Statement

The authors have no conflict of interest to declare.


alpha fetoprotein


area under the curve


Barcelona clinic liver cancer


confidence interval


computed tomography


hepatocellular carcinoma


hazard ratio


overall survival


correlation coefficient


receiver operating characteristic curve


recurrence score


time to recurrence