Genomewide transcriptomic profiling identifies a gene signature for predicting recurrence in early‐stage hepatocellular carcinoma

Dear Editor, A majority of patients with hepatocellular carcinoma (HCC) suffer from tumor recurrence even after curative treatments. Such events can be categorized as either the early phase (within 2 years of treatment) or the late phase recurrence (after 2 years).1 The prognosis for patients with early phase recurrence (EPR) is generally much worse than the other group.2–4 As per the AJCC guidelines (7th edition) and the BCLC recommendations, an early-stage HCC with an AJCC-TNM stage-I and BCLC stage-0/A is defined by the following features: a solitary tumor, 5 cm or smaller in size, and a cancer without pathological vascular invasion. Accordingly, perhaps ∼25% of the earlystage HCCs is likely at risk for developing EPR following curative treatment.5 Unfortunately, currently this is lack of availability of clinically useful biomarkers for predicting EPR in patients with early-stage HCCs.4,6 In the present study, we for the first time undertook a systematic and comprehensive biomarker discovery and validation approach to unravel a novel gene expression signature for detecting an EPR in early-stage HCC patients. Our study design includedmultiple biomarker discovery and validation phases (Figure S1). We first analyzed TCGA RNA-Seq cohort comprising 59 early-stage HCC patients with (n = 21) and without EPR (n = 38). Among the 20,502 transcripts, 342 genes were differentially expressed. Following exclusion of highly correlated genes, the asso-

ciation of each gene with recurrence-free survival was assessed using Cox's proportional hazards (CoxPH) regression model and LASSO regression analysis, which resulted in a panel of eight-candidate genes ( Figure 1A), which were not correlated with each other ( Figure 1B). We next plotted the receiver operating characteristic (ROC) curves in TCGA cohort. This eight-gene panel accurately predicted EPR (the area under the curves [AUC] = 0.88; Figure 1C); which was subsequently validated in the GSE76427 cohort comprising 26 early-stage HCC patients with (n = 15) and without EPR (n = 11; AUC = 0.81; Figure 1D). Patients who were categorized as high-risk exhibited higher cumulative recurrence rates compared to the low-risk group in both TCGA and GSE76437 cohorts (p < 0.0001 and p = 0.0030, respectively; Figures 1E and 1F).
Next, we interrogated the performance of the eight-gene panel in two independent clinical cohorts of 130 HCC patients (Table 1). In cohort-1 (n = 54), 13 cases (24.1%) experienced EPR, while EPR occurred in 27 of 76 cases (35.5%) in the cohort-2. We excluded the TSSK1B gene from further analysis due to low expression in FFPE tissues. We evaluated the expression of the remaining seven genes in the clinical cohorts using qRT-PCR (Table S1). These experiments revealed that three genes (SNX24, C1QTNF8, and HERC5) were commonly expressed in both clinical cohorts ( Figure S2) and hence were selected for further analysis. We thereafter confirmed the predictive accuracy of the three-gene panel in TCGA and GSE76427 cohorts ( Figure  S3).
We assessed the predictive potential of this three-gene panel in the clinical training cohort (cohort-1). Using a partial likelihood in CoxPH model, we obtained a risk scoring formula for the three-gene panel as follows: (0.8777 x NSX24) + (0.2245 x HERC5) + (0.00483 x C1QTNF8). In this formula, we used -ΔCT values for determining the expression of each gene. The three-gene panel demonstrated a significant predictive power in detecting EPR in early-stage HCC patients (AUC = 0.82; Figure 2A panel). The waterfall plot with the risk scores and heatmap of gene expression for each patient is shown in Figure 2B (left panel). Cumulative recurrence rate analysis revealed that the high-risk patients, categorized based upon our three-gene panel, were associated with significantly higher recurrence rate in the clinical training cohort (p = 0.0004; Figure 2C, left panel). In univariate CoxPH regression analyses, our three-gene panel emerged as the only significant predictor of early recurrence (HR 15.68, 95% CI 2.03-120.83, p = 0.0082; Figure 3A and Table S2) compared to all other clinical factors. In addition to our three-gene panel, we included tumor size (high-risk: tumor more than 2 cm) 7 and operative procedure (high-risk: non-anatomical resection) 8 in the multivariate CoxPH regression analyses because they are clinically important prognostic factors. In these analyses, our three-gene panel emerged as the only independent significant predictor of EPR (HR 19.51, 95% CI 2.50-152.40, p = 0.0046; Figure 3A and Table S2).
Subsequently, the predictive potential of this three-gene panel was validated by applying the same risk-score model and statistical correlates to the patients in the in the independent testing cohort (cohort-2). As illustrated in Figure 2A (right panel), even in this cohort, our three-gene panel was a significant predictor of EPR (AUC = 0.64). Similarly, Figure 2B (right panel), depicts the waterfall plot for the risk scores and heatmap of gene expression in cohort-2 patients. The cumulative recurrence rates revealed that based upon our three-gene panel, the highrisk patients exhibited a significantly poorer prognosis compared to the low-risk patients (p = 0.041; Figure 2C, right panel). Finally, the univariate CoxPH regression analysis using the three-gene panel and various clinicopathological factors demonstrated that our gene-panel was the only significant predictor of recurrence in patients within the testing cohort as well (HR 2.25, 95% CI 1.01 -5.02, p = 0.047; Figure 3A and Table S2). In multivariate     Figure 3A and Table S2). Next, we established a combination signature which included our three-gene panel, tumor size, and operative method. This combination model was indeed superior versus individual factors and significantly improved the overall predictive accuracy in both cohort-1 (AUC = 0.86; Figure 3B) and cohort-2 patients (AUC = 0.74; Figure 3B). Taken together, our novel combination signature emerged as a potential signature that had significantly higher predictive value in predicting EPR in early-stage HCC patients.
We would like to acknowledge a few potential limitations of our study. First, the tumor size (p = 0.0448) and total bilirubin (p = 0.0445) in the clinical cohort-2 are not suitable for CoxPH analyses according to Schoenfeld residuals. Second, the performance of our three-gene panel in the clinical cohort-2 was not as robust, potentially due to the cohort size that was analyzed; hence, further clinical validation that includes larger prospective cohorts to assess the predictive accuracy of our three-gene panel might be needed in future.
In summary, our genome-wide, systematic biomarker discovery, and validation efforts resulted in the establishment of a novel three-gene signature that could significantly predict EPR in patients with early-stage HCCs; highlighting its potential clinical significance in the identification of high-risk HCC patients undergoing surgical resection.