By continuing to browse this site you agree to us using cookies as described in About Cookies
Notice: Wiley Online Library will be unavailable on Saturday 7th Oct from 03.00 EDT / 08:00 BST / 12:30 IST / 15.00 SGT to 08.00 EDT / 13.00 BST / 17:30 IST / 20.00 SGT and Sunday 8th Oct from 03.00 EDT / 08:00 BST / 12:30 IST / 15.00 SGT to 06.00 EDT / 11.00 BST / 15:30 IST / 18.00 SGT for essential maintenance. Apologies for the inconvenience.
Donor livers are precious resources and it is, therefore, ethically imperative that we employ optimally sensitive and specific transplant selection criteria. Current selection criteria, the Milan criteria, for liver transplant candidates with hepatocellular carcinoma (HCC) are primarily based on radiographic characteristics of the tumor. Although the Milan criteria result in reasonably high survival and low-recurrence rates, they do not assess an individual patient's tumor biology and recurrence risk. Consequently, it is difficult to predict on an individual basis the risk for recurrent disease. To address this, we employed microarray profiling of microRNA (miRNA) expression from formalin fixed paraffin embedded tissues to define a biomarker that distinguishes between patients with and without HCC recurrence after liver transplant. In our cohort of 64 patients, this biomarker outperforms the Milan criteria in that it identifies patients outside of Milan who did not have recurrent disease and patients within Milan who had recurrence. We also describe a method to account for multifocal tumors in biomarker signature discovery.
Hepatocellular carcinoma (HCC) is one of the most common cancers worldwide (1,2) and is a major cause of cancer mortalities particularly in Africa and Asia (3). The incidence of HCC is increasing in western countries due to the hepatitis C virus epidemic (4) and, more recently, the obesity epidemic leading to nonalcoholic steatohepatitis (5). Curative treatment is currently limited to surgical resection and liver transplantation, but resection results in recurrence rates of greater than 70% within 5 years (6) and most patients (80%) present with extensive disease that is not amenable to surgery (7).
Transplanting within the Milan criteria (1 tumor ≤ 5 cm, 2–3 tumors ≤ 3 cm, and no evidence of intrahepatic vascular invasion or extrahepatic spread) is the accepted standard of care as 5-year survival rates of 80% or greater and 5-year recurrence rates of less than 15% can be achieved (8). However, these criteria are based primarily on radiographic characteristics and do not assess individual tumor biology and risk of recurrence or overall survival. Clearly, there are some patients outside of Milan criteria who have a low incidence of recurrence and there are those within Milan who do suffer from recurrent HCC after transplant. A biomarker indicative of a tumor's propensity for recurrence would be of great value in optimizing overall patient outcomes.
There is increasing evidence, primarily from global genomic studies (9,10), that metastatic potential is inherent in the primary tumor from an early stage and that this information can be used to predict long-term outcomes. Several groups have begun to study this in HCC using microarray technology to define messenger RNA (mRNA) and microRNA (miRNA) expression profiles that correlate with survival, recurrence and metastatic disease in the hopes of describing clinically useful biologic metrics to guide patient selection and appropriate therapeutic interventions (11–17). However, HCC is frequently associated with multifocal intrahepatic disease and this causes problems for defining such gene expression signatures. Intrahepatic metastasis arises from local dissemination of the primary tumor and can account for up to 75% of multifocal lesions whereas de novo lesions (multiple primary tumors) arise as a result of the diseased liver milieu that is predisposed to oncogenesis (18–23). Thus, in a patient with multifocal HCC, any biomarker must take into account the possibility that the tumors are not clonally related and, thus, will have different genetic profiles and associated metastatic potential. The question then arises as to which tumor (and which gene expression profile) is associated with the phenotype being studied. To address this we have devised a simple approach, the MIN–MAX method, to account for multiple expression patterns of the same miRNA in patients with multifocal disease.
In this study, we sought to define a miRNA biomarker that reliably distinguishes patients with and without HCC recurrence after liver transplant. We believe that such a biomarker can be used in conjunction with the current Milan criteria to improve selection decisions in liver transplant candidates with HCC. Furthermore, such a metric of HCC tumor biology could also be used to direct other therapeutic interventions such as surgical resection, ablative therapy, chemotherapy and radiation.
Materials and Methods
Patient cohort description
Our study was performed with approval of the University of Rochester Research Subjects Review Board (RSRB00029467). A total of 95 tumor nodules were studied from 69 patients who underwent liver transplantation for HCC at the University of Rochester Medical Center between 1996 and 2008. Forty patients had recurrent HCC within 3 years of transplant and 29 had no recurrent disease within 3 years. Patients were well-matched with regards to etiology, gender and age (Table 1). Patients in the nonrecurrent group tended to have lower grade tumors, earlier tumor stage and less vascular invasion compared to the recurrent group, but there was significant overlap in these characteristics between groups (Table 1). In the nonrecurrent group, 17 of 29 patients were within Milan criteria and in the recurrent group, 11 of 40 patients were within Milan. Milan criteria for this cohort were determined by pathologic evaluation of explanted livers at the time of transplant.
Table 1. Patient cohort demographics. Patients received liver transplants for HCC (n = 69), 29 of whom did not have recurrent HCC within 3 years and 40 of whom did. Patients were equally matched for eitiology, age, sex, race and HCV antibody status. Recurrent patients were more likely to have multifocal disease (42.5% vs. 17.2%), to be outside Milan criteria (72.5% vs. 41.4%), to have more advanced HCC stage and less differentiated tumor grade, to have vascular invasion (70% vs. 24.1%) and larger mean tumor size
Representative H&E sections of the formalin-fixed paraffin embedded (FFPE) blocks from the explanted tumors were reviewed by our pathologist (C.R.) to ensure presence of >70% viable tumor. Tissue cores (7 mm diameter) were then obtained from the corresponding blocks and re-embedded for further processing and miRNA isolation.
miRNA purification and array hybridization
MiRNA was isolated from FFPE liver tumor tissues using the Roche High Pure miRNA isolation kit (Roche Diagnostics, Mannheim, Germany). MiRNA extraction was performed from individual tissue blocks using seven sections of 10 microns each. One to three extractions were performed for each tumor to generate sufficient miRNA for microarray analysis. All samples were assessed for presence of enriched miRNA using an Experion Bioanalyzer (Bio-Rad, Hercules, CA, USA). MiRNAs were labeled using the FlashTag Biotin RNA labeling kit (Genisphere, Hatfield, PA, USA) according to the provided protocol and then hybridized to Affymetrix GeneChip miRNA 1.0 microarrays (Affymetrix, Santa Clara, CA, USA). These arrays are comprised of 46 228 probe sets representing over 6703 miRNA sequences (71 organisms) from the Sanger miRNA database (V.11) and an additional 922 sequences of human snoRNA and scaRNA from the Ensemble database and snoRNABase. Array hybridization, washing and staining was performed at the Upstate Medical University microarray core facility in Syracuse, New York, per the manufacturer's instructions and arrays were scanned with a GeneChip Scanner 7G Plus. Data files (.cel files) were generated using the miRNA-1_0_2X gain library file. Hybridization quality metrics were assessed using the AffyMir miRNA QCTool program (version 126.96.36.199, Affymetrix, Santa Clara, CA, USA). Only human miRNAs were considered in our analysis. All arrays were preprocessed using Robust Multiarray Average (RMA; Ref. 24). RMA was performed on all 46 228 probe sets after which nonhuman probe sets were removed leaving 847 human miRNA probe sets. All data have been deposited onto the Gene Expression Omnibus (GEO accession number: GSE30297, http://www.ncbi.nlm.nih.gov/geo/).
Description of the Min–Max procedure for biomarker construction with multifocal tissue samples
The preprocessed data consisted of miRNA expression estimates for each feature, miR-X, in each sample. In the case of multifocal tissue samples, more than one sample were obtained from a single patient. To create a biomarker for patient prognosis, we combined the miRNA expression estimates from each collection of samples belonging to the same patient. This was done by constructing two new probe features, miR-X_MIN and miR-X_MAX, defined as the minimum and maximum expression for each patient. As it cannot be generally anticipated whether high or low expression is associated with recurrence, miR-X_MIN and miR-X_MAX were treated as separate features in biomarker selection. Clearly, the MIN and MAX features were identical for unifocal patients. Although both the MIN and MAX features for a given miRNA can be statistically significant in a univariate analysis, we expect to use at most one from each pair in any final biomarker.
Array quality was assessed using a suite of widely used quality measures (25). The miR-X_MIN and miR-X_MAX features were constructed as previously described. Hierarchical clustering (Euclidean distance with complete agglomeration; see Ref. 26) was used to assess both similarity of expression within subjects and within recurrence status.
The primary outcome was defined as recurrence free survival time. The observation time of recurrence free subjects was, therefore, considered a right-censored survival time. The ability of each feature to predict survival time was assessed using a univariate Cox proportional hazards model. The resulting p value was interpreted as a measure of the feature's association with recurrence. The false discovery rate (FDR) adjusted p values were estimated using the Benjamini–Hochberg procedure (27).
A problem inherent in the development of biomarkers is the tendency for multivariate models to overfit data. This results in biomarkers that are initially reported to perform extremely well but whose performance cannot be reproduced. This tendency can be controlled to some degree using cross validation (CV). However, the unpredictability of multivariate models remains a problem, particularly when individual features exhibit a high degree of correlation. To address this concern, we propose the following procedure to generate a biomarker:
1For each feature, fit a univariate Cox proportional hazards model and record the p value and direction of association. Positive association means that greater feature expression yields longer recurrence free survival and negative association, the opposite.
2Using the p values from step (1), rank the features from most to least significant. Retain only a fixed number of the most significant features. We investigate the optimal number of features to retain during CV.
3Create a survival score by robustly standardizing the expression estimate of each retained feature by subtracting the median and dividing by the interquartile range (IQR), where the median and IQR are calculated across patients for each feature. For features whose direction of association from step (1) was negative, reverse the sign of the survival score so that a higher score is always associated with greater survival.
4Define an initial biomarker as the survival score of the most significant feature from step (2). Then proceed down the list of features from step (2), moving from more to less significant. For each feature, define a new (potentially improved) biomarker as the current biomarker plus the survival score of that feature from step (3). Compare the performance of the new biomarker to the current biomarker (assessed by the coefficient of variation (R2) from a Cox regression) and keep the better one.
5The final biomarker in step (4) will be the sum of the survival scores of those features that improved performance.
Additional predictor(s), such as the Milan criteria, can be easily incorporated into this procedure by adding the additional predictor(s) to the Cox regression models in steps (1) and (4).
To assess the predicted performance of our biomarker in an independent data set, we performed a CV procedure that incorporated all steps used to create our biomarker. Specifically, we performed K-fold CV training on 56 subjects and testing on the remaining 8 subjects (the total number of subjects available was 64). Within each CV sample, a new biomarker was created (following the steps described above) and biomarker scores were obtained for all subjects. The quantiles of the biomarker scores for the test subjects constituted the CV prediction. We repeated this procedure 500 times yielding 8 × 500 = 4000 total CV predictions.
Four types of biomarkers were evaluated, defined by how the samples were combined for multifocal patients (MIN–MAX or mean) and whether the Milan criteria was included as an additional predictor. For each model, a receiver operator characteristics (ROC) curve was plotted. The area under curve (AUC) is a straightforward way to assess the performance of a given biomarker.
Quality assessment of miRNA purified from FFPE tissues
We consistently obtained high yields of miRNA from FFPE blocks based on electrophoresis and spectroscopy (Figure S1). Furthermore, when we hybridized miRNA obtained from freshly frozen cell lines to the arrays and compared these results to array hybridization of miRNA from the same cell lines that had first been FFPE, we noted excellent correlation (R2= 0.88–0.90, Figure S2).
Quality metrics and univariate analysis
Seven arrays were removed due to poor quality as assessed by one of the six quality metrics considered (Table S1). This resulted in 88 samples and 64 subjects for further analysis. The MIN–MAX procedure yielded 1694 features based on 847 probes.
The univariate analysis yielded 60 significant features at 20% FDR (Table 2). A majority of the miRNAs distinguishing recurrence from nonrecurrence have been shown by others (15,16) to be relevant to hepatocellular carcinogenesis (Table S2). We may expect both the MIN and MAX feature for some probes to be significant, particularly when there tends to be smaller variation within a multifocal sample, and the two features would then be approximately equal. This effect was relatively small in our cohort. The 60 significant features represented 50 distinct probe sets (of which 10 were represented by both MIN and MAX features).
Table 2. Univariate analysis of miRNAs significant for recurrent HCC within 3 years of transplant. Top 60 probes with FDRs <0.2 are shown. MIN and MAX represent the minimum and maximum expression probe features for a given miRNA. Bolded probes are those selected between 70–97% in cross validation
hsa-mi R-99a-sta r_st_M 1N
hsa-mi R-143-star_st_M IN
Unsupervised hierarchical clustering results are refined using MIN–MAX
The results of unsupervised hierarchical clustering of all 88 samples (using all 847 miRNA probe sets without MIN–MAX) show that patients with recurrent disease tend to cluster together (Figure 1A). Employing the MIN–MAX method reduces the results to 64 patients with a very similar clustering, suggesting that information is not grossly distorted or lost as a result of the MIN–MAX procedure (Figure 1B). When we apply the MIN–MAX method and then perform a univariate Cox regression analysis, clustering of the patients using probes with FDR < 0.2 reveals a clearer distinction between recurrent and nonrecurrent patients (Figure 1C). To further investigate the clustering between recurrence and nonrecurrence samples, we examined the first two principal components (Figure S3). As we observed in the hierarchical clustering, there is a cluster of primarily recurrence samples and a cluster of mixed recurrence and nonrecurrence.
HCC recurrence miRNA biomarker discovery and its comparison to the Milan criteria
We generated our proposed biomarker using all available data. The biomarker consists of 67 miRNAs that significantly distinguished patients with HCC recurrence after transplant from those without recurrence (Figure 2) with R2= 0.848 and AUC = 0.989. Analysis of recurrence-free survival shows that the biomarker clearly delineates patients with and without recurrence within three years of transplant (Figure 3A) with a p value of 1.6 × 10−11. Applying the biomarker to patients in our cohort outside Milan (Figure 3B) and inside Milan (Figure 3C) also yields statistically significant separation (p = 6.9 × 10−5 and p = 2.8 × 10−8, respectively), demonstrating that the biomarker can identify patients outside of Milan who have favorable biology and patients within Milan who have unfavorable tumor biology (as measured by disease recurrence). In fact, in our cohort, the biomarker identified 9 of 12 patients within Milan who recurred and 8 of 11 patients outside of Milan who did not recur (Table 3). Note that all 67 miRNAs in this biomarker are employed in the analyses presented.
Table 3. Cohort characteristics of patients according to Milan and recurrent disease and performance of miRNA biomarker. The biomarker (BM) successfully identifies 9 of 12 patients who were outside Milan without recurrent disease and 8 of 11 patients within Milan who did have recurrence
Within Milan (n = 28)
11 (8/11 BM)
Outside Milan (n = 41)
12 (9/12 BM)
Table S3 lists all probe features appearing in at least 50% of CV biomarkers for the MIN–MAX model with Milan incorporated. The median number of features used in the CV fits was 75 with a minimum of 44 and a maximum of 144. We note that collinearity plays an important role, in that the fitting procedure tends to exclude features that are highly correlated with features already incorporated in the biomarker. It is interesting to note that the highest ranking feature in the univariate analysis (hsa-miR-194_st_MIN, Table 2) has Pearson's correlation coefficients of 0.58, 0.81 and 0.57 with features hsa-miR-125b-2-star_st_MIN, hsa-miR-122_st_MIN and hsa-miR-182_st_MIN, respectively. These are all among the top five ranked features, but the latter two appear in less than 50% of CV biomarkers.
Biomarker discovery is facilitated by the MIN–MAX method
Perhaps the most common way to handle multifocal data is to compute the average expression across samples for each patient. However, this ignores the possibility of heterogeneity in both the tumor phenotype and expression profile. In this study, it may be reasonable to assume that tumors in nonrecurrent patients are more homogeneous as they all lack the recurrence biomarker. However, recurrence likely only requires the recurrence biomarker to be present in one of a patient's tumors. The result of our CV comparison of the mean and the MIN–MAX procedures support this hypothesis (Figure S4) as the methods performed comparably with regard to specificity but the MIN–MAX procedure had much better sensitivity.
Cross validation reveals synergy between biomarker and Milan
The four ROC plots are shown in Figure 4. The sensitivity and specificity attained by Milan is superimposed. The MIN–MAX procedure is clearly able to exceed Milan in prognostic accuracy. Furthermore, this accuracy is enhanced by incorporating Milan itself in the biomarker.
Figure S5 demonstrates the construction of the biomarker, indicating the improvement in the R2 value as additional features are added. The rate of increase clearly increases when Milan is included in the model, indicating greater predictive ability of the probes when Milan information is incorporated.
Finally, CV was used to assess the optimal number of features to retain in step (2) of the biomarker generation procedure. The resulting AUC statistics are shown in Figure S6. The prognostic ability of the various markers is clearly sensitive to this parameter, but the superiority of the MIN–MAX biomarker which incorporates Milan is evident over the whole range.
There is a growing consensus that evaluation of HCC tumor biology via molecular characterization holds most promise in achieving accurate clinical risk stratification of patients (28,29). Intriguing preliminary results with various biologic metrics such as fractional allelic imbalance (30) and gene expression profiles (11–14) strongly suggest that addition of tumor biology information to current liver transplant selection criteria is possible and desirable. We are rapidly approaching the point where it will be reasonable to pursue biomarker testing in pretransplant tumor biopsy specimens to more efficiently direct our resources and improve transplant outcomes, as well as to direct other available therapeutic modalities for HCC.
MiRNAs are attractive markers as they are known master regulators of gene expression and are highly effective in classifying tissue types and tumor tissues of origin (31,32). One potential advantage to studying miRNA over mRNA in biomarker signature building is that there are only just over 1400 miRNAs compared to the over 20 000 mRNAs. Therefore, statistical analysis is inherently less noisy and tighter. Also, the small size and stability of miRNAs make them far more amenable to analysis from FFPE tissues compared to much larger and less stable mRNAs. One logistical concern is that HCC often presents as multifocal disease and that subcentimeter lesions often demonstrate radiographic characteristics that are highly suspicious for HCC (i.e. arterial enhancement, venous washout and T2 brightness on MRI). In these cases, obtaining appropriate amounts of tissue for biomarker testing may often prove to be logistically cumbersome, if not sometimes impossible. Protocols for reliable miRNA amplification from needle biopsy samples and statistical probability calculations of biomarker performance with multifocal lesions will be necessary to begin to address these issues.
Evidence already exists that the many miRNAs comprising the HCC recurrence biomarker may be important to hepatocellular carcinogenesis. MiR-194 has been shown to be expressed in hepatic epithelial cells and to suppress HCC metastasis in a murine model (33). This miRNA has also been shown to be downregulated in human HCCs that metastasize (15). MiR-125b-2* is expressed in human fetal liver cells (34) and its dysregulated expression is noted in colorectal cancer with liver metastases (35). MiR-182 expression has been shown in two independent studies comparing HCC to adjacent uninvolved liver to be significantly upregulated (16,36). All HCC recurrence miRNA studies to date have been performed in the context of hepatic resection rather than transplant, and it will be important to perform a similar analysis of the biomarker on a cohort of hepatic resection patients from our own institution to assess its generalizability.
When we look at the variation in expression of particular miRNAs in the context of individual patients, there is more probe expression variation in recurrent patients versus nonrecurrent (Figure S7). In fact, of the 847 miRNAs, over 85% show greater average within-patient variance in the recurrence group than in the nonrecurrence group. This is strongly suggestive of the distributional mixture, which would result from nonhomogeneity of genetic response among multifocal samples. This argues strongly that we should be using the MIN–MAX approach to select which of the varied expression levels for a given miRNA is driving the phenotype.
Previously published gene expression profiling studies of HCC using microarrays to assay mRNA changes have not rigorously addressed the multifocal issue (11–17). The rationale for this is based on the observation that metastatic tumors from an individual patient tend to have gene expression profiles that are far more similar to that patient's primary tumor compared to expression profiles from other patients’ tumors (33,34). Our analysis, however, demonstrates that miRNA expression profiles can vary significantly between multifocal tumors from the same patient and this is therefore likely the case with mRNA. Application of the MIN–MAX procedure to mRNA signature discovery in HCC may significantly reduce false discovery of genes thought to be associated with important clinical outcomes such as survival, metastasis and recurrence. We further believe that the MIN–MAX approach is generalizable and can be applied to other multifocal disease processes such as the analysis of metastatic lesions or premalignant lesions in HCC and other cancer types.
Our cohort is somewhat unique in that we experienced a high degree of HCC recurrence. We attribute this to our previously aggressive practice of transplanting patients who were outside of Milan criteria, because our overall recurrence rate of patients transplanted within Milan is 10.5% over this 12-year period. The only change in our immunosuppressive management over this period was the introduction of mycophenolate mofetil in 2001 and we were unable to detect an era effect on recurrence as a result of this change (data not shown).
Our study is limited by a relatively small number of patients studied and the lack of external validation. Expansion of the test cohort to several hundred patients where examination of all tumor nodules in every patient is necessary to both further assess the performance of the MIN–MAX procedure and to refine the miRNA biomarker. We are currently using qPCR to verify the 67-miRNA biomarker and to complete signature building on an additional 150 tumors from 50 patients. In addition, we have identified another 150 tumors from 50 more patients to perform external CV. External validation has been predicted to require fewer patients (35), but complete surveillance of each individual's entire tumor burden is essential to clearly define which miRNAs are truly driving the clinical phenotype of interest. Although the 67-miRNA biomarker we report here performs well in our cohort, we expect that expansion of our study will result in a more comprehensive and clinically robust biomarker. Therefore, we do not propose that the particular biomarker presented in this study be used as a clinical adjunct to Milan. Rather, we present compelling evidence that a biologic metric can be developed that could, with further study, enhance the efficiency and performance of the Milan criteria.
We demonstrate that global miRNA analysis of FFPE samples from explanted HCCs can be used to develop molecular signatures defining clinically important outcomes and we present a preliminary miRNA biomarker that distinguishes patients with and without recurrent HCC within 3 years of transplant. We have further shown that the MIN–MAX method is effective in directing appropriate probe selection when analyzing multifocal specimens. Upon refinement of this biomarker with study of a much larger cohort and external validation with additional patients, we envision utilizing this biologic metric in concert with the existing Milan criteria to more efficiently utilize resources and improve outcomes in liver transplant for patients with HCC. The biomarker may also be used to help rationally direct other HCC treatments such as chemotherapy, ablation and resection.
The authors of this manuscript have no conflicts of interest to disclose as described by the American Journal of Transplantation.