SEARCH

SEARCH BY CITATION

Abstract

  1. Top of page
  2. Abstract
  3. Patients and Methods
  4. Results
  5. Discussion
  6. Acknowledgments
  7. References
  8. Supporting Information

Cancer is a genetic disease with frequent somatic DNA alterations. Studying recurrent copy number aberrations (CNAs) in human cancers would enable the elucidation of disease mechanisms and the prioritization of candidate oncogenic drivers with causal roles in oncogenesis. We have comprehensively and systematically characterized CNAs and the accompanying gene expression changes in tumors and matched nontumor liver tissues from 286 hepatocellular carcinoma (HCC) patients. Our analysis identified 29 recurrently amplified and 22 recurrently deleted regions with a high level of copy number changes. These regions harbor established oncogenes and tumor suppressors, including CCND1 (cyclin D1), MET (hepatocyte growth factor receptor), CDKN2A (cyclin-dependent kinase inhibitor 2A) and CDKN2B (cyclin-dependent kinase inhibitor 2B), as well as many other genes not previously reported to be involved in liver carcinogenesis. Pathway analysis of cis-acting genes in the amplification and deletion peaks implicates alterations of core cancer pathways, including cell-cycle, p53 signaling, phosphoinositide 3-kinase signaling, mitogen-activated protein kinase signaling, Wnt signaling, and transforming growth factor beta signaling, in a large proportion of HCC patients. We further credentialed two candidate driver genes (BCL9 and MTDH) from the recurrent focal amplification peaks and showed that they play a significant role in HCC growth and survival. Conclusion: We have demonstrated that characterizing the CNA landscape in HCC will facilitate the understanding of disease mechanisms and the identification of oncogenic drivers that may serve as potential therapeutic targets for the treatment of this devastating disease. (Hepatology 2013;58:706–717)

Abbreviations
AJCC

American Joint Committee on Cancer

BH

Benjamini-Hochberg's method

CCND1

cyclin D1

CDKN

cyclin-dependent kinase inhibitor

CGC

Cancer Gene Census

CFAs

colony formation assays

CHD1L

chromodomain helicase DNA binding protein 1-like

CIN

chromosome instability index

CNA

copy number aberration

DFS

disease-free survival

DSS

disease-specific survival

FDR

false discovery rate

FGF19

fibroblast growth factor 19

GAPDH

glyceraldehyde-3-phosphate dehydrogenase

HBV

hepatitis B virus

HCC

hepatocellular carcinoma

HCV

hepatitis C virus

IHC

immunohistochemical IRB, institutional review board

MAPK

mitogen-activated protein kinase

MET

hepatocyte growth factor receptor

mRNA

messenger RNA

MSigDB

the Molecular Signatures Database

MTDH

metadherin

PI3K

phosphoinositide 3-kinase

siRNA

small interfering RNA

SNP

single-nucleotide polymorphism

TGF-β

transforming growth factor beta

VEGF

vascular endothelial growth factor.

Hepatocellular carcinoma (HCC) is the fifth-most common cancer and the third-most common cause of cancer-related death worldwide. It has high prevalence in Southeast Asia because of endemic hepatitis B virus (HBV) infection and is refractory to nearly all currently available anticancer therapies.[1] Extensive studies of HCC have implicated aberrant activation of many signaling pathways involved in cellular proliferation,[2] survival,[3] differentiation,[4] and angiogenesis.[5] Although these studies have increased the understanding of HCC tumorigenesis, few studies provide reliable information on how frequently these targets and pathways are altered in HCC patients. A number of genome-wide gene expression profiling studies have been performed using clinical samples from various geographic regions across the world: These studies have highlighted specific genes and molecular pathways in the pathogenesis of HCC and have proposed molecular classifications of HCC.[5-7] To further elucidate the mechanism of hepatocarcinogenesis, it is useful to reconstruct molecular events at both the gene expression and DNA copy number levels. With the rapid development of high-density single-nucleotide polymorphism (SNP) array and array-based comparative genomic hybridization, it has become feasible to characterize CNAs involved in tumor development and progression across the entire genome. Several groups have applied these technologies to identify copy number aberrations (CNAs) in HCC and nominated putative driver genes.[5, 8-10] However, many of the previous studies were limited by the modest size of the studied cohorts, whereas others lacked a coherent dataset, including both copy number and gene expression measurements from the same set of patients, which hindered a fully integrated analysis. It is also useful to comprehensively characterize HCC cell line models so that putative driver genes that are driven by CNAs can be studied in preclinical models carrying the matching genetic alterations. Toward this end, a comprehensive collection of characterized HCC cell line models is still lacking.

In this study, we comprehensively and systematically analyzed the genome-wide CNAs and accompanying gene expression changes in 286 primary HCCs and 30 HCC cell lines. This allowed us to characterize the genomic landscape of HCC and to identify regions in the HCC genome that have undergone recurrent high-level focal amplifications or deletions. Pathway analysis of cis-acting genes in these CNA regions suggests that key cancer pathways are altered in a significant proportion of HCC patients. We further proposed a fully integrated approach to identify candidate oncogenic drivers from recurrent focal amplicons and credentialed two candidate drivers (BCL9 and MTDH) by demonstrating that they play a significant role in tumor growth and survival in relevant HCC cell line models.

Patients and Methods

  1. Top of page
  2. Abstract
  3. Patients and Methods
  4. Results
  5. Discussion
  6. Acknowledgments
  7. References
  8. Supporting Information
Patient Samples and HCC Cell Lines

A total of 286 pairs of fresh frozen tumor and adjacent nontumor liver tissues containing no necrosis or hemorrhage were collected from primary HCC patients who were treated with surgical resection at Samsung Medical Center (Seoul, Korea) from July 2000 to May 2006 (Table 1 and Supporting Table 1; Supporting Materials). Informed consent was obtained from each patient included in the study. This study was approved by the institutional review board (IRB) of Samsung Medical Center (IRB approval no.: SMC 2010-04-083). A list of HCC cell lines used in this study and their sources can be found in Supporting Table 2.

Table 1. Major Demographic and Clinicopathological Characteristics of the HCC Cohort and Their Associations With the CIN Score Inferred Based on CNA Data
Characteristics No. of PatientsPercentageAverage CIN ScoreP Value
  1. a

    Linear trend test.

  2. b

    One-way analysis of variance test.

  3. Abbreviations: AFP, alpha-fetoprotein; BCLC, Barcelona Clinic Liver cancer classification.

GenderMale23481.80.1470.4212b
Female5218.20.155 
Age≤5516959.10.1560.0174b
>5511740.90.137 
Edmondson gradeI3311.50.097<0.0001a
II23180.80.154 
III227.70.170 
Tumor size, cm<518564.70.137<0.0001b
≥510135.30.170 
Serum AFP, ng/mL<2010839.30.1350.0036b
≥2016760.70.159 
EtiologyHBV21575.20.1510.2410b
HCV3110.80.130 
Nonviral4014.00.150 
CirrhosisAbsent13848.30.1590.0091b
Present14851.70.139 
Intrahepatic metastasisAbsent22378.00.1410.0001b
Present6322.00.176 
Vascular invasionAbsent13346.50.127<0.0001b
Present15353.50.168 
Child Pugh classA27395.50.1510.0012a
B72.40.145 
C62.10.062 
AJCC T stageI12042.00.128<0.0001a
II11941.60.153 
III4114.30.184 
IV62.10.227 
BCLC stage0-A16056.00.1380.2090a
B10235.70.164 
C-D248.40.156 
Table 2. Top Amplification and Deletion Peaks in HCC
CNA TypeCoordinatesCytobandQ ValuesCNaFreqb (%)NgcCancer Genes
  1. Shown are peaks with GISTIC2 residual Q values ≤0.05 and peak frequency ≥4%. Cancer genes column lists representative cancer genes in each CNA peak selected based on the CGC (http://www.sanger.ac.uk/genetics/CGP/Census/). BCL9 and MTDH were identified as putative HCC drivers in this study. Full sets of genes under each CNA peak can be found in Supporting Table 3.

  2. a

    Estimated copy number.

  3. b

    High peak frequency using cutoffs of 3 and 1.3 for amplifications and deletions, respectively.

  4. c

    Total number of genes in the peak.

Ampchr1:144972831-1458409491q21.17.67E-063.348.76BCL9
Ampchr1:148404398-1491280191q21.21.42E-073.4112.918ARNT
Ampchr1:160642541-1608488101q23.30.000633.3813.63 
Ampchr1:169589071-1697935331q24.30.00333.3913.31 
Ampchr1:172028389-1732471331q25.10.00013.3813.68 
Ampchr1:174153135-1745609231q25.20.001773.389.41 
Ampchr1:176873122-1777267471q25.20.009193.3412.97ABL2
Ampchr1:192720574-1991109621q31.30.001423.3917.526 
Ampchr1:201955938-2022991971q32.11.46E-053.409.15 
Ampchr1:218208694-2184202531q410.000953.4211.55 
Ampchr1:222306937-2225981811q42.110.030313.448.73 
Ampchr1:240117360-2402625501q430.000563.4311.92 
Ampchr7:104735313-1049931527q22.20.000343.244.53 
Ampchr7:115453170-1170956487q31.20.033543.704.510MET
Ampchr8:43185346-437188808p11.10.020483.674.91 
Ampchr8:70950679-711675408q13.31.02E-073.4013.61 
Ampchr8:81240525-815781108q21.130.001123.5114.02 
Ampchr8:85232762-867434398q21.20.003493.6113.68 
Ampchr8:95531123-959557998q22.10.008293.3814.75 
Ampchr8:98748412-988338368q22.10.012523.4812.91MTDH
Ampchr8:100952268-1021990878q22.20.000533.5812.611COX6C
Ampchr8:120914233-1211115528q24.123.70E-063.4316.43 
Ampchr8:124165409-1245179848q24.130.011293.4514.77 
Ampchr8:144743066-1447529198q24.37.33E-134.0010.12 
Ampchr11:68927460-6925368811q13.26.17E-144.264.93CCND1, FGF19
Ampchr13:94729994-9485925213q32.12.14E-093.555.91 
AmpchrX:118773235-118985176Xq240.002113.498.07 
AmpchrX:122664307-122921771Xq254.21E-163.6210.82 
AmpchrX:154419146-154626451Xq280.010313.675.21 
Delchr1:611233-60849931p36.337.23E-081.228.075TNFRSF14
Delchr1:611233-644451131p36.110.04661.217.0720CDKN2C, ARID1A
Delchr4:79750406-1062905124q22.13.85E-061.2219.6104 
Delchr4:140435863-1405979624q31.15.37E-051.1615.01 
Delchr4:185374591-1857859444q35.15.05E-201.1214.71 
Delchr6:149843873-1501825026q25.10.000131.167.36 
Delchr6:132054580-1708999926q260.012681.216.3180TNFAIP3
Delchr8:1-355290868p21.30.000131.2115.7187WRN
Delchr9:19035503-192843969p22.10.001131.208.43 
Delchr9:21855960-224379069p21.39.53E-371.0512.62CDKN2A, CDKN2B
Delchr10:89119360-9047749810q23.310.002021.144.97PTEN
Delchr12:18783247-2041351812p12.30.002291.174.52 
Delchr13:20929488-8968358313q13.10.028591.207.0198BRCA2
Delchr13:46367970-5664643513q14.38.72E-181.179.144RB1
Delchr16:269213-35775716p13.32.81E-121.178.73 
Delchr16:57109851-5725785216q210.030551.228.71 
Delchr16:76623474-7818662316q23.10.00211.2210.11 
Delchr16:86666729-8714769516q24.20.000161.248.71 
Delchr17:295512-58305717p13.30.036391.184.51 
Delchr17:641813-84715117p13.30.002981.264.21 
Delchr17:10214726-1047376917p13.14.07E-101.2111.24 
Delchr18:30725121-7611715318q21.310.013911.174.9145SMAD4
SNP Genotyping Array and Gene Expression Array Data

Gene expression profiling was performed at Expression Analysis (Durham, NC) on Illumina Human HT-12 v4 BeadChips, according to the array manufacturer's protocol. Data were processed using an in-house pipeline to derive gene-summarized expression values (Supporting Materials). Genotyping was performed on the Human Omni1-Quad BeadChip by Illumina FastTrack Services (Illumina, San Diego, CA), where samples were processed according to the manufacturers' instructions. Raw data were processed using an in-house pipeline to obtain copy number segments and gene-summarized copy number estimates (Supporting Materials).

Copy Number Data Analysis

In primary HCC samples, copy number gain and loss cutoffs were selected to be 2.3 and 1.7, respectively, based on an assessment of replicate samples from the same SNP arrays. Copy numbers ≥3 and ≤1.3 were considered high-level amplifications and deletions, respectively, which represent conservative thresholds as primary tumor samples are typically a mixture of tumor cells and surrounding or infiltrating stromal cells. In HCC cell lines, we used a minimum copy number cutoff of 2.7 to select models with amplifications and treated models with copy numbers >1.7 and <2.3 as copy number neutral. GISTIC2 analysis[11] was performed on segmented copy number data using default parameters.

Chromosome Instability Index Score

We devised a chromosome instability index (CIN) score to measure degree of CNAs across the entire genome of a tumor, taking into account both the total regions of chromosome that are altered in a tumor as well as the amplitude of these alterations. Specifically, for a tumor genome segmented into L segments, where li and ai are the length and mean value (as Log2 intensity ratios between tumors and matched normal samples) of segment i, the CIN score is defined as shown in Equation (1):

  • display math(1)
Statistical Analysis

Associations with disease-specific survival (DSS) and disease-free survival (DFS) were assessed using Cox's proportional hazards regression model (see Supporting Materials for definition of DSS and DFS). P values were corrected for multiple testing (of all genes on the microarray) using Benjamini-Hochberg's (BH) method[12] to calculate the false discovery rate (FDR). To assess the ability of copy number traits to predict patient outcomes, we compared the number of copy number traits that are associated with clinical outcomes to the number from a permutated dataset where the sample labels were randomly shuffled for each trait independently. cis-correlations between a gene's copy number and its own messenger RNA (mRNA) expression across tumors were calculated using Pearson's correlation. P values associated with the resulting correlation coefficients were corrected for multiple hypotheses testing using the BH method. The null correlation distribution was obtained by shuffling the sample label between each copy number and expression vector independently for all genes.

Pathway Analysis

Genes with expression changes driven by somatic CNAs were selected from GISTIC2 amplification or deletion peaks with significant cis-correlation (FDR ≤0.05). We used the canonical pathway database from the Molecular Signatures Database (MSigDB),[13] excluding gene sets with fewer than 10 or more than 500 members. Overrepresentation of selected genes among these pathways was assessed using Fisher's exact test. The FDR was calculated based on 100 permutations where random gene sets of the same size were tested. The final top 17 pathways were selected based on (1) enrichment FDR ≤0.05 and (2) at least 30% of HCCs in the studied cohort having at least one gene in the pathway altered by somatic CNAs.

Quantitative Reverse-Transcription Polymerase Chain Reaction

Total RNA was extracted from cell lines using the RNeasy Plus Mini Kit (Qiagen, Valencia, CA). The Taqman gene expression assay was performed using the TaqMan RNA-to-CT 1-step Kit protocol (catalog no.: 4392938; Applied Biosystems, Foster City, CA), according to the manufacturer's instructions. Data were derived from three independent experiments. Data analysis was performed using Stratagene's software (Stratagene, La Jolla, CA), where threshold cycle values were unlogged and normalized by the glyceraldehyde-3-phosphate dehydrogenase (GAPDH) reference. Knockdown percentage was calculated as percent reduction in average signal from siBCL9 or siMTDH cells, relative to siControl cells (set to 100%), in each assay.

Cell Proliferation Assay

The cell proliferation assay was performed using the CyQuant Direct Cell Proliferation assay (Life Technologies Corporation, Carlsbad, CA), according to the manufacturer's protocol. Data were derived from five independent experiments. Percent inhibition was calculated as percent reduction in average signal from siBCL9 or siMTDH cells, relative to siControl cells (set to 100%), in each assay. P values between siControl and siBCL9 or siMTDH samples were calculated using a two-sample t test.

Soft Agar Assay

Cells expressing small interfering RNAs (siRNAs) targeting BCL9, MTDH, or control were suspended in a top layer of RPMI growth media and 0.35% Ultrapure LMP agar (Life Technologies) and plated on a bottom layer of growth media and 0.6% LMP agar in a 96-well plate. Soft agar colonies were stained with 0.5 μM of calcein-AM solution (Life Technologies) and counted 5-14 days after plating with an Acumen eX3 multiplate reader (TTP LabTech Ltd., Melbourn, UK). Data were derived from five independent experiments. Percent inhibition was defined as percent reduction in average number of colonies formed in siBCL9 or siMTDH cells, relative to siControl cells (set to 100%), in each assay. P values between siControl and siBCL9 or siMTDH samples were calculated using a two-sample t test.

Results

  1. Top of page
  2. Abstract
  3. Patients and Methods
  4. Results
  5. Discussion
  6. Acknowledgments
  7. References
  8. Supporting Information
Genomic Landscape of HCC

To characterize the genomic landscape of HCC, we compiled a collection of snap-frozen tumor and adjacent nontumor liver tissues from 286 patients who were treated with surgical resection (Table 1). Both RNA and DNA were isolated from all samples and profiled on the Illumina Human HT-12 v4 BeadChips and Human Omni1-Quad SNP genotyping arrays (Illumina), respectively.

Based on the SNP genotyping array data, we derived the somatic copy number profiles of the 286 HCCs using their matched nontumor liver tissue as references. On average, there are 200 somatic copy number gain events and 247 somatic copy number loss events per HCC, accounting for 12.0% and 11.3% of the genome, respectively. A genome-wide view of the segmented copy numbers revealed that most chromosome arms have undergone large-scale copy number gains or losses, with frequent gains observed on 1q, 6p, 7p, 7q, 8q, 13q, and 17q and frequent losses on 1p, 4q, 8p, 9p, 9q, 13p, 16p, and 16q (Fig. 1A). We also devised a CIN score, which is a single metric that summarizes the extent of CNAs in individual tumors (see Patients and Methods). We found that the CIN scores were positively associated with various features of tumor progression, such as American Joint Committee on Cancer (AJCC) stage, Edmondson grade, and tumor size, in agreement with our understanding of somatic CNAs as a cumulative process as a tumor advances (Table 1). On the other hand, the CIN scores were negatively associated with patients' age, the Child-Pugh score, and cirrhosis, which reflect overall liver function and pathological state of the non-HCC liver (Table 1). In addition to clinical HCC samples, we also profiled 30 HCC cell lines on the same gene expression and SNP genotyping array platforms. Overall, the spectrum of CNAs in HCC cell lines recapitulates primary HCCs (Fig. 1A).

image

Figure 1. CNA landscape of primary HCCs and HCC cell lines. (A) Heatmap showing the CNA pattern of HCC genome. Columns represent markers along chromosomes and are sorted according to their genomic coordinates; rows represent tumors separated into primary HCCs (upper) and HCC cell lines (lower). Copy number changes of the 22 autosomes are shown in shades of red for copy number gains and shades of blue for copy number losses. Histograms on top of the heatmap show frequency of copy number gains (above the horizontal axis and in red) and losses (below the horizontal axis and in blue) in the primary HCC population. (B) Focal amplification and deletion peaks identified by GISTIC2 in primary HCCs. For both plots, markers on SNP arrays are plotted on the vertical axis and sorted by their genomic coordinates; horizontal axes show the statistical significance of each peak as Q values determined by GISTIC2. Vertical green lines represent the default GISTIC2 Q value cutoff of 0.25. Tick marks indicate the positions of high confidence peaks reported in Table 2, with number of genes in each peak shown in brackets and representative known cancer genes shown in parentheses.

Download figure to PowerPoint

To assess the extent to which somatic CNAs in HCC drive downstream transcriptional programs, we calculated the correlation between a gene's somatic copy number and its mRNA expression in cis across our patient cohort. Overall, there were 3,152 genes for which at least 10% (i.e., correlation coefficient ≥0.316) of their expression variation can be explained by their own copy number changes, whereas by chance only one gene was expected at the same level of correlation (FDR = 3.17 × 10−4) (Supporting Fig. 1A). This is consistent with previous studies,[14] and suggests that somatic CNA significantly contributes to the expression landscape in HCC. In addition, somatic copy numbers of 661 and 206 genes were also significantly associated with DSS and DFS in our cohort, respectively (P < 1 × 10−4), whereas by chance one could expect only two and one genes, respectively, at the same P value cutoff (Supporting Fig. 1B). Hence, somatic CNAs in HCC are clinically relevant and may provide novel prognostic markers. We also observed a nonrandom distribution of CNA-to-CNA correlations where unlinked loci were frequently correlated to each other (Supporting Fig. 2). As expected, adjacent loci were highly correlated, whereas at a higher level some chromosome arms became either unlinked (e.g., 6p versus 6q and 17p versus 17q) or anticorrelated (e.g., 1p versus 1q and 8p versus 8q). In addition, numerous correlations between unlinked loci were observed, suggesting coselection of these genomic regions (e.g., 1p versus 16p, 1q versus 4q, and 5q versus 19q) as previously reported.[14]

image

Figure 2. Validation of BCL9 as an oncogenic driver in HCC. (A and B) Relationship between BCL9 somatic copy number and mRNA expression (A), and between BCL9 mRNA and protein expression (B), in primary HCCs. P value for the mRNA-protein association was derived from a linear trend test. (C and D) qRT-PCR data showing knockdown efficiency of siRNA targeting BCL9 in all tested cell line models for CyQuant proliferation assays (C) and CFAs (D). BCL9 expression levels were normalized by the GAPDH reference and set to be 1 in siControl experiments. Data were derived from three independent experiments and plotted as mean ± standard deviation. (E and F) Quantification of proliferation by CyQuant (Life Technologies Corporation, Carlsbad, CA) proliferation assays (E) and CFAs (F) in BCL9 and control siRNA-transfected cells. Percentage numbers shown above the bars for each cell line denote the average percent of inhibition in BCL9 siRNA-transfected cells, relative to siControl transfected cells. Asterisk indicates that the P value between siBCL9 and siControl is ≤0.05. The two HCC models with BCL9 amplification and the two unamplified control models were labeled by square brackets on the x axis.

Download figure to PowerPoint

Identification of Candidate Oncogenic Drivers in HCC

Although the overall CNA pattern is broadly consistent with the literature on HCC,[5, 9, 10, 14] the size and quality of our dataset should provide greater power to accurately localize and identify both large-scale and focal chromosomal alterations. To identify regions of copy number changes that may be responsible for driving tumorigenesis, we applied the GISTIC2 algorithm,[11] which incorporates both amplitude and frequency of CNAs to determine their statistical significance. Amplification or deletion peaks identified by GISTIC2 represent recurrent overlapping CNAs among multiple tumors, thus providing a finer resolution for mapping putative oncogenes and tumor-suppressor genes. Our GISTIC2 analysis identified 146 focal events, including 99 amplification peaks and 47 deletion peaks (Fig. 1B; Supporting Table 3). The median size of amplification peaks is 0.24 Mb (ranging from 1.5 kb to 11.6 Mb), containing an average of ∼5 genes per peak (excluding peaks that contain no genes, or “gene-less” hereafter). The median size of deletion peaks is 2.8 Mb (ranging from 46 kb to 122 Mb), containing an average of ∼100 genes per peak. We found that amplification peaks were significantly smaller than deletion peaks (P = 2.6 × 10−7; Supporting Fig. 3), and that genes under the amplification peaks tended to have stronger cis-correlation than those under deletion peaks, whereas both showed stronger cis-correlation compared to genes not located within any peak (Supporting Fig. 3). These observations support the disease relevance of the CNA peaks and are consistent with the assumption that oncogene activation is more locus specific than tumor-suppressor inactivation in cancer. We also thoroughly examined the association of GISTIC2 peaks to clinical and outcome variables (summarized in Supporting Table 4).

Table 3. Top Pathways Overrepresented Among cis-Acting Genes in CNA Peaks
Canonical PathwaysFold EnrichmentaP ValueFDR%HCCsb
  1. a

    Calculated against all human genes in the MSigDB.

  2. b

    Percent of HCC patients with at least one gene altered in the pathway.

  3. Abbreviations: ChREBP2, carbohydrate response element-binding protein 2; eIF4, eukaryotic translation initiation factor 4.

Pathways in cancer1.900.000150.0006750.3
Cell cycle3.290044.8
Ubiquitin-mediated proteolysis3.290039.5
Wnt signaling pathway2.010.00380.009438.1
TGF-b signaling pathway2.350.00480.01237.1
Insulin signaling pathway3.070036.4
p53 pathway3.170.000190.0007136.4
Oxidative phosphorylation2.870.0000050.00004735.7
Basal transcription factors4.670.0000330.0002535.7
ChREBP2 pathway3.440.000970.003134.6
eIF4 pathway6.310.0000050.00004632.9
Endocytosis3.130032.5
Neuotrophin signaling pathway2.400.000480.001832.2
MAPK signaling pathway1.700.00510.01231.5
Lysosome3.610031.1
Apoptosis2.480.00210.005330.4
PI3K pathway3.570.00280.0067230.4
image

Figure 3. Validation of MTDH as an oncogenic driver in HCC. (A and B) Relationship between MTDH somatic copy number and mRNA expression (A), and between MTDH mRNA and protein expression (B), in primary HCCs. P value for the mRNA-protein association was derived from a linear trend test. (C and D) qRT-PCR data showing knockdown efficiency of siRNA targeting MTDH in all tested cell line models for CyQuant (Life Technologies Corporation, Carlsbad, CA) proliferation assays (C) and CFAs (D). MTDH expression levels were normalized by the GAPDH reference and set to be 1 in siControl experiments. Data were derived from three independent experiments and plotted as mean ± standard deviation. (E and F) Quantification of proliferation by CyQuant proliferation assays (E) and CFAs (F) in MTDH and control siRNA transfected cells. Percentage numbers shown above the bars for each cell line denote the average percent of inhibition in MTDH siRNA-transfected cells, relative to siControl transfected cells. Asterisk indicates that the P value between siMTDH and siControl is ≤0.05. The two HCC models with MTDH amplification and the two unamplified control models were labeled by square brackets on the x axis.

Download figure to PowerPoint

We next focused on higher confidence peaks with residue Q value (by GISTIC2) ≤0.05, and high-level alteration frequency of at least 4% in our cohort, resulting in 29 amplification peaks and 22 deletion peaks (excluding gene-less peaks) (Table 2). The most highly amplified peak is located at chromosome 11q13.2 and contains three genes, including cyclin D1 (CCND1) and fibroblast growth factor 19 (FGF19), both of which have recently been reported to be amplified in HCC and validated as bona fide HCC drivers.[9] Hepatocyte growth factor receptor (MET) is one of 10 genes in the amplification peak located at 7q31.2, encodes the receptor for hepatocyte growth factor, and has been implicated as an oncogene in several cancer types, including HCC.[2] Many clinical compounds are available that specifically inhibit MET, thus providing an actionable path forward for testing MET as a potential target in HCC. Another gene of interest is chromodomain helicase DNA binding protein 1-like (CHD1L), which has been shown to interact with poly(ADP-ribose) and is involved in chromatin relaxation subsequent to DNA damage. Recent studies[15] have established its oncogenic role in HCC both in vitro and in vivo. Overall, we found a number of genes in the Cancer Gene Census (CGC)[16] under the top amplification peaks (those not reviewed here include BCL9, ARNT, ABL2, REL, XPO1, COX6C, ATF1, and BCL11B). Consistent with previous findings in HCC, the most frequently deleted peak is located at chromosome 9p21.3 and encompasses cyclin-dependent kinase inhibitor 2A and 2B (CDKN2A and CDKN2B, respectively), two well-documented tumor-suppressor genes that play a regulatory role in the CDK4/6 and p53 pathways in cell-cycle G1 progression. Other well-known tumor suppressor genes located within the top deletion peaks include PTEN, RB1, BRCA2, and SMAD4.

In addition to these well-known cancer genes, which recapitulated important drivers in HCC, our analysis also revealed other chromosomal regions that have undergone recurrent CNAs in HCC, affecting a greater number of genes not previously known to be involved in HCC. For example, seven additional amplification peaks were identified, each containing a single gene in the peak. These include TMLHE, A26A1, ABCC4, MTDH, PRDM14, BAT2D1, and RFWD2, which may be worth testing as potential drivers in HCC. Further studies are necessary to determine the function of these genes to understand their roles in hepatocarcinogenesis and identify potential therapeutic targets for HCC.

CNAs Affect Key Cancer Pathways in HCC

Another approach to gain insight into these candidate driver genes is to organize them into molecular pathways and cellular processes and search for patterns of pathway alterations. In addition to placing the candidate CNA drivers into a mechanistic context, this approach can also identify other genes on the altered pathway for which therapeutic options may be available. However, one challenge of CNA-based driver gene discovery is that passenger genes are often coamplified or codeleted in the same regions as the true driver genes, even after applying algorithms such as GISTIC2, which attempts to pinpoint the exact location of drivers by examining the minimal overlapping regions across a large tumor population. To alleviate this potential contamination from passenger genes, we focused on genes under GISTIC2 peaks with significant cis-correlation to their own mRNA (i.e., the so-called cis-acting genes). Our analysis showed that cell cycle was the most enriched pathway affected by somatic CNA involving cis-acting genes, such as CCND1, CDC16/23/25C, and CDKN2A/2B, together affecting 44.8% of HCCs in our study cohort (Table 3 and Supporting Table 5). The KEGG “Pathways in Cancer” was altered more frequently in our cohort than any other pathway, affecting more than half (50.3%) of the tumors, underlying the broad-spectrum effect of somatic CNAs in targeting multiple key pathways in cancer simultaneously. More specifically, we also identified individual cancer-related molecular pathways that were significantly overrepresented among cis-acting genes driven by somatic CNAs, including Wnt signaling, transforming growth factor beta (TGF-β) signaling, the TP53 pathway, mitogen-activated protein kinase (MAPK) signaling, and the phosphoinositide 3-kinase (PI3K) pathway, many of which have established roles in HCC and therapeutic implications that may influence drug discovery and development. A detailed view of frequent somatic CNAs in critical signaling pathways identified in our HCC cohort is summarized in Supporting Fig. 4. Taken together, these results provided new insights into HCC carcinogenesis and prompted us to search for novel driver genes and potential therapeutic targets in these somatic CNA regions.

Validation of Candidate Oncogenic Drivers

To generate testable hypotheses that could be followed up experimentally in appropriate model systems, we focused on cis-acting candidate driver genes (i.e., with positive cis-correlation and an FDR ≤0.05) that are in a highly amplified peak with ≥4% frequency and ≤10 genes in the peak. We further filtered the list to those genes with ≥2-fold overexpression in the amplified tumors, compared to adjacent nontumor liver tissues, and with at least two HCC cell lines carrying the same gene amplification. Of the 14 candidate drivers from seven amplicons (Supporting Table 6), some were well-established oncogenic drivers in HCC, including CCND1, FGF19, and CHD1L.[9, 15] We were able to perform functional testing on two additional genes (BCL9 and MTDH), based on reagent availability and previous knowledge of their involvement in cancer. To test the hypothesis that HCCs with focal amplification of the candidate driver are more dependent on the driver for growth and survival, compared to HCCs without the gene amplification, we selected four HCC cell lines for each candidate driver to perform target knockdown using RNA interference: two with amplification of the target and two that were copy number neutral.

BCL9 encodes B-cell CLL/lymphoma 9 and is involved in the Wnt/β-catenin signaling pathway by mediating the recruitment of pygopus to the nuclear β-catenin/TCF complex.[17] Although a t(1;14)(q21;q32) translocation involving BCL9 and IGL has been found in B-cell acute lymphoblastic leukemia,[18] neither BCL9 translocation nor gene amplification have been reported in HCC. In our HCC cohort, BCL9 was located in the amplification peak at 1q21.1, which is highly amplified in 8.7% of HCCs (Table 2 and Supporting Table 3). There was a significant correlation between its somatic copy number and gene expression in primary HCCs (Fig. 2A), and protein expression measured by immunohistochemical (IHC) staining (Supporting Materials; Supporting Fig. 5A) correlated well with mRNA expression (Fig. 2B; Supporting Table 7). Transient transfection of siRNA SMARTpool for BCL9 significantly reduced gene expression of BCL9 in all four cell lines tested (Fig. 2C,D) and significantly decreased cell growth and survival in both proliferation assays (Fig. 2E) and colony formation assays (CFAs) (Fig. 2F) in MHCC97H and MHCC97L, the two cell lines with BCL9 gene amplification (Supporting Fig. 6). By contrast, siRNA-mediated inhibition of BCL9 gene expression had minimal effect on the SK-HEP-1 cell line, which is copy number neutral for BCL9, although the other BCL9 copy-number–neutral cell line (HUH6) showed significant growth inhibition upon BCL9 knockdown, suggesting that mechanisms other than BCL9 amplification may confer dependence on BCL9 expression.

Our analysis also identified a peak at 8q22.1 containing a single gene (MTDH), which encodes metadherin. MTDH has been implicated as an oncogene in a number of cancer types, including HCC.[19] However, previous work in HCC has not yet established the dependency of MTDH-driven tumorigenesis on MTDH focal amplification, especially in relevant preclinical models that harbor the MTDH amplification. In our study, MTDH was highly amplified in 12.9% of HCCs (Table 2 and Supporting Table 3). There was a significant cis-correlation between somatic copy number and mRNA expression of MTDH in primary HCCs (Fig. 3A), and protein expression by IHC (Supporting Fig. 5B) correlated well with mRNA expression (Fig. 3B; and Supporting Table 8). We further identified two HCC models (MHCC97H and SNU-398) with amplification of the MTDH locus (Supporting Fig. 6). Transient transfection of siRNA SMARTpool for MTDH significantly reduced the gene expression levels of MTDH in all four cell lines tested (Fig. 3C,D). In the two MTDH amplified HCC models, siRNA-mediated inhibition of MTDH gene expression significantly decreased cell growth and survival in both proliferation assays (Fig. 3E) and CFAs (Fig. 3F), whereas knockdown had a less-prominent effect on the two MTDH copy-number–neutral lines (L-02 and SMMC-7721).

Discussion

  1. Top of page
  2. Abstract
  3. Patients and Methods
  4. Results
  5. Discussion
  6. Acknowledgments
  7. References
  8. Supporting Information

Our study represents one of the most comprehensive characterizations of the genomic landscape in a large primary HCC cohort and models. In addition to revealing the overall CNA patterns in primary HCC patients, we identified and characterized focal amplification and deletion peaks and prioritized potential oncogenic drivers and molecular pathways that may be implicated in hepatocarcinogenesis. While overall, our study is broadly consistent with the literature on HCC, our sample size and resolution have increased power to accurately identify and localize both large-scale and focal chromosomal alterations. Many of the CNA peaks from our analysis contain well-established genes known to be implicated in HCC or other cancer types. For example, genes contained in the most highly amplified peak (CCND1 and FGF19) and in the most frequently deleted peak (CDKN2A and CDKN2B) have been reported on and validated as oncogenic drivers and tumor suppressors in HCC, respectively, supporting the validity of our data and analysis pipeline.

Our approach extends previous literature reports that interrogated both somatic CNAs and gene expression changes in HCC in two ways. First, our dataset included both somatic copy number and gene expression data from the same set of primary HCC patients, allowing us to fully integrate the two data types when prioritizing driver genes by requiring significant cis-correlation and overexpression of the candidate drivers in the specific subset of tumors carrying the CNAs. Second, our approach selected appropriate preclinical models for testing the candidate driver genes, including both cell lines with the gene amplification to assess activity, as well as cell lines without the amplification to establish differential response to target knockdown between the models, in order to gain confidence that the oncogenic effect is truly CNA driven.

We also noticed some differences between our study and previous reports. For example, using the Affymetrix SNP6 arrays on 58 HCC tumor and normal pairs, Jia et al.[8] identified a putative oncogene, HEY1, which was not identified in our analysis. Although HEY1 was amplified in ∼13.7% of HCCs in our cohort, it was not assigned by GISTIC2 to any amplification peak; however, our analysis did identify the tumor suppressor, TRIM35, and another putative oncogene, SNRPE, that were originally reported on in the Jia et al. study. Chiang et al.[5] studied a cohort of 100 HCCs that were primarily hepatitis C virus (HCV) positive, and identified a focal amplicon containing vascular endothelial growth factor A (VEGFA). However, VEGFA was amplified in only ∼3.3% of our HCC cohort and was not highlighted by our analysis. Overall, such discrepancies may arise from one or more of the following factors: differences in sample origin, etiology, quality and degree of stromal contamination, variations in copy number measurement technologies, different data processing and analysis algorithms, and different thresholds used for selecting genes.

Our study implicated two putative oncogenic driver genes (BCL9 and MTDH) as important for growth and survival in HCC. We found that amplification of BCL9 was significantly associated with poor DFS in our HCC cohort (P = 0.03; Supporting Fig. 7), which may indicate a distinct clinical behavior of HCC patients carrying BCL9 amplification. In addition, somatic copy numbers of both BCL9 and MTDH were positively associated with advanced AJCC tumor stage (Supporting Fig. 7), suggesting that the aberration of either gene may be involved in the maintenance of aggressive phenotype of an established tumor. We also performed preliminary functional characterizations of both putative drivers by siRNA-mediated target knockdown in HCC cell lines that carry the respective target amplification and compared with models without the amplification. We noted that results on BCL9 were mixed in the HUH6 cell line, which is copy number neutral with respect to BCL9, but had decreased viability upon BCL9 knockdown in one of the assays. Because BCL9 is involved in the Wnt/β-catenin–signaling pathway,[17] there may exist other mechanisms for activating this pathway in HUH6 cells: It has been shown that the Wnt pathway may be activated in the HUH6 cell line as a result of β-catenin mutations.[20] Blocking the Wnt/β-catenin pathway by knocking down BCL9 gene expression could then lead to tumor growth inhibition in HUH6 cells, which may be addicted to this pathway for its tumorigenic properties. More research is needed to fully validate these two genes as oncogenic drivers in HCC and to explore their utility in targeted cancer therapy. Our work nevertheless demonstrates a proof of concept that systematic clinical genomics approaches, such as the one presented here, could be valuable in uncovering novel, clinically relevant cancer driver genes, and that testing of such genes needs to be performed in relevant preclinical models, both with and without the corresponding genetic aberration.

Future directions of our work include high-throughput dropout screens to systematically test all genes within the focal amplicons, an unbiased approach similar to the forward genetic screening by Sawey et al.[9] One of the biggest challenges in CNA-driven target identification is to distinguish true driver gene(s) from passengers in a focal amplicon. It has been shown that multiple drivers may even coexist in a highly focal amplicon, such as CCND1 and FGF19.[9] It would be valuable to perform unbiased screening to validate all candidate somatic CNA drivers in appropriate models and then dissect key attributes that distinguish drivers from passengers to facilitate future in silico algorithm development. Toward this end, the genomic characterization of a comprehensive collection of 30 HCC cell line models in our study will also serve as a valuable resource for future research in this direction.

Acknowledgments

  1. Top of page
  2. Abstract
  3. Patients and Methods
  4. Results
  5. Discussion
  6. Acknowledgments
  7. References
  8. Supporting Information

The authors thank Drs. John Lamb and Soonmyung Paik for scientific discussion in this study, Peter C. Roberts for facilitating data management and transfer, and Sylvie Sakata for study support.

References

  1. Top of page
  2. Abstract
  3. Patients and Methods
  4. Results
  5. Discussion
  6. Acknowledgments
  7. References
  8. Supporting Information
  • 1
    Parkin DM. The global health burden of infection-associated cancers in the year 2002. Int J Cancer 2006;118:3030-3044.
  • 2
    Takayama H, LaRochelle WJ, Sharp R, Otsuka T, Kriebel P, Anver M, et al. Diverse tumorigenesis associated with aberrant development in mice overexpressing hepatocyte growth factor/scatter factor. Proc Natl Acad Sci U S A 1997;94:701-706.
  • 3
    Villanueva A, Chiang DY, Newell P, Peix J, Thung S, Alsinet C, et al. Pivotal role of mTOR signaling in hepatocellular carcinoma. Gastroenterology 2008;135:1972-1983, 1983.e1971-1911.
  • 4
    Sicklick JK, Li YX, Jayaraman A, Kannangai R, Qi Y, Vivekanandan P, et al. Dysregulation of the Hedgehog pathway in human hepatocarcinogenesis. Carcinogenesis 2006;27:748-757.
  • 5
    Chiang DY, Villanueva A, Hoshida Y, Peix J, Newell P, Minguez B, et al. Focal gains of VEGFA and molecular classification of hepatocellular carcinoma. Cancer Res 2008;68:6779-6788.
  • 6
    Lee JS, Heo J, Libbrecht L, Chu IS, Kaposi-Novak P, Calvisi DF, et al. A novel prognostic subtype of human hepatocellular carcinoma derived from hepatic progenitor cells. Nat Med 2006;12:410-416.
  • 7
    Hoshida Y, Nijman SM, Kobayashi M, Chan JA, Brunet JP, Chiang DY, et al. Integrative transcriptome analysis reveals common molecular subclasses of human hepatocellular carcinoma. Cancer Res 2009;69:7385-7392.
  • 8
    Jia D, Wei L, Guo W, Zha R, Bao M, Chen Z, et al. Genome-wide copy number analyses identified novel cancer genes in hepatocellular carcinoma. Hepatology 2011;54:1227-1236.
  • 9
    Sawey ET, Chanrion M, Cai C, Wu G, Zhang J, Zender L, et al. Identification of a therapeutic strategy targeting amplified FGF19 in liver cancer by oncogenomic screening. Cancer Cell 2011;19:347-358.
  • 10
    Woo HG, Park ES, Lee JS, Lee YH, Ishikawa T, Kim YJ, Thorgeirsson SS. Identification of potential driver genes in human liver carcinoma by genomewide screening. Cancer Res 2009;69:4059-4066.
  • 11
    Mermel CH, Schumacher SE, Hill B, Meyerson ML, Beroukhim R, Getz G. GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers. Genome Biol 2011;12:R41.
  • 12
    Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B 1995;57:289-300.
  • 13
    Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A 2005;102:15545-15550.
  • 14
    Lamb JR, Zhang C, Xie T, Wang K, Zhang B, Hao K, et al. Predictive genes in adjacent normal tissue are preferentially altered by sCNV during tumorigenesis in liver cancer and may rate limiting. PLoS One 2011;6:e20090.
  • 15
    Chen L, Chan TH, Yuan YF, Hu L, Huang J, Ma S, et al. CHD1L promotes hepatocellular carcinoma progression and metastasis in mice and is associated with these processes in human patients. J Clin Invest;120:1178-1191.
  • 16
    Futreal PA, Coin L, Marshall M, Down T, Hubbard T, Wooster R, et al. A census of human cancer genes. Nat Rev Cancer 2004;4:177-183.
  • 17
    Kramps T, Peter O, Brunner E, Nellen D, Froesch B, Chatterjee S, et al. Wnt/wingless signaling requires BCL9/legless-mediated recruitment of pygopus to the nuclear beta-catenin-TCF complex. Cell 2002;109:47-60.
  • 18
    Willis TG, Zalcberg IR, Coignet LJ, Wlodarska I, Stul M, Jadayel DM, et al. Molecular cloning of translocation t(1;14)(q21;q32) defines a novel gene (BCL9) at chromosome 1q21. Blood 1998;91:1873-1881.
  • 19
    Yoo BK, Emdad L, Su ZZ, Villanueva A, Chiang DY, Mukhopadhyay ND, et al. Astrocyte elevated gene-1 regulates hepatocellular carcinoma development and progression. J Clin Invest 2009;119:465-477.
  • 20
    de La Coste A, Romagnolo B, Billuart P, Renard CA, Buendia MA, Soubrane O, et al. Somatic mutations of the beta-catenin gene are frequent in mouse and human hepatocellular carcinomas. Proc Natl Acad Sci U S A 1998;95:8847-8851.

Supporting Information

  1. Top of page
  2. Abstract
  3. Patients and Methods
  4. Results
  5. Discussion
  6. Acknowledgments
  7. References
  8. Supporting Information

Additional Supporting Information may be found in the online version of this article.

FilenameFormatSizeDescription
hep26402-sup-0001-suppinfo.doc2806K

Figure S1. Association between somatic CNA, mRNA expression and clinical outcome. (A) Distribution of genome-wide cis-correlation between somatic CNAs and mRNA expression levels across the HCCs (red) and those obtained from the permutated dataset where sample labels were randomly scrambled (blue). (B) Cumulative Distribution of Cox regression p-values for associating somatic CNAs to clinical outcomes including both disease specific survival (DSS, blue) and disease-free survival (DFS, red), in comparison to same distributions calculated from a permutated dataset where sample labels were randomly scrambled (“DSS perm” in green and “DFS perm” in black). X-axis of the plot shows the –log10 of the Cox regression p-value cutoffs, and Y-axis is the number of genes with a p-value smaller than the corresponding cutoff on the X-axis.

Figure S2. Pair-wise DNA/DNA correlations reveal significant associations between unlinked loci. Pair-wise Pearson correlations computed from ∼20k gene copy number are ordered by genes' chromosomal positions through the genome on the X and Y axes with red indicating a positive correlation and blue indicating a negative correlation. The red diagonal represents the correlation of genes with themselves.

Figure S3. Distributions of GISTIC2 peak statistics. (A) Peak size distribution and comparison between amplification and deletion peaks. (B) Relationship between peak frequency and peak size for amplification peaks. (C) Relationship between peak frequency and peak size for deletion peaks. (D) Relationship between peak frequency and peak amplitude. Peak frequencies were calculated based on copy number cutoffs of 3 and 1.3 for amplification and deletion peaks, respectively. Peak amplitudes were taken as the average copy number of a peak among patients called positive for the peak. (E) Distribution of cis-correlations for genes not in any GISTIC2 peak, in deletion or amplification peaks. P-values shown were based on two-sample t-tests.

Figure S4. Frequent somatic copy number alterations in critical signaling pathways in HCC.

Figure S5. Immunostaining in HCCs for BCL9 and MTDH. HRP, original magnification x200) showing high levels of immunoreactivity for BCL9 in the nucleus (A) and MTDH in the cytoplasm (B).

Figure S6. Inferred copy numbers and expression levels of BCL9 and MTDH in a panel of 30 HCC cell line models. (A) BCL9; (B) MTDH. Cell lines colored in green were used as amplified models for each candidate driver in the functional validation, and those in pink were used as controls (i.e. copy number neutral with respect to the target).

Figure S7. Association to clinical outcomes and AJCC tumor stages for the putative CNA drivers BCL9 and MTDH. Panels (A-C) show data for BCL9; panels (D-F) show data for MTDH. Patients were separated into two groups based on the amplification status of BCL9 and MTDH. Differences in disease-specific and disease-free survival were assessed by Kaplan-Meier curves and the associated log rank test. For association with AJCC tumor stage, a linear trend test was performed (p-values shown in parenthesis).

hep26402-sup-0002-supptab1.doc439KTable S1. Major demographic and clinicopathological parameters of the HCC cohort. For AJCC T Stage, categories "3a" and "3b" were merged as "3". For BCLC stage, categories "0", "A1", "A2", "A3" and "A4" were merged into 1; categories "B", "C" and "D" were coded as 2, 3 and 4, respectively.
hep26402-sup-0003-supptab2.doc37KTable S2. Cell lines used in this study and their sources. SIBS – Shanghai Institutes of Biological Sciences (Shanghai, China); HSRRB – Japan Health Science Research Resources (Japan); ATCC – American Type Culture Collection (Virginia, USA); CrownBio – Crown Bioscience Inc. (Beijing, China)
hep26402-sup-0004-supptab3a.doc335KTable S3. All CNA peaks predicted by GISTIC2 analysis. "Peak frequency" was calculated using copy number cutoffs of 2.3 and 1.7 for copy number gains and losses, respectively. "High peak frequency" was calculated using cutoffs 3.0 and 1.3 for amplifications and deletions, respectively. Full sets of genes under each CNA peak can be found in "Wang_HCC_CNA_landscape_Table_S3_full.docx".
hep26402-sup-0005-supptab3b.doc335KTable S3. All CNA peaks predicted by GISTIC2 analysis. "Peak frequency" was calculated using copy number cutoffs of 2.3 and 1.7 for copy number gains and losses, respectively. "High peak frequency" was calculated using cutoffs 3.0 and 1.3 for amplifications and deletions, respectively. Column "Peak affected genes" lists the full set of genes covered under each CNA peak, whereas the column "CGC genes in peak" only shows those that belong to Cancer Gene Census.
hep26402-sup-0006-supptab4.doc346KTable S4. Association of average copy number of the focal amplification and deletion peaks identified by GISTIC2 to clinical and outcome variables. Values shown in the table are signed nominal p-values. For age, gender, etiology, tumor size, serum AFP, cirrhosis, intrahepatic metastasis and vascular invasion, two-sample t-tests were used: a positive p-value indicates higher average copy number of the peak in the group stated in the header. AJCC T stage, BCLC stage, Child-Pugh class and Edmondson grade were treated as ordinal variable and linear trend tests were performed: a positive p-value indicate a positive trend and vice versa. Cox regression analysis was used to evaluate the association to disease specific overall survival and disease free survival: a positive p-value indicates that increased copy number is associated with poor outcome. Order of the peaks in this table is the same as that of Table S2. The two genes validated in this study, BCL9 and MTDH, are in amplification peak #2 and #51, respectively. Significant associations at a nominal p-value <0.05 level are shaded in red for positive associations and blue for negative associations.
hep26402-sup-0007-supptab5a.doc243KTable S5. All pathways enriched among cis-acting genes in CNA peaks. Full sets of genes that map to each pathway can be found in "Wang_HCC_CNA_landscape_Table_S5_full.docx".
hep26402-sup-0008-supptab5b.doc243KTable S5. All pathways enriched among cis-acting genes in CNA peaks. The column "Genes" lists full sets of genes that map to each pathway.
hep26402-sup-0009-supptab6.doc57KTable S6. Candidate driver genes selected based on focal CNA, expression changes and model availability. "Amplification frequency" is calculated based on inferred copy numbers of individual genes (which may differ from the GISTIC2 peak frequency).
hep26402-sup-0010-supptab7.doc249KTable S7. Somatic copy number, gene expression and IHC staining results for BCL9. For definition of stain positivity please refer to Supplementary Methods.
hep26402-sup-0011-supptab8.doc249KTable S8. Somatic copy number, gene expression and IHC staining results for MTDH. For definition of stain positivity please refer to Supplementary Methods.

Please note: Wiley Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.