Molecular characterization of colorectal cancer using whole‐exome sequencing in a Taiwanese population

Abstract Next‐generation sequencing (NGS) technology is currently used to establish mutational profiles in many heterogeneous diseases. The aim of this study was to evaluate the mutational spectrum in Taiwanese patients with colorectal cancer (CRC) to help clinicians identify the best treatment method. Whole‐exome sequencing was conducted in 32 surgical tumor tissues from patients with CRC. DNA libraries were generated using the Illumina TruSeq DNA Exome, and sequencing was performed on the Illumina NextSeq 500 system. Variants were annotated and compared to those obtained from publicly available databases. The analysis revealed frequent mutations in APC (59.38%), TP53 (50%), RAS (28.13%), FBXW7 (18.75%), RAF (9.38%), PIK3CA (9.38%), SMAD4 (9.38%), and SOX9 (9.38%). A mutation in TCF7L2 was also detected, but at lower frequencies. Two or more mutations were found in 22 (68.75%) samples. The mutation rates for the WNT, P53, RTK‐RAS, TGF‐β, and PI3K pathways were 78.13%, 56.25%, 40.63%, 18.75%, and 15.63%, respectively. RTK‐RAS pathway mutations were correlated with tumor size (P = 0.028). We also discovered 23 novel mutations in NRAS, PIK3CA, SOX9, APC, SMAD4, MSH3, MSH4, PMS1 PMS2, AXIN2, ERBB2, PIK3R1, TGFBR2, and ATM that were not reported in the COSMIC, The Cancer Genome Atlas, and dbSNP databases. In summary, we report the mutational landscape of CRC in a Taiwanese population. NGS is a cost‐effective and time‐saving method, and we believe that NGS will help clinicians to treat CRC patients in the near future.

consumption of red and/or processed meat, and a history of diabetes. 2,3 Epidermal growth factor receptor (EGFR) has been recognized as an effective anticancer target during the last few years. Monoclonal antibodies used to block EGFR in combination with chemotherapy or radiation have yielded improved outcomes in CRC patients with extended RAS wild-type tumors. Mutations in the RAS and BRAF genes are harmful to anti-EGFR therapy in metastatic CRC (mCRC). 4 RAS and BRAF oncogene mutations are mutually exclusive and occur in 36.97% and 4.24% of CRC patients, respectively, as described in our previous work. 5 Thus, identifying the unique genomic profiles and molecular phenotypes could help effectively establish the best treatment method in patients with anti-EGFR therapy resistance.
CRC is one of the most interesting fields of next-generation sequencing (NGS) application. The number of studies employing the NGS technique continues to increase. The Cancer Genome Atlas (TCGA) project studied more than 224 CRC cases and showed that 24 genes, including APC, TP53, SMAD4, PIK3CA, and KRAS, contained significant mutations. Three genes (ARID1A, SOX9, and FAM123B/ WTX) were frequently mutated. 6 Ashktorab et al analyzed 63 Iranian patients using targeted exome sequencing and found higher mutation rates of MSH3, MSH6, APC, and PIK3CA and hypothesized a larger role for these genes in CRC. They suggested the adoption of a specific informed genetic diagnostic protocol and tailored therapy in this population. 7 Because patients with RAS wild-type CRC can be non-responders to EGFR-targeted therapy, Geibler et al analyzed cell lines and tumor specimens to identify prediction markers by NGS, EGFR methylation and expression, and E-cadherin expression. The authors revealed ATM mutations and low E-cadherin expression as novel supportive predictive markers. 8 Adua et al analyzed primary tumor and liver metastasis samples from 7 KRAS wild-type patients and compared the genotypes of 22 genes associated with anti-EGFR before and after chemotherapy. The results showed marked genotypic differences between pre-and post-treatment samples, which were likely attributable to tumor cell clones selected by therapy. 9 Gong et al analyzed 315 cancer-related genes and introns of 28 frequently rearranged genes in 138 mCRC cases using FoundationOne. They identified a novel KRAS mutation (R68S) associated with an aggressive phenotype. The authors reported that ERBB2-amplified tumors may benefit from anti-HER2 therapy, and hypermutated tumors or tumors with high tumor mutational burden with MSI-H or POLE mutation may benefit from anti-PD-1 therapy. 10 This study examined genetic alterations in CRC in a Taiwanese population. We performed whole-exome sequencing (WES) to detect the mutational status in all human protein-coding genes using fresh frozen tissue from 32 Taiwanese patients with CRC. was used to demultiplex data and convert BCL files into FASTQ files. Sequenced reads were trimmed for low-quality sequences and aligned to the human reference genome (hg 19) using Burrows-Wheeler Alignment. 11 Finally, single nucleotide polymorphisms and small insertion and deletion mutations were called in individual samples by the Genome Analysis Toolkit and VarScan using default settings. 12,13 We then performed ANNOVAR to functionally annotate genetic variants. 14 The following criteria were used to select confident somatic single nucleotide variants: mutant allele frequency >5%, global minor allele frequency <1%, or NA (comparing the ExAC and 1000 Genome Databases data), eliminating known harmless variants present in ClinVar or the in-house polymorphism database, and predicted to be pathogenic by all three software programs (SIFT, PolyPhen-2, and CADD).

| Statistical analysis
Comparisons between clinicopathological features and the status of critical pathway mutations in CRC were performed using Fisher's exact test. Two-sided P-values < 0.05 were considered statistically significant.

| WES analysis and coverage
Using massive parallel sequencing on a NextSeq platform, we generated a mean of 157 M raw reads per sample, of which 141 M were aligned to the human reference genome (hg19; Table 2). The mean depth of the target regions for the 32 samples was 119× (range 34.79-197.53×). The coverage of the target regions exceeded 97.97%. Figure 1 is an overview of our approach used to identifying variants.
Beyond the well-established point mutations in codons 12 and 13 of exon 2 of KRAS, we identified mutations in codon 117 of exon 4 (K117N, 11.11%) and codon 146 of exon 4 (A146T, 11.11%). One mutation (11.11%) in codon 68 (exon 3) of NRAS was also detected; this was a novel alteration (R68I). The non-synonymous variant at locus 115256508 had a C-to-A change mapped in the small GTP-binding protein domain, with an allele fraction of 21.19% (total reads 118, variant count 25) ( Figure S1A). Together, these non-KRAS exon 2 mutations constituted 33.33% of all RAS mutations ( Figure 3).

F I G U R E 2 Proportion of RAS, RAF mutations, and RAS/RAF
wild-type status identified by WES. WES, whole-exome sequencing
None of the CRC patients with RAS mutations harbored a concomitant mutation in RAF. The remaining patients (62.5%) were RAS/RAF wild-type ( Figure 2).

| TCF7L2 mutations
Two patients (6.25%) had TCF7L2 mutation tumors. The identified variants were R471C, F357L, and G424E, and each patient had two of the three TCF7L2 variants.

| SOX9 mutations
Three patients (9.38%) had SOX9 frameshift mutations. One patient had an S431fs mutation, another a G484fs mutation, and the third an S485fs mutation. The G484fs and S485fs mutations were novel variants ( Figure S1C).

No Yes Total P-Value
Gender E583* in MSH4, R265Q in PMS1, and L633I in PMS2. Among these, MSH3 A61delinsAAPA and E456K, MSH4 E583*, PMS1 R265Q, and PMS2 L633I were novel mutations ( Figure  S1F-I). The numbers of variants discovered in the MMR wildtype and mutation carriers are listed in Tables S1 and S2.

| Pathway mutations and associations
We compared the clinicopathological data of CRC patients with mutations in mutation-related pathways. The RTK-RAS pathway mutation rate was significantly higher in patients with a tumor size ≤4 cm compared to those with a tumor of >4 cm (57.89% versus 15.38%, P = 0.028). No clinicopathological variables were significantly correlated with WNT, PI3K, TGF-β, or P53 pathway mutations (Table 3).

| DISCUSSION
All of the mutated genes discussed in our study have been previously classified as driver genes that confer a selective growth advantage to tumor cells harboring the mutations. CRC is similar to other cancers with only one or multiple driver gene mutations. Tumors with only one driver mutation, always in an oncogene, and with multiple driver mutations contain a combination of oncogene and tumor suppressor gene mutations. 15 In our study, of the 4 samples with a single mutation (Table 4), 1 (25%) harbored a mutation in an oncogene (KRAS), and of the 22 samples with 2 or more mutations (Tables 5 and 6), 15 (68.18%) contained a combination of mutations in both oncogenes and tumor suppressor genes. The integrative analysis of WES data provides insights into pathways that are dysregulated in CRC. The WNT signaling pathway was dysregulated in 78.13% of cases. WNT pathway mutations have been reported in 84.5%% of CRC cases, which is higher than the mutation rate detected in our study. 16 In 2012, the TCGA consortium reported that up to 93% of CRC cases involved at least 1 alteration in a known WNT regulator. 6 Hyperactivation of the WNT pathway initiates the development of CRC, which predominantly occurs through inactivation of the APC gene. 17 Several agents have been investigated to target this pathway, including WNT inhibitors (eg, Rofecoxib, PRI-724, CWP232291) and a monoclonal antibody against frizzled receptors (e.g., vanituctumab). 18 In addition to APC and SOX9, we also identified a novel mutation in AXIN2 (p.R459L) ( Figure S1J). The AXIN2 mutation identified in the current study, R459, is located in the region that interacts with β-catenin. The frequency of alterations in the RTK-RAS and PI3K pathways was 40.63% and 15.63%, respectively. RTK-RAS and PI3K pathway mutations have been found in 60.7% and 30% of CRCs, respectively. 16 In a normal cell, RTK-RAS and PI3K pathways control cell proliferation, differentiation, and survival. 19,20 In a malignant cell, constitutive and aberrant activation of components of these pathways lead to increased cell growth, survival, and metastasis. Small molecule inhibitors, such as Sorafenib and PLX4720, which are currently being used to target BRAF p.V600E, have been developed to target the RTK-RAS and PI3K pathways. NVP-BEZ235 and BGT226 are being used to target the PI3K pathway in various cancers. 21 In addition to NRAS and PIK3CA, we identified two novel mutations in ERBB2 (p.W9fs) and PIK3R1 (p.S147* and p.L161*) ( Figure S1K,L). The PIK3R1 p.S147* and p.L161* mutations were mapped to the Rho GTPase-activating protein domain.
In our study population, the mutation rate of the TGF-β and P53 pathways was 18.75% and 56.25%, respectively. TGF-β and P53 pathway mutations have been described in 28.9% and 69% of CRCs, respectively. 16 The TGF-β signaling pathway has pleiotropic functions, including the regulation of cell growth, apoptosis, cell motility, and invasion. TGF-β signaling plays a key role in tumor initiation, development, and metastasis. Many TGF-β pathway inhibitors, such as antisense oligonucleotides, neutralizing antibodies, and receptor kinase inhibitors, have been used in preclinical trials. For example, galunisertib is a TGFβR1 inhibitor that prevents signal transduction. 22 Under cellular stress, such as DNA damage, oncogenes, oxidative free radicals, and UV irradiation, the P53 protein is activated. Activation of P53 can induce cell cycle arrest, senescence, and apoptosis. Small molecular inhibitors, such as MIs, nutlins, and RITA, have been tested as therapeutic agents in CRC by activating this pathway. 23 In addition to SMAD4, we identified a novel mutation in TGFBR2 (p.D549A) and ATM (p.E650*) ( Figure  S1M,N). Our relatively low rate of mutations in these 5 critical pathways may reflect our small sample size.
Most CRC samples can be grouped by WNT-, RTK-RAS-, P53-, TGF-β-, and PI3K-dysregulated pathways. In our study population, 3 samples (3/32, 9.38%) had no mutation in any of these pathways. However, in these 3 samples, 2 had alterations in the Notch signaling pathway (CTBP2, CREBBP, KAT2B, DVL2, and PSEN2). Deregulation of Notch signaling in CRC has been reported. 24 The third sample exhibited alterations in cell adhesion molecules (CNTN2, HLA-DRB1, HLA-DRB5, and NRXN3). This indicates that it may be necessary to identify other dysregulated pathways to achieve therapeutic benefits.
We also compared the clinicopathological data of CRC patients with the mutational status of important signaling pathways in cancerous tissues. RTK-RAS pathway mutations were correlated with tumor size (P = 0.028). These results suggest that tumor progression is not linked to increased genetic instability, although this may be due to our small sample size and fact that most cases were stage II (48.39% cases); we need to collect more samples to confirm our results.
In conclusion, we identified recurrent mutations in genes such as APC, TP53, KRAS, and FBXW7, as well as unreported mutations in NRAS, PIK3CA, SOX9, APC, SMAD4, MSH3, MSH4, PMS1 PMS2, AXIN2, ERBB2, PIK3R1, TGFBR2, and ATM in a group of Taiwanese CRC patients. The data presented herein provide more comprehensive characteristics of the top deadly disease and identify a possibility for treating it in a targeted way.