Comparative study on the mutational profile of adenocarcinoma and squamous cell carcinoma predominant histologic subtypes in Chinese non‐small cell lung cancer patients

Abstract Background Distinction in the mutational profile between the common histological types, lung adenocarcinoma (LUAD) and squamous cell lung carcinoma (LUSC) has been well‐established. However, comprehensive mutation profiles of the predominant histological subtypes within LUAD and LUSC remains elusive. Methods We analyzed the mutational profile of 318 Chinese NSCLC patients of adenocarcinoma and squamous cell carcinoma predominant subtypes from seven hospitals using capture‐based ultra‐deep sequencing of 68 lung cancer‐related genes. Results Of the 318 NSCLC patients, 215 were diagnosed with LUAD and 103 with LUSC. Adenocarcinoma in situ and acinar adenocarcinoma were the most predominant subtypes of LUAD. On the other hand, keratinizing squamous cell carcinoma was the most predominant subtype of LUSC. Among the LUAD subtypes, EGFR sensitizing mutations were most prevalent in the invasive lepidic subtype. More than half of the patients with preinvasive adenocarcinoma in situ, minimally invasive, acinar, micropapillary and papillary subtypes were also EGFR‐mutants. Patients with colloidal, invasive mucinous, and fetal subtypes had the least number of EGFR mutations. Moreover, KRAS mutations were prevalent in patients with invasive mucinous, colloid, enteric and solid subtypes. A total of 90% of the LUSC patients harbor mutations in TP53, wherein all patients except five with nonkeratinizing were TP53 mutants. PIK3CA amplifications were most prevalent in keratinizing, followed by basaloid and nonkeratinizing subtypes. Conclusion These data suggest that the mutational profiles among the predominant histological subtypes were very distinct, which provided a reliable tool to improve treatment decisions.


Introduction
Lung cancer is the leading cause of cancer-related death in China and around the world. 1,2 Non-small cell lung cancer (NSCLC) accounts for about 85% of lung cancer cases diagnosed, with two major histological types: adenocarcinoma (LUAD) and squamous cell carcinoma (LUSC) accounting for nearly 50% and 30% of NSCLC, respectively. 3 Historically, the histological classification of NSCLC had not been a major determining factor in treatment guidance. 4 It is only in the past decade that it has become apparent that LUAD and LUSC have distinctive mutation profiles, responsible for their divergent responses to targeted therapies. Development of targeted therapies such as EGFR-TKIs has revolutionized the management of EGFR-mutant LUAD patients. 5,6 Furthermore, pemetrexed 7 and bevacizumab 8 were approved for nonsquamous NSCLC. On the other hand, nivolumab has recently been approved by the U.S. Food and Drug Administration for metastatic LUSC patients. 9 Due to the development of therapeutic agents approved only for particular histological types, the need for histopathological classification grew significantly over the years.
Conventionally, histological classification of NSCLC primarily relied on morphology using light microscopy with hematoxylin-eosin and mucin stains. However, certain cytology samples obtained from small biopsies are morphologically indistinguishable such as poorly differentiated adenocarcinoma and squamous cell. 10 Immunohistochemistry (IHC) markers were introduced in the diagnosis of NSCLC to improve accuracy and reproducibility of histological classification and are now widely used in the subtyping of NSCLC. In line with the growing need for histologic classification of NSCLC, the World Health Organization (WHO) recently revised the guidelines for the classification of lung tumors. Some of the amendments included the emphasis of histology to personalized medicine and modification of the histologic criteria and classifications for both LUAD and LUSC subtypes following the recommendations from the Association for the Study of Lung Cancer, American Thoracic Society and European Respiratory Society (ASLC/ATS/ERS). 4 The reclassification in the LUAD and LUSC subtypes were according to the predominant morphologic pattern as well as the general pattern of invasion. LUAD with predominantly lepidic, nonmucinous pattern is characterized as preinvasive adenocarcinoma in situ, minimally invasive adenocarcinoma, or invasive adenocarcinoma with lepidic component depending on the invasion pattern; while other invasive LUAD with identifiable patterns are classified as invasive mucinous, colloid, fetal, enteric, acinar, papillary, micropapillary, and solid subtypes. 4 On the other hand, LUSC is further classified as preinvasive squamous cell carcinoma in situ, and invasive squamous cell carcinoma as keratinizing, nonkeratinizing and basaloid subtypes. 4 Each cancer histological type and subtype arose from multiple risk factors including genetic and environmental and thus has its own unique genetic mutational profile. Molecular profiling of the individual tumor's genome facilitates the understanding of the distinct molecular mechanism that regulates cancer progression and the discovery of potential therapeutic targets. Due to the advancements in molecular profiling technologies, the mutation profiles of LUAD and LUSC have been well elucidated. [11][12][13] However, the molecular distinction between each specific histological subtypes within LUAD and LUSC is just beginning to be understood. Among LUAD subtypes, EGFR mutations are more prevalent in lepidic (formerly termed as bronchioalveolar) tumors, while KRAS mutations are more common invasive LUAD subtypes, particularly invasive mucinous and solid subtypes. 11,[13][14][15][16][17][18][19] On the other hand, limited information is available on the molecular distinction in LUSC subtypes. 13 In this multi-center comparative study, we aimed to characterize the mutational profiles of the major histological subtypes of adenocarcinoma and squamous cell carcinoma in the Chinese NSCLC patients. This study emphasizes the need for mutational profiling in all NSCLC patients to identify actionable mutations amenable to targeted therapy.

Patient enrollment
This cohort included 318 Chinese stage I-IV NSCLC patients with either adenocarcinoma or squamous cell types from seven hospitals. All patients underwent complete tumor staging according to the seventh edition tumor, node, and metastasis (TNM) criteria of NSCLC. 20 Lung cancer histology was classified according to the predominant subtype following the 2015 World Health Organization (WHO) histopathology classification. 4 The patients' clinical data, including demographic information, smoking status and cancer histological subtype, were reviewed. Tumor samples were obtained either by surgical or needle biopsy procedures and sequenced for mutational analysis. The study had been approved by the relevant regulatory and independent ethics committees or institutional review boards of all the participating hospitals. Written informed consent was obtained from each patient for the use of their tissue samples.

Tissue DNA extraction
DNA was extracted from formalin-fixed, paraffinembedded (FFPE) tumor tissues using QIAamp DNA FFPE tissue kit (Qiagen, Hilden, Germany) according to the manufacturer's instructions.

Capture-based targeted DNA sequencing
A minimum of 50 ng of DNA is required for NGS library construction. Tissue DNA was sheared using Covaris M220, followed by end repair, phosphorylation, and adaptor ligation. Fragments of size 200-400 bp were selected by bead (Agencourt AMPure XP Kit, Beckman Coulter, Brea, CA, USA), followed by hybridization with capture probes baits, hybrid selection with magnetic beads and PCR amplification. Lung Core panel from Burning Rock Biotech (Guangzhou, China) consisting of 68 lung cancer-related genes spanning 345 kb of the human genome was used. The quality and the size of the fragments were assessed using a Qubit 2.0 Fluorimeter with the dsDNA high sensitivity assay kit (Life Technologies, Carlsbad, CA, USA). Indexed samples were sequenced on Nextseq500 (Illumina, Inc., Madison, WI, USA) with paired-end reads.

Sequence data analysis
Sequence data were mapped to the reference human genome (hg19) using Burrows-Wheeler aligner v.0.7.10. Local alignment optimization, variant calling and annotation were performed using Genome Analysis Tool Kit v.3.2, and VarScan. Variants were filtered using the VarScan fpfilter pipeline, loci with depth less than 100 were filtered out. Base calling in tissue samples required at least five and eight supporting reads for small insertiondeletions (INDELs) and single nucleotide variants (SNVs), respectively. INDELs and SNVs with population frequency over 0.1% in the ExAC, 1000 Genomes, dbSNP or ESP6500SI-V2 databases were grouped as SNP and excluded from further analysis. Remaining variants were annotated with ANNOVAR and SnpEff v.3.6. Analysis of DNA translocation was performed using Factera v.1.4.3.

Statistical analysis
All the data were analyzed using R software (R version 3.4.0; R: The R Foundation for Statistical Computing, Vienna, Austria). Significance between the groups was calculated using Fisher's exact test with P < 0.05 considered as statistically significant. P-values were adjusted for variables such as age, gender, smoking history, and pathological stage when applicable.

Comparison of the mutational profile of LUAD and LUSC patients
Of 297 patients, 199 (92.6%) were LUAD and 94 (91.3%) were LUSC. A total of 31 patients (16 LUAD and 15 LUSC) had no mutation detected from our panel.
From the somatic mutational profile of the patients, a distinct pattern between LUAD and LUSC was observed. Mutations in the oncogenic drivers were more prevalent in LUAD than LUSC patients (P < 0.001), with 77.2% (166/215) of the LUAD patients harboring alteration in oncogenic driver mutations. Oncogenic drivers included EGFR L858R and exon 19 deletion (19del), KRAS G12, G13 and Q61, BRAF V600E, MET exon14 skipping and amplification, ERBB2 exon 20 insertion, ALK fusions, RET fusions, ROS1 fusions and NRG1 fusions. Among them, 37.7% (81/215) of the patients carried EGFR L858R and 19del mutations. Mutual exclusivity was observed among driver mutations in LUAD. However, a rare co-occurrence of EGFR L858R and KRAS Q61H was found in a LUAD patient. Moreover, we also observed concurrent mutations of EGFR L858R and ERBB2 S310F in two adenocarcinoma patients; EGFR L858R and ERBB2 amplification in three LUAD patients (Table S3). In addition to sensitizing mutations in EGFR, we found three rare EGFR mutations including EGFR G719A detected in a patient and EGFR S758I and L838V concurrently detected in a patient (Table S3). A total of 24 patients (11.2%, 24/215) had EGFR compound mutation. KRAS mutations, including G12, G13, Q61, were detected in 17.7% (38/215) of patients. Among the BRAF mutations detected in 5.6% (12/215) of patients, no BRAF V600E was found. A BRAF rare disruptive in-frame insertion R506_K507insVLR was detected in a LUAD patient (Table S4). Conversely, mutations in ALK, RET and ROS1, including fusions and other types of mutations, were detected in 25 (11.6%), 8 (3.7%) and 12 (5.6%) LUAD patients, respectively. A majority (77.8%, 7/9) of the patients with ALK fusion had EML4 as the fusion partner. Of these nine patients, four had single EML4-ALK fusion, while three patients had EML4-ALK and concurrent ALK fusion with previously unreported partners, such as EXOC6B, TTN, ACVR1, and TACR1. The two remaining patients had other unreported ALK fusion partners, including ERBB4 found in a patient and both PRR20A and RHOB detected in another patient. A summary of the ALK fusions detected in our cohort was listed in Table S4. Moreover, a patient with a driver CCDC6-RET fusion also had a concurrent, previously unreported RET fusion with CCSER2. In addition, CD74-NRG1 fusions were also detected in three LUAD patients. Rare mutations and concurrent mutations in oncogenic genes were summarized in Table S3.
In LUSC, only 10 (9.7%, 10/103) patients carried alterations in oncogenic drivers. No mutual exclusivity was observed among driver mutations. Interestingly, four and one LUSC patients carried EGFR sensitizing and KRAS mutations, respectively. Such mutations were believed to occur exclusively in LUAD. On the other hand, BRAF mutations were found in four (3.9%, 4/103) LUSC patients; however, no BRAF V600E mutations were detected. Conversely, EML4-ALK and CD74-ROS1 fusions were found in a patient each; while no RET fusion was found in LUSC patients.
Among the patients with preinvasive adenocarcinoma in situ (AIS) subtype, EGFR mutations were detected in 52% (16/31) of the patients (Fig 2b). No alterations in ALK, ERBB2, MET, and RET were found in AIS patients (Fig 2b). Only an AIS patient harbored ROS1 fusion. Meanwhile, TP53 mutations were found in 19% (6/31) of the AIS patients (Fig 2c). Among the patients with no mutations detected from our panel, five were AIS patients.
In patients with minimally invasive adenocarcinoma subtype (MIA), the only oncogenic driver mutations detected were EGFR, ERBB2 and KRAS (Fig 2b) wherein half (4/8) of the MIA patients were EGFR mutants. ERBB2 and KRAS were found in a patient each. No gene alterations were detected in ALK, BRAF, MET, RET, ROS1 and TP53 in MIA patients (Fig 2b). An MIA patient was negative for mutations in our gene panel.
Among all the patients in the cohort, patients with adenocarcinoma of the invasive type generally had alterations in all oncogenic driver genes. In particular, gene alterations in ALK, MET and RET were only found in these patients (Fig 2b). TP53 mutations were also more prevalent in patients with invasive adenocarcinoma (48%, 74/153, Fig 2c).
Patients with invasive mucinous adenocarcinoma subtype (IMA) patients had the most prevalent KRAS mutations, with 40% (6/14) KRAS-positive IMA patients (Fig 2b). Instead of sensitizing mutations, only EGFR amplifications were found in two IMA patients (Fig 2b). No BRAF, ERBB2, MET and RET alterations were detected in any of these patients (Fig 2b). A patient with IMA was negative for mutations in our gene panel.
Among the 10 colloid adenocarcinoma subtype (COL) patients, 30% (3/10) were positive for KRAS mutations (Fig 2b). Interestingly, two KRAS-positive patients also had concomitant BRAF mutations (Fig 2b). Moreover, two patients had ALK fusion; one patient had ERBB2 amplification and one had both EGFR 19del and ERBB2 amplification (Fig 2b). Two patients with COL were negative for mutations in our gene panel.
Apart from ERBB2 amplification and KRAS mutation in each of the two fetal adenocarcinoma (FET) patients in the cohort, no other gene alterations in any oncogenic driver and TP53 were detected (Fig 2b,c).

Mutational profile of squamous cell lung carcinoma patients (LUSC) according to subtype
In contrast to LUAD, LUSC had a predominance of mutations in TP53 and copy number variations in particular genes, while oncogenic driver mutations were very few (P < 0.001, Fig 1c, Figure S2). The mutation count among the LUSC histological subtypes were comparable, with the median basaloid squamous cell carcinoma (BSC) patients slightly higher but not statistically different from the other subtypes (P = 0.45, Fig 3a).

The relationship between molecular and clinical features in LUAD and LUSC patients
In LUAD patients, TP53 mutations were associated with older patients. The age of TP53-positive LUAD patients ranged between 43 to 81 years with a median age of 62 years, while the age of wild-type TP53 patients ranged between 29 to 79 years with a median age of 59 years (adjusted P < 0.001, Figure S3). KRAS mutations were also found to be associated with smoking status (adjusted P = 0.039). There was no significant correlation in age, gender and genetic alterations among the LUSC patients.

Discussion
The understanding and management of lung cancer has advanced significantly in the past decade. Even with the increasing importance of molecular testing to identify actionable mutations for targeted therapy, histopathological classification of cancer subtypes is still an essential component of clinical diagnosis and making optimal treatment decisions, particularly in patients with no actionable mutations.
To the best of our knowledge, our study is the first to use a unified strategy to compare the mutational profile of LUAD and LUSC predominant histological subtypes in Chinese NSCLC patients.
In our cohort of 215 LUAD NSCLC patients, adenocarcinoma in situ and acinar subtypes were the two most prevalent LUAD histological subtypes. This is in contrast to Caucasian histological prevalence where the top two subtypes were acinar followed by solid subtypes. 11,21 Since previous studies have only used traditional methods of molecular testing, existing literature on the genetic alterations in various histological subtypes are mostly limited to the detection rates of EGFR and KRAS mutations. EGFR mutations are detected in 10%-30% of LUAD patients; however, this prevalence increases to approximately 50% among Chinese LUAD patients. 13,22 Hence, we only considered the reports that included Chinese patients. A previous study reported that among Chinese LUAD patients, lepidic and micropapillary subtypes had the most EGFR mutations with approximately 70% EGFR-mutant patients from each subtype, while solid subtype had the least number of EGFR mutant patients. 18 Meanwhile, another study reported the EGFR mutation detection rates of 68.8%, 70.7%, 69.5%, 22.5%, 80.0%, and 25.0% in Chinese patients with lepidic, papillary, acinar, solid, micropapillary and mucinous subtypes, respectively, with no EGFR mutation detected in the case of fetal adenocarcinoma. 23 In contrast, in our cohort, EGFR sensitizing mutations were generally more common in preinvasive and minimally invasive subtypes. Meanwhile, invasive subtypes such as acinar, micropapillary, and papillary also had a substantial number of EGFR mutant patients. However, the least number of EGFR sensitizing mutations were in patients of colloidal subtype (1/10), invasive mucinous (0/14) and fetal (0/2) subtypes. Conversely, KRAS mutations in our cohort were consistent with the reported prevalence. 17 In our cohort, KRAS mutations were also more prevalent in invasive mucinous (6/14), colloid (3/10), enteric (4/22) and solid (6/24) subtypes.
Furthermore, we revealed distinct mutation profiles for Chinese LUAD and LUSC patients. In contrast to LUAD patients, the LUSC patients in our cohort had significantly more amplification events and TP53 mutations. Similar to the findings of the TCGA 12 and a study among 104 Korean LUSC patients by Kim et al., 24 significantly fewer EGFR and KRAS mutations were detected in Chinese LUSC than LUAD patients. Conversely, another mutational profiling study involving 157 Chinese LUSC patients reported mutation incidence of 56% for TP53, 8.9% for CDKN2, 8.9% for PIK3CA, 25 and the incidence rates were significantly lower than those observed in our cohort. Studies involving larger cohorts are necessary to validate the incidence rates.
Interestingly, we have detected several rare mutations in known oncogenic genes including EGFR and BRAF and unreported fusion partners for RET, FGFR1 and ALK in our cohort. With the increasing use of molecular profiling in the clinical setting, more novel mutations and fusion partner genes are being uncovered. However, the clinical significance of these rare mutations and novel fusion partners requires further studies. Our study is limited by its retrospective nature. Clinical data for some of the patients were incomplete which limits the analysis of clinical features and histological subtypes of the cohort. The limited availability of tissue samples also limits us to explore the clinical significance of rare mutations detected in the study. Future prospective multi-center studies with a larger cohort are required to further explore the mutation profile of the different subtypes. It would be interesting to explore the stratification of molecular subtypes according to distinct mutation signature and their clinical responses to certain inhibitors.
In summary, this comparative study revealed the mutational distinction between LUAD and LUSC as well as their predominant histological subtypes in the Chinese population. Taking the inherent genetic heterogeneity among tumor subtypes, we further emphasize the need to include comprehensive mutational profiling in the standard management of lung cancer patients of all histological subtypes to understand the genetic landscape of the tumor and further inform clinical decisions.

Supporting Information
Additional Supporting Informationmay be found in the online version of this article at the publisher's website: Figure S1 Mutational spectrum of the LUAD patients. Each column represents a patient and each row represents a gene. Top plot represents the overall number of mutations a patient carried. Side bars represent the percentage of patients with a certain mutation. Different colors denote different types of mutation. Negative denotes the absence of any mutation. ACI, acinar adenocarcinoma; AIS, adenocarcinoma in situ; COL, colloid adenocarcinoma; ENT, enteric adenocarcinoma; FET, fetal adenocarcinoma; IMA, invasive mucinous adenocarcinoma; LPA, lepidic adenocarcinoma; MIA, minimally invasive adenocarcinoma; MP, micropapillary adenocarcinoma; PAP, papillary adenocarcinoma; Solid, solid adenocarcinoma; Unknown, LUAD with unclassified subtype. Figure S2 Mutational profile of LUSC patients according to histological subtypes. Each column represents a patient and each row represents a gene. Top plot represents the overall number of mutations a patient carried. Side bars represent the percentage of patients with a certain mutation. Different colors denote different types of mutation. Negative denotes the absence of any mutation. BSC, basaloid squamous cell carcinoma; KSC, keratinizing squamous cell carcinoma; NKSC, nonkeratinizing squamous cell carcinoma; SIS, squamous cell carcinoma in situ; Unknown, LUSC with unclassified subtype. Figure S3 The relationship between molecular and clinical features in LUAD patients. Box plot illustrating the relationship between age of the LUAD patients and TP53 mutation. X-axis denotes the TP53 mutation status, negative for wild-type. Y-axis denotes the age of the patients in years.