Gene expression changes associated with malignant transformation of oral potentially malignant disorders.

BACKGROUND
A large number of oral squamous cell carcinomas (OSCCs) are believed to be preceded by oral potentially malignant disorders (OPMD) that have an increased likelihood of malignant transformation compared to clinically normal mucosa. This study was performed to identify differentially expressed genes between OPMDs that underwent malignant transformation (MT) and those that did not, termed 'non-transforming' (NT) cases.


METHODS
Total RNA was extracted from formalin-fixed paraffin-embedded tissue biopsies of 20 OPMD cases with known clinical outcomes (10 MT vs. 10 NT). Samples were assessed for quantity, quality and integrity of RNA prior to sequencing. Analysis for differential gene expression between MT and NT was performed using statistical packages in R. Genes were considered to be significantly differentially expressed if the False Discovery Rate corrected p-value was < 0.05.


RESULTS
RNA yield was variable but RNA purity was good (A260/A280 >1.90). Analysis of RNA-Sequencing outputs revealed 41 genes (34 protein-coding; 7 non-coding) that were significantly differentially expressed between MT and NT cases. The log2 fold change for the statistically significant differentially expressed genes ranged from -2.63 to 2.48, with 23 protein-coding genes being downregulated and 11 protein-coding genes being upregulated in MT cases compared to NT cases.


CONCLUSION
Several candidate genes that may play a role in malignant transformation of OPMD have been identified. Experiments to validate these candidates are underway. It is anticipated that this work will contribute to better understanding of the aetiopathogenesis of OPMD and development of novel biomarkers.


| INTRODUC TI ON
Epidemiological studies estimate that more than 300,000 new cases and 145 400 deaths from oral cancers (inclusive of lip cancers) occur annually. 1 Approximately two-thirds of the new oral cancer cases occurred in men, and around 77% of oral cancer deaths were in less developed nations. 1 Most oral cancers are oral squamous cell carcinomas (OSCCs) and a proportion of OSCCs are believed to be preceded by clinical entities termed "oral potentially malignant disorders" (OPMDs). 2 OPMDs are defined as clinical disorders having an increased risk of developing OSCC in oral mucosa; either in recognisable lesions or clinically "normal" oral mucosa. 2 There are several possible clinical outcomes for OPMD, the lesion remains unchanged, it increases in size, it regresses in size, it disappears completely or undergoes malignant transformation (MT).
Several epidemiological studies conducted in different areas of the world have shown that most OPMD do not undergo MT although they may persist. 2,3 A contemporary systematic review and meta-analysis described a mean overall MT rate of 12.1% in oral epithelial dysplasia (OED) whilst a recent systematic review found a 7.9% prevalence rate of MT in OPMD. 3,4 Currently, there is no reliable method to determine the clinical outcome of patients with OPMDs.
To compensate for the limitations in predicting malignant change, biomarkers have been sought based on an improved understanding of the underlying molecular pathogenesis of OSCC. Numerous individual biomarkers have been studied, but none have been validated for use in clinical practice.
By studying differential gene expression (DGE) between normal and abnormal tissue, in-depth understanding of the genetic pathways involved in carcinogenesis can be elucidated. Studies based on DGE have allowed researchers to dissect and examine the cancer transcriptome in a way that was not possible using conventional molecular biological methods. DGE has also contributed to the paradigm shift away from single biomarkers towards the use of gene expression signatures for diagnosis or prognosis.
The ability to identify patient sub-groups with similar molecular patterns in various tumour types have enabled researchers to define new molecular cancer sub-types enhancing better targeted therapy and patient care. A prime example is breast cancer where at least five molecular sub-types with prognostic correlation were discovered. The findings were then further refined and validated resulting in a predictive gene signature. 5 The lack of prognostic biomarkers in OPMD is a cogent reason to perform DGE-based studies to identify gene signatures for early diagnosis, therapy or prognosis in OPMD to inform targeted therapy. A recent meta-analysis performed by De There are very few DGE studies on OPMD or oral epithelial dysplasia (OED). [7][8][9][10][11] As yet, only one truly investigated DGE between OPMD that transformed to OSCC and those that did not. 7 Saintigny et al (2011) proposed gene expression-based prediction models that showed superior prognostic accuracy when compared to models using clinico-pathologic risk factors. 7 As such, further studies in DGE between OPMD that undergo malignant transformation versus those that do not would provide much needed insight into the molecular mechanisms that translate into malignant transformation in OPMDs.
Whole transcriptome analysis is a major advancement in studying and understanding gene expression as it allows researchers to obtain a comprehensive view of the transcriptional profile at a given moment in time. A widely used method for profiling the whole transcriptome in a "snapshot" manner is RNA-Sequencing (RNA-Seq).
As it captures the whole transcriptome, RNA-Seq is able to detect gene transcripts and is suitable for assessing genes that are differentially expressed between different disease states. In this study, we have used RNA-Seq as a discovery platform to identify transcripts of genes that may be involved in the malignant transformation of OPMD.

| Patients
OPMD cases for this study were selected from a previously studied cohort of OPMD patients. 12 A case was classified as having undergone MT when there was progression from an OPMD to oral squamous cell carcinoma (OSCC) after a period of six months or more from the time of the initial diagnosis of OPMD.
The following exclusion criteria were applied: i) Previous history of head and neck cancer; ii) Previous history of radiotherapy to the head and neck region; iii) Patients with hereditary/acquired conditions that are linked to an increased risk of head and neck SCC (such as ataxia telangiectasia, xeroderma pigmentosum, Fanconi anaemia etc); iv) Patients that were diagnosed as having chronic hyperplastic candidosis; v) Cases with incomplete/inconsistent records; vi) Cases with inadequate/damaged/unavailable FFPE tissue for analysis.
Demographic (age at diagnosis, sex) and clinico-pathological data (site, clinical diagnosis) were recorded for each patient. The clinical outcome and time to either malignant transformation or last follow-up was also recorded and calculated for the patients. All histopathological assessments were performed following a modified three-tier system adapted from the work published by Speight et. al. (2015) involving three oral and maxillofacial pathologists. 13 . The cases were graded using the three-tiered (mild, moderate or severe) World Health Organization (WHO) 2017 classification and binary grading systems. 2 The pathologists were blinded to clinical outcome of OPMD patients during the assessment and grading exercise.

| Bioinformatic analysis of RNA-Seq data
FastQ files generated from the sequencing runs were downloaded from the Illumina server using BaseMount, the command line interface for Illumina BaseSpace. Read quality of the FastQ files generated from the sequencing run was assessed using FastQC (http:// www.bioin forma tics.babra ham.ac.uk/proje cts/fastqc) and MultiQC (http://multi qc.info) was used to obtain summary statistics for quality control tests on the read quality. Reads were quantified against transcripts using "Kallisto". 14 To obtain gene-level counts, a package from the R statistical programming language (R Foundation for Statistical Computing, Vienna, Austria), "tximport" was used. Gene annotation was obtained from Ensembl transcript IDs using the R package "biomaRt". 15 The R package DESeq2 was used for normalisation and testing for differential gene expression by use of negative binomial generalised linear models. 16 Genes were considered to be significantly differentially expressed when the False Discovery Rate (FDR) using the Benjamini-Hochberg method corrected p-value was less than 0.05.
A hypergeometric test was carried out to assess over-representation of gene ontology (GO) terms amongst genes found to be significantly differentially expressed. The R package "GOStats" was used to implement this test. 17

| Ethics
This study was approved by the National Research Ethics Service Committee Northeast (Evaluation of the prognostic potential and functional significance of biomarkers in oral cancer; NRES Committee Northeast -Sunderland 11/NE/0118) and complies with UK legislation and guidelines.

| Differential gene expression (DGE)
All samples passed the quality control assessments to proceed for downstream analysis. Reads were assessed using FastQC and overall quality was high. RNA yield was variable but RNA purity was good (A260/A280 > 1.90). As expected, the RNA was highly degraded (RIN 1.4 -2.6). Bioinformatic analysis of RNA-Seq outputs revealed 41 genes that were significantly differentially expressed between MT and NT cases (

| GO enrichment analysis
To discover the functions of the differentially expressed genes, we performed GO enrichment analysis (Table S1). Twenty of the most significant GO biological process (GOBP) terms associated with the identified significantly differentially expressed genes are listed in Table 3. Of these, three GOBP terms were noted to have a high degree of association with oral carcinogenesis: Regulation of response to wounding (Genes: IER3, CD46 and FAM46A), regulation of response to DNA damage stimulus (Genes: IER3, SPIDR and MUC1) and regulation of Notch signalling pathway (Genes: DLX2 and CD46).  A recent study by Conway et al (2015) also employed RNA-Seq to assess DGE in "normal", OED and OSCC tissues; however, all three tissue states ("normal", OED and OSCC) were obtained from the same excision specimen. 8 Due to the well-recognised theory of field change in OPMD patients, it is understood that histologically "normal" tissue may not be molecularly "normal" and free from molecular change which introduces a confounder to the results obtained by The relatively small number of significantly differentially expressed genes identified in our study highlights the high degree of similarity between cases that undergo MT and those that do not.

Site of OPMD
This finding is consistent with the overall clinico-pathological features of OPMDs whereby it is difficult to accurately predict the clinical outcome of a patient with OPMD. Instead of focusing too much on individual genes, more emphasis should be placed on the pathways and biological processes involved.
Three of the GOBP terms found from the enrichment analysis; "regulation of response to DNA damage stimulus", "regulation of response to wounding" and "regulation of Notch signalling pathway", have been shown to be associated with carcinogenesis and have some degree of association with one another. [18][19][20][21] The relationship between DNA damage response (DDR) and carcinogenesis is one that is well established, and in recent years, there has been interest in the association between regulation of DDR and the regulatory effect of the Notch signalling pathway on DDR. 22,23 The association between regulation of wounding, cancer and the Notch signalling pathway is also one that is being studied with renewed interest in recent years, consistent with the hypothesis that cancer is an "over-healing wound". 18 shown recently to be overexpressed in breast and ovarian cancers as well as advanced stages of gastric adenocarcinoma suggesting a potential role in carcinogenesis. [27][28][29] The study by Lee et. al. (2011) suggests that DLX2 may be involved in tumour progression via metabolic-stress induced necrosis. 28 DLX2 has also been implicated in transforming the role of transforming growth factor β (TGFβ) from a tumour suppressor to a tumour promoter by increasing the expression of the mitogenic transcription factor c-Myc, directly suppressing TGFβ receptor II and reducing expression of cell-cycle inhibitor p21 CIP1 . 29 The role of DLX2 in oral carcinogenesis, however, is currently unknown.

GOBP ID P-value
Decreased expression of CD46 that encodes for a complement regulatory protein (a membrane co-factor protein) was detected in cases that underwent malignant transformation. CD46 is also known as complement restriction factor as it facilitates inactivation of C3b and C4b of the complement system. Interestingly, other studies have shown that CD46 together with other complement restriction factors such as CD55 and CD59 are expressed at higher levels in head & neck cancer tissue compared to non-tumour tissue proposing that these proteins may play a role in tumour evasion of the complement system. 30 The decreased expression of CD46 observed in our study is different to that seen in OSCCs suggesting that CD46 is dynamically expressed during oral carcinogenesis with possible temporal differences in expression before, during and after malignant transformation.
Archived formalin-fixed paraffin-embedded (FFPE) tissues are an invaluable resource that can be successfully used for molecular-based assays despite the degradation that often accompanies fixation and embedding of tissues in paraffin wax. Our study adds to the increasing body of work on utilisation of FFPE material for gene expression studies.
One of the limitations of our study is the relatively small number of cases included compared to the study by Saintigny et al (2011) that had an 86-patient cohort. 7 This was due to strict quality control resulting in exclusion of poor quality RNA samples. Another limitation is that gene expression studies only allow a snapshot of the transcriptomic profile at a given point in time, and as such is a very simplistic and static representation of a dynamic temporal process.
Furthermore, an OPMD that was categorised as being a non-transforming case may eventually undergo MT. However, RNA-Seq analysis for this study was to serve only as an initial broad overview of the transcriptomic differences between OPMD cases that undergo MT and those that do not.
In summary, our study has identified candidate genetic pathways that may play a role in malignant transformation of OPMD.
Experiments to validate these pathways and relevant genes are currently underway, and it is anticipated that this work will contribute to better understanding of the pathogenesis of OPMD and the development of novel prognostic biomarkers.