The role of YTH domain containing 2 in epigenetic modification and immune infiltration of pan‐cancer

Abstract YTH domain containing 2 (YTHDC2) is the largest N6‐Methyladenosine (m6A) binding protein of the YTH protein family and the only member containing ATP‐dependent RNA helicase activity. For further analysing its biological role in epigenetic modification, we comprehensively explored YTHDC2 from gene expression, genetic alteration, protein‐protein interaction (PPI) network, immune infiltration, diagnostic value and prognostic value in pan‐cancer, using a series of databases and bioinformatic tools. We found that YTHDC2 with Missense mutation could cause a different prognosis in uterine corpus endometrial carcinoma (UCEC), and its different methylation level could lead to a totally various prognosis in adrenocortical carcinoma (ACC), cervical squamous cell carcinoma and endocervical adenocarcinoma (CESC), lung squamous cell carcinoma (LUSC) and UCEC. The main molecular mechanisms of YTHDC2 focused on catalytic activity, helicase activity, snRNA binding, spliceosome and mRNA surveillance. Additionally, YTHDC2 was notably correlated with tumour immune infiltration. Moreover, YTHDC2 had a high diagnostic value for seven cancer types and a prognostic value for brain lower grade glioma (LGG), rectum adenocarcinoma (READ) and skin cutaneous melanoma (SKCM). Collectively, YTHDC2 plays a significant role in epigenetic modification and immune infiltration and maybe a potential biomarker for diagnosis and prognosis in certain cancers.

the initiation of translating and promote protein synthesis through interacting with eIF3. 7 YTHDF2 can mediate m 6 A-dependent mRNA decay. 8 YTHDF3 takes an important part in the translation process of m6A-containing mRNAs through the sequential recruitment of other effectors. 9 YTHDC1 regulates gene splicing and modulate the exportation of m6A-modified mRNA. 2 YTHDC2 serves as the largest protein member (approximately 160 kD) in the YTH protein family. YTHDC2 is distinct from other YTH proteins because of its special helicase domains. YTHDC2 prefers to bind to the conserved m 6 A-modified motifs and executives the function of m 6 A reader by enhancing the translation efficiency and decreasing the mRNA abundance. [10][11][12][13] Previous studies have shown that YTHDC2 has a crucial effect on cancer metastasis through increased translation efficiency of HIF-1α in colon cancer patients 14 and contributes significantly in the proliferation of hepatocellular carcinoma cells. 15 YTHDC2 expression has been identified to be associated with prog-

nosis, apoptosis activation and ubiquitin-mediated proteolysis in
Head and Neck squamous cell carcinoma (HNSCC). 16 However, although certain studies have been carried out on YTHDC2, no single study exists which could overall evaluate its effects on considerable types of cancers. In light of increasing concern of pan-cancer genetic analysis, which will be beneficial to assess diagnostic and clinical prognostic values of genes on the whole level, we explored YTHDC2 expression in pan-cancer to further identify the influence of YTHDC2 on tumour promotion and suppression. As a result, we observed that YTHDC2 expressed significantly differently in cancers compared with normal tissues and was down-regulated in most tumours. We also found that YTHDC2 not only showed a high diagnos-

| Gene expression analysis
The RNA-seq data and relevant clinical data in level 3 transcripts per million reads (TPM) of 15,776 samples across 33 tumour types from The Cancer Genome Atlas (TCGA; https://portal.gdc.cancer.gov/; tcga_RSEM_gene_tpm) and the Genotype-Tissue Expression (GTEX) database (dataset ID: gtex_RSEM_gene_tpm) were downloaded by UCSC XENA (https://xenab rowser.net/datap ages/). Then, the RNAseq data in TPM format were converted into log 2 format for expression comparison between samples. R software v3.6.3 was used for statistical analysis, and ggplot2 package was for visualization. The Wilcoxon rank sum test detected two sets of data, and p < 0.05 was considered statistically significant. (ns, p ≥ 0.05; *, p < 0.05; **, p < 0.01; ***, p < 0.001). 17 We also displayed the differential expression of YTHDC2 between tumours and adjacent normal tissues using Tumor Immune Estimation Resource (TIMER).

| Genetic alteration analysis
Genomic profiles, including the alteration frequency, mutation type and Copy number alterations (CNA) across all TCGA tumours, were calculated by using the cBioPortal web (https://www.cbiop ortal. org/). Kaplan-Meier plots with log-rank p-value was generated by obtaining the data on the overall survival (OS), progression-free survival (PFS), disease-free survival (DFS) and disease-specific survival (DSS) of UCEC cases with or without YTHDC2 genetic alteration. 18,19 Additionally, we used Gene Set Cancer Analysis (GSCA; http://bioin fo.life.hust.edu.cn/), 20 which is an integrated genomic and immunogenomic web-based platform for gene set cancer research to assess the correlation between YTHDC2 methylation and prognosis in cancers.

| PPI network analysis
The online STRING (https://strin g-db.org/) tool provides investigators with systematic and comprehensive functional annotation tools for identifying the biological significance of an extensive list of genes. We obtained 50 YTHDC2-binding proteins by setting the following main parameters: minimum required interaction score ('medium confidence [0.400]'), meaning of network edges ('evidence'), max number of interactors to show ('no more than 50 interactors' in 1st shell) and active interaction sources ('Experiments, Text mining, Databases'). Then, Cytoscape (version 3.7.2) was applied for visualization of PPI networks.

| Functional and pathway enrichment analysis
In our study, Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses were conducted for YTHDC2-binding proteins using ggplot2 package for visualization and cluster Profiler package for statistical analysis. 21,22 2.5 | Distribution of YTHDC2 expression in molecular subtypes and immune subtypes of cancers TISIDB (cis.hku.hk/TISIDB/) 23 is a web portal for tumour and immune system interaction, which integrates multiple heterogeneous data types. We explored the association between YTHDC2 expression and molecular subtypes or immune subtypes across TCGA tumours from TISIDB database.  TCGA tumours. Additionally, we used GSCA 20 to assess the correlation between YTHDC2 expression and immune infiltration. Also, we used immunedeconv package for reliable immune score evaluation. Furthermore, we examined the correlation between YTHDC2 expression and immune checkpoint-related genes (SIGLEC15, 27 IDO1, 28 CD274, 29 HAVCR2, 30 PDCD1, 31 CTLA4, 26 LAG3 32 and PDCD1LG2 33 ) using R software. The horizontal axis represents the expression of immune checkpoint-related genes, and the vertical axis represents different tumour tissues. Different colours represent correlation coefficients, blue colour represents positive correlation, while red colour represents negative correlation, and the darker colour represents the stronger correlation (*p < 0.05, **p < 0.01, ***p < 0.001).

| Diagnostic value analysis
The clinicopathological parameters of 33 tumour patients from TCGA database and the corresponding normal tissue data from GTEX database were extracted to assess the diagnostic value of YTHDC2 by receiver operating characteristic (ROC) curve using pROC package for analysis and ggplot2 package for visualization. Note: The area value under the ROC curve is between 0.5 and 1. The closer the area under the curve (AUC) is 1, the better the diagnostic effect is. AUC in 0.5-0.7 has a low accuracy, AUC in 0.7-0.9 has a certain accuracy, and AUC above 0.9 has a high accuracy.

| Survival prognosis analysis
Kaplan-Meier plots were presented to assess the relationship between YTHDC2 expression level and prognosis for various cancer types, survminer package was used for visualization, and survival package was used for statistical analysis of survival data. The Log-rank test was used in the hypothesis test, and p < 0.05 is considered statistically significant.

| Analysis of YTHDC2 expression in cancers
We displayed YTHDC2 expression in normal tissues from GTEX database ( Figure 1A

| Genetic alteration analysis of YTHDC2
The results of the preliminary analysis of the genetic alteration status of YTHDC2 are presented in Figure 2. The results indicated that mutation as the highest alteration frequency (8.36%) occurred in patients with UCEC. Furthermore, samples of CESC with genetic alteration only had mutation of YTHDC2, with the alteration frequency showed 4.35% and 3.19% respectively. Besides, patients with ACC had the higher alteration frequency (2.2%) of amplification than all the other tumours, and all mature B-cell neoplasms cases with genetic alteration had deep deletion (2.08%) of YTHDC2 ( Figure 2B). We also observed 268 mutations of YTHDC2 presented in Figure 2C, including types, sites and case numbers of mutations and found that the key gene alteration type was the missense mutation. In Figure 2D-G, we explored that the prognosis of UCEC patients with YTHDC2 genetic alteration was better than the cases without YTHDC2 genetic alteration in PFS, DFS and DSS. However, patients with YTHDC2 hypermethylation had a worse OS than patients with YTHDC2 hypomethylation in ACC, CESC, LUSC and UCEC ( Figure 3).

| Enrichment analysis of YTHDC2-related genes in cancers
We screened out 50 targeted binding proteins of YTHDC2 using STRING's website and showed the interaction network by cytoscape ( Figure S1). Then, we conducted the GO and KEGG pathway enrichment analyses to further investigate the molecular mechanism of YTHDC2 ( Figure 4, Figure S2). The results suggested that the biological process (BP) was primarily involved in the aspect of RNA Splicing, RNA3'-end processing, DNA-templated transcription termination and termination of RNA polymerase II transcription; the major as-

| Molecular subtypes and immune subtype's analysis in cancers
We used TISIDB database to show the correlation between YTHDC2 expression and molecular subtypes or immune subtypes across TCGA tumours. As shown in Figure S3

| Immune infiltration analysis in cancers
We used the TIMER to assess the correlation between YTHDC2 ex- We further explored the correlation between YTHDC2 and eight immune checkpoint-related genes and found that YTHDC2 was significantly associated with the expression of SIGLEC15, IDO1, CD274, HAVCR2, PDCD1, CTLA4, LAG3 or PDCD1LG2 in almost all types of cancers except for CESC and TGCT ( Figure 5).

| Diagnostic value of YTHDC2 in cancers
We performed the ROC curve analysis to evaluate the diagnostic  contributed to the proliferation in oesophageal squamous-cell carcinoma. 37 However, in reviewing the literature, no study was found to comprehensively assess its distinct influence on pan-cancer.

| Survival prognosis of YTHDC2 in cancers
In the present study, firstly, we examined YTHDC2 expression in TCGA tumours and found that YTHDC2 was significantly downregulated in the majority of cancers than in the normal samples, in- Then, we observed that the missense mutation of YTHDC2 was the primary type, and the highest incidence of that occurred in  As previous studies demonstrated, YTHDC2 can promote the efficiency of translation of mRNA secondary structures probably due to RNA helicase activity, 38 and also, RNA-binding domains contribute to the interrelationship between m 6 A-containing mRNAs and the ribosomes. Additionally, YTHDC2 favours the RNA degradation for the purpose of managing the stability of mRNAs. 39 Thus, we conducted GO and KEGG analysis of 50 targeted binding proteins of YTHDC2 to further investigate its molecular mechanism, revealing that the catalytic activity, helicase activity, snRNA binding, spliceosome and mRNA surveillance were mainly involved in BP, MF and enrichment pathway.
Prior studies have noted the crucial correlation between m 6 A modification and tumour microenvironment, implying that diverse m 6 A modification patterns play a vital role in the multiplicity and perplexity of tumour microenvironment. 40,41 According to Spearman correlation analysis of immune score and YTHDC2 gene expression, our study found that YTHDC2 was critically involved in the immune infil-  [45][46][47] and predictive model. 48,49 In summary, our study contributes to uncovering cancer promoting or suppressing effects of YTHDC2 in various cancer types comprehensively and provides evidence on the role of YTHDC2 in tumour cell immune infiltration, diagnostic value and clinical prognosis.

CO N FLI C T O F I NTE R E S T
The authors declare that there is no conflict of interests.

CO N S E NT TO PA RTI CI PATE
Not applicable.

CO N S E NT FO R PU B LI C ATI O N
Not applicable.

DATA AVA I L A B I L I T Y S TAT E M E N T
Publicly available datasets were analysed in this study, which can be found in UCSC XENA (https://xenab rowser.net/datap ages/).