Prognostic signature and immune efficacy of m1A‐, m5C‐ and m6A‐related regulators in cutaneous melanoma

Abstract Cutaneous melanoma (CM) is an aggressive cancer; given that initial and specific signs are lacking, diagnosis is often late and the prognosis is poor. RNA modification has been widely studied in tumour progression. Nevertheless, little progress has been made in the signature of N1‐methyladenosine (m1A), 5‐methylcytosine (m5C), N6‐methyladenosine (m6A)‐related regulators and the tumour microenvironment (TME) cell infiltration in CM. Our study identified the characteristics of m1A‐, m5C‐ and m6A‐related regulators based on 468 CM samples from the public database. Using univariate, multivariate and LASSO Cox regression analysis, a risk model of regulators was established and validated by a nomogram on independent prognostic factors. The gene set variation analysis (GSVA) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) clarified the involved functional pathways. A combined single‐sample gene set enrichment analysis (ssGSEA) and CIBERSORT approach revealed TME of regulator‐related prognostic signature. The nine‐gene signature stratified the patients into distinct risk subgroups for personalized prognostic assessment. Additionally, functional enrichment, immune infiltration and immunotherapy response analysis indicated that the high‐risk group was correlated with T‐cell suppression, while the low‐risk group was more sensitive to immunotherapy. The findings presented here contribute to our understanding of the TME molecular heterogeneity in CM. Nine m1A‐, m5C‐ and m6A‐related regulators may also be promising biomarkers for future research.

RNA modification is a way of post-transcriptional regulation, which works as an additional link between transcription and translation and is crucial for the event of many diseases. Over one hundred styles of RNA modifications have been discovered, the most common one being m 6 A methylation. 3  Immunotherapy showed superb clinical effectiveness in a minority of CM patients with long-lasting effects. 9,10 Yet, the vast majority of patients have to endure the cost of frequent and serious immune-related adverse events, which greatly hampers the actual efficacy. 11 It has been widely accepted that tumour microenvironment (TME) plays a critical role in malignancy evolvement and immune regulation. TME incorporates not solely tumour cells but also stromal cells (fibroblasts and macrophages), and immune cell infiltration (ICI), chemokines and growth factors. Alternative TME components directly or indirectly interact with tumour cells to cause changes in a variety of physical behaviours, such as apoptosis resistance, 12 hypoxia tolerance 12 and immune dysfunction. 13 With the deepening of the understanding of the TME, several studies have demonstrated that TME plays a pivotal role in tumour progression, immune escape and immunotherapy response. 14 Therefore, an extensive analysis of the TME landscape can effectively improve the ability to guide and predict the immunotherapy response.
This study set out to identify the potential signature of m 1 A-, Cox regression analyses, we established a regulator-related risk predictive model to distinguish the level of risk among patients with CM. The importance and originality of this study are that it further revealed the underlying connection between the regulator-related risk predictive signature and the ICI characteristics of TME. This novel signature could be used to evaluate the sensitivity of CM patients to immunotherapy.

| Selection of m 1 A-, m 5 C-and m 6 A-related regulators
A total of 48 m 1 A-, m 5 C-and m 6

| Data Acquisition
All of the clinical patient data, mutations, copy-number variation (CNV) and mRNA expression data were downloaded from the TCGA

| Establishment and validation for the prognostic signature of m 1 A-, m 5 C-and m 6 Arelated regulators
A total of 46 regulators expressed in TCGA-SKCM were enrolled in the survival analysis ( Figure S5). The identification of m 1 A-, m 5 C-and m 6 A-related prognostic genes was carried out using univariate Cox regression analysis, and genes were considered significant with a cut-off of p < 0.05. The selected factors in the LASSO regression were analysed by multivariate analysis. The risk score was generated as follows: The patients were stratified into high-risk and low-risk groups based on the median risk score. For the evaluation of the overall survival (OS) of high-and low-risk groups, the Kaplan-Meier (K-M) survival analysis was performed by the R package 'survival'. The same analysis was also conducted in the external validation set. Clinical information for the training set and the external validation set is risk score = e sum(each gene's expression levels × corresponding coefficient) e sum(each gene's mean expression levels × corresponding coefficient) .
presented in Table 1 and Table S5, respectively. The assessment of risk score prognostic efficiency was conducted based on the areas under the curve (AUCs) of the time-dependent receiver operating characteristic (ROC) curve in the R package 'TimeROC'. 19

| Independent prognostic roles of the regulator-related signature
The univariate and multivariate Cox regression analyses were performed to test the hypothesis that the prognostic gene signature could be independent of other clinical parameters (including gender, radiotherapy, chemotherapy, N stage, T stage, M stage, stage and age). P < 0.05 was considered to be statistically significant.

| Development of a nomogram
A nomogram was constructed based on the independent prognostic factors by the survival and the rms R package. The calibration curves and the concordance index (C-index), ranging from 0.5 to 1.0, were used to gauge the model's ability to predict prognosis. The value of 0.5 and 1.0 represents a random chance and the optimal performance with the prognosis model, respectively.

| GSVA and functional annotation
The 'GSVA' R package was used to test the enrichment of m 1 A-, m 5 Cand m 6 A-related regulatory gene signatures in the normalized gene expression table. Non-parametric tests and unsupervised method were bound to compare the number of the pathway and biological process activity in the samples of an expression data set. 20 Adjusted P with a value less than 0.05 was considered statistically significant.

| Pathway analysis for the differentially expressed genes (DEGs) of the regulator-related risk model
The DEGs between different risk groups were analysed with func-

| Estimation of TME immune cell infiltration
We utilized the ssGSEA algorithm to quantify the relative abundance of each cell infiltration in the cutaneous melanoma TME. The enrichment score calculated by the ssGSEA was utilized to represent the relative abundance of each TME-infiltrating cell in each sample. The gene set for marking each TME infiltration immune cell type was the difference between the empirical cumulative distribution function, which is similar to the one used in GSEA, but instead based on absolute expression rather than differential expression.

| Statistical analysis
All analyses were carried out by the R software (version 3.5.2).
Distributions of OS were compared using the log-rank test. The  (Table S1). The incidence of SNV and CNV of 46 regulators was summarized in TCGA skin cutaneous melanoma (SKCM). Figure 2 presents the landscape of alteration obtained from the preliminary analysis of the 46 regulators. Missense mutation was the most frequent mutation event ( Figure 2A). The top 20 mutated genes were identified based on the number of mutations in TCGA-SKCM using the 'maftools' R package ( Figure 2B). Among 245 samples, 183 experienced mutations of regulators, with a frequency of 74.69%. It was found that the TET1 exhibited the highest mutation frequency. The investigation of the 46 regulators exhibited a prevalent CNV alteration and amplification in copy number. Figure 2C displays the CNV alteration location of the 46 regulators. A total of 467 CM samples were included for SNV studies. For CNV analysis, there were 416 CM samples. Survival analysis revealed that CM with SNV had a worse prognosis than that without mutations ( Figure 3A). In contrast, there were no significant differences between CNV and wild type ( Figure 3C). Four of 44 regulators were identified in 468 CM samples that had a significant association with different SNV patterns (Table S2). DNMT3B and UNG with SNV were linked to reduced mRNA expression, while increases were observed for DNMT1 and HNRNPC ( Figure 3B). Dose compensation effects were a contributing factor for the relationship between CNV alterations and gene expression levels. 22 The next analysis focused on how CNV patterns attributed to expression of regulators. The correlation between the two in 416 CM samples is interesting because increased copy numbers of 45 regulators showed high expression, while deletions presented low expression ( Figure S1). Only IGF2BP lacked a meaningful result (Table S2). These findings indicated that SNV and CNV of m 1 A-, m 5 C-and m 6 A-related genes could affect the expression of key regulatory molecules and contribute to CM progression.

| Identification of the regulatory gene expression relevant to clinical prognosis
To explore the clinical signature of m 1 A-, m 5 C-and m 6

| Construction of regulator-related prognostic risk model
The results presented above indicated that the m 1 A-, m 5 C-and  (Table S3). The regression coefficient of the 12 regulators was computed using the LASSO Cox regression analysis ( Figure 4B and Figure 4C). We identified nine regulators: UNG, FMR1, MBD4, MBD2, NEIL1, WTAP, UHRF2, YTHDF1 and RBM15B ( Figure 4D). To calculate the patient's risk score, a multivariate Cox regression analysis with nine genes was conducted (Table S4). The distribution of the risk score, vital status and expression levels of the corresponding nine regulators in the TCGA data set is shown in Figure 5A and Figure 5B. Using the median risk score to divide patients into the high-risk and low-risk groups, the K-M   Figure 5C).

| Internal and external validation of the nineregulator-related risk model
In order to examine the performance of the risk model based on nine regulators, we calculated the AUC at 3, 5 and 7 years ( Figure 6A). All were greater than 0.64. To further validate the efficacy of the nineregulator-related gene signatures, we also performed the above analysis in the GSE10 0797 data set (external validation set). Based on median risk values, 13 CM patients were classified as high-risk group and 12 as low-risk group. As the risk score increased, so did the number of deaths ( Figure 5E). The expression pattern of prognostic regulators between the two groups was almost identical to that in the training set ( Figure 5G)  (Table 2 and Table S4); the results indicated that risk score, N stage, T stage and age were independent prognostic factors of OS in CM ( Figure 5D). To develop a clinically applicable way for the prediction of survival status in CM patients, a nomogram was established to predict the 3-year and 5-year OS probability in 468 CM patients ( Figure 6C). The calibration plot for nomogram suggested its high predictive accuracy and sensitivity in CM patients ( Figure 6D).

| Functional enrichment analyses for the nineregulator-related risk subgroups
The GSVA enrichment analysis was employed to investigate the underlying biological activities among the high-and low-risk groups. As shown in Figure 7A and Table S6, the high-risk group was markedly en-  Figure 7C; Figure S4).

| Immune infiltration characteristics of TME within nine-regulator-related risk subgroups
The ssGSEA was conducted to investigate the ICI pattern related to the risk score based on transcriptome profiling data for 468 SKCM patients from the TCGA database. The low-risk group was remarkably enriched in innate ICI, which mainly included natural killer cells, macrophages, activated dendritic cells (aDCs), B cells and T cells ( Figure 8A). Previously, the low-risk group identified matched survival advantage in TCGA-SKCM ( Figure 5C). The results of the GSVA showed that the high-risk group was significantly associated with stromal activation. It has been reported that T-cell suppression could activate the TME stroma. 23 Coincidentally, ssGSEA showed a significant decrease in T-cell activity in the high-risk group ( Figure 8B). Therefore, we speculated that stromal activation in the high-risk group inhibited the antitumour effect of immune cells. Based on the above analyses, we were surprised to find that the two groups had radically distinct TME cell infiltration characterization. Based on the three scores generated by the ESTIMATE algorithm, we analysed the relationship between scores and high-/low-risk groups.
As shown in Figure 8C, we could see that high-and low-risk groups had a significant effect on immune, stromal and ESTIMATE scores (all   1 (0.99-1.3 p < 0.001). The highest ESTIMATE score was found in the high-risk group (p = 4.1e-08). The low-risk group had the highest immune and stromal scores, whereas the high-risk group had the lowest. We then used a deconvolution algorithm, the CIBERSORT method, for determining the immune cell type in CM, to compare the component differences in immune cells between the high-and low-risk groups. We found that there were significant differences in the compositions of TME cell types between the high-and low-risk groups ( Figure 8D). Therefore, we concluded that the immune-related biological processes and pathways associated with the risk score might result from the observed significant differences in the various immune cell types.

| Assessment of immunotherapy response in nine-regulator-related risk subgroups
A relationship between polymorphisms of human leucocyte antigen (HLA) and the tumour proliferation and immune escape has been studied in the literature. 24 Pioneering investigations revealed that immunotherapy targeting immune checkpoints (ICPs) offered great hope for the clinical treatment of human cancers. Given that the HLA was differentially expressed between high-and low-risk groups ( Figure 8E), it was necessary to analyse the correlation between the expression levels of several well-known immune checkpoint molecules (ICMs) in distinct risk subgroups. In the high-risk group, ICMs were lowly expressed, suggesting that the low-risk group could be more suitable for immunotherapy. CTLA-4 and PD-1 as well-known ICBs improved OS in especially CM patients with metastatic or advanced disease. 25,26 Potential response to immunotherapy in patients from the different risk subgroups was modelled on TIDE instructions, and T-cell dysfunction and rejection were used to predict the performance of ICBs in the two subgroups. We found that the low-risk group (23.923%, 50/209) was significantly more eligible for immunotherapy compared with the high-risk group (17.703%, 37/209) ( Figure 8F). SubMap is an unsupervised clustering method that can match subclasses in two independent gene expression data sets. 27 Herein, an inspection of the SubMap analysis data in Figure 8G revealed that the low-risk group was more likely to respond to anti-PD-1 therapy (p < 0.05).

| DISCUSS ION
CM is a potentially deadly form of skin cancer, and its pathogenesis remains controversial. 25 As a consequence of the molecular hetero-  Figure S1).   TME plays a vital role in the therapeutic resistance in CM patients. 14 In particular, the specific mechanism of interaction between TME and immune cell infiltration (ICI) significantly influences CM prognosis. 36,37 In this study, the high-risk group with survival disadvantage was rich in immune full activation pathway, which might be

ACK N OWLED G M ENTS
We thank LetPub (www.letpub.com) for its linguistic assistance during the preparation of this manuscript.

CO N FLI C T O F I NTE R E S T
The authors confirm that there are no conflicts of interest.

DATA AVA I L A B I L I T Y S TAT E M E N T
The data analysed in this study were derived from the public domain resources: http://gdc.cancer.gov and http://www.ncbi.nlm.nih.gov/geo/.