Prognostic impact and immunotherapeutic implications of NETosis‐related gene signature in gastric cancer patients

Abstract The role of NETosis and its related molecules remains unclear in gastric cancer. The data used in this study was directly downloaded from the Cancer Genome Atlas (TCGA) database. All analysis and plots are completed in R software using diverse R packages. In our study, we collected the list of NETosis‐related genes from previous publications. Based on the list and expression profile of gastric cancer patients from the TCGA database, we identified the NETosis‐related genes significantly correlated with patients survival. Then, CLEC6A, BST1 and TLR7 were identified through LASSO regression and multivariate Cox regression analysis for prognosis model construction. This prognosis model showed great predictive efficiency in both training and validation cohorts. We noticed that the high‐risk patients might have a worse survival performance. Next, we explored the biological enrichment difference between high‐ and low‐risk patients and found that many carcinogenic pathways were upregulated in the high‐risk patients. Meanwhile, we investigated the genomic instability, mutation burden and immune microenvironment difference between high‐ and low‐risk patients. Moreover, we noticed that low‐risk patients were more sensitive to immunotherapy (85.95% vs. 56.22%). High‐risk patients were more sensitive to some small molecules compounds like camptothecin_1003, cisplatin_1005, cytarabine_1006, nutlin‐3a (−)_1047, gemcitabine_1190, WZ4003_1614, selumetinib_1736 and mitoxantrone_1810. In summary, our study comprehensively explored the role of NETosis‐related genes in gastric cancer, which can provide direction for relevant studies.

about 90% of all cases. 5The factors contributing to GC are multifaceted, ranging from chronic gastritis, and Helicobacter pylori infection, to prolonged dietary and environmental exposures. 6Globally, it ranks as the fifth most diagnosed cancer and the third leading cause of cancer-related deaths. 7Although the prevalence has been decreasing in many Western countries due to dietary improvements and better food preservation, it remains prevalent in Eastern Asia, Eastern Europe and South America. 8The early detection and understanding of its underlying molecular mechanisms are crucial for improving treatment strategies and patient outcomes. 9Tosis is an intriguing and unique form of programmed cell death, primarily exhibited by neutrophils, one of the frontline defenders in our immune system. 10Unlike traditional apoptosis or necrosis, during NETosis, neutrophils expel a web-like structure composed of their DNA, histones, and various antimicrobial proteins. 11These expelled networks, termed Neutrophil Extracellular Traps (NETs), are primarily designed to ensnare and neutralize pathogens. 12,13However, as with many physiological processes, there's a double-edged sword to NETosis. 14In recent years, a growing body of research has started to unveil the multifaceted roles of NETosis, especially its implications in non-infectious diseases. 15Among these, its association with cancer has generated significant interest.While on one hand, NETs might play a protective role by trapping circulating tumour cells, and preventing their metastasis, they have also been implicated in fostering tumour progression.This is attributed to the inflammatory and pro-thrombotic nature of NETs, which can create a favourable microenvironment for tumour growth and spread. 16reover, components of NETs can also induce DNA damage in adjacent cells, potentially driving tumorigenesis. 17Furthermore, some studies suggest that the very presence of tumours can stimulate NETosis, thereby creating a feedback loop that can exacerbate disease progression. 18Understanding the intricate dance between NETosis and cancer holds promise not just in deciphering tumour biology, but also in potentially unveiling novel therapeutic targets that could revolutionize cancer management.As the dynamics of NETosis continue to be unravelled, its impact on cancer prognosis, diagnosis and therapy is anticipated to be profound.

| Data acquisition
Expression data of RNA sequencing and relevant clinical information of all STAD samples were downloaded from the Cancer Genome Atlas (TCGA) database (https:// tcga-data.nci.nih.gov/ tcga/ ). 19After removing samples without clinical information or with less than 30 days of clinical follow-up (Table 1).Then STAR-Counts data was transformed into transcript per million (TPM) form and normalized through log2 (TPM + 1).Single-cell sequencing data of STAD patients were downloaded from the TISCH database (http:// tisch.comp-genom ics.org/ home/ ).And 61 NETosis-related genes (NRGs) were retrieved from previous publications. 12,20,21

| Construction and verification of prognostic signature based on NRGs
All the NRGs were included in the univariate Cox regression analysis, and the NRGs meaningful for univariate analysis were further incorporated into LASSO analysis and multivariate Cox regression analysis.The prognostic signature was established according to NRGs screened from multivariate Cox regression analysis. 22An external validation based on GSE84437cohort was performed to validate the predictive performance of the signature using multiple methods including differential clinical parameters analysis, Kaplan-Meier (KM) method analysis, time-dependent receiver operating characteristic (ROC) and independent prognostic analysis. 22

| Functional analysis of the signature between two groups
With the 'gsva' R package, GSVA analysis was applied to the hallmark gene set to evaluate pathway enrichment. 23GSE134520 and GSE167297 cohorts were downloaded from the online single-cell portal database of the TISCH database (http:// tisch.comp-genom ics.[27]

| Exploration of phylogenomic features
Somatic mutation data used to calculate tumour mutational burden (TMB) were retrieved from from cBioPortal website (https:// www.cbiop ortal.org/ datasets). 28With maftools, we screened genes with p-values lower than 0.05 that were different between the two risk groups and analysed how these mutations interact.In addition, copy number variation (CNV) analysis was applied using GISTIC_2.0(https:// cloud.genep attern.org).Copy number gain burden and loss burden at the focal and arm levels were calculated based on the output files from GISTIC_2.0.

| Immune infiltration analysis
Single sample gene set enrichment analysis (ssGSEA) algorithm was implemented to estimate the scores of immune cells and immune functions between the two groups.The ESTIMATE algorithm was used to calculate the stromal scores, immune scores and ESTIMATE scores, which represented tumour immune score and immune cell infiltration status in two groups.

| Immunotherapy and chemotherapy response prediction
The potential response to immune checkpoint blockade (ICB) of patients with STAD was assessed between high-and low-risk groups based on the tumour immune dysfunction and exclusion (TIDE) (http:// tide.dfci.harva rd.edu). 29,30SubMap (https:// cloud.genep attern.org/ gp) analysis was performed to estimate the immune checkpoint inhibitors response of PD-1 and CTLA4 in two groups.The drug sensitivity to chemotherapy in STAD patients was estimated through the public database Genomics Drug Sensitivity in Cancer (GDSC). 31

| Statistical analysis
Statistical significance was determined by a p-value of 0.05 for all two-sided tests.A t-test was performed on independent samples with continuous variables with a normal distribution to compare their means and standard deviations.For the comparison of continuous variables with skewed distributions, the Wilcoxon rank-sum test was used.

| Expression landscape of NRGs and signature construction
After the acquisition of all NRGs, differentially expressed NRGs were screened by differential expression analysis using 'limma' R packages (Figure 1A-E).The univariate Cox regression analysis was then performed and seven NRGs affecting prognosis were selected including TNF (ENSG00000232810), BST1 (ENSG00000109743), CD93 (ENSG00000125810), CLEC6A (ENSG00000205846), ENTPD4 (ENSG00000197217), SELP (ENSG00000174175) and TLR7 (ENSG00000196664) (Figure 1F).analyses between riskScore and different clinicopathological features, we found that for STAD patients involved in a high-risk group, the higher riskScore is related to the high grade and stage (Figure 2H).

| Validation and evaluation of the signature
In addition to the STAD samples included from the TCGA database as the training cohort, an external validation cohort from GSE84437 was also obtained to validate the accuracy of the prognostic signature.Time-dependent ROC curve analyses showed the overall accuracy was excellent among the both training and validation cohorts (Figure 3A, B).Also, KM survival curves showed that patients in the high-risk group have a worse survival performance (Figure 3C, D).Meanwhile, from Figure 3E, F, we can see that higher levels of riskScore were associated with higher mortality risks for STAD patients in both training and validation cohorts.To verify whether the riskScore could act as an independent prognostic factor, we performed univariate and multivariate independent prognosis analyses, and the results showed the riskScore could serve as the independent prognostic factor with the lowest p-value (Figure 3G, H).

| Functional enrichment analysis and single-cell analysis
In terms of the hallmark gene set as a reference gene set, GSVA was performed to investigate the potential biological functions and signalling pathways between the two groups.We found that in the high-risk group, HYPOXIA, MESENCHYMAL_TRANSITION and OESTROGEN_RESPONSE_EARLY were enriched while XENOBIOTIC_METABOLISM, FATTY_ACID_METABOLISM and INTERFERON_ALPHA_RESPONSE were enriched in the low-risk group (Figure 4A).Next, based on the top 100 differentially expressed genes selected from two risk groups, KEGG analysis was conducted, and the top five pathways in KEGG enrichment analysis were shown in both the high-risk group (Figure 4B) and the lowrisk group (Figure 4C).Following this, the expression distribution of these NRGs in multiple cell subgroups was analysed both in the GSE134520 cohort and the GSE167297 cohort (Figure S1A, B).

| Genomic features analysis including TMB and CNV
Based on analysis of the TCGA database, we found that STAD patients had a higher TMB level than patients with other types of cancers (Figure 5A).Patients in the low-risk group had a lower level of TMB score than patients in the high-risk group (Figure 5B).Meanwhile, to identify differences in mutated genes between high-and low-risk groups, analysis of differentiation of mutated genes was performed, and we found that almost all genes were prone to be more susceptible to mutation in the low-risk group (Figure 5C).Then, somatic mutations data of STAD samples were also analysed, and we found more somatic mutations including non-synonymous, synonymous mutations and all mutations enriched in patients with low-risk (Figure 5E-G).Meanwhile, the interrelationships of these genes are also shown in Figure 5D.
For each patient with STAD and GISTIC scores were calculated

| Multiple immune analyses revealed the correlations between riskScore and the immunotherapy response
We explored the relationship between immune cell scores and three hub NRGs and found that there was a strong positive correlation among them (Figure 7A).Besides, the stromal scores, immune scores and ESTIMATE scores were significantly lower in the low-risk group than in the high-risk group (Figure 7B).The most of immune cell scores and immune function scores including B cells, T cells, APC co-inhibition, and type I IFN response were significantly lower in the low-risk group than in the high-risk group (Figure 7C, D).The expression of immune checkpoints like PDCD1 and TIGIT was also higher in the high-risk group than in the low-risk group (Figure 7E).Furthermore, the distribution of TIDE scores in STAD patients was exhibited (Figure 8A).
Compared to the high-risk group, the TIDE, dysfunction and exclusion scores were significantly lower in the low-risk group (Figure 8B).Patients in the low-risk group had a higher proportion of responders to immunotherapy than high-risk group (Figure 8C).
The result of the SubMap analysis also revealed that patients in the low-risk group may respond to anti-CTLA4 immunotherapy than in the high-risk group (Figure 8D).In addition, we found that Camptothecin_1003, Cisplatin_1005, Cytarabine_1006, Nutlin−3a (−)_1047, Gemcitabine_1190, WZ4003_1614, Selumetinib_1736, and Mitoxantrone_1810 witnessed significant drug sensitivity differences between high-and low-risk group (Figure 8E).

| DISCUSS ION
GC, often viewed as a significant public health concern, maintains a consistently high prevalence globally. 32This malignancy subtly initiates from the inner lining cells of the stomach and progressively develops. 33Many patients, in the early stages, remain asymptomatic, leading to frequent late-stage diagnoses. 34When low-risk patients and found that many carcinogenic pathways were upregulated in the high-risk patients.Meanwhile, we investigated the genomic instability, mutation burden and immune microenvironment difference between high-and low-risk patients.Moreover, we noticed that low-risk patients were more sensitive to immunotherapy (85.95% vs. 56.22%).High-risk patients were more sensitive to some small molecules compounds like camptothecin_1003, cispla-tin_1005, cytarabine_1006, nutlin-3a (−)_1047, gemcitabine_1190, WZ4003_1614, selumetinib_1736 and mitoxantrone_1810.
Distinctive pathway enrichments were observed between high-risk and low-risk groups.Notably, in the high-risk group, pathways such as HYPOXIA, MESENCHYMAL_TRANSITION and OESTROGEN_RESPONSE_EARLY were prominently enriched.
Hypoxia is a well-documented factor in tumour progression, often contributing to enhanced tumour cell survival, migration and invasion. 35Concurrently, mesenchymal transition, a hallmark of epithelial-mesenchymal transition, is frequently associated with tumour metastasis and resistance to therapies, further highlighting its relevance in a high-risk patients. 36The enrichment of the early oestrogen response might suggest a hormonal influence in the progression or maintenance of the high-risk phenotype, hinting at potential therapeutic targets. 37

Figure
Figure2Bdisplays the correlations between these NRGs from univariate Cox regression analysis.To increase the predictive accuracy, we opted to additionally perform LASSO regression analysis and all these NRGs were selected for further analysis (Figure2B, C).Following that, these NRGs were included to conduct multivariate Cox regression analysis, and a prognostic signature composed of three NRGs including CLEC6A, BST1 and TLR7 was successfully constructed with a riskScore formula: riskScore = the expression of CLEC6A* -0.325645123010234 + the expression of BST1* 0.0423937986799213 + the expression of TLR7* 0.0736051785940703 (Figure2D).Then, STAD patients were stratified into a high-risk group and a low-risk group according to the median cutoff value.Based on these signature NRGs, Figure 2E-G illustrates the general probability of survival for patients based on Kaplan-Meier curves.And in clinical association

F I G U R E 1
Expression of NETosis-related genes (NRGs) and univariate prognostic analysis.(A-E) Results of differential expression analysis of all the NRGs between normal and STAD samples.(F) Results of univariate prognostic analysis of all the NRGs.* = p < 0.05; ** = p < 0.01; *** = p < 0.001; **** = p < 0.0001.tocompare CNV between the two risk groups (Figure6A, B).Further, based on the differential CNV analysis, we found that STAD patients in the high-risk group possessed lower CNV losses or gains at arm level than STAD patients in the low-risk group (Figure6C, D).

F I G U R E 2
Identification of NETosis-related genes (NRGs)-based prognostic signature.(A) Correlation analysis between the NRGs screened from the univariate prognostic analysis.(B, C) LASSO regression analysis screened seven NRGs. (D) Forest plots of three prognostic NRGs identified by multivariate Cox regression analysis, (E-G) Kaplan-Meier survival curves of signature NRGs in STAD.(H) Pie charts showing the Chi-squared test of clinicopathologic factors between two groups.
symptoms such as upper abdominal pain or indigestion emerge, the disease may have already advanced to a critical phase.Lifestyle choices, dietary habits and familial genetics can all elevate the risk of contracting this ailment.Confronted with such a health menace, the pursuit of superior preventive and therapeutic approaches becomes an urgent endeavour in the medical community.In our study, we collected the list of NETosis-related genes from previous publications.Based on the list and expression profile of GC patients from the TCGA database, we identified the NETosisrelated genes significantly correlated with patients survival.Then, CLEC6A, BST1 and TLR7 were identified through LASSO regression and multivariate Cox regression analysis for prognosis model construction.This prognosis model showed great predictive efficiency in both training and validation cohorts.We noticed that the highrisk patients might have a worse survival performance.Next, we explored the biological enrichment difference between high-and F I G U R E 3 Verification and evaluation of the signature.(A, B) Both training and validation cohorts were evaluated based on AUC values under the time-dependent receiver operating characteristic (ROC) curves.(C, D) Analysing the Kaplan-Meier survival curves in both the training and validation cohorts to compare overall survival between the two risk groups.(E, F) Distribution of survival status and survival time for patients with different risk scores as well as the expression levels of three prognostic NETosis-related genes (NRGs) in both training and validation cohorts.(G, H) Prognostic value of riskScore measured by prognostic value of risk score measured by multivariate Cox regression analyses.

F I G U R E 5
Conversely, the low-risk group displayed enrichment in pathways such as XENOBIOTIC_ METABOLISM, FATTY_ACID_METABOLISM and INTERFERON_ ALPHA_RESPONSE.Xenobiotic metabolism pathways play a crucial role in the detoxification of foreign substances, suggesting that the low-risk group might possess a heightened capability to metabolize potentially harmful compounds.The prominence of fatty acid metabolism indicates potential differences in energy utilization or lipid synthesis between the two groups.Meanwhile, the enrichment of interferon-alpha response is particularly intriguing. 38Interferons, especially alpha-type, are renowned for their antiviral and antitumor properties.Their upregulation in the low-risk group might reflect an innate capacity to mount a more robust defence against tumour F I G U R E 4 Analysis of single cells and functional enrichment.(A) GSVA analysis revealed two risk groups enriched in different functional pathways.(B, C) KEGG analysis showed the top five terms enriched in high-risk and low-risk groups.progressionor external pathogens.Taken together, these findings illuminate the complex molecular landscape that differentiates highrisk from low-risk groups.Understanding these nuances not only enhances our comprehension of disease pathogenesis but also paves the way for targeted therapeutic strategies.Meanwhile, a compelling observation was the stark difference in immune cell scores and immune function scores between the low-risk and high-risk groups.The majority of these scores, including those for B cells, T cells, APC co-inhibition and type I IFN response, were notably diminished in the low-risk group compared to the high-risk group.The lower scores for B cells and T cells in the low-risk group suggest a potential reduction in adaptive immune responses.39This is intriguing, as one might typically associate robust adaptive immune responses with effective antitumor defences.However, it's possible that in this context, a heightened adaptive immune response may reflect or contribute Correlation between risk score and tumour mutational burden (TMB).(A) TMB distribution of pan-cancer in the Cancer Genome Atlas (TCGA).(B) Patients in the low-risk group had a higher level of TMB.(C) A total of eight genes were identified mutated more frequently in the low-risk group.(D) The heatmap showed the correlation between these eight genes identified by maftools analysis.(E-G) In patients with low riskScore, the number of somatic mutations, including non-synonymous mutations, synonymous mutations and all mutations, was higher.* = p < 0.05; ** = p < 0.01; *** = p < 0.001; **** = p < 0.0001.to an inflammatory microenvironment, which has been implicated in promoting tumour progression and metastasis in numerous malignancies.40Furthermore, the decreased scores for APC coinhibition in the low-risk group might indicate reduced levels of immune checkpoint molecules, which are often hijacked by tumours to evade immune surveillance.This could mean that the low-risk group has a diminished requirement for co-inhibitory signalling, possibly due to a more quiescent immune landscape.

F I G U R E 6
Copy number variation (CNV) analysis between two groups.(A) Overall copy number GISTIC score of each STAD patient.(B) Overall copy number percentage of each STAD patient.(C) Between different risk groups, there was a significant difference in amplification and deletion frequencies.(D) Patients with low riskScore had a higher level of burden of number gain or loss at arm level.* = p < 0.05; ** = p < 0.01; *** = p < 0.001; **** = p < 0.0001.F I G U R E 7 The immune infiltration landscape between two risk groups.(A) The relationship between immune cells scores and three hub NETosis-related genes (NRGs).(B) The comparison of the stromal scores, immune scores and ESTIMATE scores between high-and lowrisk groups.(C, D) The immunity infiltration differences between the two groups.(E) The difference in expression of immune checkpoint between two groups.* = p < 0.05; ** = p < 0.01; *** = p < 0.001; **** = p < 0.0001.Lastly, the attenuated type I IFN response in the low-risk group could reflect reduced antiviral or antitumor activities.Type I interferons are crucial players in innate immunity, often acting as the first line of defence against pathogens and malignancies.Their reduced activity could suggest a more general suppression of innate immune responses in the low-risk group.In sum, our findings underscore the intricacies of the immune landscape in the context of disease risk.A deeper understanding of these immunological F I G U R E 8 Prediction of immunotherapy response and chemotherapy sensitivity.(A) The distribution of TIDE scores in patients with STAD.(B) The comparison of TIDE, dysfunction and exclusion scores between high-and low-risk groups.(C) The proportion of responders to immunotherapy in two groups according to the TIDE algorithm.(D) Response prediction to immunotherapy between high-and low-risk groups according to the SubMap algorithm.(E) Response prediction to chemotherapy between two groups.