Identification of angiogenesis‐related subtypes, the development of a prognosis model, and features of tumor microenvironment in colon cancer

Angiogenesis is associated with tumor progression, prognosis, and treatment effect. However, the angiogenesis’ underlying mechanisms in the tumor microenvironment (TME) still remain unclear. Understanding the dynamic interactions between angiogenesis and TME in colon adenocarcinoma (COAD) is necessary. We downloaded the transcriptome data and corresponding clinical data of colon cancer patients from The Cancer Genome Atlas (TCGA) and the Gene Expression Omnibus (GEO) databases, respectively. We identified two distinct angiogenesis‐related molecular subtypes (subtype A and subtype B) and assessed the clinical features, prognosis, and infiltrating immune cells of patients in the two subtypes. According to the prognostic differential genes, we defined two different gene clusters to further explore the correlation between angiogenesis and tumor heterogeneity. Then, we construct the prognostic risk scoring model angiogenesis‐related gene (ARG‐score) including seven genes (ARMCX2, latent transforming growth factor β binding protein 1, ADAM8, FABP4, CCL11, CXCL11, ITLN1) using Lasso‐multivariate cox method. We analyzed the correlation between ARG‐score and prognosis, clinicopathological features, TME, molecular feature, cancer stem cells (CSCs), and microsatellite instability (MSI) status. To assess the application value of ARG‐score in clinical treatment, immunophenotype score was used to predict patients’ immunotherapy response in colon cancer. We found the mutations of ARGs in TCGA–COAD dataset from genetic levels and discussed their expression patterns based on TCGA and GEO datasets. We observed important differences in clinicopathological features, prognosis, immune feature, molecular feature between the two molecular subgroups. Then, we established an ARG‐score for predicting OS and validated its predictive capability. A high ARG‐score characterized by higher transcription level of ARGs, suggested lower MSI‐high (MSI‐H), lower immune score, and worse clinical stage and survival outcome. Additionally, the ARG‐score was remarkably related to the CSCs index and immunotherapy sensitivity. We found two new molecular subtypes and two gene clusters based on ARGs and established an ARG‐score. Multilayered analysis revealed that ARGs were remarkably correlated to the heterogeneity of colon cancer patients and explained the process of tumorigenesis and progression better. The ARG‐score can help us better assess patients’ survival outcomes and provide guidance for individualized treatment.

in colon cancer.We found the mutations of ARGs in TCGA-COAD dataset from genetic levels and discussed their expression patterns based on TCGA and GEO datasets.We observed important differences in clinicopathological features, prognosis, immune feature, molecular feature between the two molecular subgroups.Then, we established an ARG-score for predicting OS and validated its predictive capability.A high ARG-score characterized by higher transcription level of ARGs, suggested lower MSI-high (MSI-H), lower immune score, and worse clinical stage and survival outcome.Additionally, the ARG-score was remarkably related to the CSCs index and immunotherapy sensitivity.We found two new molecular subtypes and two gene clusters based on ARGs and established an ARG-score.Multilayered analysis revealed that ARGs were remarkably correlated to the heterogeneity of colon cancer patients and explained the process of tumorigenesis and progression better.The ARG-score can help us better assess patients' survival outcomes and provide guidance for individualized treatment.

K E Y W O R D S
angiogenesis, ARG-score, colon cancer, gene cluster, molecular subtype, TME

INTRODUCTION
Colon cancer is the third most common among all malignancies and its mortality rate ranked the second highest in the world. 1 Colon cancer contains a variety of pathological types, such as adenocarcinoma, mucinous adenocarcinoma, and neuroendocrine carcinoma, among which the incidence of colon adenocarcinoma (COAD) is as high as more than 90%. 2 Treatment, including surgery, chemotherapy, radiation, and immunotherapy, significantly increased survival time of colon cancer patients. 3owever, the 5-year survival rate of patients is still unsatisfactory because of distant metastasis and drug resistance. 4t is important to identify the molecular mechanism of the proliferation and progression of COAD and develop accurate biomarkers for individualized treatment.
Angiogenesis is important for tumor occurrence and deterioration, and the progression is regulated by growth factors, proangiogenic cytokines, and neovascularization antagonists. 4Instead of the normal blood vessels, tumor neovascularization is characterized by weak vascular wall, extensive vascular endothelial space, high vascular permeability, and structural disorder. 5Angiogenesis-related factors are often overexpressed in tumors. 6Neovascularization is immature and highly heterogeneous, leading to reducing immune cell infiltration and activity and increasing the probability of cancer metastasis. 7Angiogenesis provides additional nutrient supply to tumor cells, so antiangiogenesis is a promising antitumor therapeutic method. 5[10] Immune checkpoint inhibitors for PD-1, PD-L1, and CTLA-4 are used in the molecularly targeted therapy of colon cancer. 11,12However, only a small percentage of people receive sustained benefits from immunotherapy.Additionally, genetic changes within the tumor cells themselves, factors in the surrounding environment can also promote tumor development or lead to immune escape. 13Cumulative studies suggest that tumor microenvironment (TME) is related to both tumor invasion and immune response. 14ME is consisted of several factors, including tumor cells, blood vessels, infiltrating immune cells, stromal cells, tissue fluids, and cytokines. 15Intratumor angiogenesis can interfere with the recruitment of immune cells in TME and regulate immune cells activity, thus affecting tumor progression and immune sensitivity. 16VEFG can promote the recruitment and proliferation of immunosuppressive cells, such as Treg cells, MDSCs, and M2-TAMS. 16In addition, vascular endothelial growth factor (VEGF)-induced over angiogenesis produces hypoxic TME after abnormal blood vessel formation, creating a more immunosuppressive environment. 17Angiotensin-2 can also inhibit the secretion of TNF-α, thus suppress the anticancer activity of monocytes. 18Analyzing the high heterogeneity and complexity of TME, as well as the dynamic correlation between angiogenesis and TME, is critical for the advancement of anticancer combination therapies.
This article systematically discussed the transcription level angiogenesis-related genes (ARGs) and their value on TME, prognosis, and therapeutic response of colon cancer patients.Two distinct angiogenesis subtypes were identified in COAD with The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) datasets.Patients were then divided into two gene clusters according to the differentially expressed genes (DEGs) obtained by the two ARG-related molecular subtypes.Further, a scoring model (ARG-score) was established to accurately predict survival outcome and responses to immunotherapy in colon cancer patients.

Data sources
We downloaded three colon cancer datasets from TCGA (https://portal.gdc.cancer.gov/)and GEO (https://www.ncbi.nlm.nih.gov/geo/)(GSE39582, GSE17356) and excluded the patients without survival outcomes.Finally, 1202 COAD patients in total were included in the study.Among them, the TCGA and GSE39582 datasets (N = 1025) were used for model building and internal validation, and the external validation of model use the data from GSE17356 (N = 177) cohort.The RNA-seq, somatic variation, and clinical data were obtained from the TCGA and GEO database, respectively.The clinical data for these COAD patients was listed in Table S1.A total of 36 ARGs were searched from the MSigDB Team (Hallmark Gene set) (Table S2).

Consensus clustering analysis of ARGs
The "ConsensusClusterPlus" package was used to perform unsupervised cluster analysis.The patients were classified into two distinct molecular subgroups based on the transcription level of 36 ARGs. 19,20Gene set variation analysis (GSVA) was carried out to investigate the differences of ARGs in biological function with the Kyoto Encyclopedia of Genes and Genomes (KEGG) gene set (c2. cp.kegg.v7.2). 21

Relationship between molecular subtypes with the clinicopathological features and prognosis of colon cancer
In order to explore the clinical significance of two ARGrelated molecular subtypes identified by consensus clustering, the relationships between ARG-related molecular subtypes, clinicopathological features, and prognosis were compared.Age, gender, T stage, N stage, M stage, and TNM stage acted as clinicopathological features.Kaplan-Meier (K-M) analyses were employed to evaluate the differences in prognosis between the two subgroups.

Molecular and immune characteristics between the ARG-related subgroups
Single-sample gene set enrichment analysis (ssGSEA) algorithm 22 was used to determine the levels of immune cell infiltration of each patient in different ARG-related molecular subtypes.We also compared the expression level of ARGs in the two subgroups.

DEG identification and functional annotation
The "limma" R package was employed to identify the DEGs between the two ARG-related molecular subtypes, and adjusted p value <0.05 and | log fold change | >0.585 were defined statistically significant for identifying DEGs.Based on these DEGs, the "clusterProfiler" R package was used to perform Gene Ontology (GO) and KEGG analyses, and adjusted p value <0.05 was set as thresholds.

Construction of the angiogenesis-related prognostic ARG-score in colon cancer
In order to better predict the prognosis of patients, an angiogenesis-related prognostic ARG-score model was constructed according to the ARG_related molecular subtypes.First, 925 DEGs between the 2 ARG_related molecular subtypes were subjected to univariate Cox regression analysis to identify those linked to overall survival.Second, the 1025 patients from TCGA-COAD dataset and GSE39582 dataset were classified into different gene clusters (gene cluster A, gene cluster B) for further analysis using an unsupervised clustering method based on the expression of 396 prognostic DEGs.Then, 1025 patients in total were randomly divided into training group (n = 513) and testing group (n = 512) at a ratio of 1:1.Finally, based on the expression value of prognostic DEGs, the LASSO Cox regression model was constructed to develop the prognostic ARG-score with the R package "glmnet" in the training group.The ARG-score was calculated as follows: ARGscore = Σ(Expi × Coefi), where Expi is the expression value of each gene i, and Coefi is the regression coefficient in gene i from LASSO Cox regression analysis.Similarly, the risk scores of patients in the testing group (n = 512) and GSE17356 (n = 177) dataset were calculated according to the formula constructed by the training group and divided them into the high-risk group and the low-risk group according to the median risk of the training group, respectively.

Relationship between the ARG-score and clinicopathological features
The relationship between the ARG-score and the clinicopathological characteristics of patients with COAD was identified by chi-square test.Univariate and multivariate analyses were used to distinguish independent prognostic factor of COAD patients.The K-M curves were performed to analyze survival outcome of patients between high-and low-risk.

Relationship between of ARG-score and molecular and immune features
The each patient's immune and stromal scores was calculated with ESTIMATE algorithm. 23The fractions of 22 human immune cell subsets of every colon cancer sample were assessed by the CIBERSORT algorithm. 24Gene set enrichment analysis (GSEA) was utilized to analyze the underlying mechanisms of ARGs between the two risk groups.We also analyzed the relationship between the two groups of the transcription level of ARGs and immune checkpoint-related genes by Wilcoxon rank sum tests.Finally, we explored the cancer stem cell (CSC) index, microsatellite instability (MSI) and somatic mutations in the high-low risk group.

Immunotherapy susceptibility analysis
The immunophenotype score (IPS) was utilized to evaluate the response to immune checkpoint inhibitors therapy. 25he IPS of colon cancer patients was acquired from TCIA database and the IPS of different risk group was compared to value the susceptibility to immune checkpoint blocking therapy.

Statistical analysis
All statistical analyses were applied with R 4.

Genetic and transcriptional alterations of ARGs in colon cancer
We first used the TCGA-COAD dataset to determine the transcription levels of 36 ARGs in tumor and normal samples.Twenty-eight DEGs in total were identified, and most of them were significantly upregulated in COAD tissue specimens (Figure 1C).Next, we identified the rate of somatic mutations of 36 ARGs with TCGA-COAD dataset.Figure 1A shows that 149 of 399 (37.34%) tumor samples presented genetic mutations, with the highest mutation frequency of VCAN, and following by COL5A2 and POSTN, among the 36 AAGs.Then, we analyzed the interactions among the 36 ARGs and their survival significance and found that most genes had significant positive correlations (Figure 1B).Meanwhile, we further plotted the K-M survival curves and found that 27 ARGs were associated with prognosis (Figure S1), suggesting the underlying function ARGs in COAD oncogenesis.

Identification of the ARG-related molecular subtypes
To comprehensively understand the transcription pattern of ARGs involved in tumor progression, 1025 patients from 2 corresponding colon cancer datasets (TCGA-COAD, GSE39582) were put together for further analysis.
We found that when k = 2, the intragroup correlation was highest and the intergroup lowest, suggesting the stability and reliability of classifying the 1025 patients into subtypes A (n = 468) and B (n = 557) (Figure 2A,B).PCA analysis showed obvious distinction in the angiogenesis expression profiles between the two subgroups (Figure 2C), indicting superior performance of consensus clustering.The K-M analysis showed that there were significant survival difference between the two subtypes and a longer OS time was in subtype B (p = 0.018 < 0.05) (Figure 2D).Additionally, the genomic transcription level and clinicopathological features between the two molecular subtypes were compared.No significant difference in clinicopathological features between the two subtypes was found, whereas the most of ARGs were highly expressed in subtype A when compared with subtype B (Figure 2E).

Characteristics of the TME in two RAG-score subgroups
Based on the GSVA, we observed that subtype A was remarkably involved in cancer-associated pathways (such as pancreatic cancer, small cell lung cancer, prostate cancer, and melanoma) and metastasis-associated pathways (such as extracellular matrix [ECM] receptor interaction, focal adhesion, and cell adhesion molecules cams), whereas the subtype B was significantly involved in metabolism-associated pathways (such as propanoate metabolism, nitrogen metabolism, retinol metabolism, and TCA cycle) (Figure 2F, Table S3).The levels of 23 immune cell infiltration in the samples were calculated by ssGSEA algorithm (Figure 2G).The higher infiltration levels of activated CD4 T cell, activated CD8 T cell, activated B cell, activated dendritic cell, CD56dim NK cell, CD56bright NK cell, eosinophilna cell, gamma delta T cell, immature B cell, immature DC cell, MDSC cell, macrophage, mast cell, NK T cell, NK cell, plasmacytoid DC, regulatory T cell, T follicular helper cell, type 1 T helper cell, and type 2 T helper cell were obviously observed in the ARG-related subtype A than those in the ARGrelated subtype B, whereas the lower infiltrations of type 17 T helper cell were remarkably observed in subtype A compared to those in subtype B. There also existed considerable statistical differences in transcription level of immune checkpoint-related genes between the two molecular subtypes (Figure 2H).Most immune checkpointrelated genes, including CTLA4, PDCD1 (PD-1), and PDCD1LG2 (PD-L2), are highly expressed in molecular subtype A compared with subtype B.

DEGs identification and functional annotation based on the ARG-related subtypes
According to two ARG-related subtypes, we found 925 DEGs (Table S4) using the R package "limma," and identified biological functions with GO and KEGG.GO analysis revealed that these DEGs were involved in metastasisassociated biological processes, for example, collagen containing ECM, ECM organization, and ECM structural constituent (Figure 3A, Table S5).KEGG analysis showed the enrichment of cancer-and metastasis-associated pathways (Figure 3B, Table S6), suggesting that ARGs may play a crucial role in tumor metastasis.Then, univariate Cox regression model identified that among 925 DEGs, 396 DEGs were related with OS of COAD patients (Table S7).In order to further analyze the underlying biological functions of ARGs in colon cancer, we performed consensus clustering analysis to divide the patients into 2 gene clusters according to the 396 prognostic DEGs (Figure 3C,D).PCA analysis showed that gene clusters still had a good stratification performance on patients (Figure 3E).K-M curve showed that the patients in gene clusters A had significantly poorer prognostic outcome (p = 0.013 < 0.05) (Figure 3F). Figure 3G shows that the higher expression level of most ARGs was obviously found in gene clusters A than that in gene clusters B, which was consistent with the expected result of ARG-related molecular subtypes, suggesting that ARGs can regulate the pathological mechanism of colon cancer.

Establishment of the angiogenesis-related prognostic ARG-score in colon cancer
Based on the 396 DEGs associated with prognosis, we constructed a angiogenesis-related prognostic ARG-score to assessed the patient's prognostic risk.First, 1025 patients in total were randomly divided into training group (n = 513) and testing group (n = 512) at a ratio of 1:1.Second, the prognostic ARG-score in the training group was developed by the LASSO Cox regression analysis with the "glmnet" R packet (Figure 4A,B).Finally, seven genes were acquired for constructing the ARG-score with the formula was as follows: ARG-score = Σ(Expi × Coefi), where Coefi and Expi denoted the risk coefficient and expression of each gene, respectively.In the TCGA dataset, we further analyzed the prognostic value of seven genes involved in model construction.Univariate Cox regression model shown that the seven genes were correlated with prognosis (Figure S2A), of which four were dangerous factors (ARMCX2, latent transforming growth factor β binding protein 1 [LTBP1], ADAM8, and FABP4) (hazard ratio [HR] >1) (Figure S2B-E) and three were protective factors (CCL11, CXCL11, ITLN1) (HR < 1) (Figure S2F-H).The risk-score of every patients in training group (n = 513) was further calculated and the patients was assigned into the high-risk group (n = 256) and low-risk group (n = 257) using the median risk value.Figure 4C displays the patients' distribution in the two molecular subtypes, two gene clusters, and two ARG-score groups.As shown in Figure 4D, a higher risk score was found in molecular subtype A (p < 2.22e − 16).Similarly, significant differences were remarkably observed in AAG-score between the gene clusters, and gene cluster A had a higher risk score (p < 2.22e − 16) (Figure 4E).By analyzing the training group, the proportion of deaths was found higher in the high-risk group than the low-risk group (Figure 4F,G).Furthermore, K-M curve showed that the prognosis of low-risk patients was significantly better (Figure 4H).The AUCs for the 1-, 3-, 5-year survival were 0.658, 0.711, and 0.664 (Figure 4I).

Validation of the ARG-score
In order to verify the practicality and credibility of the model, the patients in testing group (n = 512) and GSE17536 (n = 177) dataset were used for internal validation and external validation.The risk scores of patients in the testing group were calculated according to the formula constructed by the training group and divided them into the high-risk group and the low-risk group according to the median risk of the training group, respectively.In the testing group, a higher proportion of death was found in the high-risk group than the low-risk group (Figure 5A,B).Figure 5C shows the survival outcomes of patients with high-risk score were obviously worse than those with lowrisk score (p = 0.001 < 0.05), and the AUCs for the 1-, 3-, 5 year survival were 0.577, 0.620, and 0.644, respectively (Figure 5D).In the GSE17536 dataset, we reached similar conclusions, namely, a higher proportion of deaths in the high-risk group (Figure 5E,F) and poorer prognostic outcomes in the patients with high-risk score (Figure 5G).ROC curves were also plotted and the AUCs for the 1-, 3-, 5 year survival in the GSE17536 were 0.643, 0.637, and 0.620, respectively (Figure 5H).

Relationship of ARG-score and clinicopathological features
We used chi-square tests to explore the relationship between the risk score and clinicopathological features and found significant differences in T stage, N stage, M stage, and TNM stages between the high and low-risk groups in the training group (Figure 6A).The clinicopathological features were observed as obviously different between high and low-risk groups in testing group and GSE17536 dataset (Figure S3A,B).Meanwhile, we further explored the relationship between T, N, M, and TNM stage and risk score by Wilcoxon signed-rank test, separately.We observed obviously differences in risk scores for T, N, M, and TNM stages, and the higher the risk score of patients with clinical stage deterioration (Figure 6B-E).We also observed that patients with high-risk scores tend to have poor clinical staging in testing group (Figure S3C-F) and GSE17536 dataset (Figure S3G).The univariate Cox regression analysis indicated that the ARG-score was related to the prognosis of the training group (p < 0.001, HR = 1.514, 95% CI: 1.350 − 1.699) (Figure 6F).Simultaneously, the multivariate Cox regression model yielded same results (p < 0.001, HR = 1.344, 95% CI: 1.168 − 1.548) (Figure 6G).Univariate and multivariate Cox regression analysis in testing group (p < 0.001, HR = 1.345, 95% CI: 1.231 − 1.469 and p < 0.001, HR: 1.216, 95% CI: 1.086 − 1.362) (Figure S4A,B) and GSE17536 dataset (p = 0.004, HR = 1.810, 95% CI: 1.165 − 2.811 and p = 0.003, HR: 2.042, 95% CI: 1.275 − 3.271, 2.042) (Figure S4C,D) reconfirmed that ARG-score was independent prognostic factor for patients with colon cancer.

Immune and molecular characteristics for patients in two risk groups
We calculate the relative abundance of 22 infiltrating immune cells per sample in the low-and high-risk groups by CIBERSORT algorithm and explored the correlations between the risk score and human immune cell subsets (Figure S5).The ARG-score was positively associated with the infiltration of M2 macrophages, activated mast cells, neutrophils, regulatory T cells (Tregs), and M0 macrophages but negatively correlated with M1 macrophages, follicular helper T cell, naive B cell, activated dendritic cell, resting dendritic cell, eosinophils, plasma cells, activated memory CD4+ T cell, resting memory CD4+ T cell, and CD8+ T cell.Furthermore, we found that most infiltrating immune cells were obviously related to the genes of the ARG-score model (Figure 7A).We also observed that a low ARG-score was obviously related to a high immune score, whereas a high ARG-score was related to a high stromal score (Figure 7B).The boxplot (Figure 7C,D) shown that 28 ARGs and 25 immune-related genes had different expression levels in the low-risk group and the high-risk group.We further explored the gene enrichment level in different ARGs-score subgroups using GSEA software in TCGA-COAD dataset.Gene sets of high ARG-score group were related to tumor metastasis (Figure 7E), whereas the gene sets of low ARG-score group were related to metabolism-related pathway (Figure 7F).Table S8 shows detailed results for GSEA.

Correlation analysis between ARG score and MSI and CSC index
Previous studies have shown that immunotherapy is more effective for the patients with high MSI (MSI-H).We found that low ARG-score was remarkably related to MSI-H status, meanwhile high ARG score was related to microsatellite stable (MSS) status (Figure 8A,B).Additionally, we explored the underlying association between the ARG-score and CSC index in TCGA-COAD dataset.The ARG-score was negatively related to the CSC (R = − 0.83, p = 3.2e − 16 < 0.05), suggesting that CRC cells with lower ARG-score had more distinct stem cell features and a lower degree of cell differentiation (Figure 8C).

Mutation and immunotherapy sensitivity
We then explored the difference of the somatic mutations in the low and high-risk ARG-score groups.The top 10 genes with the mutation incidences in two ARG-score groups were identical: KRAS, APC, TTN, TP53, PIK3CA, SYNE1, MUC16, ZFHX4, FAT4, and RYR2 (Figure 8D,E).Meanwhile, IPS was calculated to assess the immunotherapeutic response of the patients.In Figure 8F, a higher IPS was found in the low-risk group compared with the highrisk group, indicating that immunotherapy was likely more effective for the patients with low ARGs score.

DISCUSSION
One of the main features of all cancers is angiogenesis and the new blood vessels can drive tumor progression. 26umor progression, invasion, and metastasis are inseparable from tumor angiogenesis.Angiogenesis is closely correlated with the treatment of tumors.In the TME, endothelial junctions with tumor vessels are frequently disrupted, leading to increased permeability and interstitial fluid pressure and poor perfusion, which in part hinder drug delivery.These biological capabilities provided a theoretical foundation for the clinical practice of ARGs in predicting therapeutic responses of colon cancer patients.Inhibiting angiogenesis could improve chemotherapy treatment. 27Antiangiogenesis has been shown to achieve good clinical benefits in the treatment of solid tumors. 28At present, the FDA has approved two major classes of antiangiogenic drugs: single-target inhibitors and multi-target inhibitors. 29VEGF is a crucial target molecule of antitumor angiogenesis. 30Significantly, patients are prone to resistant to antiangiogenic drugs.At the same time, angiogenesis can regulate the infiltration degree and the function state of immune cells in the TME. 16Angiogenic factors including VEGF interfere with the entry of immune cells into the TME by regulating the level of adhesion molecules. 31,32][35] Immune checkpoint inhibitors combined with antiangiogenic drugs has made great progress for clinical cancer therapy. 16Nevertheless, there are still difficulties in tumor immunotherapy, especially immune resistance and lack of immune cell infiltration.It is essential to understand the interplay between angiogenesis and the TME, and accurately stratify patients for individualized treatment.
In our study, we defined the transcriptional level of ARGs in the TCGA-COAD dataset.Although the frequency of ARGs mutations (0%-9%) was low, most of them were upregulated in COAD patients and correlated to prognosis.COAD patients were grouped into 2 subtypes (subtype A and subtype B) based on 36 ARGs.In the two subgroups, remarkable differences were observed in the patients' outcomes, immune cell infiltration, and capability.Compared to subtype B, a poorer OS was found in the patients with subtype A, and most ARGs were upregulated in subtype A. There were also remarkable differences in TME features between the two subgroups.The subtypes A were also featured by remarkable cancer-associated pathways and metastasis-associated pathways.Furthermore, the transcription levels of immune checkpoint-related genes between different subtypes were obviously related to ARGs.Most of immune checkpoint-related genes were obviously elevated in subtype A, which means that subtype B benefit more from immune checkpoint inhibitors.We found two gene sets according to the DEGs between the two ARG-related molecular subtypes.Compared with the patients with gene cluster B, patients with gene cluster B characterized by higher expression level of most ARGs and worse OS.We constructed and validated an ARG-score model to quantify the prognostic risk of per patient.The ARGscore was an independent prognostic parameter according to variable Cox regression model.The 1-, 3-, and 5-year ROC curves confirmed the predictive robustness of the ARG-score model.We observed that seven genes in the ARG-score model construction were associated with prognosis, three (CCL11, CXCL11, and ITLN1) of which were protective genes and four (ARMCX2, LTBP1, ADAM8, and FABP4) were dangerous genes.Identification of dangerous genes can help us detect cancer at an earlier stage and guide treatment.The high expression status of ARMCX2 has been shown to be associated with poor prognosis in patients with gastric cancer. 36LTBP1 was observed to be abnormally highly expressed in esophageal carcinoma (ESCA) and positively correlated with lymphatic metastasis.At the same time, overexpression of LTBP1 can mediate the occurrence of EMT in ESCA. 37ADAM8 has been shown to promote metastasis and recurrence of colon cancer by regulating the validation response. 38At the same time, in pancreatic cancer, breast cancer, and glioma, abnormal expression of ADAM8 is involved in mediating the invasion and metastasis of tumor cells. 39Patients with low-and high-risk ARG-scores showed considerable difference in clinicopathological parameters, prognosis, TME, immune checkpoints, CSC index, MSI, and drug susceptibility.Patients with high ARG-score had a poorer OS and worse clinical staging, suggesting a high ARG-score were able to predict an unsatisfactory prognosis.GSEA also revealed the high ARG-score was correlated to cancerassociated pathways and metastasis-associated pathways.Correlation analysis between IPS and ARG-score revealed that immunotherapy was more effective for the patents with low risk.
Solid malignant tumors are a complex system, including tumor cells, immune cells, vascular endothelial cells, and ECM components.Immune cells in the TME are often "tumor-associated" and enhance the growth of tumor by exerting immunosuppressive functions.Mounting evidence revealed the inevitable association between angiogenesis and innate immunity. 40,413][44][45] By exploring the features of TME and angiogenesis, it is expected to further suppress tumor development and metastasis. 13In the high and low-risk groups, we not only explored the transcription levels of 36 ARGs but also calculated immune scores.We found that the expression of 28 ARGs differed between the high and low-risk groups, and 26 genes were upregulated in the high-risk group.Meanwhile, we observed lower immune scores in the high-risk group compared to the low-risk group.These outcomes are in accord with previous studies that angiogenesis can inhibit immune cell infiltration, and suggesting the ARG-score could be used as an indicator to predict patient prognostic risk and guide immunotherapy.
Understanding the correlation between the TME and ARG-score could be conducive to exploring new treatment strategies for the therapy of COAD or influencing the TME to improve the prognosis of the patients.In our research, the ARG-score was negatively related to the infiltration levels of DCs, CD4+ T cell, and CD8+ T cell and positively related to the degree of infiltration of macrophages M2 cell and T cells regulatory (Tregs).
In the TME, CD8+ T cells compose the major immune cell set and can further differentiate into cytotoxic T lymphocytes and exhibit cytotoxicity to tumor cells. 468][49] CD4+ T cells indirectly regulate dendritic cell activation and maturation, thereby activating CD8+ T cells. 50Tregs are an important immunosuppressive subset of CD4+ T cells that maintain homeostasis and limit immune activation, thereby suppressing tumor immune responses. 51M2 macrophages can promote the proliferation of tumor cells by promoting the secretion of IL-10 and decreasing IL-12 and IL-23. 52Our findings are consistent with previous findings that low ARG-score correlates with tumor-suppressing immune cells, and high ARG-score correlates with tumor-promoting immune cells.
MSI-H status is associated with good patient outcomes.A meta-study analysis of 32 clinical studies involving more than 7000 patients showed that regardless of stage II, III, or IV colorectal cancer, patients with MSI-H had a lower risk of death than MSS. 53Our results presented a similar finding that low-risk patients were associated with MSI-I, indicating that the patients with low-risk have a longer OS.Finally, COAD patients with low ARG-scores were found responded positively to anti-pd1 and anti-CTLA-4 therapy, suggesting that the patients with low-risk may more sensitive for immune checkpoint inhibitors than those with high risk.
While, there existed some shortcomings about our study.The raw data from public databases were obtained retrospectively, and selection bias may affect their robustness and accuracy.Vivo and in vitro experimental studies are needed to confirm the interaction between ARGs and tumor.A large number of prospective cohort studies are needed to validate the performance of ARG-score in clinical applications.

CONCLUSION
We comprehensively expored the molecular and immune mechanisms of ARGs in the pathogenesis of colon cancer and found that the expression and role of ARGs in colon tumors were significantly heterogeneous.The classification of different molecular subtypes and genetic subtypes facilitates precision and individualized treatment.At the same time, we established a prognostic ARG-score model, and subsequent multidimensional analysis proved to be a promising indicator for diagnosis and treatment.

A U T H O R C O N T R I B U T I O N S
All authors contributed to the study's conception and design.Changjing Wang and Baokun Li performed data collection and analysis.Feifei Wang and Guanglin Wang

F I G U R E 1
Genetic and transcriptional alterations of angiogenesis-related genes (ARGs) in colon cancer.(A) Mutation frequencies of 36 ARGs in 399 patients with colon cancer from the Cancer Genome Atlas (TCGA) dataset.(B) Interactions among ARGs in colon cancer.The line connecting the ARGs represents their interaction, with the line thickness indicating the strength of the association between ARGs.Blue and pink represent negative and pink positive correlations, respectively.(C) The expression level of 36 ARGs between normal and tumor tissues.

F
I G U R E 2 Angiogenesis-related gene (ARG)-related molecular subtypes and clinicopathological and biological features of two distinct subtypes of samples divided by consistent clustering.(A and B) Consensus matrix heatmap defining two clusters (k = 2) and their correlation area.(C) PCA analysis showing a remarkable difference in transcriptomes between the two molecular subtypes.(D) Survival curve showed that the OS of the distinct ARG-related molecular subtypes was different.(E) Differences in clinicopathologic features and expression levels of ARGs between the two distinct molecular subtypes.(F) Biological pathways in two distinct molecular subtypes from gene set variation analysis (GSVA) software, in which activated and inhibited pathways were plotted using red and blue lines, respectively.(G) Abundance of 23 infiltrating immune cell types in the two distinct molecular subtypes.(H) Expression level of immune checkpoint-related genes between the two distinct molecular subtypes.

F I G U R E 3
Selection of gene clusters based on differentially expressed genes (DEGs).(A and B) Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses of DEGs among two angiogenesis-related gene (ARG)-related molecular subtypes.(C and D) Consensus matrix heatmap defining two gene clusters (k = 2) and their correlation area.(E) PCA analysis indicating a remarkable difference in transcriptomes between the two gene clusters.(F) Survival curve for OS of the two gene cluster.(G) Expression level of ARGs between the two gene clusters.F I G U R E 4 Construction of the angiogenesis-related gene (ARG)-score in the training group.(A) Cross-validation for tuning the parameter selection in the LASSO regression.(B) LASSO regression of these prognosis-related differentially expressed genes (DEGs).(C) Alluvial diagram of subgroup distributions in groups with different ARG-scores and clinical outcomes.(D) Differences in ARG-score between the two ARG_related molecular subtypes.(E) Differences in ARG-score between the two distinct gene clusters.(F and G) Ranked dot and scatter plots indicating the ARG-score distribution and patient survival status.(H) Survival curve for two risk groups.(I) ROC curves to predict the sensitivity and specificity of 1-, 3-, and 5-year survival according to the ARG-score.

F I G U R E 5
Validation of the angiogenesis-related gene (ARG)-score in testing group and GSE17536 dataset.(A and B) Ranked dot and scatter plots indicating the ARG-score distribution and patient survival status in testing group.(C) Survival curve for two risk groups in testing group.(D) ROC curves to predict the sensitivity and specificity of 1-, 3-, and 5-year survival according to the ARG-score in testing group.(E and F) Ranked dot and scatter plots indicating the ARG-score distribution and patient survival status in GSE17536 dataset.(G) Survival curve for two risk groups in GSE17536 dataset (H) ROC curves to predict the sensitivity and specificity of 1-, 3-, and 5-year survival according to the ARG-score inGSE17536 dataset.

F I G U R E 6
The correlation and independent prognosis analysis of angiogenesis-related gene (ARG)-score and clinicopathological features in training group.(A) Heatmap of the correlation between clinicopathological features and ARG-score.(B-E) Boxplots of the correlation among T stage, N stage M stage, and TNM stage, respectively.(F and G) Univariate and multivariate analyses of the prognostic value of the ARG-score.

F I G U R E 7
Immune and molecular feature between the high and low-risk groups.(A) Relationships between the abundance of immune cells and selected genes in the prognostic model.(B) Correlations between angiogenesis-related gene (ARG)-score and immune and stromal scores.(C) Differences in the expression of ARGs between the high and low-risk groups.(D) Differences in the expression of immune checkpoint-related genes between the high and low-risk groups.(D-F) Gene set enrichment analysis (GSEA) in high and low-risk group, respectively.

F I G U R E 8
Comprehensive analysis of the angiogenesis-related gene (ARG)-score in colon cancer.(A and B) The correlation analysis between ARG-score and microsatellite instability (MSI).(C) The correlation analysis between ARG-score and cancer stem cells (CSC) index.(D and E) The waterfall plots of somatic mutation features established with high and low ARG-scores.