Immune landscape and a promising immune prognostic model associated with TP53 in early‐stage lung adenocarcinoma

Abstract Purpose TP53 mutation, one of the most frequent mutations in early‐stage lung adenocarcinoma (LUAD), triggers a series of alterations in the immune landscape, progression, and clinical outcome of early‐stage LUAD. Our study was designed to unravel the effects of TP53 mutation on the immunophenotype of early‐stage LUAD and formulate a TP53‐associated immune prognostic model (IPM) that can estimate prognosis in early‐stage LUAD patients. Materials and methods Immune‐associated differentially expressed genes (DEGs) between TP53 mutated (TP53MUT) and TP53 wild‐type (TP53WT) early‐stage LUAD were comprehensively analyzed. Univariate Cox analysis and least absolute shrinkage and selection operator (LASSO) analysis identified the prognostic immune‐associated DEGs. We constructed and validated an IPM based on the TCGA and a meta‐GEO composed of GSE72094, GSE42127, and GSE31210, respectively. The CIBERSORT algorithm was analyzed for assessing the percentage of immune cell types. A nomogram model was established for clinical application. Results TP53 mutation occurred in approximately 50.00% of LUAD patients, stimulating a weakened immune response in early‐stage LUAD. Sixty‐seven immune‐associated DEGs were determined between TP53WT and TP53MUT cohort. An IPM consisting of two prognostic immune‐associated DEGs (risk score = 0.098 * ENTPD2 expression + 0.168 * MIF expression) was developed through 397 cases in the TCGA and further validated based on 623 patients in a meta‐GEO. The IPM stratified patients into low or high risk of undesirable survival and was identified as an independent prognostic indicator in multivariate analysis (HR = 2.09, 95% CI: 1.43–3.06, p < 0.001). Increased expressions of PD‐L1, CTLA‐4, and TIGIT were revealed in the high‐risk group. Prognostic nomogram incorporating the IPM and other clinicopathological parameters (TNM stage and age) achieved optimal predictive accuracy and clinical utility. Conclusion The IPM based on TP53 status is a reliable and robust immune signature to identify early‐stage LUAD patients with high risk of unfavorable survival.


| INTRODUCTION
Lung cancer is the leading reason for tumor-associated death, and it is histologically categorized into two major subtypes: small-cell lung carcinoma (SCLC) and non-small-cell lung carcinoma (NSCLC), accounting for approximately 15% and 85% of all cases, respectively. 1,2 Lung adenocarcinoma (LUAD), the most frequent subtype of NSCLC, comprises over 40% of all patients, which is derived from mucus-secreting type II alveolar cells in small airway epithelium. 3,4 High rate of metastasis and invasiveness, one of the most striking characteristics of LUAD, yields a 5-year survival rate of merely 19%. 5 Moreover, a staggering proportion (57%) of LUAD cases initially diagnosed with metastatic neoplasm achieves a 5-year survival rate of merely 5%. 6 Conversely, low-dose computerized tomography (CT) is conducive to the rapid screening of early-stage LUAD. 7,8 Those with localized tumor exhibit a relatively desirable 5-year survival rate of up to 83% (TNM stage IA) and 71% (TNM stage IB). 9 Based on several large and randomized clinical trials, a majority of early-stage LUAD cases are not recommended to receive adjuvant systemic chemotherapy following surgical intervention, which is partly ascribed to chemotherapy-associated toxic effects far exceeding the potential survival advantages for these individuals. 10,11 Therefore, it is essential to identify the subset of early-stage LUAD patients (TNM I and Ⅱ stage) with high possibility of recurrence and death, for whom additional systemic chemotherapy is required. 8,12 Immune escape has been considered as an emerging hallmark of lung cancer. 13 Novel and promising immune checkpoint inhibitors' (ICIs) treatment targeted programmed death 1(PD-1)/programmed death ligand 1 (PD-L1) has been revealed a prominent and durable response in approximately 20% NSCLC patients. 14 The ideal method of patient selection for the optimal improvement in the effectiveness of frontline immunotherapy for NSCLC remains to be determined. 15 The levels of immunosuppressive molecules (such as PD-L1, PD-1, and indoleamine 2,3-dioxygenase [IDO]), mutational landscape, and burden as well as mismatch repair deficiency have been identified as potential predictors of patient's response to ICIs therapy. 16,17 For example, a significantly increased progression-free survival (PFS) was revealed in anti-PD-1 monotherapy (Nivolumab) combined with anti-CTLA-4 monotherapy (Ipilimumab) versus chemotherapy alone in NSCLC individuals with a high tumor mutational burden (TMB) rather than in those with a low TMB. 18,19 The subset of NSCLC cases with strongly PD-L1positive neoplasms is the primary driver of clinical benefit from anti-PD-1 monotherapy (Pembrolizumab) in the whole research population. 20 Additionally, certain immune-associated clinicopathological parameters, such as enhanced levels of tumor-infiltrating cytotoxic lymphocytes (CTLs), seem to be prognostic biomarkers and are significantly correlated with better prognosis in early-stage LUAD, further highlighting the significance of multifarious components of the immune system during the initiation and progression of lung cancer. 21 Thus, comprehensive exploration of the complex cross-talk between tumor and immune microenvironment will be instrumental in evaluating their prognostic potential in early-stage LUAD and optimizing tumor-associated immunotherapy strategies. 22 Tumor suppressor p53, a transcription factor, exerts its tumor suppressive function primarily via its transcriptional modulation of its downstream target genes. 23 The products of p53 target genes are proved to participate in a sequence of crucial biological pathways, including cell proliferation and apoptosis, DNA damage repair, anti-oxidant function, metabolism and angiogenesis as well as immunoreaction, thus making contributions to the tumor-suppression effect of p53. [24][25][26] TP53, the encoding gene of p53 protein, is the most commonly mutated gene in multiple cancers and is subsistent in approximately 37%-50% LUAD cases. [27][28][29] Mutant p53 potentially triggers chromosomal/genomic instability and further results in a high TMB, which is frequently related to more aggressive malignancy and unsatisfactory prognosis in LUAD. [30][31][32] Nevertheless, several studies have highlighted that TP53-mutated LUAD is characterized with higher PD-L1 expression by malignant cells, boosted T-cell infiltration and tumor immunogenicity, leading to an increased response to ICIs. 7,[32][33][34] Therefore, TP53 mutation burden is a pivotal determinant and biomarker for targeted therapy response and prognosis of patients.
In our report, we combined TP53 mutation status information with mRNA expression profiles to investigate the association between TP53 mutation and immune landscape in early-stage LUAD. An individualized IPM on the basis of immune-related gene whose expression is influenced by TP53 mutation status was developed and validated in different populations and platforms, which can be served as a promising prognostic signature to improve early-stage LUAD patient management.

| Data source and patient selection
We retrospectively analyzed the raw data of somatic single nucleotide mutation, mRNA expression, and corresponding clinical information of early-stage LUAD patients from four public datasets, including The Cancer Genome Atlas (TCGA) database and three independent datasets retrieved from Gene Expression Omnibus (GEO; GSE72094 based on platform GPL15048, GSE42127 based on platform GPL6884 and GSE31210 based on platform GPL570). Only early-stage LUAD cases with sufficient clinical annotation were incorporated into our study. Certain clinicopathological variables, such as age at diagnosis, gender, race, TP53 mutant information, status of residual tumor, TNM stage, survival time, and survival status, were extracted for each patient. A total of 1020 early-stage LUAD patients were included, consisting of 397 cases from TCGA, 336 cases from GSE72094, 111 cases from GSE42127, and 176 cases from GSE31210. We selected the TCGA dataset as independent training cohort. The remaining three GSE datasets were merged into one meta-GEO as validation cohort. Both gene expression data and clinicopathological information for early-stage LUAD are publicly available. All analyses in our study were performed strictly followed to the guidelines and regulations of above databases. The flow chart of overall study design was illustrated in Figure 1.
Specifically, the somatic mutation status of 561 LUAD cases (workflow type: SomaticSniper Variant Aggregation and Masking) and RNA sequencing (RNA-seq) data and the corresponding clinicopathologic parameters of 504 The flow diagram of our study. DEGs, differently expressed genes; FC, fold change; FDR, false discovery rate; LASSO, least absolute shrinkage and selection operator; ROC, receiver operating characteristic; OS, overall survival early-stage LUAD samples were extracted from the TCGA (https://tcga-data.nci.nih.gov/tcga/; up to May 30th, 2020). 35 The gene symbols were annotated using "clusterProfiler" R package. 36 Of above early-stage LUAD patients, a total of 397 cases with mRNA expression profiles, TP53 mutant status and clinical data were selected for subsequent analysis in our study. The RNA-seq data were handled through log2-scale transformation and was further applied to trimmed mean of M values (TMM) normalization via "edgeR" R package. 37 Average expression value was estimated and utilized when multiple expression values of the same gene were detected. 35 Microarray data and the corresponding clinical information of early-stage LUAD individuals from GSE72094, GSE42127, and GSE31210 were downloaded from Gene Expression Omnibus (GEO) database (http://www.ncbi.nlm. nih.gov/geo/) through "GEOquery" R package. 38 To accord with standard normal distribution, these data were eliminated batch effects via "sva" R package and normalized by scale function of "limma" R package, which were further merged into a meta-GEO cohort (including 623 early-stage LUAD patients) to externally validate the practicability of the IPM. 35

| Gene set enrichment analysis
To investigate the underlying association between TP53 mutant status and immune-associated biological pathways in early-stage LUAD, gene set enrichment analysis (GSEA) was conducted between early-stage LUAD cases without (n = 218) and with (n = 179) TP53 mutation in the TCGA cohort through GSEA software downloaded from https://www. broad insti tute.org/gsea/. 39 p < 0.05 was considered statistically significant.

| Differentially expressed genes analysis based on TP53 status
DEGs between TP53 MUT and TP53 WT early-stage LUAD were analyzed by "limma" R package. 40 |log 2 fold change (FC) | > 1 and false discovery rate (FDR) < 0.01 were considered as the cutoff values to identify DEGs. All TP53-related DEGs and above immune-related genes obtained from GSEA were overlapped to acquire immune-associated DEGs between TP53 WT and TP53 MUT early-stage LUAD individuals.

| Construction and validation of an immune prognostic model
A total of 397 early-stage LUAD patients in the TCGA database had sufficient information, including mRNA expression profiles, TP53 mutant status, survival time and survival status. All TP53-related immune DEGs and the survival data of 397 cases were analyzed by univariate Cox regression analysis utilizing "survival" R package. 41 When p value was below 0.05 and |hazard ratio (HR)|was not equal to 1, the corresponding DEGs were considered as prognostic immunerelated genes, which were all incorporated into subsequent investigation. LASSO Cox analysis was implemented through "glmnet" R package, further determining the most significantly prognostic immune-related DEGs. 42 The penalization coefficient was used for evaluating the weight of model parameters. Nonsignificant indicators were shrunk to zero, while residual DEGs were applied for establishing a prognostic risk score model. Subsequently, an IPM was constructed based on corresponding coefficients of the prognostic DEGs: risk score = β mRNA1 * Expr mRNA1 + β mRNA2 * Expr mRNA2 + ⋯ + β mRNAn * Expr mRNAn , where Expr is the DEG expression level and β is the LASSO Cox regression coefficient. All early-stage LUAD cases in the TCGA dataset were stratified into the high-or low-risk group according to the optimal cutoff value identified by X-tile 3.6.1 software. 43 To determine the distinguishing capability of the IPM for patients' prognosis, the differences of overall survival (OS) between high-or low-risk group were compared through Kaplan-Meier curves with log-rank test using "survival" R package. 41 To evaluate the predictive efficiency of the IPM, the time-dependent receiver operating characteristic (ROC) curve with area under the curve (AUC) value were developed through "survival ROC" R package. 44 Similarly, the identical median value and formula of risk score based on the TCGA cohort were further implemented in a meta-GEO validation group to identify the robustness of our IPM.

| Estimation of immune cell type fractions
CIBERSORT, a deconvolution algorithm described by Alizadeh et al, has the capability to quantify cell fractions from abundant tissue gene expression profiles. To detect the relative fractions of 22 infiltrating immune cell types in the early-stage LUAD samples, the CIBERSORT (http://ciber sort.stanf ord.edu/) and the LM22 were utilized. 45 LM22, the leukocyte gene signature matrix with 547 genes, is an approach to accurately differentiate 22 human hematopoietic cell phenotypes, including B cells, T cells, natural killer (NK) cells, macrophages, dendritic cells (DCs), and myeloid subsets. CIBERSORT generates a p value for the deconvolution of each specimen through Monte Carlo sampling, thus measuring the confidence of the results. When p value is below 0.05, the outcome of the concluded fractions of immune cell populations generated through CIBERSORT is considered receivable. 45 Therefore, in our report, CIBERSORT integrated with LM22 were applied for quantifying the percentage of immune cells in TP53 MUT and TP53 WT early-stage LUAD samples in the TCGA database. Patients with a CIBERSORT p < 0.05 were selected for subsequent analysis.

| Independence of the IPM from conventional clinicopathological factors
A total of 397 early-stage LUAD cases with sufficient clinicopathological data such as age, gender, race, TP53 mutant status, residual tumor status, TNM stage, and survival data were subjected to subsequent analysis. Univariate and multivariate Cox regression analysis were applied for exploring whether the predictive capacity of an IPM was independent of conventional clinicopathological variables.

| Development and validation of nomogram model
We incorporated all statistically significant clinicopathological parameters identified via multivariate Cox analysis and further established a visualized nomogram model through "rms" and "survival" R package, thus predicting the 3-,5-,10-year OS probability of patients. 46,47 The predictive capability of the nomogram was evaluated through measuring discrimination and calibration utilizing a bootstrap approach under 1000 resampling. Discrimination was assessed via Concordance index (C-index). The C-index is closer to 1, implying more accurate predictive ability of the nomogram model. 48 Calibration curves were applied for evaluating the consistency between the predicted survival probability of nomogram and the actual observed possibility. The 45 reference line indicated the optimal predictive performance. 46 The ROC curve with AUC value and the decision curve analysis (DCA) were generated through using "survival ROC" and "rmda" R package, respectively, which can evaluate the predictive precision and clinical utility of nomogram. 44,49 3 | RESULTS

TCGA database
To investigate the prognostic role of 67 immune-associated DEGs in patients, we performed univariate Cox analysis and further found that six of 67 DEGs, including C8B, C6, PLA2G1B, Ectonucleoside triphosphate diphosphohydrolase family member 2 (ENTPD2), Macrophage migration inhibitory factor (MIF) and C8A, were significantly related to patients' OS (p < 0.05; Table 1). Moreover, LASSO Cox analysis was conducted to identify two immune-associated DEGs (ENTPD2 and MIF) with the optimal prognostic efficiency in patients. In the TCGA cohort, ENTPD2 (HR = 1.12, 95% CI: 1.02-1.24, p = 0.0243) and MIF (HR = 1.21, 95% CI: 1.02-1.43, p = 0.0277) were considered as risky genes in earlystage LUAD and were ultimately chosen to establish an IPM ( Table 1). The following formula can be used to calculate the risk score of each patient:IPM risk score = 0.098 * ENTPD2 expression value +0.168 * MIF expression value. We estimated the risk score of every patient and identified optimal cutoff value via X-tile software.
All 397 early-stage LUAD cases were stratified into high-or low-risk group in accordance with the optimal F I G U R E 3 Gene set enrichment analysis (GSEA) results of significant immune-associated biological processes enriched in TP53 WT earlystage lung adenocarcinoma (LUAD) individuals. TP53 mutated, TP53 MUT cutoff point (0.99). Survival analysis revealed that a higher risk score was correlated with significantly worse prognosis of cases in the TCGA training database (HR = 1.85, 95% CI: 1.27-2.72, p = 0.002; Figure 4A). The risk score distribution, heatmap concerning two immune genes' expression, and survival status for every patient were illustrated in Figure 4B. There was a high uniformity between the expression levels of ENTPD2 and MIF and the IPM risk score. Specifically, the high-risk cases displayed significantly enhanced levels of ENTPD2 and MIF compared with their low-risk counterparts. The time-dependent receiver operating characteristic (ROC) curve and the corresponding area under the ROC curves (AUC) was applied for evaluating the prognostic accuracy of our IPM. The AUC values for OS of early-stage LUAD were 0.612 (95% CI: 0.466-0.736) at 0.5 years, 0.662 (95% CI: 0.594-0.751) at 1 years, 0.658 (95% CI: 0.598-0.721) at 2 years, 0.707 (95% CI: 0.646-0.763) at 3 years and 0.645 (95% CI: 0.543-0.738) at 5 years ( Figure 4C).

GEO cohort
A meta-GEO cohort composed of 623 early-stage LUAD cases was utilized to externally validate the robustness of our IPM. The expression values of ENTPD2 and MIF gene in the meta-GEO validation cohort were normalized, with an average value of 0 and a standard deviation (SD) of 1. 50 On the basis of the identical score formula and cutoff point in the TCGA training group, all cases in a meta-GEO group were categorized into high-or low-risk groups. As revealed in Figure 4D, cases with high risk score were characterized with an inferior prognosis compared with those in the low-risk group (HR = 2.10, 95% CI: 1.47-3.01, p < 0.001), which was roughly consistent with the results in the TCGA group. Figure 4E illustrated the IPM score, gene expression and survival status of early-stage LUAD cases in the meta-GEO validation cohort. Moreover, the diagnostic performance of the IPM attained an AUC of 0.645 (95% CI: 0.552-0.715) at 0.5 years, 0.684 (95% CI: 0.612-0.749) at 1 years, 0.629 (95% CI: 0.515-0.633) at 2 years, 0.605 (95% CI: 0.539-0.645) at 3 years and 0.738 (95% CI: 0.637-0.758) at 5 years ( Figure 4F). Therefore, these findings highlight that the IPM is robust and applicable in varying populations and platforms.
Recently, Zuo et al. established a novel 8-gene prognostic signature for predicting the long-term OS of patients with early-stage non-small-cell lung cancer (NSCLC). 51 Firstly, they investigated four lung cancer datasets to estimate DEGs between early-stage NSCLC and normal adjacent lung tissue. Furthermore, Cox proportional hazards models was performed on those DEGs to determine the candidate prognostic genes. Ultimately, a risk score model for these hub genes was developed in the TCGA cohort. Similarly, in 2020, Liang et al also proposed a promising 21-gene-based prognostic immune prediction model for early-stage LUAD patients in the TCGA cohort. 52 The C-index, a frequently utilized evaluation indicator for survival models, was estimated through "pec" R package utilizing a bootstrap manner under 1000 resampling, thus comparing the predictive performance of the previously published prognostic signatures and our IPM. A higher C-index value indicates the more favorable predictive performance of the model. 53 The C-index of our IPM to predict 1 to 5-year OS surpassed that of the 8-gene and 21-gene prognostic signatures in TCGA and meta-GEO group, highlighting that the IPM displayed relatively desirable performance to predict the prognosis of early-stage LUAD patients ( Figure 4G,H).

TP53 status
As revealed in Figure 5A, TP53 mutant early-stage LUAD patients exhibited a 1.66-fold higher risk (HR = 1.66, 95% CI: 1.16-2.38, p = 0.006) than patients without TP53 mutation. We stratified all cases in the TCGA group into two subgroups based on TP53 status, thus uncovering whether the predictive power of the IPM was independent of TP53 status. The results indicated that high risk score exerted a significantly negative effect on the prognosis of earlystage LUAD patients in TP53 WT subgroup (HR = 1.96, 95% CI: 1.14-3.40, p = 0.016; Figure 5B) and TP53 MUT subgroup (HR = 1.74, 95% CI: 1.01-2.97, p = 0.044; Figure 5C), respectively. Similarly, correlation analysis also demonstrated that there was a negative association between IPM risk score and survival time in both TP53 WT (R = −0.26, p = 0.00012) and TP53 MUT (R = −0.42, p = 9.9e-09) subgroups ( Figure 5D). Furthermore, the predictive performance of our IPM was independent of T A B L E 1 Univariate Cox analysis of immune-associated DEGs between TP53 WT and TP53 MUT early-stage LUAD TP53 status via univariate and multivariate Cox analysis ( Figure 5E). Concerning that a majority of TP53 alterations were missense mutation (106/179) in our report, we then investigated whether the IPM influenced the prognosis of patients in different TP53 mutation subtypes. As revealed in Figure 5F, cases with high risk score were correlated with diminished survival in the TP53 missense mutation subgroup (HR = 2.62, 95% CI: 1.17-5.87, p = 0.019).

| Strengthened immune response in early-stage LUAD patients with low IPM risk score
Gene set enrichment analysis was further conducted to disclose immune-related processes between low-risk (n = 182) and high-risk (n = 215) early-stage LUAD cases in the TCGA training cohort. Cases with low risk score were significantly  Table S4). Conversely, there was no statistically significant immune-associated biological processes enriched in patients with high risk score (Table S5). Thus, low IPM risk score potentially portends an enhanced immune phenotype while attenuated immune response is prone to occur in those cases with high risk score.

| Immune landscapes between earlystage LUAD patients with low and high risk score
The discrepancies concerning the compositions of 22 tumor-infiltrating immune cells between low-and high-risk early-stage LUAD cases were evaluated through CIBERSORT combined with LM22 signature matrix. The distribution of the risk score, clinicopathologic characteristics and immune cell composition were evaluated in the TCGA cohort. A higher risk score was significantly correlated with more advanced TNM stage and the status of with tumor ( Figure 6A). Additionally, the proportions of immune cells were diverse among different early-stage LUAD samples, highlighting that alterations of intra-tumoral immune cells potentially serve as inherent characteristics to represent individual variations ( Figure 6A). Generally, the five most common immune cell fractions in early-LUAD tissues were CD4 + T cells, DCs, NK cells, CD8 + T cells and B cells, and the sum of their average proportions was over 50% ( Figure 6B). As revealed in Figure 6C, patients with low risk score were characterized with an increased proportion of resting DCs (p = 0.0168) and eosinophils (p = 0.0089). Conversely, cases in the high-risk group had higher infiltration of activated mast cells (p = 0.0413) and monocytes (p = 0.0389). We performed principal components analysis (PCA) to identify the difference between low-and high-risk early-stage LUAD patients on the basis of above-identified immune cell subpopulations. As showed in Figure 6D, immune cell subpopulations stratified high-risk and low-risk cases into two discrete sections, emphasizing that the immune landscape of early-stage LUAD cases in the low-risk group was greatly different from those with high risk score. The heterogeneity of immune microenvironment in early-stage LUAD potentially exerts significant implications for identifying prognosis. In order to unravel the associations between the IPM risk score and responses to immunotherapy of early-stage LUAD patients, we specially focused on the relationship between IPM risk score and the levels of crucial immunotherapy-related genes. There was a significantly negative correlation between risk score and the expression of CD8A (Pearson correlation coefficient [PCC] = −0.138, p = 0.0058) and CD4 (PCC = −0.258, p = 1.823e-07). In contrast, the IPM risk score was significantly positively associated with PD-L1 (also known as CD274, PCC = 0.163, p = 0.0011), CTLA-4 (PCC = 0.141, p = 0.005) and TIGIT (PCC = 0.109, p = 0.031; Figure 7A; Table 2), whereas there was no significant correlation between risk score and the levels of  (Table 2). Similarly, violin plots depicted in Figure 7B illustrated that cases with low IPM risk score had significantly increased levels of CD4 and CD8A while significantly diminished expression of PD-L1, CTLA-4, and TIGIT compared with the high-risk patients (p < 0.05). Thus, the expression levels of pivotal immunotherapy-associated genes are potentially correlated with the risk stratification of early-stage LUAD patients.

| Analysis of IPM-associated biological function and pathway
GO analysis was performed to predict the potential biological significance of immune-associated DEGs between the highand low-risk groups. A total of 17 immune-associated DEGs were obtained through overlapping 178 risk score-associated DEGs and 94 IPM-related immune genes, which were considered as risk score-associated immune DEGs ( Figure 8A). Those genes were submitted for GO analysis to reveal the underlying biological functions and pathways ( Figure 8B,C). These results demonstrated that risk score-associated immune DEGs in the TCGA cohort were primarily enriched in the humoral immune response, antimicrobial humoral response and positive regulation of cytokine secretion (Table S6).

| The IPM is independent of traditional clinicopathological factors
Univariate and multivariate Cox regression analysis were applied for assessing the contribution of an IPM as an independent prognostic parameter to the OS of early-stage LUAD cases in the TCGA cohort. As revealed in Figure 9A, univariate Cox analysis demonstrated that several clinicopathologic parameters, such as TP53 status, TNM stage, age and IPM risk score, exerted an effect on the prognosis of early-stage LUAD cases. Above significant variables were incorporated   results indicated that the risk score was correlated with OS and the predictive capability of the IPM was independent of traditional clinicopathological variables for the prognosis of early-stage LUAD cases. Moreover, we made a comparison of the C-index between our IPM and traditional clinicopathological factors. Among six prognosis-predictive parameters (including TP53 mutant status, gender, TNM stage, age, race and residual tumor), our IPM displayed the greatest average C-index (0.625) than other traditional clinicopathological parameters (0.509 to 0.579; Figure 9B). Thus, above findings suggested a more satisfactory capability of the IPM to estimate the OS of early-stage LUAD cases.

| Construction and identification of a nomogram model based on the IPM
A nomogram model was further developed through integrating all significant parameters (the IPM, TNM stage and age) identified via multivariate Cox analysis, which can confer physicians with a quantitative method to estimate the survival of early-stage cases. In the nomogram model, above parameters were assigned scores based on a point scale with a range of 0 to 100. The score of each parameter was identified through plotting upward a straight line. The scores of the parameters of each patient were added up and considered as the total points. For each early-stage LUAD patient, the survival rate of 3, 5, and 10 years was estimated through plotting downward a perpendicular line from the total point axis to the result axis. Specifically, one early-stage LUAD patient at TNM stage Ⅱ (100 points), with high risk (66 points) and age over 60 years (17 points) obtained a total point of 183. The perpendicular line plotted from the total point axis at a numeric value of 183 to the result axis revealed that the survival probability of 3, 5, and 10 years was 63%, 35% and 19%, respectively. Thus, TNM stage made the greatest contributions to risk points with a range of 0 to 100, followed by IPM risk score (ranging from 0 to 66) and age (ranging from 0 to 17; Figure 9C).
The nomogram displayed a relatively desirable C-index of 0.709 (95% CI: 0.615-0.779). The deviation correction line was close to the 45-degree reference line in the calibration curves, indicating the satisfactory consistency between  Figure 9D). We made a contrast of the predictive efficiency of nomogram model and additional clinical variables (such as age, TP53 status and IPM as well as TNM stage) through establishing time-dependent ROC curves. Concerning the ROC curve of 3-year OS, nomogram model achieved the highest AUC of 0.842, followed by IPM (AUC = 0.78), TNM stage (AUC = 0.62), TP53 status (AUC = 0.609) and age (AUC = 0.542; Figure 10A). The nomogram to predict 5-year OS achieved an AUC value of 0.87, which was more satisfactory than that of IPM (0.742), TNM stage (0.617), TP53 status (0.586) and age (0.585; Figure 10B). The AUC for the nomogram, IPM, TNM stage, TP53 status and age to predict 10-year OS were 0.916, 0.717, 0.632, 0.641 and 0.609, respectively ( Figure 10C). Furthermore, DCA curve was established to evaluate the clinical utility and benefit of the nomogram and additional parameters. The DCA of the nomogram displayed the greatest net benefits, followed by TNM stage, IPM, TP53 status and age ( Figure 10D). Collectively, above findings highlighted that the nomogram, constituted by IPM, TNM stage and age, is an optimal model with relatively desirable clinical utility to predict long-term OS of early-stage LUAD.

| DISCUSSION
A growing number of publications have demonstrated the complicated interaction between the immune microenvironment and the malignant transformation and development of LUAD. 54 TP53 mutant status is associated with the percentages of immune cell infiltration and the expression levels of immune checkpoints, which potentially serves as an indicator for evaluating the effectiveness of immunotherapy in LUAD. 55,56 Thus, reliable prognostic biomarkers associated with the tumor immune landscape and TP53 status potentially hold great promise for recognizing promising molecular targets and strengthening patient management in the era of immunotherapy.
In our report, we focused on the effect of TP53 mutation on modulating the immune landscape in early-stage LUAD based on bioinformatic approaches. Initially, we demonstrated that TP53 WT early-stage LUADs displayed a distinctly intensive local immune response compared with TP53 MUT counterparts through GSEA analysis. Our conclusion is supported by another report where revealed that TP53 WT female LUADs rather than TP53 MUT cases were correlated with the longest survival and the increased percentages of certain immune infiltrates, such as INF-γ, TNF and macrophages monocytes. While immunosuppressive molecule PD-L1 levels were greater in TP53 MUT LUAD compared with TP53 WT counterpart. 57,58 Furthermore, we established a prognostic signature based on two immune-associated DEGs (MIF and ENTPD2) whose expression was related to the TP53 status and the prognosis of early-stage LUAD. Our IPM can further classify clinically defined groups of early-stage LUAD patients into subgroups with different survival outcomes. High expression levels of MIF and ENTPD2 signified high IPM risk score, further indicating the undesirable prognosis in early-stage LUAD.
Migration inhibitory factor is structurally conservative homotrimer in mammals and is considered as well-acknowledged functional immune cytokine with a crucial effect on host immune system and whole inflammatory cascade. MIF is primarily derived from multiple immune cells, such as macrophages monocytes, T and B cells, eosinophils, neutrophils and mast cells. [59][60][61][62] MIF is initially proved to impede the directionless migration of macrophages in vitro to characterize delayed type hypersensitivity. 63,64 MIF can also stimulate the secretion of pro-inflammatory mediators (including TNF-α, IL-1β, IL-6, COX2, and IFN-γ). 65 MIF has been demonstrated to be significantly upregulated in various tumors, including colon cancer, 66 prostate cancer, 67 malignant melanoma, 68 head and neck cancer, 69 glioblastoma 70 and breast cancer 71 as well as lung cancer. [72][73][74][75][76][77] High expression level of MIF usually indicates advanced cancer phenotype and unsatisfactory prognosis in above tumors. Specifically, the lung cancer cells with resistance to cisplatin facilitates M2 polarization of tumor-associated microphages (TAMs) through secreting MIF, thus accelerating angiogenesis and epithelial-mesenchymal transition (EMT) as well as distant metastasis of lung cancer. 74 Overexpression of MIF also activates NF-κB and further upregulates HIF-1α, thus amplifying proliferation and Warburg effect in lung cancer. 75 CXCR4/ MIF axis positively modulates tumor growth and EMT interaction in NSCLC. 76 MiR-608 mitigates tumor migration and invasion via directly targeting MIF in LUAD. 77 Thus, MIF can develop into an attractive target for pharmacological intervention for the therapy of lung cancer.
ENTPD2 (also named CD39L1 or NTPDase 2), is considered as a pivotal ectoenzyme involved in extracellular ATP hydrolysis. 78,79 AMP is generated via ENTPD2-mediated hydrolyzation of extracellular ATP and is further hydrolyzed by CD73 to adenosine that stimulates tumor proliferation and metastasis as well as drug resistance. 79 ENTPD2 upregulation is demonstrated in papillary thyroid carcinoma-derived cells, esophageal cancer cells, gliomas cells, and liver cancer cells in contrast to normal cells. [80][81][82][83] Overexpression of ENTPD2 is also revealed in HCC clinical samples, which also indicates undesirable prognosis for HCC. The hypoxic microenvironment stimulates ENTPD2 overexpression mediated by hypoxia-inducible factor-1 (HIF-1) in liver cancer cells. ENTPD2 transforms extracellular ATP into 5′-AMP and further suppresses the differentiation of myeloid-derived suppressor cells (MDSCs), thus facilitating the maintenance of MDSCs and the development of tumor immunosuppressive microenvironment. Depletion of ENTPD2 can impede tumor growth and strengthen the efficiency of immune checkpoint inhibitors. 82 Thus, ENTPD2 is harnessed by tumor cells to shun immune-mediated demolition.
We also demonstrated an enhanced immune landscape in the low-risk group through functional enrichment analysis. Specifically, increased proportions of resting DCs and eosinophils while decreased abundance of activated mast cells and monocytes were observed in early-stage LUAD cases with low risk score. Notably, there were several contradictory findings concerning the effect of mast cells on the prognosis of LUAD. A report demonstrated that mast cells was associated with angiogenesis and unsatisfactory prognosis in stage I LUAD. 84 Mast cells could facilitate growth and metastasis by generating IL-1β in LUAD progression. 85 Conversely, certain studies revealed that greater abundance of mast cells was related to better prognosis and prolonged survival in early-stage LUAD cases. 54,86 We also found relatively enhanced expression of CD4 and CD8A indicative of a strengthened immune response in the low-risk group. 7 Conversely, patients with high risk score had increased levels of immunosuppressive molecules (such as PD-L1, CTLA-4 and TIGIT). Notably, in several reports on the anti-PD-1/PD-L1 treatment for NSCLC, PD-L1 expression in tumors has been considered as the criteria and a predictive biomarker for inferior prognosis. Interestingly, PD-L1-expressing NSCLC individuals have a higher possibility to obtain clinical benefit from immunotherapy. 15,87 Based on the aforementioned results, dysregulation of immune contexture is a potential reason for the OS differences between patient subgroups stratified by the IPM. The immunosuppressive microenvironment is one of the reasons why high-risk subgroup was characterized with unsatisfactory clinical outcome.
In benchmark comparisons, our IPM achieved more satisfactory prognostic performance than a novel 21-gene-based immune-related gene signature and an 8-gene prognostic signature that have been published for early-stage LUAD. Additionally, we conducted Cox analysis to identify that IPM risk score was an independent prognostic parameter. ROC curve with AUC values revealed that our IPM had more satisfactory predictive power than traditional prognostic variables. We further developed a novel nomogram model through leveraging the complementary value of the IPM and clinicopathologic characteristics (including age and TNM stage), further highlighting that combining both could confer an intuitionistic and accurate scoring system to estimate the OS of early-stage LUAD in clinical practice.
Several limitations are deserved to be discussed in our research. Initially, our report was retrospective and these public databases were devoid of certain crucial clinicopathologic information (such as smoking and drinking status, family history, whether performed by surgical intervention and specific surgery types, whether receiving neoadjuvant chemoradiotherapy). 46,48 Thus, further multicenter clinical trials of larger sample size and more detailed clinical data are required to externally validate our findings. Furthermore, our IPM was constructed based on two immune genes. The molecular functions and biological effects of above genes are warranted to be investigated individually and conjunctively, further favoring their clinical utilization. Ultimately, our IPM risk score was estimated in accordance with the gene expression values. The intra-tumor heterogeneity potentially resulted in sampling bias.

| CONCLUSION
In conclusion, we established and validated a two immune gene-based and TP53 status-associated IPM through analyzing the TP53 mutation data, RNA expression, and clinical information of early-stage LUADs in multiple independent datasets across different platforms. The IPM was considered as an independent prognostic biomarker for early-stage LUAD. A nomogram model including the IPM and other clinicopathologic parameters was formulated to quantitatively predict the long-term prognosis of early-stage LUAD individuals, thus assisting physicians to make better clinical decisions.

CONFLICT OF INTEREST
The authors have declared that no competing interest exists.

AUTHOR CONTRIBUTIONS
Wei Lin and Chengde Wu designed/planned the study and wrote the paper. Chengde Wu acquired and analyzed data and performed computational modeling. Chengde Wu and Xiang Rao performed imaging analysis. Wei Lin, Chengde Wu, and Xiang Rao participated in discussion of related data.

ETHICAL STATEMENT
The TCGA and GEO database is publicly available, and our study was performed based on the guideline of these databases. All patient information was anonymized and de-identified in the TCGA and GEO database. Thus, our study was exempted from the ethics committee approval and patients' informed consent.