A novel signature based on pairwise PD‐1/PD‐L1 signaling pathway genes for predicting the overall survival in patients with hepatocellular carcinoma

Dear Editor, Hepatocellular carcinoma (HCC) is ranked as the most prevalent subgroup of liver malignancies in the world, representing about 90% of primary liver cancers.1 To date, no widely accepted molecular biomarkers are available for survival stratification with HCC.2,3 Recently, a novel algorithm was developed based on the relative orderings of mRNA expression patterns and had yielded excellent results.4,5 Substantial breakthroughs have been founded in programmed death 1 (PD-1)/ programmed death 1 ligand (PD-L1) signaling pathway in various cancers.6 Here, we used the TCGA cohort to develop a signature and other two databases to confirm the prognostic model based on pairwise PD-1/PD-L1 signaling pathway genes. As shown in the flowchart (Figure 1A), the mRNA expression matrixes and their clinical characteristics were retrieved from the TCGA, GEO, and the International Cancer Genome Consortium (ICGC-JP cohort) databases. The clinicopathological information of the three databases was displayed in Table S1. Finally, 146 potential genes were acquired from the KEGG and Reactome pathway databases (Table S2). We filtered out the candidate genes computed by median absolute deviation (MAD) > 0.5. We created all possible pairs of genes based on the shared 34 genes. Among gene pairs, two genes (a and b) formed a gene pair in alphabetical order. If the expression values of genea > geneb in a particular sample, the gene pair produced a score of 1; if the situation is opposite, the score equals 0. The association of these gene pairs was evaluated with univariate and LASSO Cox regression analysis after 1000 iterations (Figure 1B-C), and finally, 12 gene pairs, included 13 genes,were identified (Table S3). The risk score was computed via the expression patterns of these pairwise genes weighted by the respective coefficient obtained from LASSO algorithm. On the basis of time-dependent ROC curve, the ideal cutoff of the risk score was calculated at 0.573 to stratify individuals into the highand

low-risk subgroups ( Figure 1D). The corresponding signature risk score and individuals survival status between groups were illustrated in Figure 2A. The novel model could split individuals into low-and high-risk subgroups with different outcome (Figure 2B), and with the increased risk score, individuals have a greater risk of mortality (P = .00197; Figure 2C). Additionally, the area under curve (AUC) of the new model for 1-, 3-, and 5-year outcome in TCGA group was 0.733, 0.716, and 0.722, respectively ( Figure 2D). The performance of the model was successfully verified in the two external cohorts ( Figure 2E-L). The risk score was clearly higher in individuals with increased AFP (≥300 ng/mL), advanced Tumor, Node, Metastasis (TNM) staging (stage III-IV), advanced pathological grade (grades III-IV), patients without family history of cancer, and patients with vascular invasion (all P < .05). However, there were no obviously differences in age, neoplasm status, gender, and history of prior cancer ( Figure S1).
Multivariate Cox analysis demonstrated that the model risk score was an independent predictive indicator for OS ( Figure 3A) after adjusting for multiple clinical and pathological parameters in TCGA portal. Moreover, the independent prognostic factor was verified in two validation cohorts ( Figure 3B-C). A nomogram was therefore developed ( Figure 3D). The C-index for the nomogram and TNM stage was 0.706 (95% CI: 0.648-0.764) and 0.551 (95% CI: 0.479-0.623), respectively. Calibration plots exhibited highly consistent with the actual observations ( Figure 3E). According to decision curve analysis, the nomogram illustrated high clinical application potential and net benefits than the TNM stage examined ( Figure 3F).
The outline of immune cell abundance between two subgroups measured by CIBERSORT is illustrated in Figure 4A. It was revealed that memory B cells, resting mast cells, were elevated, and CD8 T cells were reduced in highrisk subgroup ( Figure 4B-D). The model risk score was positively associated with M0 macrophage, Tregs, neutrophil, Gene ontology (GO) analysis of the 13 signaling pathway genes illustrated that the signature was mainly enriched in immune response processes ( Figure S2A). KEGG enrichment analysis demonstrated that the signature was mainly involved in PD−L1/PD−1 checkpoint pathway in cancer, Fc epsilon RI signaling pathway, and B-/T-cell receptor signaling pathways ( Figure S2B). GSEA performed between two risk subgroups exhibited that multiple gene sets were markedly involved in high-risk group ( Figure S2C) and several metabolism pathways were mainly involved in lowrisk group ( Figure S2D).
Traditional approaches that previous studies used were required scaling and appropriate normalization among various measurement platforms. 2,7 We designed a novel method for the identification of functional associations between pairwise genes based on relative expression orderings of two genes within-sample, without the need of standardization, batch correction and scaling among different platforms. Besides, the cutoff of the risk score can be applied to other validation cohort in the future because it was computed as expression levels (0 or 1) of these pairwise genes weighted by the coefficients, presenting some significant advantages than previous signatures. Moreover, to identify reliable signature of HCC prognosis, we developed a signature for patients with HCC and confirmed its performance in other two external validation cohorts with large cancer samples for biomarker mining through multidimensional bioinformatics methods. The signature was accurate and robust, giving that the data resource was adequate and the analytical procedures were comprehensive. Therefore, all findings reveal that the new model can present a precise overall survival (OS) prediction of HCC patients and can measure HCC OS in an individualized way and can be readily translated into clinical utility according to the cutoff value determined from previous formula.
In conclusion, we proposed and validated a pairwise PD-1/PD-L1 gene panel that can accurately predict OS of patients with HCC. The signature can help to the effect of immunotherapy for HCC patients and be used as a promising marker to evaluate individuals who could benefit from immunotherapy.

ETHICS APPROVAL AND CONSENT TO PARTICIPATE
The data used were publicly available to all, so the ethics approval was waived.

DATA AVAILABILITY STATEMENT
All datasets that support the present study are publicly available to all at GEO (https://www.ncbi.nlm.nih.gov/ geo/) database, ICGC portal, and the TCGA portal.