An integrated multiomics analysis of rectal cancer patients identified POU2F3 as a putative druggable target and entinostat as a cytotoxic enhancer of 5-fluorouracil

Rectal cancer (RC) accounts for one-third of colorectal cancers (CRC), and 40% of these are locally advanced rectal cancers (LARC). The use of neoadjuvant chemoradiotherapy (nCRT) significantly reduces the rate of local recurrence compared to adjuvant therapy or surgery alone. However, after nCRT, up to 40%-60% of patients show a poor pathological response, while only about 20% achieve a pathological complete response. In this scenario, the identification of novel predictors of tumor response to nCRT is urgently needed to reduce LARC mortality and to spare poorly responding patients from unnecessary treatments. Therefore, by combining gene and microRNA expression datasets with proteomic data from LARC patients, we developed an integrated network centered on seven hub-genes putatively involved in the response to nCRT. In an independent validation cohort of LARC patients, we confirmed that differential expression of NFKB1 , TRAF6 and STAT3 is correlated with the response to nCRT. In addition, the functional enrichment analysis also revealed that these genes are strongly related to hallmarks of cancer and inflammation, whose

dysfunction may causatively affect LARC patient's response to nCRT. Furthermore, by constructing the transcription factor-module network, we hypothesized a protective role of POU2F3 gene, which could be used as a new drug target in LARC patients.
Finally, we identified and tested in vitro entinostat, a histone deacetylase inhibitor, as a chemical compound that could be combined with a classical therapeutic regimen in order to design more efficient therapeutic strategies in LARC management.

| INTRODUCTION
Colorectal cancer (CRC) ranks third in incidence and mortality globally, with similar numbers in men and women. 1 One-third of CRCs are represented by rectal cancer (RC) and 40% of RCs are locally advanced rectal cancer (LARC) at diagnosis. Recently, some studies have shown that colon and rectal cancer differ in terms of their clinical-pathological characteristics, carcinogenic pathways, and therapy. 2,3 The standard approach for LARC includes preoperative neoadjuvant chemoradiotherapy (nCRT) followed by total mesorectal excision. The use of nCRT significantly reduces the rate of local recurrence compared to adjuvant therapy or surgery alone. 4 However, after nCRT, up to 40%-60% of patients show a poor pathological response, while only about 20% achieve a pathological complete response. 5 To date, the most promising biomarker to monitor LARC treatment is the carcinoembryonic antigen (CEA); however, its prognostic potential is insufficient. 6 Further, the few studies performed on LARC to identify potential predictors of nCRT response are contradictory and remain inconclusive, principally due to discrepancies in patient selection, sample size, nCRT regimen, and parameters used to evaluate a patient's response to therapy. Thus, the identification of novel predictors of LARC response to nCRT is urgently needed to reduce LARC mortality and to spare poorly-responding patients from unnecessary treatments. 6 At the macroscopic level, the complex interactions among genetic, environmental, and lifestyle factors are at the basis of complex diseases, including cancer. 7 In parallel, at the microscopic level, complex diseases are typically caused by a combination of molecular alterations and their interplay. 7 The reductionist approach devoted to dissecting the individual gene biomarkers has been used in almost all existing studies, relying on an underlying hypothesis that changes in gene expression cause and explain different phenotypes. However, it is often not possible to distinguish whether gene expression variations are causative or merely an effect of intricate rewiring of regulatory and signaling cascades in complex diseases. 7 Genes do not work alone, they interact with each other and are controlled by many transcription factors to form networks or pathways, to carry out biological functions. In addition, it has been demonstrated that even if genetic modifications in tumor cells initiate and drive malignancies, the acellular component of the tumor microenvironment (TME), called extracellular matrix (ECM), coevolves with cancer cells, creating a dynamic signaling circuitry that promotes cancer diffusion and reduces the response to therapy. Taken together, these factors support the use of a computational integrated approach of experimental-driven evidence, including transcriptomic and proteomic datasets, to reconstruct the underlying intricate interaction networks at the basis of cancer onset and resistance to therapy.
In our study, we applied such a holistic approach to identify novel potential biomarkers for the prediction of LARC patients' response to nCRT. Among these, POU2F3 was identified as a key gene in determining patients' response and survival, and we hypothesized that upregulating it could improve response to chemotherapy. We validated this hypothesis on CRC cell lines by performing POU2F3 upregulation using the histone deacetylase inhibitor (HDAC) entinostat.
Indeed, combination treatment of entinostat with the classical chemotherapeutic 5 fluorouracil (5-FU) was proven to be more cytotoxic compared to single 5-FU treatment or entinostat treatment alone.

| Patients
A series of n = 21 paired (matched) healthy rectum (HR) and rectal cancer (RC) tissue samples from LARC patients were collected from the Tissue biobank of the First surgery clinic (Department of Surgery, Oncology and Gastroenterology-University of Padova). All of the patients fulfilled the following criteria: histologically confirmed primary adenocarcinoma of the rectum, tumor within 12 cm of the anal verge on proctoscopic examination, clinical stage cT3-4 and/or N0-2, resectable disease and age ≥18 years. Patients with a known history of a hereditary colorectal cancer syndrome and patient who were not nCRT naive were excluded. Patients' clinic-pathological characteristics were summarized in

| Tissue decellularization
HR and RC destined to decellularization process were treated with one to three detergent-enzymatic treatment (DET) cycles. Each DET cycle was processed as indicated in reference 9. After decellularization, matrices were rinsed in PBS +3% penicillin/streptomycin (pen/strep) and then stored at À80 C until used.

| DNA isolation and quantification
To assess total DNA content within the native HR and RC compared to decellularized matrices (respectively, DHR and DRC), 20 mg of each specimen was treated using DNeasy Blood & Tissue kit (Qiagen) under manufacturer's instruction. DNA samples were then quantified using Nanodrop 2000 spectrophotometer at 260/280 nm ratio (Thermo Scientific).

| DHR and DRC proteomics
A total of n = 6 paired HR and RC (n = 3 TRG 1-2, R and n = 3 TRG 4-5, NR) decellularized tissues (average weight 2 mg, range 0.5-3.4 mg) were analyzed by mass spectrometry following the protocol published by Naba et al 10 and as indicated in Supplementary methods (Data S1).

| Resources and datasets
In our study, we reused our previous published differential genes and microRNAs derived from datasets including n = 86 LARC patients treated with nCRT. Gene expression profile dataset GSE4540 11 was performed on platform GPL570 Affymetrix Human Genome U133 Plus 2.0 Array (HG-U133_Plus_2). microRNA expression profile dataset GSE68204 12 was performed on miRNA microarray platform Rel 12.0 (V3) manufactured by Agilent SurePrint Technology. The molecules were used to carry out the network analysis in combination with proteomic data.

| Network analysis
To gather a more complete picture of the molecular landscape of LARC response to treatment, we integrated proteomics and transcriptomics data from the above-mentioned datasets. Protein-protein interactions (PPIs) were retrieved using IID v.2020-05 13 ; microRNAs-gene interactions were retrieved using mirDIP 4.1, 14 16 retaining only pathways with adjusted P-value (BH method) <.01. Due to KEGG returning mainly disease pathways, this database was removed from the analysis. For network annotation, each gene was annotated with the pathway with the lowest adjusted P-value (BH method) it belonged to. Gene Ontology enrichment analysis was performed using clusterProfiler_3. 16

| Cells maintenance and expansion
The

| Identification of differentially abundant proteins between decellularized healthy rectum and rectal cancer
Matched samples from both HR and RC specimens were decellularized using a detergent-enzymatic treatment. The DNA amount was quantified to confirm nuclear content depletion and cells removal.
Both HR and RC samples were completely decellularized after one DET cycle with a reduction of 95.78% and 95.81%, respectively (P-value <.001 for both; native HR and RC tissue were used as control; Figure 1A Figure 1E). For each protein identified and quantified, a DRC vs DHR ratio has been calculated, and results were expressed as fold change (FC). Venn diagram presented in Figure 1F shows that proteins shared  Figure 1G,H). R and NR patients were discriminated by the PCA analysis. The heatmap analysis showed that NR samples cluster together, while in the R samples, two out of three samples cluster together and one sample clusters separately. Proteins responsible for groups' separation were obtained using PLS-DA as supervised analysis, and the best classification performance was obtained using two components (Accuracy 85%, R2 0.98, Q2 0.43). The VIP score (Variable Importance in Projection) used to identify the important features for model construction is reported in Figure 1I. As it can be seen, COL4A3, COPB1 and MVP proteins were those mostly responsible for the separation between R and NR. A subsequent Significance Analysis of Microarray (SAM), designed to address the false discovery rate, confirms these results, indicating the three proteins as the most significant ones (FDR 0.11, P-values <.001 for all, Figure S1A).

| Network analysis, annotation and validation
The ECM proteomic signature discriminating LARC patients' response to nCRT was integrated with our previously identified genes and micro-RNAs sets in order to build a comprehensive multiomic view, combining transcriptional, posttranscriptional and proteomics results. On these bases, we first generated a protein-protein interaction network between COL4A3, COPB1, MVP (the ECM-derived proteomic set of response to nCRT); and CXCL9, CXCL10, CXCL11, IDO1, MMP12, AKR1C3 and HLA-DRA (a previously-published transcriptional gene set linked to response to nCRT). 20 We identified a total of 77 common interactors able to connect the transcriptomics identified genes to the proteomics ones (Table S1). Subsequently, we selected only those common interactors that are also targets of our previously published micro-RNAs linked to response to nCRT in LARC: miR-125b-5p, miR-299-5p and miR-154-5p. 21 Based on this, we identified the following seven after nCRT compared to TRG 3-5 (NR, red line) (respectively P < .001; P < .001 and P < .01, Figure 2B).

| Pathway enrichment analyses and gene ontology
Performing pathway enrichment analysis using the sets of genes from the transcriptomics, proteomics and seven central genes (combined network), we obtained 50 enriched pathways, mainly linked to inflammation ( Figure S1B and Table S2). We then focused on enriched pathways that included genes from at least two of the three gene sets (transcriptomic, proteomics and seven central genes), and 20 pathways were retained ( Figure S2 and Table S3). Of these, only Reactome  (Table S4). Of these, only "Extracellular matrix regulation," "Extracellular matrix structure regulation" and "Regulation of vasculature development" molecular functions included all three datasets ( Figure 3A).  However, only limited data exist for solid cancers. 23 On this basis, we sought to determine in vitro whether POU2F3 is targeted by entinostat, and whether this causes its upregulation. Subsequently, in order to validate the in silico data available in the three database analyzed, we also selected nadolol as a drug candidate reported to have an opposite effect, on respect to entinostat, on POU2F3. 24 HCT-15,

| Targeting the upstream regulator
HCT-116 and SW480 CRC cell lines exposed to nadolol and entinostat (both at 50 μM) showed respectively a significantly downregulated and upregulated expression of POU2F3 compared to the same lines nontreated (controls, P-value <.035, <.0001 and <.001, respectively, for nadolol; and P-value <.0001, <.0098 and <.001, respectively, for entinostat, Figure 4B-D). As nadolol-derived downregulation of POU2F3 was only tested as control, it was not further considered. Finally, since we hypothesized that the upregulation of POU2F3 could cause the subsequent overexpression of its down-stream regulated genes NFKB1, STAT3 and NR3C1, we confirmed that HCT-116 CRC cell lines exposed to entinostat (at 50 μM for 24 and 48 hours) showed a statistically significant downregulation of NFKB1, but an upregulation of STAT3 and NR3C1 ( Figure S3A).

| Single and combined cytotoxicity evaluation of 5-FU and entinostat treatments
To test the cytotoxic effect of entinostat alone or in combination with LARC chemotherapeutics, 5-FU was chosen as reference drug, since it represents the backbone of all chemotherapy approaches in the treatment of LARC patients. 4 Thus, HCT-15, HCT-116 and SW480 were either treated with 5-FU or entinostat in the range of 0.1 to 100 μM.
As depicted in Figure 5A

| DISCUSSION
Currently, the best treatment option for patients with LARC involves nCRT, followed by curative surgery. Unfortunately, after nCRT, only approximately 20% of LARC patients show a complete pathological response, while in 40%-60% of patients, the response is poor or absent. 25 Thus, finding predictive molecular markers to nCRT would be of great clinical significance to improve the clinical management of LARC patients.
Several studies proposed potential biomarker predictors of response to nCRT in rectal cancer at transcriptomic, posttranscriptomic and proteomic levels. 26 However, none has been successfully clinically applied, mostly due to the difficulty to distinguish whether gene or proteomic expression changes are causative or merely an effect of a complex disease as cancer is. In our study, we overcame these limitations by simultaneously investigating the transcriptomic, posttranscriptomic and proteomic expression data to develop a multiomic holistic approach that integrates multiomic data to reflect the casual relationship between the integrated network and phenotype. 7 First, we integrated the analysis of the tumor ECM, the most abundant compartment of the TME, to the transcriptomic and post- Collagen increases compared to nonmetastatic CRC patients and healthy controls. 29 The MVP protein is involved in protection against cellular stresses, such as DNA-damaging agents, irradiation, hypoxia, hyperosmotic, and oxidative conditions. The export from the nucleus of DNA-damaging drugs could be a molecular function of MVP involved in drug-resistance phenomena. 30,31 Subsequently, in contrast to the conventional methods relying on a single omic profile of the cell, we developed a protein-protein interaction network integrating multiomic data across different cohorts to reveal more likely causal relationships between common interactors and therapy resistance phenotype in LARC patients.
Another Conversely, the biological processes were enriched for ECM organization, extracellular structure organization, regulation of vasculature development and different processes in the regulation of both innate and adaptive immunity. In this landscape it clearly emerges that the deregulated stroma is a hallmark of cancer able not only to contribute to tumor onset and progression but also able to modulate the response to chemotherapy. 38 Considering the growing evidences supporting that the immune landscape of CRC is extremely complex and linked to chemoresistance, it is urgent to understand even better the role of the tumor stroma in the molecular mechanisms of immune response and chemoresistance. 39,40 Third, through investigating the transcription factor-target interactions, we found that POU2F3 could be the hidden-player of the network which, although not directly, acts as a common mediator of the different deregulated molecular mechanisms discussed above, and may be the target of effective drugs. Indeed, we demonstrated that an increased expression of POU2F3 in the rectal cancer cohort of the TCGA elicits a protective role by increasing their patients' survival. POU2F3 (POU class 2 homeobox 3; also known as SKN-1a/OCT-11) is a transcription factor required for the generation of a rare chemosensory cell type found in the gastrointestinal and respiratory tracts known as Tuft cells. 41 Tuft cells' primary function is the release of bioactive immunomodulator molecules in response to external stimuli, and this immune control at the TME level could have a crucial role in determining drug resistance. 42,43 Also, recent studies reported that POU2F3 expression has been used to recognize different types of pulmonary neuroendocrine cancer, including small cell lung cancer (SCLC). 44  In conclusion, in this article we presented a new network-based approach to identify putative hidden biomarkers of complex diseases and novel treatment options by integrating multiomic data with network information. The analysis based on such an integration can potentially lead to new insight into complex diseases at the systems level. Therefore, by combining transcriptomic and posttranscriptomic data sets and proteomic data from LARC patients, we developed a reconstructed interaction network of seven central genes putatively involved in the response to nCRT. We identified several common interactors, which could serve as biomarkers for patient response prediction to nCRT. In addition, the functional enrichment analysis also revealed that these identified targets are strongly related to hallmarks of cancer and inflammation, whose dysfunction may affect LARC patients' response to nCRT. Furthermore, by constructing the transcription factor-module network, we hypothesized a protective role of POU2F3 gene, which could be used as a new drug target in LARC patients. Finally, we identified entinostat as chemical compound which, targeting POU2F3, could be combined with classical therapeutic regimen in order to design a more efficient therapeutic strategy in LARC.

AUTHOR CONTRIBUTIONS
Edoardo D'Angelo and Chiara Pastrello: planned and conducted the study, collected and interpreted the data and drafted the manuscript.

CONFLICT OF INTEREST STATEMENT
The authors declare no conflicts of interest.

DATA AVAILABILITY STATEMENT
The raw mass spectrometry data generated in our study have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the dataset identifier PXD035408. Other data that support the findings of our study are available from the corresponding author upon request.

ETHICS STATEMENT
Our study was conducted according to the principles expressed in the Declaration of Helsinki. Written informed consent was obtained from every enrolled patient and the protocol was approved by the institu-