STAT3-dependent systems-level analysis reveals PDK4 as an independent predictor of biochemical recurrence in prostate cancer

Prostate cancer (PCa) has a broad spectrum of clinical behaviour, hence biomarkers are urgently needed for risk stratification. We previously described the protective effect of STAT3 in a prostate cancer mouse model. By utilizing a gene co-expression network in addition to laser microdissected proteomics from human and murine prostate FFPE samples, we describe STAT3-induced downregulation of the TCA cycle/OXPHOS in PCa on transcriptomic and proteomic level. We identify pyruvate dehydrogenase kinase 4 (PDK4), a key regulator of the TCA cycle, as a promising independent prognostic marker in PCa. PDK4 predicts disease recurrence independent of diagnostic risk factors such as grading, staging and PSA level. Furthermore, PDK4 expression is causally linked to type 2 diabetes mellitus, which is known to have a protective effect on PCa. We conclude that this effect is related to PDK4 expression and that PDK4 loss could serve as a biomarker for PCa with dismal prognosis.


Introduction
Prostate Cancer (PCa) is the second most frequent cancer and the fifth leading cause of death from cancer in men worldwide (Bray et al., 2018). The diagnosis of PCa is largely based on the histopathological evaluation of biopsies, which are graded by the Gleason score (GSC) (Gleason and Mellinger, 1974). In 2005, the GSC was modified by the International Society of Urological Pathology (ISUP) (Epstein et al., 2005), resulting in the ISUP grade, which ranges from I to V (National Collaborating Centre for Cancer, 2014). PCa shows a wide variety in clinical behaviour, ranging from harmless, indolent tumors to aggressive metastatic disease Lotan, 2014, Sathianathen et al., 2018). As a consequence, treatment following biopsy of the prostate is individualized and based on four main criteria: the amount of tumor in the biopsy, the histological GSC/ISUP grading, clinical staging and -to a lesser extent -the level of prostate specific antigen (PSA) in the serum (National Collaborating Centre for Cancer, 2014). Nonetheless, there is a significant risk of over-and undertreatment (Sathianathen et al., 2018), and additional biomarkers for risk stratification are urgently needed.
Molecular characterization reveals PCa as a highly heterogeneous disease with diverse genetic, epigenetic and transcriptomic alterations (The Cancer Genome Atlas Research Network, 2015, Taylor et al., 2010. As a consequence, there is a strong need to define molecular subgroups of PCa to identify potential targets for treatment. In a previous attempt, our group studied the role of signal transducer and activator of transcription 3 (STAT3) in PCa, which turned out to exert tumor suppressor activities (Pencik et al., 2015). Physiologically, STAT3 plays a central role in signal transduction in a wide range of transcriptional pathways. In cancer, it has been shown to promote stem-cell like characteristics of tumor cells, tumor cell survival and proliferation. STAT3 enhances metastatic potential and immune evasion (Huynh et al., 2019). Owing to these facts, it is generally considered an oncogene. However, others have shown that STAT3 may also have tumor suppressive capabilities (Huynh et al., 2019, Pencik et al., 2015. In this study, we have analyzed the system-level effects of STAT3 on the transcriptome and proteome of human PCa samples and a previously established (Pencik et al., 2015) Stat3 PCa mouse model ( Figure   1A). For transcriptomic analyses, we used The Cancer Genome Atlas -Prostate Adenocarcinoma (TCGA PRAD) RNA-Seq dataset (The Cancer Genome Atlas Research Network, 2015) and established a gene coexpression network. We found a negative association of STAT3 expression to the tricarboxylic acid (TCA) cycle / oxidative phosphorylation (OXPHOS) and ribosomal biogenesis. These results were corroborated by findings in shotgun proteomics of laser-microdissected formalin fixed and paraffin embedded (FFPE) PCa material from both human and murine samples. Furthermore, we showed elevated levels of metabolites of the TCA cycle in PtenStat3 pc-/mice, using a targeted metabolomics approach.
Our experimental results therefore support the notion that STAT3 is an important negative regulator of TCA/OXPHOS (Huynh et al., 2019), thereby influencing aggressiveness of PCa. Furthermore, gene expression of PDK4, which inhibits pyruvate oxidation through the TCA cycle and thereby negatively impacts OXPHOS , was significantly downregulated in low STAT3 patients. We show that high PDK4 expression is significantly associated with a lower risk of biochemical recurrence (BCR), and it predicts disease recurrence independent of ISUP grading in low/intermediate risk primary tumors. In addition, PDK4 is an independent predictor of BCR compared to ISUP grading and clinical staging, as well as pathological staging and pre-surgical PSA-levels in primary and metastatic tumors, identifying PDK4 as a promising prognostic marker in PCa.

Figure 1: Identification of STAT3 associated pathways in prostate cancer
A. Overview of transcriptomic (top) and proteomic (bottom) analyses.
B. Overexpression analysis of enriched KEGG pathways of significantly differentially expressed genes between low STAT3 versus high STAT3 groups in TCGA PRAD. See also Table S1, Figures S1 and S2.

Low STAT3 expression in primary PCa is associated with increased OXPHOS and ribosomal biosynthesis
In order to gain insight into the effects of STAT3 expression in primary PCa, we employed two different approaches of analyzing TCGA PRAD RNA-Seq data of 498 patients ( Figure 1A).
Firstly, samples were ranked according to STAT3 expression and split into three groups: "high STAT3" consisted of the 1 -0.8 th quantile (n=100), "low STAT3" of the 0.2 nd quantile (n=100) and "medium STAT3" of all samples in between (n=298). We compared low STAT3 to high STAT3 samples and found 1194 genes to be significantly differentially expressed (log-FC ≥ 1, adj. p-value ≤ 0.05, Table S1). Gene set testing using the Ensemble Of Gene Set Enrichment Analyses (EGSEA) method (Alhamdoosh et al., 2017) (Methods) showed gene sets directly associated with STAT3 signaling to be downregulated ( Figure S1A-B, Table S1).
Interestingly, the hallmark signature gene set for OXPHOS was strongly upregulated, so was the KEGG pathway Ribosome (Table S1). Overexpression analysis of differentially expressed genes showed the ribosome and OXPHOS among upregulated KEGG pathways ( Figure 1B, Figure S2A -B), while Gene Ontologies (GO) Cellular Component (CC) showed enrichment of genes coding for ribosomal subunits, RNA metabolism and protein localization, which are involved in translation (Steitz, 2008) (Table S1).
Secondly, we used Weighted Gene Co-Expression Network Analysis (WGCNA) (Methods) Horvath, 2008, Langfelder andHorvath, 2012) to create a network of co-expressed gene clusters from the entire dataset. Gene clusters consist of groups of genes which are highly interconnected by their high absolute correlation. Gene clusters can be analyzed for common biological motives and for their association with a trait of interest, such as tumor grade, stage or expression of a specific gene. We generated a network from 13,932 genes and 382 patients which resulted in 13 gene clusters that we analyzed for characteristic biological themes by using overexpression analysis (Yu et al., 2012) (Figure 2A and 2B). Some gene clusters indicated specific overexpression of distinct biological motives. Genes in cluster 2 for example, were mainly associated with cellular respiration (mitochondrial respiratory complex assembly, OXPHOS) and RNA splicing. Cluster 3 represented ribosomal translation and protein targeting to the endoplasmic reticulum (ER). Cluster 11 was associated with epigenetic processes (histone and chromatin modification, gene silencing). We subsequently investigated which gene clusters were associated with STAT3 expression by using the cluster eigengene (= the first principal component, Methods) and found strong correlations for three clusters. While genes in "epigenetic"-cluster 11 (Pearson correlation; ρ = 0.59, adj. p-value = 8e-36) showed a positive correlation with STAT3 expression, "OXPHOS"-cluster 2 (ρ = -0.67, adj. p-value = 7e-50) and "ribosomal"-cluster 3 (ρ = -0.74, adj. p-value = 1e-65) were negatively correlated ( Figure 3A and 3B, Table S2). We also investigated the correlation of gene clusters with clinical traits representing tumor aggressiveness. Clusters correlating with the clinical traits BCR, GSC, and the risk groups pathological tumor (pT) and lymph node (pN) staging showed different correlations than those correlated with STAT3 expression ( Figure 3B).
To confirm the negative association of STAT3 with OXPHOS and ribosomal activity, we assessed the correlation of individual genes in cluster 2 and cluster 3 with STAT3. For each cluster, we selected the 50 genes that were most strongly negatively correlated with STAT3 (ρ ≤ -0.6), while at the same time being highly associated with the respective gene cluster (ρ ≥ 0.8). Those of cluster 2 showed enrichment of GO Biological Process (BP) terms belonging to three groups: Oxidative phosphorylation (45.71%), mitochondrial ATP synthesis coupled protein transport (42.86%) and mitochondrial translational elongation (11.43%) ( Figure 3C). In cluster 3, GO BP terms were associated with four groups, consisting of SRP-dependent cotranslational protein targeting to membrane (50%), ribosomal small subunit biogenesis (38.89 %), negative regulation of ubiquitin protein ligase activity (5.56%) and cytoplasmic translation (5.56%) ( Figure 3D).
As a conclusion, both our first and our second analysis suggest a negative correlation of STAT3 expression to genes associated with both increased OXPHOS and ribosomal activity.  A. Graphical representation of the network of cluster eigengenes (= their first principal component).

Dendrogram and heatmap plots show the relationships between the eigengenes and STAT3.
Correlations between cluster eigengenes and the trait are indicated by a color bar ranging from red (1) to blue (-1).
C. and D. Network representation of enriched GO BP terms of top 50 genes most strongly negatively correlated with STAT3 (GS ≤ -0.6, adj. p-value ≤ 0.01) in cluster 2 (C, blue, MM ≥ 0.8, adj. p-value ≤ 0.01) and cluster 3 (D, pink, MM ≥ 0.8, adj. p-value ≤ 0.01), respectively. Node size indicates the percentage of associated genes. Similar colors indicate terms of the same GO group. GS = Gene significance, MM = module membership, GO = Gene Ontology, BP = Biological Process.

Proteomics analysis of human FFPE-samples shows high TCA/OXPHOS in low STAT3 PCa
After the investigation of STAT3-effects on the gene expression level, we examined its impact on the protein level ( Figure 1A). Low and high STAT3 levels were preselected by immunohistochemical (IHC) analyses of STAT3 in patient samples. We conducted shotgun proteomics experiments with FFPE patient material, comparing low STAT3 with high STAT3 PCa and a healthy prostate control group (n= 4 in each group, Methods). To specifically focus on cell autonomous mechanisms, tumor and control material was procured by laser-microdissection (LMD) of prostate epithelial cells, and a label-free quantification (LFQ) approach was used to obtain protein intensities. Samples showed a clear separation of groups after principal component analysis (PCA) ( Figure 4A). We identified 2316 proteins on average (1722 -2610), of which 86 were differentially expressed across all groups (log-FC ≥ 1, adj.p-value < 0.05). Among the 22 proteins we found to be significantly differentially expressed between low STAT3 and high STAT3 groups (log-FC ≥ 1, adj. p-value < 0.05, Figure 4B, Table S3), 13 were mitochondrial proteins. Of interest, succinate dehydrogenase complex iron sulfur subunit B (SDHB) (log-FC = 3.56, adj. p-value = 0.01) and isocitrate dehydrogenase (NADP (+)) 2 (IDH2) (log-FC = 2.31, adj. p-value = 0.0539) were significantly up-regulated.
STAT3 could not be detected in this experiment, which may be due to archived FFPE material, which could impede proteomics coverage. Gene set testing between low STAT3 and high STAT3 showed mitochondrial proteins and metabolic processes to be upregulated (Table S3). Consistent with the results of our TCGA analysis, several metabolic KEGG pathways, among them the TCA cycle and OXPHOS ( Figure 4C, Table   S3), were upregulated.

Proteomics from murine FFPE-samples show increased ribosomal activity in PtenStat3 pc-/--tumors
We wanted to know if proteomics from a PCa mouse model would reflect the results we obtained from human data. We used a previously established genetic PCa mouse model (Alonzi et al., 2001, Pencik et al., 2015, Suzuki et al., 2001, Wu et al., 2001) (Methods) with conditional loss of either Pten (referred to as Whereas Pten pc-/mice show slow, localized tumor progression, the additional deletion of Stat3 leads to rapid tumor growth, dissemination and early death (Pencik et al., 2015). We selected triplicates from each genotype (wild type (WT), Pten pc-/and PtenStat3 pc-/-) and performed LMD and LFQ shotgun proteomics on FFPE tumors and controls (Methods). We were able to detect 2994 proteins on average (2052 -3465), with 1510 being differentially expressed between all three groups (log-FC ≥ 1, adj. p-value < 0.05). PCA showed a clear separation between groups, and STAT3 was the strongest differentially expressed protein in PtenStat3 pc-/--compared to Pten pc-/tumors (log-FC = -5.34427, adj.p-value = 0.0005; Figure 4a-b Table S4).
Comparing PtenStat3 pc-/--to Pten pc-/mice, we found significant upregulation of KEGG pathways associated with ribosome and protein processing in ER. In addition, PI3K-Akt signaling was upregulated, whereas several pathways related to immune response were downregulated ( Figure 4D). Gene set testing on GO BP terms, comparing PtenStat3 pc-/--to Pten pc-/tumors, showed upregulation of ribosome biogenesis, translational initiation, rRNA metabolic process, protein localization to ER and establishment of protein localization to ER, among others (Table S4). STAT3-associated regulation of ribosomal activity on protein level was consistent with our human TCGA samples and corresponded to gene cluster 3. In summary, our experimental results indicate a STAT3-dependent repression of ribosomal biogenesis and translation, which suggests that low expression of STAT3 or Stat3 loss is associated with more aggressive tumors with a high need of energy supply for growth, dissemination and metastasis (Donati et al., 2012).
For comparison with our human proteomic data, we performed gene set testing with a subset of metabolic KEGG pathways, including TCA/OXPHOS, which were shown to be enriched in human proteomic samples (Table S3). We found the TCA cycle to be also significantly upregulated in PtenStat3 pc-/mouse tumors (Table S4). Since differences in log-FCs were not high between groups, we sought to additionally investigatie Stat3-dependent changes in the TCA cycle activity on metabolite level.

Metabolomics show increased TCA cycle activity in PtenStat3 pc-/mouse tumors
In order to assess Stat3-dependent changes in TCA cycle metabolite levels, we performed a targeted metabolomics experiment on WT, Pten pc-/and PtenStat3 pc-/mice (with biological replicates n= 5 for WT and Pten pc-/and n= 3 for PtenStat3 pc-/-, see Methods). Hence, we measured absolute amounts (nmol/µg) of pyruvate, citrate, α-ketoglutarate, succinate, fumarate and malate in mouse prostate tumors and WT prostates. PtenStat3 pc-/prostate tumors showed significantly higher amounts of pyruvate (Anova with TukeyHSD, adj. p-value = 0.01), fumarate (adj. p-value = 0.027) and malate (adj. p-value = 0.029) compared to WT tumors ( Figure 5A, Table S5). Citrate levels were not statistically different between groups (Table   S5). Succinate was the only metabolite with lower amounts in PtenStat3 pc-/--or WT-compared to Pten pc-/prostates. Whereas there was a trend of upregulation of TCA cycle metabolites between WT and Pten pc-/-, only succinate showed a significant difference (adj. p-value = 0.023). Generally, measured metabolite concentrations showed a trend to be higher in PtenStat3 pc-/compared to Pten pc-/mice, but due to the high variability in metabolite levels of Pten pc-/mice, significance was not reached in these samples. The results of all ANOVAS with Tukey multiple comparisons of means can be found in the supplementary data (Table   S5).

Figure 5: Increased TCA/OXPHOS is associated with tumor aggressiveness in PCa.
A. Boxplot representing metabolite concentrations in nmol/µg of 5 metabolites in WT, Pten pc-/and PtenStat3 pc-/prostates. Jitter represents biological replicates. ANOVA test and Tukey multiple comparisons were performed to assess significance. N.s. values are not stated due to better readability, but all p-values can be found in Table S5. Colors represent groups (Red (PtenStat3Ko) = PtenStat3 pc-/-, blue (PtenKo) = Pten pc-/-, grey (WT) = wild type). N.s. = not significant. X-axis indicates the respective metabolites, y-axis the metabolite concentration in nmol/µg.
B. and C. Boxplots representing SDHB (B) and IDH2 (C) protein expression levels detected by IHC. Jitter represents single values in groups. Kruskal-Wallis test and Dunn's all pairs test were performed to assess significance. GL = Gleason grade, IHC = Immunohistochemistry.

Increased TCA/OXPHOS is associated with tumor aggressiveness in PCa
By reason of our results of the proteomic analyses, we re-analyzed our TCGA RNA-Seq data for TCA/OXPHOS pathways. We used a subset of metabolic KEGG pathways that was significantly enriched in human proteomic samples for gene set testing of low STAT3 vs. high STAT3 groups. Besides OXPHOS, we also found the TCA cycle to be significantly enriched (Table S1).
It is established, that PCa tumorigenesis is accompanied by enhanced TCA/OXPHOS activity (Costello and Franklin, 2006, Costello et al., 1997, Cutruzzolà et al., 2017. Our data suggest a STAT3-dependent downregulation of TCA/OXPHOS, supporting the notion of a tumor suppressive function of STAT3 in primary PCa. To compare TCA/OXPHOS enzyme levels between PCa and healthy prostate, we performed immunohistochemical (IHC) stainings of a tissue microarray (TMA) consisting of primary PCa and adjacent tumor free tissue from 83 patients (Methods). We stained for SDHB and IDH2. Together with SDHA, SDHC and SDHD, SDHB forms the succinate dehydrogenase (SDH) complex (or respiratory complex II, CII), which is located in the inner mitochondrial membrane. SDH/CII participates in both the TCA cycle by oxidizing succinate to fumarate and OXPHOS by shuttling electrons. IDH2 is the TCA cycle enzyme that converts isocitrate to α-ketoglutarate (Anderson et al., 2018, Stelzer et al., 2016. Both SDHB and IDH2 showed higher expression levels in tumors than in normal tissue (Kruskal Wallis test and Dunn's all pairs test: SDHB: adj. p-value = 1.4e-05, IDH2; adj. p-value. = 5.3e-07). Moreover, GL5 areas showed a stronger expression of both SDHB (adj. p-value = 0.00044 to GL3 and 0.00014 to GL4) and IDH2 (adj. p-value = 2.4e-05 to GL3 and 0.00461 to GL4), when compared to GL 3 or 4 areas ( Figure 5B-C). These data confirm that increased TCA/OXPHOS is associated with tumor aggressiveness in PCa.

Low-PDK4 expression is significantly associated with earlier disease recurrence in PCa
Considering that low STAT3 expression correlates with increased TCA cycle activity and enhanced OXPHOS, we were looking for differentially expressed genes that might cause this effect. Pyruvate dehydrogenase kinase 4 (PDK4) was significantly downregulated in low STAT3 RNA-Seq samples (log-FC = -1.126, adj. p-value = 1.47E-07); it is known to inhibit metabolic flux through the TCA cycle and thereby downregulate OXPHOS (Jeoung, 2015;Zhang et al., 2014) Figure 6B) and in primary and metastatic tumors combined (beta: -0.9815, HR: 0.3747, p.-value: 1.87e-06, Figure 6C). When compared to diagnostic risk factors, it predicted BCR in low/intermediate risk primary tumors (= clinical staging T1c-T2c) independent of ISUP grades (multivariate Cox proportional hazards model, Figure 7A). In addition, PDK4 was a significant predictor independent of ISUP grading and clinical tumor staging, as well as pathological tumor staging and pre-surgical PSA-levels in primary and metastatic tumors combined ( Figure   7B-C). PDK4 expression was also significant relative to the occurrence of chemotherapy, hormone therapy and radiation therapy ( Figure 7D).
Considering the possibility of a dataset-specific effect of PDK4 expression, we additionally tested 4 other datasets with the survExpress tool (Methods) (Aguirre- Gamboa et al., 2013). They all showed a similar trend: low-PDK4 patients have a higher chance for earlier BCR or death. The Sboner Rubin Prostate dataset (Sboner et al., 2010) consists of survival data of 281 patients with primary PCa from a watchful waiting cohort with up to 30 years clinical follow up. In this cohort, patients in the low-PDK4/high-risk group, had a higher chance of earlier death (Risk Groups HR = 1.4 (confidence interval (CI) 1 ~ 1.98), p-value = 0.05, Figure S3A). Difference in survival for PDK4 groups was significant for Gleason 6 (Risk Groups HR = 2.92 (CI 1.16 ~ 7.36), p-value = 0.023, Figure S3B) and 8 (Risk Groups HR = 3.06 (CI 1.06 ~ 8.85), p-value = 0.039, Figure S3D) and has a p-value = 0.057 in Gleason 7 (Risk Groups HR = 1.85, (CI 0.98 ~ 3.51), Figure   S3C). It has to be considered, that due to the advanced age of PCa patients (85% of all cases are diagnosed in patients >65 years (National Collaborating Centre for Cancer, 2014)) and the duration of the follow-up period of up to 30 years, patients in this study might possibly have suffered from multiple co-morbidities, which also affect survival time.
We also analyzed the TCGA-PRAD dataset (The Cancer Genome Atlas Research Network, 2015) which includes data on survival time, but no reliable information on BCR. Since the overall number of patient deaths is only 10 in this dataset, statistical significance was not reached (HR = 3.14, (CI 0.65 ~ 15.11), pvalue = 0.15). Nevertheless, 8 out of 10 patients who died are in the low-PDK4/high-risk group after a median split of sample groups ( Figure S4A). We tested two additional datasets -Gulzar (Gulzar et al., 2013) and Lapointe (Lapointe et al., 2004) -that are considerably smaller (n= 89 with 24 events in Gulzar and n = 29 with 7 events in Lapointe). In those, PDK4 did not reach significance, presumably because of the smaller sample sizes. However, there is a clear trend of low PDK4 showing risk of earlier BCR ( Figures   S4B-C). In conclusion, all datasets revealed a trend of low-PDK4 patients having a higher risk for earlier BCR or death.
Other tested candidates that were linked to TCA/OXPHOS or regulated ribosomal pathways, such as hypoxia-inducible factor-1α (HIF-1α) and CCR4-NOT transcription complex subunit 1 (CNOT1), were not predictive of increased risk to earlier disease recurrence ( Figure S5A and S5B).

Discussion
In this study we have used a gene co-expression network analysis in addition to proteomics from laser microdissected human and murine FFPE samples to identify PDK4 as a highly relevant independent candidate prognostic marker in PCa. We here for the first time demonstrate that PCa patients with low PDK4 expression have a higher risk of earlier disease recurrence, independent of ISUP grading and tumor staging.
Moreover, specifically in low-intermediate risk T1c-T2c tumors, PDK4 proves to be a significant predictor of earlier BCR in the dataset tested, independent of ISUP grading. Therefore, PDK4 is a strong candidate marker for risk stratification of the large group of T1c-T2c tumors, which are prone to over-or undertreatment. The influence of PDK4 on the PCa disease course may also have an impact on the treatment of type 2 diabetes mellitus (T2DM) with drugs targeting PDK4 as discussed below.
Due to the slow clinical progression rate of PCa, BCR is generally used for risk determination. On account of the protracted nature of a prospective study we evaluated the effect of PDK4 on PCa BCR retrospectively.
We therefore believe that it might be beneficial to conduct additional prospective studies to support the postulated effects of PDK4 on PCa outcome.
The generation of mitochondrial adenosine triphosphate (ATP) through aerobic respiration via TCA/OXPHOS is the primary source of energy in most normal cells (Hanahan andWeinberg, 2011, Stacpoole, 2017). Prostate epithelial cells, however, are characterized by a physiological downregulation of TCA/OXPHOS, caused by citrate secretion and zinc accumulation in the cell (Costello and Franklin, 2006, Costello et al., 1997, Cutruzzolà et al., 2017. This is due to the highly specialized role of prostate epithelial cells, which excrete citrate-rich prostatic fluid (Costello et al., 1997). In most cancers, malignant transformation is accompanied by a shift from aerobic respiration via TCA/OXPHOS to aerobic glycolysis, an event also known as the Warburg effect (Hanahan andWeinberg, 2011, Stacpoole, 2017). Primary PCa cells, however, do not show the Warburg effect. On the contrary, the malignant shift from healthy prostate cells to primary PCa cells involves the upregulation of TCA/OXPHOS (Cutruzzolà et al., 2017, Costello andFranklin, 2006). PDK4, which is part of the pyruvate dehydrogenase kinase (PDK) family, plays a central role in the regulation of TCA/OXPHOS . PDK4 phosphorylates the pyruvate dehydrogenase complex (PDC) subunits and thereby inhibits the formation of acetyl-coenzyme A from pyruvate. This leads to a down-regulation of metabolic flux through the TCA cycle , Jeoung, 2015, Stacpoole, 2017. PDK4 is a known STAT5 target gene (White et al., 2007) and a putative STAT3 target gene: ENCODE ChIP-Seq against STAT3 in human HeLa-S3 cells shows binding to the promoter region of PDK4 (Chen et al., 2013, Davis et al., 2018, Kuleshov et al., 2016, The Gene Ontology Consortium, 2018. Thus, the gene expression profile of low STAT3 PCa samples, showing PDK4 downregulation and TCA/OXPHOS upregulation, reflects a PCa-specific metabolic setting and emphasizes the increased aggressiveness of those cancers compared to high STAT3 tumors. The crucial role of PDK4 as inhibitor of PDC activity renders it important as a target gene in many cancers and metabolic disorders (Jeoung, 2015, Yamane et al., 2014. In non-prostate cancer cells, high PDK4 facilitates the transition from OXPHOS to aerobic glycolysis and is therefore considered a risk factor enhancing the Warburg effect . High PDK4 is associated with poor survival in breast cancer (Guda et al., 2018) and increased cell growth in bladder cancer cell lines (Woolbright et al., 2018). In addition to direct interaction with PDC, PDK4 has been shown to enhance the Warburg effect via mammalian target of rapamycin (mTOR) and HIF-1α. In mouse embryonic fibroblasts (MEFs) and Eker leiomyoma tumor-3 (ELT3) cells, Liu et al. show, that PDK4 activates mTOR signaling via cAMP-response element-binding Protein (CREB) and Ras homolog enriched in brain (RHEB) . The mTOR effector HIF-1α and its downstream target pyruvate kinase isozyme M2 (PKM2) were elevated in PDK4 overexpressing cells and reduced in PDK4 knockdown cells. Both HIF-1α and PKM2 have been known to modulate key processes required for the Warburg effect (Courtnay et al., 2015).
Conversely, in cancer cells that have undergone tumor progression via epithelial-mesenchymal transition (EMT), a low-PDK4-mediated metabolic shift from glycolysis to OXPHOS was reported, and knockdown of PDK4 was sufficient to induce EMT in human non-small cell lung cancer (NSCLC) cell lines (Sun et al., 2014). In accordance with these findings, Sun et al. show reduced overall survival of NSCLC patients with low PDK4 expression. Yang et al. show that downregulation of PDK4 is associated with earlier recurrence and lower survival time in hepatocellular carcinoma (Yang et al., 2019). In concurrence with our data, Chen et al. (Chen et al., 2018) show that prostate tumors exhibit higher gene expression and higher protein levels of both PDC subunit pyruvate dehydrogenase A1 (PDHA1) and the PDC activator pyruvate dehydrogenase phosphatase 1 (PDP1). Mengual et al. find PDK4 to be significantly higher expressed in both tumors and post-prostatic massage urine samples from PCa patients compared to the respective control groups (Mengual et al., 2014).
PDK4 is elevated in patients with T2DM and other metabolic disorders involving insulin resistance, such as obesity (Kulkarni et al., 2012, Lee, 2014, Stacpoole, 2017. Furthermore, it has been established that the development of diabetes and insulin resistant states is causally linked to PDC inhibition through PDK upregulation (Stacpoole, 2017, Jeoung, 2015. Therefore, the PDC/PDK4 axis is an important therapeutic target in the treatment of both diabetes and cancer. Dichloroacetate (DCA), for example, is a PDK inhibitor that is most active against PDK2, but also against PDK1 and PDK4, and has been used as an investigational drug for over 30 years in diabetes, cancer and other diseases (Stacpoole, 2017). It has been subjected to a large number of clinical trials but has lacked pharmaceutical support due to its non-patentability. Attempts to synthesize similar small-molecule inhibitors of PDK were made, but failed in clinical trials (Stacpoole, 2017). Nonetheless, PDK inhibitors remain promising targets in the treatment of T2DM and cancer.
T2DM is reported to have a protective effect on the development of PCa (Baradaran et al., 2009, Choi et al., 2016. Our indication of a beneficial effect of PDK4 upregulation on the clinical course of PCa suggests that a T2DM-induced inhibition of the TCA cycle via the PDC/PDK4-axis is a potential cause of this protective effect, thereby leading to an indolent disease or potentially preventing the development of high risk PCa. As a note of caution, our findings suggest that therapeutic targeting of PDK4 in patients with both T2DM and PCa may result in increased tumor aggressiveness. Hence, our data may be of high clinical importance. For proteomic characterization, we used human FFPE-material up to 21 years old. Patient material is routinely processed and stored as FFPE specimen for diagnostic purposes in the clinic. Academic pathological institutions possess large archives of fully annotated FFPE-patient material that can be used for retrospective research. Recent developments in LC-MS/MS sample preparation protocols have made it possible to use those archives for proteomic sample-characterization (Ostasiewicz et al., 2010, Wisniewski et al., 2013. This is particularly useful in diseases such as PCa, that frequently have a protracted clinical progression rate and may take years for disease recurrence and development to metastatic disease to occur. Previous proteomic studies focused on the characterization of the proteome (and transcriptome) of primary prostate cancer (Iglesias-Gato et al., 2016, Sinha et al., 2019, on proteomic and transcriptomic disease evolution (Latonen et al., 2018) or on the establishment of a diagnostic panel via machine learning (Kim et al., 2016). In contrast to the above studies, our approach is focused on the specific effects of loss of the key transcription factor STAT3 in prostate cells only.
Although STAT3-signaling is linked to various regulatory events causing increased proliferation, stemness and inflammation and therefore has oncogenic properties, STAT3 can also act as tumor suppressor (Huynh et al., 2019). The deletion of Stat3 in prostate epithelial cells in a loss of Pten PCa mouse model leads to increased tumor growth and early death (Pencik et al., 2015). It is well established that STAT3 is able to control the activity of mitochondria, the electron transport chain (ETC) and the ER both via transcriptional control and via direct binding to these cell compounds (Avalle et al., 2018, Huynh et al., 2019, Poli and Camporeale, 2015, Wegrzyn et al., 2009). On the one hand, STAT3 expression is associated with increase in glycolysis and the suppression of the ETC (Wegrzyn et al., 2009, Huynh et al., 2019. Specifically, the activation of STAT3 is linked to the induction of HIF-1α, which suppresses OXPHOS and reprograms TCA (Camporeale et al., 2014, Demaria et al., 2010, Niu et al., 2008, Pawlus et al., 2014, Poli and Camporeale, 2015. Likewise, HIF-1 was shown to transcriptionally upregulate PDK4 (Courtnay et al., 2015). On the other hand, however, STAT3 can be directly associated with mitochondrial complexes, improving ETC activity and transcription of mitochondrial genes (Huynh et al., 2019, Wegrzyn et al., 2009. Our data support the former concept of STAT3-dependent downregulation of TCA/OXPHOS in accordance with HIF-1α upregulation and additionally suggest inhibition of the TCA cycle via PDK4. We did not find a significant influence of HIF-1α on the relapse of PCa, nor was CNOT1, which is associated with inhibition of ribosomal translation initiation, significantly associated with a worse survival outcome.
In summary, the present study uses a systems-biology approach to unveil the effects of loss of STAT3 in PCa. In this setting we show an association of STAT3 to TCA/OXPHOS, ribosomal biogenesis and translation. Our data do not only substantiate previous research on the effects of STAT3, but in a prostate specific context also explain the tumor suppressive functions of STAT3 from a cell autonomous point of view. We here identify PDK4 as a promising independent prognostic marker for PCa which will facilitate to separate good from bad prognostic PCa. In addition, the known protective effect of diabetes on the development of PCa could be owing to the upregulation of PDK4. Since PDK4 inhibitors are promising therapeutics in T2DM, these drugs may potentially increase the aggressiveness of PCa. Therefore, our

LK is a member of the scientific advisory board of CBmed -Center for Biomarker Research in Medicine
GmbH. MB and WW are members of the Scientific Board of CBmed.

CONTACT FOR REAGENT AND RESOURCE SHARING
Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Lukas Kenner (lukas.kenner@meduniwien.ac.at).
Only primary tumor samples (n=489) were selected. For data pre-processing, R package edgeR v.3.24.3  was used. Raw data were transformed to counts per million (cpm) values and genes that were expressed in less than 70% of samples were omitted. Gene expression distributions were normalized using weighted trimmed mean of M-values (TMM) method (Robinson and Oshlack, 2010).

Weighted gene co-expression network analysis (WGCNA)
TCGA PRAD RNA-Seq data were used to generate a weighted gene co-expression network with WGCNA v.1.66 R package as described by Horvath, 2008, Langfelder andHorvath, 2012). For creation of a trait matrix, TCGA PRAD clinical data were acquired via GDC Legacy Archive. Patients without information on disease recurrence were excluded. Following clinical traits were used for analyses: Biochemical disease recurrence (BCR), pathological tumor staging (pT), pathological lymph node staging (pN) and histological grading with Gleason Score (GSC). Pathological staging was split into low to intermediate risk (indicated as 1) and high to very high risk (indicated as 2) groups. For pT, the low to intermediate risk group consisted of T2abc-and the high to very high risk group of T3-T4 samples.
For pN, low to intermediate risk was assigned to N0 samples, high to very high risk to N1 samples. The emergence of BCR was indicated as 1, otherwise as 0. GSCs were not split into groups. STAT3-expression was included from RNA-Seq data.
RNA-Seq data was acquired and prepared as described above. Only samples with matching trait data were used for network creation (n=397). Gene expression data was voom-transformed with limma v.3.38.3 R package (Ritchie et al., 2015, Law et al., 2014 and outliers were removed by sample clustering. 382 samples and 13932 genes were used for network construction. First, a correlation matrix was created using biweight midcorrelation of genes. Second, an adjacency matrix was established from the correlation matrix with a soft thresholding power beta of 6. Third, a topological overlap matrix (TOM) was calculated from the adjacency matrix (Zhang and Horvath, 2005). The TOM provides information on the interconnectedness of genes by a similarity measure: it indicates, whether two genes share co-expression to a similar set of other genes Horvath, 2005, Yip andHorvath, 2007). For the creation of gene clusters (= modules), hierarchical clustering based on TOM-based dissimilarity was performed. Minimum gene cluster size was set to 30. Genes which did not belong to any cluster were summarized as cluster 13. To compare expression profiles of gene clusters, the 1 st principal component (= module eigengene (ME)) of each cluster was calculated and clusters with similar eigengenes (r > 0.75) were merged. Genes in each gene cluster were tested for overrepresentation of GOs and KEGG pathways with clusterProfiler v.3.10.1 (Yu et al., 2012). GOs CC, Molecular Function (MF) and BP were tested separately. Significance was defined by an adj. p-value ≤ 0.05, adjustment method was Benjamini-Hochberg.
Gene clusters were associated to external traits by correlating MEs with trait data (= cluster-trait correlation) by Pearson correlation. Student asymptotic p-values for given correlations were adjusted by Benjamini-Hochberg method. Likewise, correlation of each gene to both the respective gene cluster (= module membership, MM) and STAT3-expression (= Gene significance, GS) was calculated by Pearson correlation. Student asymptotic p-values were calculated and adjusted with Benjamini-Hochberg method. Significance was defined by an adj. p-value ≤ 0.05.

Human tissue micro array (TMA) generation
For generation of a TMA, we used FFPE-material from a patient cohort of 83 patients with primary PCa who underwent radical prostatectomy from 1993 to 2003. The TMA consists of 2 spots from tumor, prostatic intraepithelial neoplasia (PIN) and normal prostate areas from the same patient. Whole mount prostate FFPE-blocks were sliced into 3 µm thick sections, mounted on slides and stained with hematoxylin and eosin. Subsequently, a pathologist marked the respective areas on the slides. To generate the TMA, cores of 2 mm diameter were cut out of the donor block and placed into the recipient TMA block using a manual tissue arrayer (Beecher Instruments). Tissue sections (3 µm thick) were placed onto superfrost slides.

Immunohistochemistry (IHC)
IHC was performed on FFPE TMAs using consecutive sections. The following antibodies were used: anti-IDH2 (rabbit polyclonal, 1:100 dilution; Proteintech, 15932-1-AP, RRID: AB_2264612) and anti-SDHB (mouse monoclonal, 1:100 dilution; Abcam; ab14714, RRID: AB_301432). Staining was performed using the Benchmark Ultra automated staining system (Ventana, Roche). The procedure was conducted as follows: heat pre-treatment, antigen retrieval with CC1 buffer (pH 8.5) for 64 min., incubation with the antibody for 32 min. and counterstaining with hematoxylin and bluing reagent for 8 min. each. After automated staining, the slides were washed with water, then dehydrated in increasing concentrations of ethanol (70%, 80%, 96%, absolute alcohol) until xylol, covered with the mounting medium Shandon Consul-Mount TM (Thermo Scientific) and analyzed by standard light microscopy. Antibodies were validated for FFPE IHC. As positive controls, human colon cancer for IDH2 and human muscle tissue for SDHB were used.

Sample selection and preparation for laser microdissection (LMD)
From the TMA and patient cohort described above, STAT3 protein expression was quantified by a pathologist after IHC staining (Pencik et al., 2015). We selected 7 patients with no STAT3 expression (0 positive cells) as low STAT3 group and 7 patients with ≥ 11% positive cells as high STAT3 group (see also supplementary information). Additionally, 7 healthy prostate FFPE samples were included as control group, stemming from BCa patients. To facilitate LMD, we created a TMA for each patient. Whole mount prostate FFPE-blocks were sliced into 3 µm thick sections, mounted on slides and stained with hematoxylin and eosin. A pathologist marked tumor areas with GL 4 or 5 on the slides. For each patient, a TMA block was created with 2 mm diameter spots using a manual tissue arrayer (Beecher Instruments).
For LMD of murine samples, FFPE tumor material was used from WT, Pten pc-/and PtenStat3 pc-/mice (n=3 for each genotype). Blocks were sliced into 3 µm thick sections, mounted on slides and stained with hematoxylin and eosin. Tumor areas were marked by a pathologist. Since mouse tumors are much smaller and contain only few stroma compared to human PCa, there was no need to create sample-TMAs for LMD.

LMD for proteomic analysis
For LMD of human samples, a Palm Zeiss Microbeam 4 was used. Sample TMA blocks were cut into 10 μm thick sections and mounted on superfrost slides. For LMD of mouse samples, a Leica LMD6000 was used. Tissue blocks were cut into 10 μm thick sections and mounted on membrane slides (PEN Membrane, 2.0 µm, Leica). LMD was conducted similarly for mouse and human samples: for each sample, a slide was stained with hematoxylin and eosin for inspection before LMD. To obtain the minimum amount of tissue (100 nl = 0.1 mm³) necessary for consecutive LC-MS/MS analysis, at least 10 mm² of target area was lasermicrodissected. To obtain proteomic profiles solely from the tumor, stroma and immune cells were excluded from dissection. Microdissected FFPE-samples were stored at -20°C before LC-MS/MS analysis.

Protein extraction and enzymatic digestion
100 nl (10 mm 2 of 10 µm slides) of FFPE-material per sample was used for analysis. Lysis of microdissected tissue was carried out in 50% trifluoroethanol (TFE), 5 mM dithiothreitol (DTT), 25 mM ammonium bicarbonate (ABC) at 99°C for 45 min. followed by 5 min. sonication (Bioruptor, Diagenode). After centrifugation at 16,000 g for 10 min., the cleared protein lysate was alkylated with 20 mM iodoacetamide for 30 min. at room temperature. Upon vacuum centrifugation, digestion was carried out in 5% TFE, 50 mM ABC to which 0.15 µg of LysC and 0.15 µg of trypsin were added for digestion overnight at 37°C. The following day, digestion was arrested by adding trifluoroacetic acid (TFA) to 1% and the digestion buffer removed by vacuum centrifugation. Peptides were suspended in 2% acetonitrile, 0.1% TFA and purified on C18 StageTips. Finally, purified peptides were resolved in 2% acetonitrile, 0.1% TFA and the entire sample was injected for MS analysis in a single shot measurement. Protocols were adapted from Roulhac et al. andWang et al. (Wang et al., 2005, Roulhac et al., 2011).

LC-MS/MS analysis
LC-MS/MS analysis was performed on an EASY-nLC 1000 system (Thermo Fisher Scientific) coupled online to a Q Exactive HF mass spectrometer (Thermo Fisher Scientific) with a nanoelectrospray ion source (Thermo Fisher Scientific). Peptides were loaded in buffer A (0.1% formic acid) into a 50cm long, 75 μm inner diameter column in house packed with ReproSil-Pur C18-AQ 1.9 μm resin (Dr. Maisch HPLC GmbH) and separated over a 270 minute gradient of 2-60% buffer B (80% acetonitrile, 0.1% formic acid) at a 250 nl/min flow rate. The Q Exactive HF operated in a data dependent mode with full MS scans (range 300-1,650 m/z, resolution 60,000 at 200 m/z, maximum injection time 20 ms, AGC target value 3e6) followed by high-energy collisional dissociation (HCD) fragmentation of the five most abundant ions with charge ≥ 2 (isolation window 1.4 m/z, resolution 15,000 at 200 m/z, maximum injection time 120 ms, AGC target value 1e5). Dynamic exclusion was set to 20s to avoid repeated sequencing. Data were acquired with the Xcalibur software (Thermo Scientific).

LC-MS/MS data analysis
Xcalibur raw files were processed using the MaxQuant software v.1.5.5.2 (Cox and Mann, 2008), employing the integrated Andromeda search engine (Cox et al., 2011b) to identify peptides and proteins with a false discovery rate (FDR) of < 1%. Searches were performed against the Human or Mouse UniProt database (August 2015), with the enzyme specificity set as "Trypsin/P" and 7 as the minimum length required for peptide identification. N-terminal protein acetylation and methionine oxidation were set as variable modifications, while cysteine carbamidomethylation was set as a fixed modification. Matching between runs was enabled in order to transfer identifications across runs, based on mass and normalized retention times, with a matching time window of 0.7 min. Label-free protein quantification (LFQ) was performed with the MaxLFQ algorithm (Cox et al., 2014, Schaab et al., 2012, Tyanova et al., 2016a, Tyanova et al., 2015, Cox et al., 2011a, Cox and Mann, 2008 where a minimum peptide ratio count of 1 was required for quantification. Data pre-processing was conducted with Perseus software (Tyanova et al., 2016b); v.1.5.8.6 was used for human data and v.1.5.5.5 for mouse data. Data was filtered by removing proteins only identified by site, reverse peptides and potential contaminants. After log2 transformation, biological replicates were grouped and outlier removal was conducted for human samples. For human samples, we pursued analysis with the 4 most similar samples of each group after unsupervised hierarchical clustering. For mouse samples, we continued analyses with three replicates per group. LFQ intensities were filtered for valid values with a minimum of 3 valid values per group, after which missing data points were replaced by imputation. The resulting datasets were exported for further statistical analyses using R. Filtered, normalized and log2 transformed data were imported and PCA and unsupervised hierarchical clustering was performed. Plots were generated with ggplot2 v.3.1.1. (Wickham, 2016.), gplots v.3.0.1.1 (Gregory R. Warnes et al., 2019) and EnhancedVolcano v.1.0.1 (Blighe, 2019) R packages. Differential expression was conducted as described in the "Statistical Information" section on differential expression analysis.

Standards and solvents
Acetonitrile (ACN), methanol (MeOH) and water were of LC-MS grade and ordered at Fisher Scientific (Vienna, Austria) or Sigma Aldrich (Vienna, Austria). Ammonium bicarbonate, ammonium formate and ammonium hydroxide were ordered as the eluent additive for LC-MS at Sigma Aldrich. Formic acid was also of LC-MS grade and ordered at VWR International (Vienna, Austria). Sodium hydroxide (NaOH) was ordered from Sigma Aldrich (Vienna, Austria). Metabolite standards were purchased from Sigma Aldrich (Vienna, Austria) or Carbosynth (Berkshire, UK).

Sample preparation of mouse organs
Analysis was conducted with n = 5 biological replicates of wild type and Pten pc-/mice and n= 3 biological replicates of PtenStat3 pc-/mice. For wild type and Pten pc-/-, technical replicates (n=3) were made from 2 biological samples in each group. For PtenStat3 pc-/mice, technical replicates (n=3) were made for all biological samples. The prostates of sacrificed 19 week old mice were immediately collected, quickly washed in fresh PBS, snap-frozen in liquid N2, and stored on dry ice in Petri-dishes until extraction. Tissue pieces were transferred into glass vials and 50 µl of fully 13 C labelled internal standard from ISOtopic solutions e.U. (Vienna, Austria) and 950 µl extraction solvent were added (80% MeOH, 20% H2O, both LC-MS-grade (Sigma Aldrich, Vienna, Austria)). Subsequently, the tissue was homogenized with a probe sonicator head (Polytron PT 1200E handheld homogenizer, Kinematica) in the extraction solvent. After homogenization, the contents of the glass tubes were transferred to a 2 ml Eppendorf-tube and the glass tubes were washed two more times with 500 µl extraction solvent to transfer all tissue content to the Eppendorf-tube. The Eppendorf-tubes were thoroughly vortexed and kept on dry ice during the processing of other samples.
The samples were centrifuged (14,000 g, 4°C, 20 min), four 400 µl aliquots were extracted into LC-vials and 3x 100 µl were used for pooled quality controls (QC) for each sample, respectively. Remaining extraction solvent on the pellets was discarded. Aliquots were evaporated until dryness in a vacuum centrifuge. The dried samples and the high molecular pellets were stored at -80°C until measurement.

Quantification of metabolites with LC-HRMS
Before the LC-HRMS analysis, the samples were reconstituted in water with thorough vortexing, diluted either 1 to 100 or 1 to 40 in water, and adjusted to be in total of 500 µl 50: 50 H2O: ACN.
Quantification was carried out by external calibration using U13C-labelled internal standards. The internal standard always originated from the same aliquot as used for the extraction, and was diluted to the same extent as the sample.
Sample measurements were randomized, and within every 10 injections a pooled QC sample, a QC with standards and a blank was injected. HRMS was conducted on a high field Thermo Scientific™ Q Exactive HF™ quadrupole-Orbitrap mass spectrometer equipped with an electrospray source. Full mass scan data with resolution of 120 000, maximum injection time (IT) of 200 ms, automatic gain control (AGC) target of 1e6 in the mass range of 65-900 m/z was acquired with positive-negative-polarity switching.
Targeted analysis of the metabolomics data was carried out with Thermo Trace Finder 4.1 software. In all cases the [M-H]ion was extracted with 5 ppm mass tolerance.

Quantification of total protein content from pellet
The pellets resulting from extraction with 80% MeOH were dissolved in 0.2M NaOH solution, diluted 1:10 and quantified for total protein content with the Micro BCA Protein Assay kit from Thermo (Rockford, USA), according to the manufacturer's instructions.
Absolute metabolite amounts were normalized to the protein content. If multiple technical replicates were available from the same organ, the sum of metabolite concentrations was calculated for each metabolite and it was normalized with the sum of total protein content for that organ. This way, absolute metabolite amounts (nmol) were normalized to the total protein content (µg) from the tissue of origin.

QUANTIFICATION AND STATISTICAL ANALYSIS Statistical analysis of WGCNA clusters
Statistical analyses were conducted as described in the section "Weighted gene co-expression network analysis (WGCNA)". Generally, correlations were assessed by Pearson correlation and Students asymptotic p-values were calculated. P-values were adjusted by Benjamini-Hochberg method. Significance was defined as adj. p-value ≤ 0.05.

Differential expression analysis
Differential gene and protein expression analysis was conducted using limma v.3.38.3 (Ritchie et al., 2015) R package. Limma uses linear models and borrows information across genes using empirical Bayes method and is therefore applicable for analyses of high-dimensional omics data with limited sample size. RNA-Seq data was transformed using voom. Proteomic differential expression was calculated using the algorithm for single-channel microarray gene expression data. Groups for comparison were defined in a design matrix.
Linear models were fitted for expressions of each gene / intensities of each protein. Empirical Bayes method was used to borrow information across genes/proteins. Multiple testing correction was performed using the Benjamini-Hochberg method. Differential expression was defined as minimum log-FC ≥ 1 and adj. p-value ≤ 0.05.

Gene Set testing
For gene set testing of transcriptomic and proteomic data, the Ensemble Of Gene Set Enrichment Analyses (EGSEA) R package v.1.10.1 (Alhamdoosh et al., 2017), was used. EGSEA allows to use results from up to twelve Gene Set Enrichment (GSE)-algorithms, covering competitive and self-contained methods (Goeman and Bühlmann, 2007), to calculate collective gene set scores. We used the collective gene set score results from 11 of those methods, namely from ora, gage, camera and gsva (competitive null hypothesis) along with roast, safe, padog, plage, zscore, ssgsea and globaltest (self-contained hypothesis).
We tested for enrichment on all eight Molecular Signatures Database (MSigDB) provided collections, including Gene Ontologies (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways and hallmark gene sets (Liberzon et al., 2015, Liberzon et al., 2011, The Gene Ontology Consortium, 2018, Ashburner et al., 2000, Kanehisa et al., 2019, Kanehisa and Goto, 2000, Kanehisa et al., 2017, Subramanian et al., 2005. Significance was defined by an adj. p-value ≤ 0.05, adjustment method was Benjamini-Hochberg. For overrepresentation analyses of WGCNA gene clusters and genes correlated to STAT3 expression, please refer to the section "Weighted gene co-expression network analysis (WGCNA)" on biological theme comparison and GO enrichment of top 50 genes in the method details section.

Statistical analysis of targeted metabolomics data
To evaluate differences between 3 groups for 6 metabolites, ANOVA and Tukey Honest Significant Differences (HSD) test were performed for each metabolite after normality testing. Normality was tested using Pearson chi-square normality test and Levene's test for homogeneity of variance (center = median).
Visual inspection of the distribution of data was conducted using Q-Q plots and density plots. Significance was defined as p-value ≤ 0.05 after ANOVA and as adj. p-value ≤ 0.05 after TukeyHSD (95% family-wise confidence level). Since this was an exploratory experiment used for hypothesis generation, no p-value adjustment was performed between the six individual ANOVAs. Results of all ANOVAs can be found in Table S5. Statistical tests were performed using the R software environment with packages DescTools v.0.99.28 (Signorell and al., 2018) and nortest v.1.0-4 (Gross and Ligges, 2015). Plots were generated with ggpubr v.0.2 (Kassambara, 2018) and ggplot2 v.3.1.1 (Wickham, 2016.).

DATA AND SOFTWARE AVAILABILITY
The  Table S1, related to Figure 1, Figure S1 and S2. TCGA PRAD differentially expressed genes and enriched gene sets.

Exclusion of p14 ARF /CDKN2A from analysis
Based on previous findings (Pencik et al., 2015) which indicate the regulation of p14 ARF by STAT3, resulting in more aggressive tumors after loss of both, p14 ARF expression was initially included in the sample groups to allow for concurrent analysis of both STAT3 and p14 ARF .
From the TMA and patient cohort described in the Methods section, both STAT3 and p14 ARF protein expression were quantified by a pathologist after IHC staining (Pencik et al., 2015). Patients in the low and high STAT3 group have also low and high p14 ARF , respectively.
Since WGCNA cluster-trait analysis for both STAT3 and CDKN2A (the gene encoding for p14 ARF ) showed no influence of CDKN2A on respective STAT3 correlated clusters, p14 ARF was neglected from further analyses and the focus set solely on STAT3.