Integrating proteomics into precision oncology

DNA sequencing and RNA sequencing are increasingly applied in precision oncology, where molecular tumor boards evaluate the actionability of genetic events in individual tumors to guide targeted treatment. To work toward an additional level of patient characterization, we assessed the abundance and activity of 27 proteins in 134 patients whose tumors had previously undergone whole‐exome and RNA sequencing within the Molecularly Aided Stratification for Tumor Eradication Research (MASTER) program of National Center for Tumor Diseases, Heidelberg. Proteomic and phosphoproteomic targets were selected to reflect the most relevant therapeutic baskets in MASTER. Among six different therapeutic baskets, the proteomic data supported treatment recommendations that were based on DNA and RNA analyses in 10% to 57% and frequently suggested alternative treatment options. In several cases, protein activities explained the patients' clinical course and provided potential explanations for treatment failure. Our study indicates that the integrative analysis of DNA, RNA and protein data may refine therapeutic stratification of individual patients and, thus, holds potential to increase the success rate of precision cancer therapy. Prospective validation studies are needed to advance the integration of proteomic analysis into precision oncology.

potential explanations for treatment failure. Our study indicates that the integrative analysis of DNA, RNA and protein data may refine therapeutic stratification of individual patients and, thus, holds potential to increase the success rate of precision cancer therapy. Prospective validation studies are needed to advance the integration of proteomic analysis into precision oncology.

| INTRODUCTION
Cancer is considered to be a "disease of the genome". 1 Initiatives such as The Cancer Genome Atlas and the International Cancer Genome Consortium have been fundamental for a better understanding of the genomic basis of many frequent tumor entities. 2,3 Together with numerous studies focusing on rare tumor entities, 4,5 they have identified driver gene alterations. [6][7][8] However, while some tumor entities are initially caused by recurrent alterations affecting specific pathways or even particular genes, 4,5,9,10 the diversity of sequence variants, intratumor heterogeneity and cellular plasticity steadily increase with disease progression. 11 These secondary events may be attributed also to therapies and contribute to increasing tumor aggressiveness. 12 In precision oncology programs, such findings can now be exploited toward stratifying patients for therapy with targeted agents. 13,14 Unbiased whole-exome sequencing (WES) and whole-genome sequencing (WGS) are thus increasingly applied in precision oncology programs, and are often complemented by transcriptome profiling. 15 The Molecularly Aided Stratification for Tumor Eradication Research (MASTER) program was launched at the National Center for Tumor Diseases (NCT) in Heidelberg, Germany, in 2013. 16 16,17 Despite the clinical impact MASTER and similar genome-driven precision oncology programs make, the lack of information on the biological relevance of genomic variants in the specific clinical context of an individual patient poses a major challenge. 1,18 Genetic and transcriptomic data cannot predict protein levels and, even less, pathway activities in a way that would fully explain tumor biology. 19 As a consequence, response to targeted treatment is often hard to predict, particularly for an individual patient who presents with a complex personal spectrum of genomic variants. 14 Given that most targeted therapeutics act on proteins 20 and that posttranslational modifications such as protein phosphorylation play crucial roles in cell signaling, 21 it seems consistent to directly integrate proteomic and phosphoproteomic information into tumor profiling programs. 22,23 Yet, broad proteomic analysis is only emerging to be considered in precision oncology. [24][25][26] Here, we have quantified the levels of 27 total and phosphorylated proteins in 134 tumor samples from the MASTER program in a retrospective proof-of-concept study. The cohort included more than 15 disease entities to represent a broad range of expression levels as well as phosphorylation states for the analyzed proteins. Protein targets were selected to reflect pathway activities in six interventional treatment baskets currently recommended by MASTER. 16 We established rules for interpretation of the proteomic data and deduced treatment recommendations. These were compared with the recommendations previously made based on DNA/RNA analysis. 17 We demonstrate that proteomic data often confirm data from genomic analysis, but supports alternative baskets or indicates potential mechanisms of treatment failure or secondary resistance in other cases.

| Proteomic data
Expression levels of proteins and phosphoproteins were measured via reverse phase protein array (RPPA) technology and target-specific antibodies as previously described. 27 Briefly, tissues were lysed in T-PER tis- were measured for every ninth slide to evaluate total protein levels and to rule out false measurements due to potential evaporation effects during the printing process. Target proteins were specifically detected using primary antibodies having been validated with western blotting prior to the study. 28 All antibodies used throughout the study are described in Table 2. Arrays without primary antibodies were otherwise identically processed to serve as "blank" controls. Fluorescent signals were detected using an Odyssey Infrared Imaging System (LI-COR Biosciences, Lincoln, NE) at 700 nm and at 21 μm spatial and 16 bit optical resolution.

| Data analysis
Signal intensities were quantified using GenePixPro 7.0 (Molecular Devices, Sunnyvale, CA) and RPPA raw data preprocessing and quality control were performed using the RPPanalyzer R package. 29 Fast Green FCF total protein intensities, "blank" controls and dilution series were used for data normalization and quality control. 27 Only antibodies that showed significant enrichment over blank and had signals linear to total protein concentration from dilution series were included in the study. Signal intensities representing protein expression and phosphorylation levels were visualized in a heatmap using unsupervised hierarchical clustering 30,31 with Euclidean distance and Ward's agglomerative method. 32 Pairwise protein correlations as well as correlations between proteomic and transcriptomic data were estimated using the Pearson correlation coefficient 33 using GraphPad Prism 6.0. Pairwise protein correlations were plotted using the corrplot R package 34 using Tinn-R Editor 6.01.01.05 (https://nbcgib.uesc.br/tinnr/en/).

| Decision criteria for proteomic-based selection of interventional baskets
As RPPA generates relative protein expression data, signal intensities for every target (ie, total protein or phosphosite) were sorted for  received a score of 1. For every interventional basket, the scores for all proteins and phosphoproteins considered in that basket were then summed up to calculate a "total basket score." Patients were then ranked for this score and for every basket individually. Patients attaining a total basket score > =1.5 received a proteomic-based recommendation for this therapeutic group.
There were five refinements to this rule: 1. The RAF-MEK-ERK pathway was classified as active only if ERK1_ERK2_T202_Y204 was above the 75th percentile and at least one of BRAF and RAS total proteins were also above the 75th percentile.
2. Phospho-AKT (PI3K-AKT-mTOR basket) was tested with two different antibodies. As data strongly correlated, we took both sites into the calculation of scores by giving a score of 1 for patients above the 95th percentile and of 0.5 for patients above the 75th percentile for each antibody/phosphosite.
3. Patients ranking above the 95th percentile for ERBB2_Y1112 signals received a score of −2 while patients above the 75th percentile received a score of −1, because ERBB2_Y1112 is not associated with enhanced phosphotransferase activity of this protein. 35 4. In the Tyrosine Kinases basket, patients having a score of least 2 were directly regarded for this basket (Supplementary Table 3).
Patients with a score of 1 were added when they additionally had a score of at least 0.5 in downstream targets phospho-AKT and/or phospho-ERK1/2.

5.
In the basket DNA Damage Response, low levels of the considered proteins are indication of a potentially dysfunctional DNA damage response system and thus suggest potential clinical actionability. 36,37 Therefore, patients in the 5th percentile (ranks 1-6) received a score of 2 and patients in the 25th percentile (ranks 7-33) a score of 1 for the proteins in this basket for each target protein.

| Generation and initial validation of the proteomic dataset
Tumor samples were obtained from 134 patients recruited into the MASTER precision oncology program (https://www.nct-heidelberg.
de/forschung/nct-master.html), reflecting a broad spectrum of cancer entities ( Table 1, Supplementary Table 1 Table 2. In a first-quality assessment of the data, the validity and reproducibility of our approach were ascertained. Initially, potential batch effects resulting from laboratory factors such as personnel, sample batches and processing dates could be excluded (data not shown).
Next, we confirmed reproducibility of the applied method by testing  technology. In contrast to EGFR, the correlation coefficient between phosphorylation sites in ERBB2 was weaker when the tested sites were indication of different functionalities (Supplementary Figure 1F).
Although phosphorylation of residues Y1221 and Y1222 indicates activity of the ERBB2 receptor tyrosine kinase, residue phosphorylation at Y1112 has been associated with inhibition of phosphotransferase activity and marking of the protein for degradation. 35 Catalytic activity of ERBB2 should precede inhibition of this phosphotransferase activity, which could explain the correlation coefficient of r = 0.4881 and that 6/6 patients above the 95th percentile and 14/28 above the 75th percentile for ERBB2_Y1112 were still recommended for the Tyrosine Kinases basket based on the proteomic data (Supplementary Table 3).

| Correlation of signaling pathway activities within and between baskets
Next, we analyzed the pairwise correlation for all proteins having been tested in the respective interventional baskets. Consistent with the results obtained from the testing of different phosphosites within the same proteins, we observed strong correlations for most proteins and phosphoproteins within particular baskets ( Figure 1). Correlation was also high between proteins of related baskets, for example, proteins in the baskets Tyrosine Kinases and PI3K-AKT-mTOR as well as RAF-MEK-ERK were strongly correlated (P < .01). Activated tyrosine kinase/ PI3K/MAPK signaling was strongly correlated also with the expression levels of PD-L1 (P < .01) and this is in line with previous reports where PD-L1 was reported to be regulated by EGFR. 39 However, CTLA4 and PDL1 protein expression did not correlate, which is consistent with the tumors and cell types expressing those proteins as well as with their different activities. 40 Expression levels of neither protein correlated with tumor cell content (not shown).
The ERBB2_Y1112 and the ERBB4_Y1162 signals stood out in the Tyrosine Kinases basket as these two sites correlated best with RB1_S780 as well as with PDPK1_S241 (P < .01). These findings are in line with the functionalities that have been associated with the respective sites. ERBB2_Y1112 is connected with degradation of ERBB2 via recruitment of ubiquitin ligases 35 while ERBB4_Y1162 has been related to induction of cell growth, 41 which is in line with the observed correlation of the latter site with RB1_S780 (P < .01). PDPK1 phosphorylates AKT at residue T308 and is negatively regulated by 14-3-3 proteins. 42 Phosphorylation of PDPK1 at S241 increases its interaction with 14-3-3 proteins and could thus also explain the poor correlation between PDPK1_S241 and ATK_T308 in the proteomic dataset ( Figure 1).

| Entity-independent distribution of protein-based therapeutic recommendations
We then performed unsupervised hierarchical cluster analysis, which

| Decision criteria for protein-based therapeutic recommendations
Although the heatmap shown in Figure 2 was suggestive of pathway activities for particular patients, a more objective approach was needed to guide proteomics-based treatment recommendations. Since RPPA generates relative quantitative information, we next ranked patients for each tested protein/phosphosite among all other patients in the cohort, as was previously suggested. 46,47 We reasoned that proteins and phosphosites ranking in the upper and lower percentiles should have the highest predictive value for pathway activities.
Accordingly, all tumors showing signals above the 95th as well as the 75th percentiles for each target were marked (solid blue and red lines, respectively, in Figure 3A, colored boxes for individual patients in Figure 3B, and numerically for all patients in Supplementary Figure 3).
Based on this grouping, we defined decision criteria for proteinguided treatment recommendations: (a) signals and associated interventional baskets above the 95th percentile were given higher priority than signals mapping above the 75th percentile. (b) Active protein forms (ie, phosphorylated proteins) were given higher priority than total protein levels within a particular basket. (c) Concordant signals for different proteins within the same basket were regarded as strong indication of pathway activity. 46 Based on this rationale and other criteria that were specific for particular baskets and are described in detail in Section 2, all patients were given scores that were finally used to recommend one or the other interventional basket for treatment of respective patients. A detailed listing of patients, scores and baskets is provided in Supplementary Table 3 and the recommended baskets are also indicated for individual patients in Figure 2. Table 3 shows that the numbers of patients within every basket was very vari-  Table 3).
Although the phosphorylation states of some tyrosine kinases are indication of the activity of the respective signaling pathways and, thus, also suggestive of a potential therapeutic relevance, proteomic criteria may not be of similar relevance for other baskets. The smallest number of patients was assigned to the Immune Evasion basket based on proteomic information (20 patients). Firstly, just two proteins were evaluated for this basket and these two proteins did not correlate well ( Figure 1). This observation is in line with the literature having suggested that the predictive value of PD-L1 expression for successful immunotherapy is high, 48 while that of CTLA-4 appears to be of lower relevance. 49 Secondly, particularly in this basket, genomic analysis provides important information as to the therapeutic potential of immunotherapy for respective patients (eg, numbers of SNVs and indels, mutations in relevant genes). The value of proteomic information is, therefore, strongly affected by the number and predictive impact the marker proteins have in a particular therapeutic basket and should always be assessed in conjunction with molecular data from genomic analysis.

| Partial overlap of protein-based vs genomeand transcriptome-guided therapeutic recommendations
To evaluate this further, we next wanted to assess the overlap of rec-  Table 2 for details on  antibodies and Supplementary  Tables 2 and 3 Figure 2 and also in Supplementary Table 3.
Proteomic data supported the MASTER recommendation based on DNA and RNA analysis alone in 10% to 57% of cases, depending on the respective interventional basket (Table 3). It has been noted that some recommendations based on Genomic/Transcriptomic data had a basket termed "Other" (Table 3

| Assessment of individual cases
For a more fine-grained analysis and to fully leverage the integration of proteomic data into MASTER, we next exemplarily evaluated three patients who were suggested for the Cell Cycle Regulation basket based on proteomic data ( Figure 2B, Table 4, Supplementary Table 3).
Patient 139 had been diagnosed with a chondrosarcoma more than 7 years prior to inclusion into MASTER. Since WES of a metastasis had revealed an amplification of the CDK4 gene and the RB1 gene was found to be intact and expressed, as judged by RNA sequencing, the molecular tumor board recommended therapy with a CDK4/6 inhibitor. In addition, amplifications of EGFR and ERBB3 were found, rendering tyrosine kinase inhibitors targeting EGFR or ERBB3 potential therapeutic candidates. The proteomic data specifically supported the option of tyrosine kinase inhibition ( Figure 3B, Table 4). Several activating phosphosites on tyrosine kinases (ALK, ERBB2, ERBB3) ranked high. Also, phosphorylation of AKT1/2 at serine 473/474, indicative of active downstream signaling, was strongly elevated. In contrast, proteomic analysis did not show high expression levels of any of the proteins considered for the cell cycle regulation basket.
In the clinic, the patient had received treatment with trofosfamide,    The proteomic data indeed indicated an upregulation of ERBB4 activity; however, this was not reflected by a similarly upregulated signaling in downstream targets, that is, phosphorylation of ERK1/2 or AKT ( Figure 3, Table 4, Supplementary Table 3 Although this small overlap may be surprising at first sight, the poor correlation between proteomic and transcriptomic data (Supplementary Figure 4)  Along these lines, our findings suggest that the combination of molecular testing of DNA, RNA and protein should further improve rational therapy recommendations made by molecular tumor boards (our secondary objective). This is exemplified by two of the three cases described in some detail earlier. While for patient 139, the DNA damage response basket had some indication with support from both protein and DNA data, this was not the case for patient 133 (Table 4).
While DNA provides highly valuable information on single nucleotide as well as structural variants, RNA and, even more so, protein data inform on the presence and activities of a tumor's functional modules.
Our study indicates that molecular evidence for therapeutic recommendations in advanced-stage cancer patients, as it is currently used in precision oncology, should be complemented by the inclusion of proteomic data. Amplification and increased transcription of genes like CDK4/6 or loss of genetic material, for example, of CDKN2A/B, do not always translate into activation of a particular oncogenic pathway.
Levels of RNA and protein do not correlate well, and only direct protein analysis informs on expression levels as well as on activity states (eg, phosphorylation), suggesting this to be a superior readout of tumor physiology. This should require quantifying total as well as phosphorylated proteins to reflect protein and pathway activities. Proteomic data should thus help distinguish putative from true drivers and should be particularly valuable in cases where no strong driver events are detected by genetic analysis alone. To this end, other targeted as well as unbiased proteomic methods should be further examined in prospective trials for their value in clinical decision-making. 26,57 However, the results presented here warrant validation in prospective trials toward refining knowledge on functional tumor states and, in consequence, improving treatment recommendations and the success of targeted treatment. Especially in cases where proteomic data contradicts genetic findings, more research is warranted to assess the respective value of both methods. To our knowledge, this is the first study that has generated proteomic data for patients who had entered an ongoing precision oncology trial, aiming to assess the potential additional value of proteomic data. Summing up, our study suggests that proteomic data, once positively evaluated in prospective trials in combination with genomic and transcriptomic data, might add valuable information to the decision-making process in interdisciplinary molecular tumor boards and should potentially be integrated into precision oncology programs.

CONFLICT OF INTEREST
AS is member of the Science Advisory Board/Speaker's Bureau of Astra Zeneca, Bayer, BMS, Eli Lilly, Illumina, Janssen, MSD, Novartis, Pfizer, Roche, Seattle Genetics, Takeda, ThermoFisher and is supported with grants from Bayer, BMS, Chugai.

ETHICS STATEMENT
Patient tissue samples were collected with informed consent under protocol NCT MASTER, S-206/2011 in accordance with its regulations and after approval by the Ethics Committee of Heidelberg University.

DATA AVAILABILITY STATEMENT
The proteomic data that have been collected in the course of this study is made available in Supplementary Table 2. Processed data for each patient and every basket, including recommendations based on genomic/ transcriptomic analysis, is presented in Supplementary Table 3. Other data that support the findings of this study are available from the corresponding authors upon request.