Investigating the concordance in molecular subtypes of primary colorectal tumors and their matched synchronous liver metastasis

To date, no systematic analyses are available assessing concordance of molecular classifications between primary tumors (PT) and matched liver metastases (LM) of metastatic colorectal cancer (mCRC). We investigated concordance between PT and LM for four clinically relevant CRC gene signatures. Twenty‐seven fresh and 55 formalin‐fixed paraffin‐embedded pairs of PT and synchronous LM of untreated mCRC patients were retrospectively collected and classified according to the MSI‐like, BRAF‐like, TGFB activated‐like and the Consensus Molecular Subtypes (CMS) classification. We investigated classification concordance between PT and LM and association of TGFBa‐like and CMS classification with overall survival. Fifty‐one successfully profiled matched pairs were used for analyses. PT and matched LM were highly concordant in terms of BRAF‐like and MSI‐like signatures, (90.2% and 98% concordance, respectively). In contrast, 40% to 70% of PT that were classified as mesenchymal‐like, based on the CMS and the TGFBa‐like signature, respectively, lost this phenotype in their matched LM (60.8% and 76.5% concordance, respectively). This molecular switch was independent of the microenvironment composition. In addition, the significant change in subtypes was observed also by using methods developed to detect cancer cell‐intrinsic subtypes. More importantly, the molecular switch did not influence the survival. PT classified as mesenchymal had worse survival as compared to nonmesenchymal PT (CMS4 vs CMS2, hazard ratio [HR] = 5.2, 95% CI = 1.5‐18.5, P = .0048; TGFBa‐like vs TGFBi‐like, HR = 2.5, 95% CI = 1.1‐5.6, P = .028). The same was not true for LM. Our study highlights that the origin of the tissue may have major consequences for precision medicine in mCRC.

LM for four clinically relevant CRC gene signatures. Twenty-seven fresh and 55 formalin-fixed paraffin-embedded pairs of PT and synchronous LM of untreated mCRC patients were retrospectively collected and classified according to the MSI-like, BRAF-like, TGFB activated-like and the Consensus Molecular Subtypes (CMS) classification. We investigated classification concordance between PT and LM and association of TGFBa-like and CMS classification with overall survival. Fifty-one successfully profiled matched pairs were used for analyses. PT and matched LM were highly concordant in terms of BRAF-like and MSI-like signatures, (90.2% and 98% concordance, respectively). In contrast, 40% to 70% of PT that were classified as mesenchymal-like, based on the CMS and the TGFBa-like signature, respectively, lost this phenotype in their matched LM (60.8% and 76.5% concordance, respectively). This molecular switch was independent of the microenvironment composition. In addition, the significant change in subtypes was observed also by using methods developed to detect cancer cell-intrinsic subtypes. More importantly, the molecular switch did not influence the survival. PT classified as mesenchymal had worse survival as compared to nonmesenchymal PT (CMS4 vs CMS2, hazard ratio [HR] = 5.2, 95% CI = 1.5-18.5, P = .0048; TGFBa-like vs TGFBi-like, HR = 2.5, 95% CI = 1.1-5.6, P = .028). The same was not true for LM. Our study highlights that the origin of the tissue may have major consequences for precision medicine in mCRC.

K E Y W O R D S
colorectal cancer molecular classification, gene expression profile of primary and synchronous liver metastasis, molecular concordance between primary and liver metastasis

| INTRODUCTION
Colorectal cancer (CRC) is one of the most common cancers worldwide, with an estimated 1.2 million cases and over 600 000 deaths per year. 1 Due to its relatively asymptomatic progression, patients are frequently diagnosed with metastatic disease, which is associated with a five-year survival rate of around 10%. 2 Since biopsies and surgical tissue of metastatic lesions are difficult to obtain, treatment choice is mainly driven by the analysis of the archived primary tumor.
Coding mutations have been reported to be highly concordant between primary tumors (PT) and matched liver metastasis (LM). 3 This is also the case for epigenetic and microbiome profiles. [4][5][6] In contrast, copy number profiles are discordant 7,8 possibly pointing at larger genomic differences between PT and LM.
CRC can also be classified into different molecular subtypes based on gene expression patterns. [9][10][11][12][13][14][15] The different molecular subtypes are characterized by the activation of different biological processes, such as microsatellite instability (MSI) and immune infiltration signaling, canonical epithelial signaling activation, metabolic dysregulation and mesenchymal characteristics. Although these subgroups have different prognosis, their predictive value, especially regarding the efficacy of targeted agents, remains under investigation. In this context, the MoTriColor consortium is currently exploring the efficacy of specific treatment strategies in molecularly defined CRC subgroups. 16  Oncologico Veneto (IOV-Padua). Samples were collected from treatment-naive cases with synchronous liver metastases at time of diagnosis and available clinical-pathological annotations. We restricted our study to these inclusion criteria to exclude potential effects of earlier treatments or different metastatic locations. 19 Clinicopathological annotations included are reported in Table 1.

| Microarray processing and quality control
Total RNA was isolated from fresh-frozen and FFPE tissues with at least 30% of tumor cells. If possible, tissue enrichment was performed for samples that did not meet these criteria. RNA isolation and microarray processing were performed as described previously. 9,10,12,13 For fresh tissue, RNA was isolated using the RNeasy micro kit (Qiagen, Hilden, Germany). Quality was assessed using an RNA 6000 Nano total RNA-Chip (Agilent Technologies, Santa Clara, California). Only samples with RIN ≥ 6 were included in further analyses. Two hundred nanograms of total RNA were reverse transcribed, amplified and labeled with either Cy3 (sample) or Cy5 (reference sample) using the QuickAmp Labeling kit (Agilent Technologies), and subsequently purified using the Qiagen RNeasy mini kit. Cy3-labeled cDNA and Cy5-labeled cDNA were pooled (equimolar) and hybridized to the microarray.
For FFPE tissues, RNA was isolated using the RNeasy FFPE kit Probes that showed nonuniformity of the signal as identified by the feature extraction software were omitted from further analyses.
Image analysis of the scanned arrays was performed to quantify fluorescent intensities using Feature Extraction software version 9.5 and 11.5.1.1 (Agilent Technologies), for fresh and FFPE tissues, respectively. The feature extraction process included within-array normalization, which was performed using the default method for within-array normalization of Agilent microarrays (Lowess correction method using a linear polynomial [locally weighted linear least square regression]).
Background correction was not applied. The final data sets contained expression values for 32.164 unique probes for our entire cohort.
Expression values were calculated as sample/reference ratios using within-array normalized signals (log10[Cy3/Cy5]) for fresh tissue and represented the gMeanSignal intensities for FFPE tissue.

| Study population
To gain insights into the concordance of the transcriptomic profiles of PT and their matched LM, we collected 82 matched mCRC samples.
As summarized in Figure 1, 48 (= 24 pairs) fresh tissue samples were processed and all passed quality control (QC). The FFPE tissue cohort contained 76 samples (= 38 pairs) that were available for molecular subtyping. When we compared the success rate of sample processing on the gene expression array, we did not observe statistically significant differences between the fresh (94.4%) and the FFPE (87.8%) cohorts (P = .259).
We next sought to investigate if tissue preservation could have an influence on the gene expression read-out. To this end, we looked at the expression of genes belonging to the MSI-signature, the TGFBa-like signature and the CMS classification in the 11 patients' pairs for which we received both fresh and FFPE tissues.
Unsupervised clustering of these pairs showed that samples derived from the same patients were clustering together irrespective of tissue type (Figures 2A and S1A,B). This effect was most apparent when considering the MSI-like and TGFBa-like signatures (Figures 2A and   S1A). Therefore, we concluded that gene expression differences between samples from the same patient were mainly due to intratumor heterogeneity rather than tissue preservation method, as previously reported for other solid malignancies. 25

| Primary CRC and matched liver metastasis differ at gene expression level
We next aimed to investigate if transcriptomic profiles of the PT differed from those of their matched LM. Considering that the tissue preservation method did not influence the transcriptomic profiles, we combined the fresh and FFPE pairs. As reported in Figure 1, the final cohort of 51 successfully profiled matched pairs derived from 13 fresh pairs and 38 FFPE pairs, combined together. For patient characteristics, see Table 1. Overall, the distribution of the major clinicalpathological characteristics was similar between the three centers, except for tumor grading (P < .001), with samples from ICO being mainly characterized by well-differentiated tumors.
Furthermore, unsupervised clustering of the transcriptome profiles of these 51 matched pairs showed two major clusters without any obvious correlation with molecular subtyping calls or categorical clinical-pathological variables. As reported in Figure 2B Figure 3 as well as in Table S1.
Overall, we observed high concordance for the BRAF-like and the MSI-like signatures between PT and LM, while lower concordance was observed for the TGFBa-like signature and the CMS classification. Four PT were classified as BRAF wt-like while their matched LM were classified as BRAF m-like. One PT was classified as BRAF m-like while its matched LM was classified as BRAF wt-like ( Figure 3A). The overall concordance in terms of BRAF-like signature between PT and LM was 90.2%; the number of switches was not statistically significant (P = .177) (Table S1). Only one matched pair was not concordant in terms of MSI signature, with the PT classified as MSI-like and its matched LM as MSS-like ( Figure 3A). The overall concordance of MSI-like signature between PT and LM was 98%; the number of switches was again not statistically significant (P = .313; Table S1).
Two pairs switched from AB/TGFBi-like in the PT to C/TGFBalike in the LM ( Figure 3A). More importantly, 10 out of 14 pairs (71%), whose PT were classified as C/TGFBa-like, were classified as AB/TGFBi-like in their matched LM showing an overall concordance of 76.5% (Table S1). This significant switch (P = .020) was also observed for the CMS4 classification ( Figure 3B). Thirteen out of 32 pairs (40.6%), whose PT were classified as CMS4, were classified as CMS2 in their corresponding LM. One pair, whose PT was classi-   Figure S4A). These results confirmed that tumors classified as positive for a mesenchymal phenotype have a worse prognosis when compared to tumors classified as nonmesenchymal. 14,15 In contrast, no mOS differences were observed among LM classified as CMS2 vs CMS4 (51.6 vs 42.1 months, respectively, HR = 1.5, 95% CI = 0.7-3.5, P = .28; Figure 4B) and TGFBa-like vs TGFBi-like (59.7 vs 45.4 months, respectively; Figure S4B). Finally, when we compared matched pairs that switched phenotype with the ones that did not switch phenotype, we did not observe major differences. Even if exploratory, these analyses confirmed previous observations, 14 Despite these limitations, our cohort represents a unique series of synchronous mCRC where only LM were analyzed. By keeping in mind the limitations above reported, our data suggest that the transcriptomic profile of the PT is the driver of patient outcome rather than the profile of their matched LM. This may indicate that the PT has intrinsic properties that are constant despite changes induced by a different microenvironment. Our data argue in favor of using the PT rather than the distant metastases, for molecular analyses of mCRC.