Consensus molecular subtype transition during progression of colorectal cancer

The consensus molecular subtype (CMS) classi ﬁ cation divides colorectal cancer (CRC) into four distinct subtypes based on RNA expression pro ﬁ les. The biological differences between CMSs are already present in CRC precursor lesions, but not all CMSs pose the same risk of malignant transformation. To fully understand the path to malignant transformation and to determine whether CMS is a ﬁ xed entity during progression, genomic and transcriptomic data from two regions of the same CRC lesion were compared: the precursor region and the carcinoma region. In total, 24 patients who underwent endoscopic removal of T1 – 2 CRC were included. Regions were subtyped for CMS and DNA mutation analysis was performed. Additionally, a set of 85 benign adenomas was CMS-subtyped. This analysis revealed that almost all benign adenomas were classi ﬁ ed as CMS3 (91.8%). In contrast, CMS2 was the most prevalent subtype in precursor regions (66.7%), followed by CMS3 (29.2%). CMS4 was absent in precursor lesions and originated at the carcinoma stage. Importantly, CMS switching occurred in a substantial number of cases and almost all (six out of seven) CMS3 precursor regions showed a shift to a different subtype in the carcinoma part of the lesion, which in four cases was classi ﬁ ed as CMS4. In conclusion, our data indicate that CMS3 is related to a more indolent type of precursor lesion that less likely progresses to CRC and when this occurs, it is often associated with a subtype change that includes the more aggressive mesenchymal CMS4. In contrast, an acquired CMS2 signature appeared to be rather ﬁ xed during early CRC development. Combined, our data show that subtype changes occur during progression and that CMS3 switching is related to changes in the genomic background through acquisition of a novel driver mutation ( TP53 ) or selective expansion of a clone, but also occurred independently of such genetic changes.


Introduction
Colorectal cancer (CRC) is a highly heterogeneous disease, presenting with significant variation in tumour biology, prognosis, and response to therapy.Part of this heterogeneity is already reflected at its premalignant stage and several distinct molecular pathways of CRC development have been described.
The majority of sporadic CRCs develop through the adenoma-carcinoma sequence as proposed by Fearon and Vogelstein [1].This sequence describes a step-wise accumulation of somatic mutations initiated by truncating mutations in APC (detected in 80% of cases), followed by activating mutations in oncogenes, mainly KRAS (43%), and inactivating mutations in tumour suppressor genes such as TP53 (60%).TP53 mutations can contribute to the development of chromosomal instability (CIN), which is a key molecular event in tumour progression [2][3][4].Although specific mutations are associated with particular stages of tumour development, the order in which these aberrations occur can vary between CRC lesions.
Alternatively, 15% of non-metastatic CRCs develop via a microsatellite instability (MSI) route, which is caused by a defective DNA mismatch repair (MMR).As a consequence, MSI tumours harbour numerous mutations in particular within highly repetitive microsatellite regions.MMR deficiency in sporadic CRC is generally the result of CpG island promotor methylation and subsequent inactivation of the MLH1 gene [5].In addition, MSI tumours are associated with mutations in the BRAF oncogene [4,6].
Over the last 20 years, it has become increasingly clear that BRAF mutations and a CIMP-high phenotype are associated with an alternative pathway of CRC development: the 'serrated neoplasia pathway' [7][8][9][10].These molecular changes appear to be crucial first steps in this pathway, as they can already be detected in the most prevalent precursor lesion of this pathway: sessile serrated lesions (SSLs) [11,12].Another hallmark of the serrated pathway is the acquisition of MSI, caused by MLH1 promotor hypermethylation, which is not generally found in the conventional adenoma pathway [13].
Other less common pathways of CRC carcinogenesis involve mutations in POLE, resulting in ultra-mutated tumours in the absence of MMR deficiency, and biallelic germline variants in MUTYH, which encodes a base excision repair protein, causing MUTYH-associated polyposis [14,15].
In order to address the heterogeneity present in CRC, a stratification system was proposed dividing CRC into four distinct subtypes based on RNA expression profiles [16].In brief, these consensus molecular subtypes (CMSs) are characterised by CMS1 -MSI status, hypermutation, and immune infiltration; CMS2activation of WNT and MYC signalling; CMS3metabolic deregulation; and CMS4stromal infiltration and angiogenic activation.
Prior research has demonstrated that the biological differences between CMS subtypes are already installed at the premalignant state.For instance, CMS1 and CMS4 expression patterns have been linked to SSLs [17][18][19], while the majority of sporadic and hereditary conventional adenomas have been linked to expression patterns observed in the epithelial subtypes CMS3 and partly also CMS2 [18,20,21].Although CMS3 was established as the main precursor CMS subtype, CMS2 was linked to precursor lesions at higher risk of malignant transformation [20].In addition, the specification into a specific cancer subtype may occur early in tumour development, with certain precursor lesions predisposed to transforming into a carcinoma of a particular CMS subtype.For example, SSLs can progress towards MSI + CMS1 cancers but also have the potential to transform towards CMS4 CRCs, which appears to involve the onset of TGFβ-mediated signalling [22].In contrast, cancers belonging to epithelial subtypes (CMS2 and CMS3) are thought to originate from tubular adenomas via the classical Vogelstein sequence [19,22].
The aforementioned studies provide important insights into the role of CMS subtypes in CRC precursor lesions.Not all CMSs seem to pose the same risk of malignant transformation and it appears that progression into a specific cancer subtype is already installed at the premalignant stage, which could have important implications for diagnostic, preventive, and therapeutic strategies.Importantly, molecular subtypes in CRC precursor lesions have only been evaluated on sets of nonprogressed precursor lesions, and since only 5% of these lesions progress to CRC [23], this may not provide a complete picture.To fully understand the path to malignant transformation of each CMS, data are warranted on CRC lesions and their precursor of origin.To this end, we analysed and compared transcriptomic and genomic data from two different regions of the same CRC lesion: (1) the precursor region and (2) the carcinoma region in a unique set of patients with early-stage CRC.This approach allowed us to evaluate the role of CMS subtypes in the development of CRC and to gain insight into genetic alterations that occur during progression from precursor lesions to CRC.

Study population
Patients who underwent endoscopic resection for early-stage CRC at the Amsterdam University Medical Centers, location AMC, between 2016 and 2021 were selected retrospectively.All medical records were reviewed, and patients were excluded if they had a hereditary predisposition for CRC, a history of inflammatory bowel disease, evidence of neoadjuvant treatment, or when the lesion was resected piecemeal instead of en bloc.Informed consent was subsequently obtained from all patients involved in this study and the study was conducted according to the guidelines of the Declaration of Helsinki and The Netherlands Code of Conduct that defines the (clinical) research integrity principles for institutions in The Netherlands.Ethical approval for collecting and analysing fresh adenoma samples was given by the medical ethics committee of Amsterdam University Medical Centers, The Netherlands (METC2015_206).Ethical review and approval were waived for analysis of archival colorectal cancer tissue.

Pathology review
Haematoxylin and eosin (H&E) slides and formalinfixed, paraffin-embedded (FFPE) tumour tissue blocks Consensus molecular subtype transition during colorectal cancer progression were collected.Two expert pathologists (AF and HK) reviewed the H&E slides to confirm the diagnoses of T1 or T2 CRC.The amount of carcinoma and precursor tissue, i.e. the residual adenoma region, was assessed and patients were excluded if this was estimated to be insufficient for RNA and DNA extraction.Next, the precursor adenoma region was categorised by level of dysplasia (low grade or high grade) and by morphology (tubular adenoma, tubulovillous adenoma, villous adenoma, or serrated lesion).

Extraction of RNA and DNA
Five 10-μm-thick sections with flanking H&Es were prepared and FFPE tissue deparaffinisation was performed using xylene.RNA and DNA were extracted from two different parts in a histologically defined, single cancer mass: (1) the precursor region and (2) the carcinoma region.Precursor regions were defined as parts showing low-grade or high-grade dysplasia.Carcinoma regions were defined as the parts that showed invasion through the muscularis mucosae.Both parts were encircled for macro-dissection (Figure 1A).In cases where multiple precursor or carcinoma regions were identified within a single cancer mass, these parts were pooled to ensure sufficient RNA and DNA yield.The precursor and carcinoma regions were either directly next to each other within the same tissue block or from adjacent tissue blocks.RNA and DNA were extracted from the same tissue sections using the AllPrep DNA/RNA FFPE Kit (Qiagen, Hilden, Germany) according to the manufacturer's instructions.RNA quantity and quality were measured using a NanoDrop 2000 Spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA) and the TapeStation System (Agilent Technologies, Waldbronn, Germany).DNA quantity was assessed using a Qubit 2.0 Fluorometer (Thermo Fisher Scientific).

CMS classification
The NanoString platform (NanoString Technologies, Seattle, WA, USA) was used for gene expression profiling, applying a custom nCounter codeset developed by our team.This codeset allowed for a highly reliable classification of FFPE samples using a validated algorithm.Details of this approach have been published elsewhere [24].The codeset contains 62 target genes and three 'housekeeping' reference genes.The NanoString platform is a reliable and robust platform for samples with partly degraded RNA [25][26][27].After log2 transformation, posterior probability scores for each subtype were calculated using NanoClassifier and the nearest subtype with the highest probability score was used for further analyses.In addition, a set of six switching and six non-switching pairs of precursor and carcinoma regions were profiled with a larger NanoString set (nCounter ® Tumor Signaling 360™ Panel, NanoString Technologies) to obtain more extensive insight into the transcriptome changes observed.

Mutation analysis
All samples, both precursor and carcinoma regions, were sequenced using the TruSight Oncology 500 (TSO500) DNA NGS panel (Illumina, San Diego, CA, USA).The ready-to-use TSO500 workflow, developed by Illumina, was utilised for data analysis, generating an individual file per sample with all passing variants and microsatellite instability data as output.Non-synonymous variants with a read depth greater than or equal to 50 and a population frequency less than 0.01 were filtered for analysis.

MSI status
MMR deficiency was determined by immunohistochemistry for the four MMR proteins (MLH1, MSH2, MSH6, and PMS2) in 20 carcinomas.MMR-proficient tumours were classified as microsatellite-stable (MSS).All lesions were additionally classified according to MSI status based on TSO500 data (Illumina), which tests microsatellite sites for evidence of instability, relative to a set of baseline normal samples.The percentage of unstable microsatellite sites in the total assessed microsatellite sites is reported and the threshold for classification as MSI was >10%.The MSI status of all cases that were analysed using both methods corresponded perfectly.

Digital quantification of stroma
The precursor and carcinoma regions were annotated on digital slides.The quantification of stroma was performed using QuPath (version 0.4) [28].In cases where multiple precursor or carcinoma regions were present, the mean stroma percentage was used.

Set of benign adenomas
Fresh adenoma samples (n = 85) were collected and snap-frozen immediately after resection from patients in the endoscopy programme for the removal of large (>1 cm) colorectal adenomas at the University Medical Centers Amsterdam (location AMC).All patients provided written informed consent (METC2015_206).mRNA was isolated using mechanical disruption with a Heidolph SilentCrusher M Homogeniser (VWR International, Radnor, PA, USA) in combination with TRI Reagent (Sigma, St Louis, MO, USA), followed by an ISOLATE II RNA Mini Kit (Bioline, QC-Biotech, Alphen ad Rijn, The Netherlands) according to the manufacturer's instructions and used for RNAseq analysis.Subsequent CMS classification was performed with the identical CMS classifier that was used for the NanoString data in order to maintain consistency.CMS classification using the single sample predictor function, provided by the CMSclassifier R package [16], gave identical CMS classification in 95.3% of the cases.

Statistical analysis
Baseline characteristics were analysed using standard descriptive statistics.Contingency tables were constructed to compare the distribution of CMS subtypes between different lesions (precursor versus carcinoma region and low-grade versus high-grade precursor regions) and tumour location.Pearson's χ 2 test was used to test for statistical significance and a p value less than 0.05 was considered significant.The stroma percentages are depicted as mean with standard deviation, and a two-sided t-test was used to compare precursor and carcinoma regions.Analyses were performed using

Study population
In total, 37 patients met the inclusion and exclusion criteria.After pathological review, nine patients were excluded because of insufficient tissue quantity in either the carcinoma and/or the precursor region.For four patients, no tissue was available, resulting in a final panel of 24 patients included for analysis.The average age was 65.2 years (± 10.1) and 15 (62.5%) patients were male (Table 1).The distribution of lesion location was 20.8%, 33.3% and 45.8% for the right colon, left colon, and rectum, respectively.The majority of the precursor regions were classified as tubular adenomas (70.8%), and 66.7% showed high-grade dysplasia.An overview of patient, precursor, and carcinoma characteristics is presented in supplementary material, Table S1.

CMS classification
We compared the CMS classification of two parts within the same lesion: (1) the precursor region and (2) the carcinoma region (Figure 1A).With the use of our NanoString classifier, precursor regions were, in the majority of cases, assigned to CMS2 (16/24, 66.7%), while in seven patients (29.2%) these regions were classified as CMS3 and in only one case as CMS1 (Figure 1C).Importantly, none of the precursor regions were subtyped as CMS4.Subtyping of the carcinoma regions revealed an equal number of CMS1 (n = 1) and CMS2 (n = 16) compared with the precursor regions.The number of CMS3-typed cases was lower in the carcinoma regions (n = 3, 12.5%), compared with the precursor regions.Four (16.7%) carcinoma regions were subtyped as CMS4, while this subtype was absent in the precursor regions, indicating that the prevalence of CMS4 was significantly different between the precursor and carcinoma regions ( p = 0.037).CMS2 was more often located at the left colon or rectum ( p = 0.011), while CMS4 was more often observed within the right colon (p = 0.012).
Next, we applied the CMS classification on a set of 85 large (>1 cm) adenomas without evidence of malignant transformation.The majority were tubulovillous adenomas (n = 57, 67.1%), followed by tubular adenomas (n = 17, 20.0%) and SSLs (n = 7, 8.2%).High-grade dysplasia was present in five (6.6%) adenomas.A striking difference was observed in the CMS distribution in this set.As opposed to the precursor regions, almost all benign adenomas were classified as CMS3 (n = 78, 91.8%), while only seven (8.2%) were assigned to CMS2 (Figure 1B).This dominance of CMS3 in precursor lesions is in line with previous studies [18,21,29].Importantly, our NanoClassifier approach was further validated using the set reported by Chang et al [18], as classification of these precursor samples with our classifier revealed a similarly high number of CMS3 classified samples (337/375, 90.1%).This implies that CMS3-like gene expression within adenomas could be regarded as an 'early' subtype.
To elaborate on this idea, we determined the level of dysplasia in the subtyped precursor regions of our set of 24 cancers and this revealed a significant difference in low-grade and high-grade dysplasia (p = 0.002).In almost all CMS3 classified precursor regions (n = 6 out of 7, 85.7%), low-grade dysplasia was detected, while in contrast, the vast majority of CMS2-typed regions displayed high-grade dysplasia (n = 14 out of 16, 87.5%) (Figure 1C).

MSI status
It is well established that both adenomas and SSLs are precursors of CRC, each following a distinct sequence of molecular aberrations: the conventional adenomacarcinoma sequence and the serrated neoplasia pathway.Acquisition of MSI is an important hallmark in the serrated neoplasia pathway, as opposed to the adenoma-carcinoma sequence.To evaluate which pathway might be at play in our set, we determined the MSI status of all lesions.
Both the precursor region and the carcinoma region of one patient were MSI (patient 19).Accordingly, the precursor region of this lesion was morphologically categorised as an SSL.The molecular aberrations seen  S1).

Comparison of the classification between matching regions within one lesion
In most cases (n = 16, 66.7%), a match was seen between the subtype observed in the precursor region and that in the carcinoma region.However, in the other eight (33.3%)cases, a shift in the CMS classification occurred between the separate regions within the same CRC lesion (Figure 2A).Intriguingly, almost all (six out of seven) CMS3 precursor regions showed a shift to a different subtype in the carcinoma part of the lesion.In four cases, the corresponding carcinoma area was classified as CMS4, while the other two cases showed a shift towards CMS2.Strikingly, two CMS2 precursor regions switched to CMS3 in their corresponding carcinoma region.Apart from this, the CMS2 subtype remained remarkably stable, suggesting that an acquired CMS2 signature is rather fixed between these two stages of progression (Figure 2B).The same appeared to be the case for CMS1, although such a Consensus molecular subtype transition during colorectal cancer progression 303 conclusion would require a much larger set of CMS1-typed lesions.Importantly, some regions, such as C-014, showed a mixed CMS probability score (Figure 2B).Nevertheless, also in these cases the composition of the CMS signatures clearly changed between the precursor and carcinoma regions, reflecting a true CMS switch.

Subtype switching through progression
Multiple reasons exist as to why subtype switching could occur from the precursor to the carcinoma stage.First, one could envision that a complete distinct clonal representation exists within these two regions of a lesion.Second, the subtype switching could be a consequence of progression of a clone or multiple clones from a relatively benign stage towards a more aggressive carcinoma stage.Third, one could envision that subtype switching could be the result of a changed microenvironment leading to a differential gene expression pattern.To discriminate between these options, we evaluated the mutational profiles of both regions (precursor and carcinoma) of all samples using TSO500, a next-generation sequencing assay that analyses 523 cancer-relevant genes.This analysis allowed for the identification of clonal differences or clonal progression between the regions.First, the presence of mutations in known driver genes for CRC carcinogenesis was evaluated [4].Alterations in APC (n = 41, 85%) and KRAS (n = 35, 73%) were most common, followed by TP53 (n = 18, 38%) and BRAF and FBXW7 (both n = 8, 17%) (Figure 3).APC (p = 0.021) and TP53 ( p < 0.001) mutations were more frequently present in CMS2 lesions, corresponding to the classical carcinogenesis sequence, i.e. the Vogelgram, known to be related to this subtype.Both CMS1 samples (precursor and matching carcinoma region) harboured a BRAF V600E mutation and a truncating mutation in RNF43, which is consistent with a classical MSI colon cancer.
Next, we compared the mutational profiles between matching precursor and carcinoma regions of the same CRC lesion (Figure 3).All APC and KRAS mutations present in carcinoma regions were already acquired at the precursor stage, supporting the conventional view that these mutations are early events in CRC carcinogenesis [1].Interestingly, the majority (13 of 16) of non-switching lesions showed a relatively stable set of mutations, indicating that the same clone(s) were detected in the two regions of the lesion.In patients 10, 11, and 22, who all showed a stable CMS2 classification, a clear expansion of pre-existing mutant clones was observed (Figure 4A).
As mentioned previously, subtype switching could be a consequence of a distinct clonal representation at the carcinoma stage.When patients who showed switches in their CMS class between precursor and carcinoma region were analysed, the data revealed a clearly different genotype in three out of eight cases where a novel TP53 mutation was acquired at the carcinoma stage.This suggests that TP53 mutation acquisition is a possible explanation for some of the observed CMS subtype switches.
Next to de novo mutations, the data were also analysed for evident clone expansion and differential clonal representation between the two regions.In order to evaluate this, we performed mutation analysis based on the variant allele frequencies (VAFs) of all non-silent mutations (supplementary material, Figure S1).In three cases, clone expansion could have induced the observed CMS switch: patients 5, 16, and 24.In patient 24, the VAF of BRAF V600E and SMAD4 mutants increased, while in patient 16, a clear increase in the TP53 and BRAF V600E fraction was observed between the precursor and carcinoma regions (Figure 4A).This selection of a clone among other subclones could well induce a more aggressive tumour state and subsequent switch from CMS3 to CMS4 as observed in these cases.In addition, as BRAF mutations, present in both carcinoma regions, are associated with a higher stromal content, this could well explain the detection of CMS4 in these carcinoma areas.In patient 5, this analysis revealed expansion of the APC and TP53 fraction between the precursor and carcinoma regions.
Combined, these data indicate that the majority of the switching cases could be explained by a change in the genomic background of the lesion.That is, in some cases, progression was associated with the acquisition of a novel driver mutation (TP53) and in other cases with selective expansion of a clone that became more dominant in the carcinoma region.
Interestingly, a similar genomic background was observed in the three remaining patients who showed a switch in their CMS class.One patient switched from CMS2 to CMS3, while the other two patients showed a switch from CMS3 to CMS4.Next to genomic alterations, changes in gene expression patterns could be the result of a remodelled microenvironment, which could be particularly involved in the transition to CMS4.In order to evaluate this, we performed a digital quantification of the amount of stroma within each region.Overall, the stroma percentage was higher in the carcinoma regions compared with the precursor regions (50% ± 11% versus 36% ± 11%, p < 0.001).This increase in stroma was observed in cases that switched to CMS4 as well as in cases that did not show this switch.Importantly, there was no enrichment in stromal percentages when the CMS2 and CMS4 carcinoma regions were compared, indicating that there was no association between the amount of stroma and assignment to CMS4, pointing to a more tumour intrinsic change of gene expression that determined the CMS4 classification (Figure 4B).In agreement, when assessing a larger NanoString gene set (Tumor Signaling 360) on six switching cases in comparison with six non-switching CMS2 cases, we could confirm that typical CMS2 (cell cycle regulation) and CMS4 features (TGFβ pathway activation) were indeed associated with the respective CMS2 and CMS4 samples (Figure 4C).Moreover, this analysis revealed that a strong change in gene expression was evident when going from the precursor to the carcinoma stage in, for instance, stromal enrichment, while this 304 S van de Weerd et al was not the case for immune regulation (Figure 4D).In conclusion, our data revealed that subtype switching is related to changes in the genomic background, but also occurred independently of such genetic changes.

Discussion
In this study, we analysed transcriptomic and genomic data of matching precursor and carcinoma regions in a unique set of early-stage CRC.CMS2 was the most prevalent subtype in precursor regions and an acquired CMS2 signature appeared to be relatively stable during progression to malignancy.On the contrary, CMS3 was related to a more indolent type of precursor lesion that had a lower likelihood of developing in CRC.However, if these CMS3 precursor lesions did progress, they often underwent a subtype switch and transformed into CMS2 or CMS4 cancers.Importantly, such subtype switching occurred in 1/3 of the samples analysed, indicating that this is not a rare event during progression.

Consensus molecular subtype transition during colorectal cancer progression 305
Our classification results of benign adenomas are in line with previous studies.Komor et al reported CMS3 as the most prevalent subtype, followed by CMS2 [20].In contrast, in a larger set of precursor lesions, Chang et al established CMS2 as the main subtype [18].These inconsistencies can be well explained by differences in the applied bioinformatic approaches.Two CMS classification methods for CRC have been developed: the Random Forest classifier (RF) and Single Sample Predictor (SSP) [16].Chang et al stratified their set using the RF classification method, but in contrast to the RF classifier, the SSP method is not influenced by the composition of the dataset on which it is applied and therefore we believe that the SSP is more suitable in a setting that contains a distribution of subtypes that is distinct from the original set.In agreement, when Chang et al applied the SSP method to their dataset, almost all lesions were classified as CMS3 (n = 370, 97.6%) [18].
Taken together, CMS3 has been well established as the most prevalent subtype in multiple sets of benign adenomas, suggesting that CMS3 might be an 'early' or precursor subtype.It follows from this that CMS3 adenomas could be less likely to progress into CRC.In addition, the propensity to progress might be related to the possibility to transit into another subtype.In contrast, the enrichment of CMS2 in our set of precursor regions, as compared with benign adenomas, suggests that CMS2 adenomas might be at higher risk for malignant transformation or are already transiting towards malignancy.Accordingly, high-risk adenomas, as defined by the presence of two or more cancer-associated events, have previously been linked to CMS2 [20,30].
Currently, the classification of CRC precursor lesions is primarily based on endoscopic and histopathological appearances.In current practice, most guidelines guide the timing of post-polypectomy surveillance colonoscopies by number, size, and dysplasia grade of the polyps removed [31].Shorter time intervals are recommended in the presence of five or more adenomas or in the case of one or more high-risk adenomas, defined as a lesion of ≥10 mm or presenting with high-grade dysplasia.In this set, 14 out of 16 CMS2 precursor regions presented with high-grade dysplasia and would thus have been identified as high-risk adenomas as defined by current guidelines.This indicates that the CMS classification may not provide strong additional value to the current identification of high-risk adenomas but rather represents a different way of identifying high-grade dysplasia.
The most important conclusion from this study is that a switch in CMS subtype between matching precursor

306
S van de Weerd et al and carcinoma regions occurs in a substantial number of cases (33%).This implies that CMSs are not fixed entities and can evolve throughout tumour development.Distinct pathways of CRC development have been described and specific precursor lesions have been suggested to develop into different subtypes of CRC.The epithelial subtypes (CMS2 and CMS3) are thought to progress via the conventional adenomacarcinoma sequence, while CMS1 and CMS4 CRCs have been suggested to originate from sessile serrated lesions, linked to the 'serrated neoplasia pathway' [19,22].Our findings now clearly point to the possibility that precursor regions classified as CMS3 tubular adenomas have the potential to progress into poor-prognosis CMS4 cancers, which was unexpected based on current knowledge.Importantly, switches in CMS were particularly related to CMS3 precursor regions, which almost all displayed a shift, while an acquired CMS2 signature appeared to be relatively fixed at this stage in tumour progression.The same seemed to be the case for CMS1, although such a conclusion would require a much larger set of CMS1-typed lesions.CMS1 is highly related to MSI and the number of MSI cases in our set is lower than the reported incidence in non-metastatic CRC (15%).This is in line with previous studies that reported that MSI is rare in adenomas and T1 CRC [20,32,33].It is suggested that lesions progress rapidly after acquiring MSI, leaving a small window of opportunity to be detected at early stages, which could explain the low number of CMS1 lesions as observed in this set (n = 1) and previous reports [29].
Mutation analyses of matching precursor and carcinoma regions revealed that de novo TP53 mutations in the carcinoma region are a plausible explanation for some of the observed CMS subtype switches.TP53 inactivation is an important hit in the classical Fearon and Vogelstein pathway and usually occurs after APC loss and activation of KRAS.Inactivating mutations in APC and TP53 together are sufficient to induce CIN [3], which is an important hallmark of tumour progression.As mentioned before, CMS2 is highly related to this pathway and indeed APC and TP53 mutations were associated with CMS2 in our set.In contrast, TP53 mutations are known to be less prevalent in CMS3 [16].Considering this together, one could envision that TP53 mutation acquisition induces progression from an 'early' CMS3 precursor lesion towards a more advanced CMS2 carcinoma, as was observed in two patients in our set.
The third patient that acquired a TP53 mutation progressed from CMS3 to CMS4.Importantly, the carcinoma region of this case also showed a de novo SMAD4 mutation and expansion of a BRAF V600E clone.Expansion of a BRAF clone also induced transition into CMS4 in patient 16.In agreement, in both mouse models and patient samples, BRAF mutations have been associated with a more malignant phenotype and stromal-rich cancers [34,35].
In conclusion, our data indicate that CMS3 is related to a more indolent type of precursor lesion that less likely progresses to CRC and when this occurs, it is often associated with a subtype change.Importantly, CMS3 precursor lesions also have the capacity to progress into the mesenchymal, poor-prognosis CRC subtype.In contrast, an acquired CMS2 signature appears to be rather fixed during early CRC development.CMS switching occurred in a substantial number of cases and was related to changes in the genomic background through acquisition of a novel driver mutation (TP53) or selective expansion of a clone, but also occurred independently of such genetic changes.

Figure 1 .
Figure 1.Overview of study approach and CMS classification results.(A) Histology of a representative colorectal cancer lesion.Precursor regions (dashed lines) and carcinoma region (solid line) are encircled.(B) Pie chart depicting CMS distribution in the set of benign adenomas.(C) Pie charts showing CMS distribution of precursor regions and carcinoma regions.The CMS2 and CMS3 precursor regions are further subdivided into low-grade or high-grade morphology.CMS, consensus molecular subtype.

Figure 2 .
Figure 2. CMS classification of matching precursor and carcinoma regions.(A) Sankey plot comparing CMS subtypes between corresponding precursor (left side) and carcinoma regions (right side).(B) Probability scores for each CMS subtype depicted within every case.Corresponding precursor (P) and carcinoma (C) regions are aligned horizontally.Patients who showed a switch in CMS subtype between the two regions are grouped at the bottom.C, carcinoma region; CMS, consensus molecular subtype; P, precursor region.

Figure 3 .
Figure 3. Oncoplot of the top 55 altered colorectal cancer genes in our set, comparing matching precursor and carcinoma regions.CMS classification and mutation type are indicated.CMS switching cases are grouped on the right side.C, carcinoma region; CMS, consensus molecular subtype; P, precursor region; indel, insertion or deletion.

Figure 4 .
Figure 4. Comparative analysis of genomic background.(A) Scatterplots comparing variant allele frequencies between precursor and carcinoma regions of six representative cases.(B) Stroma percentages of all precursor and carcinoma regions quantified on digital slides.Boxplots indicate mean and standard deviation.Cases that switched to CMS4 (right) are compared with all other cases (left).(C) Gene set enrichment analysis of cell cycle and TGFβ signalling comparing CMS2 and CMS4 carcinoma regions.(D) Immune enrichment scores comparing precursor versus carcinoma regions and CMS2 versus CMS4 carcinoma regions.Boxplots depict median and interquartile range.indel, insertion or deletion; NS, not significant; Padj, adjusted p value.

Table 1 .
Baseline characteristics of all included lesions.