Genetic and immune landscape evolution in MMR‐deficient colorectal cancer

Mismatch repair‐deficient (MMRd) colorectal cancers (CRCs) have high mutation burdens, which make these tumours immunogenic and many respond to immune checkpoint inhibitors. The MMRd hypermutator phenotype may also promote intratumour heterogeneity (ITH) and cancer evolution. We applied multiregion sequencing and CD8 and programmed death ligand 1 (PD‐L1) immunostaining to systematically investigate ITH and how genetic and immune landscapes coevolve. All cases had high truncal mutation burdens. Despite pervasive ITH, driver aberrations showed a clear hierarchy. Those in WNT/β‐catenin, mitogen‐activated protein kinase, and TGF‐β receptor family genes were almost always truncal. Immune evasion (IE) drivers, such as inactivation of genes involved in antigen presentation or IFN‐γ signalling, were predominantly subclonal and showed parallel evolution. These IE drivers have been implicated in immune checkpoint inhibitor resistance or sensitivity. Clonality assessments are therefore important for the development of predictive immunotherapy biomarkers in MMRd CRCs. Phylogenetic analysis identified three distinct patterns of IE driver evolution: pan‐tumour evolution, subclonal evolution, and evolutionary stasis. These, but neither mutation burdens nor heterogeneity metrics, significantly correlated with T‐cell densities, which were used as a surrogate marker of tumour immunogenicity. Furthermore, this revealed that genetic and T‐cell infiltrates coevolve in MMRd CRCs. Low T‐cell densities in the subgroup without any known IE drivers may indicate an, as yet unknown, IE mechanism. PD‐L1 was expressed in the tumour microenvironment in most samples and correlated with T‐cell densities. However, PD‐L1 expression in cancer cells was independent of T‐cell densities but strongly associated with loss of the intestinal homeobox transcription factor CDX2. This explains infrequent PD‐L1 expression by cancer cells and may contribute to a higher recurrence risk of MMRd CRCs with impaired CDX2 expression. © 2023 The Authors. The Journal of Pathology published by John Wiley & Sons Ltd on behalf of The Pathological Society of Great Britain and Ireland.


Introduction
Loss of DNA mismatch repair (MMR) proteins (MLH1, PMS2, MSH2, or MSH6) in MMR-deficient (MMRd) colorectal cancers (CRCs) establishes a hypermutator phenotype and high mutation burdens.The large number of mutation-encoded neoantigens make these cancers highly immunogenic [1].Following surgical resection, stage 1/2 MMRd CRCs have a lower recurrence risk compared with stage-matched MMR-proficient (MMRp) CRCs, probably through higher immunogenicity.The favourable prognosis of MMRd is largely lost in tumours with lymph node metastases (Stage 3) [2].Yet, the contribution of immune evasion (IE) mechanisms to disease progression remains uncertain [3].Furthermore, immunotherapy with programmed cell death protein 1/programmed death ligand 1 (PD1/PD-L1) checkpoint inhibitors (CPIs) is ineffective in metastatic (stage 4) MMRp CRCs [4], but has high response rates in MMRd CRCs (43.8%) [5], confirming an important role for the PD1/PD-L1 immune checkpoint in IE.Combined PD1 and CTLA4 CPIs achieved even higher response rates in stage 4 [6] and early-stage MMRd CRCs [7].Loss of the intestinal homeobox transcription factor CDX2 is enriched in MMRd CRCs, associated with high-grade tumours that progress and with worse survival [8,9].There remains a major need to assess current and identify novel biomarkers to predict who benefits from CPI.
Biomarkers for CPIs in other tumour types include mutation burden, the number of insertions and deletions (indels) [10], as well as cytotoxic CD8 T-cell infiltrates and expression of PD-L1 [11].Inactivation of genes in the IFN-γ signalling pathway or in HLA class I antigen presentation, such as Beta-2-Microglobulin (B2M), have also been associated with resistance to CPIs in several tumour types [12][13][14][15].Data in MMRd CRCs are less clear: neither CD8 T-cells, PD-L1 expression, nor mutation numbers correlated with CPI responses [4,16].Moreover, disrupting mutation of B2M was not associated with CPI resistance [17].Recent data even showed higher response rates in MMRd CRCs with B2M mutations and that γδ T-cells mediate CPI responses in these [18].
Importantly, this hypermutator phenotype and the large number of small indels with high immunogenic potential [10] distinguish MMRd tumours from other CPI-sensitive cancers, such as melanoma, lung, or renal cancer.We have shown in MMRd gastro-oesophageal cancers that this promotes extreme intratumour heterogeneity (ITH) and subclonal driver evolution [19].ITH hinders biomarker identification, as a single sample may not be representative of the entire cancer cell population [20,21].ITH may thus explain the slow progress in identifying predictive biomarkers in MMRd CRCs, but it has not been systematically assessed.A clear understanding of which genetic drivers are commonly truncal and which biomarkers are unlikely to show ITH, and can therefore be assessed reliably from single samples, and which ones require assays competent to address ITH, may be important for biomarker development.We applied multiregion multiomics and immunohistochemistry (IHC) analysis to reveal ITH of driver aberrations, immune infiltrates, and PD-L1 expression.

Materials and methods
Fifty-five primary tumour regions, 15 lymph nodes, and one distant metastasis from 20 CRCs were sequenced with a panel of 194 genes (supplementary material, Table S1) that are recurrently mutated in MMRd or MMRp CRCs or implicated in IE and may confer CPI resistance [12,15,19,[22][23][24]. 3'-RNA sequencing was performed on 60 samples with sufficient RNA yield and IHC for CD8; PD-L1 and STING immunostaining was performed on 68 samples.The study was performed in accordance with the Declaration of Helsinki and protocols had been approved by an ethics committee (approval numbers: 18/LO/0165, 2015/405, 14/EE0024).All patients had signed written informed consent for study participation.Details for sample collection and preparation, targeted DNA sequencing, mutation calling, HLA mutation analysis, DNA copy number analysis, identification of probable driver aberrations, linear regression model for extrapolation from targeted sequencing to exome mutation number in MMRd CRC, phylogenetic tree reconstruction, IHC, 3'-RNA sequencing, immune cell abundance analysis using single sample gene set enrichment analysis (ssGSEA), expression analysis of CRC cells lines, and statistical analyses are presented in Supplementary materials and methods.

Results
Targeted sequencing of 71 samples from 20 CRCs (Table 1) where IHC reported MMRd enabled high sequencing depths (median: 2659Â).This was critical Genetic and immune landscape evolution in MMRd colorectal cancer to avoid ITH overestimation in samples with low cancer cell content, such as lymph node metastases.The median non-silent mutation number of individual tumour regions was 58 (Figure 1A, and supplementary material, Table S2).Nineteen tumours had high numbers of mutations (28-90 mutations/region).Only 10 mutations/region were identified in T20, suggesting it was MMRp.T20 was excluded after histology review showed patchy loss of MLH1/PMS2, which is a recognised feature that can cause misclassification as MMRd [25].A median of 44 mutations were ubiquitously present in all regions of individual tumours.Thus, an average of 76% of mutations/tumour were truncal.

ITH by subtype and stage
ITH was identified in all cases (median: 16.1% heterogeneous mutations/region, Figure 1A).It was significantly higher in tumours with MLH1/PMS2 than with MSH2/MSH6 loss (Figure 1B).ITH was numerically but not significantly higher in BRAF V600E mutant tumours and did not differ by stage.Ubiquitous mutation numbers did not significantly differ by BRAF status, pattern of MMR protein expression loss, or stage (supplementary material, Figure S1).A high proportion of indels is characteristic of MMRd and a median of 60.8% of non-silent mutations were indels (Figure 1C).The higher indel fraction compared with whole-exome data from MMRd CRCs [10] is probably a consequence of over-representation of microsatellite repeats in driver genes of our panel, high sequencing depths, and the use of a contemporaneous mutation caller with increased sensitivity for indels [26].Two tumours had low indel fractions (T6, T10).Both showed only isolated MSH6 loss.This can be explained by the predominant recognition of base-base mismatches by the MSH2/MSH6 heterodimer, whereas the MSH2/MSH3 and MLH1/PMS2 heterodimers are also important for recognition and repair of indels [27,28].A third tumour with isolated MSH6 loss had a high indel fraction (T8).This tumour harboured an early somatic stop codon mutation in MSH2 (E28X) in all regions.Lynch syndrome patients with MHS2 start codon mutations and preserved MSH2 expression have been described, suggestive of a hypomorphic MSH2 isoform expressed from the M67 downstream start codon [29].Expression of this isoform probably explains the high indel fraction despite preserved MSH2 expression.

Comparison of mutation numbers to whole-exome sequencing studies
To compare the mutation burdens in our samples to those from published MMRd CRC whole-exome sequencing [23], we trained a linear regression model.The number of point mutations within target genes in our panel correlated significantly with mutation numbers from whole-exome sequencing in the training dataset (r = 0.851, supplementary material, Figure S2).Applying the derived linear model equation to our series extrapolated a median of 1,280 mutations in the exome per sample.This is similar to the whole-exome mutation numbers in MMRd CRCs from the training cohort (median: 994) and TCGA (median: 1166).The median extrapolated number of truncal mutations per case was 909 in our series.Long phylogenetic trunks are, hence, characteristic for MMRd CRCs.

Phylogenies reveal branched evolution
All tumours showed branched evolution in the phylogenetic analysis (Figure 1D and supplementary material, Figure S3).Metastases had diverged before detectable genetic diversification of the primary tumour in six of eight cases where multiple primary tumour regions and at least one metastasis were available to assess dissemination timing.Confidence for early dissemination was low in two of six cases (T13, T17) based on bootstrap values <50%.Thus, the ability to metastasise may already have been acquired on the phylogenetic trunk in at least four of eight tumours.

Driver aberration evolution is characterised by a clear hierarchy
The acquisition of new genetic drivers is arguably the most important consequence of cancer evolution, but has not been systematically assessed in MMRd CRC.We identified drivers by cataloguing hotspot mutations in known oncogenes, as well as disrupting aberrations [frameshift, splice-site, nonsense mutations, disruption of the signal peptide, and loss of heterozygosity (LOH)] that were likely to confer biallelic inactivation of tumour suppressor (TS) genes or IE driver genes (Figure 2A, the driver genes assessed are shown in supplementary material, Table S3).A hotspot mutation in an oncogene or LOH of a TS or IE gene combined with a disrupting mutation that is clonal in a sample reliably defines driver aberrations.Yet, many TS and IE genes harboured two disrupting mutations.Even when clonal, two distinct disrupting mutations can only inactivate a gene if they affect  Genetic and immune landscape evolution in MMRd colorectal cancer 229 both alleles.Most detected disrupting mutations were not in close enough proximity to assess, in the short-read sequencing data, whether they were present on different alleles or on the same allele.However, of five cases with multiple B2M mutations in the same sample, three harboured disrupting mutation pairs within the span of paired-end sequencing reads.In all three, mutations were mutually exclusive, i.e. never detected together in a single read (Figure 2B).We therefore accepted two independent clonal disrupting mutations in a TS or IE gene as a marker of biallelic inactivation.All identified driver aberrations (Figure 2A and supplementary material, Table S4) were mapped onto the phylogenetic trees (Figure 1D and supplementary material, Figure S3).

BR Challoner et al
This revealed a clear hierarchy of driver evolution (Figure 2C): WNT/β-catenin (WNT) pathway aberrations, activating mutations of receptor tyrosine kinases/ mitogen-activated protein kinase pathway (RTK-MAPK), and inactivation of TGF-β receptor family members were almost always truncal (87.0%, 86.4%, and 83.7%, respectively).These pathways were altered on the trunk of 89.5%, 94.7%, and 84.2% of tumours, respectively, demonstrating that they are critical for tumour initiation.Mutual exclusivity was observed for APC and RNF43 disrupting mutations in the WNT pathway.This allowed further validation of our algorithm for the identification of drivers in TS genes, as AXIN2 RNA overexpression is an established marker of ligand-independent WNT activation

230
BR Challoner et al (described to occur through APC inactivation or CTNNB1 hotspot mutations) versus ligand-dependent activation (through RNF43 inactivation) [30].Six tumours with two truncal disrupting mutations in APC showed significantly higher AXIN2 expression than tumours with truncal RNF43 drivers (Figure 2D), and similar expression to those with truncal APC disrupting mutations/LOH or CTNNB1 hotspot mutations.Thus, the detection of two independent disrupting mutations, even in a large TS gene such as APC, strongly indicates biallelic inactivation.Aberrations activating the PI3K pathway, inactivation of histone modifiers or of DNA damage response and repair genes were truncal in 36.4-68.4% of occurrences.Inactivation of HLA class I antigen presentation or IFN-γ signalling genes, previously shown to enable IE [12][13][14][15], were predominantly subclonal (71.4%).ITH hence impedes accurate assessment of these critical immunotherapy biomarkers and must be taken into account for biomarker analyses in primary tumour tissue.Disrupting mutations of HLA class I genes were infrequent (five mutations across the six HLA class I gene alleles per genome, see supplementary material, Figure S3) in comparison with drivers in other antigen processing/presentation genes (31 mutations in 15 antigen presentation genes according to supplementary material, Table S3).This perhaps suggests that the disruption of genes with broad effects on antigen presentation is a more effective evolutionary path to IE in tumours with such high neoantigen numbers.

Parallel evolution predominates among IE drivers
We next assessed parallel evolution, which is a strong indicator of Darwinian selection [20,31].This was defined by the presence of at least two distinct driver aberrations affecting genes with similar function on different phylogenetic branches.Eight of 19 tumours showed 12 instances of parallel evolution in nine different driver genes (see supplementary material, Figure S3).Parallel evolution was most common for IE genes (six of nine genes), most frequently involving B2M.The most striking example was observed in T12: NLRC5 (the master regulator of HLA class I expression) was inactivated in three and the antigen transporter TAP1 and the IFN-γ pathway gene JAK1 each in two different ways, leading to pan-tumour inactivation of these three genes in all regions.Pan-tumour IE was observed in one additional case.Importantly, parallel evolution of IE occurred in four of six tumours that harboured any subclonal IE drivers, indicating that these tumours were under intense immune selection pressure.Subclonal IE drivers were absent in all five tumours with truncal IE drivers.Parallel evolution, and mutual exclusivity of truncal and subclonal IE drivers in a total of 11 of 19 tumours, substantiates that these evolved through Darwinian selection.

Driver heterogeneity between primary tumours and metastases
We questioned whether specific drivers evolved in metastases (supplementary material, Figure S3).
Only KMT2D acquired driver aberrations more than once (three of 15 metastases).
IE drivers were acquired by two metastases; B2M in T15 and JAK1, NLRC5 as well as TAP1 in T12.PTEN was inactivated on the common branch of two metastases in T18.No new drivers had evolved on the branches, leading to seven of 15 metastases.Thus, drivers can be unique to metastatic sites, but we found no strong evidence for recurrent selection of specific drivers, consistent with our above observation that metastatic potential may already be encoded on the trunk in many cases.These conclusions are limited to lymph node metastases, as only one distant metastasis was assessed.

Heterogeneity of immune infiltrates
Our next aim was to interrogate the ITH of tumour infiltrating T-cells, which are surrogate markers of cancer immunogenicity [11].We quantified CD8 T-cells using IHC and computational image analysis.Mean CD8 T-cell densities in the primary tumour varied between cases (1.9-21.5% of nucleated cells), questioning how T-cell density is regulated.It did not significantly differ by BRAF status, pattern of MMR protein loss, or stage (Figure 3A).T-cell densities did not correlate with ubiquitous mutation or indel numbers, nor with heterogeneous mutation numbers (supplementary material, Table S5).This is perhaps unsurprising, as all tumours had sufficient truncal mutations (909 extrapolated exonic mutations/tumour) to encode for multiple neoantigens, even when taking conservative estimates that only 0.5-1% of somatic mutations give rise to HLA-presented neoantigens [32,33].T-cell densities also varied within primary tumours (Figure 3B).Dichotomisation using the mean CD8 T-cell fraction in the primary tumour distinguished tumours with low (mean: 4.0%) from tumours with high (mean: 11.7%), albeit frequently variable, T-cell infiltrates.This suggested a tumourintrinsic setpoint, which is, however, accompanied by marked variability within tumours with dense infiltrates.
To obtain more granular insights, we computationally inferred 15 immune cell subtypes (supplementary material, Table S2) from RNA expression data of each tumour region.IHC-measured CD8 T-cells correlated most strongly with activated CD8 T-cells (r = 0.525, supplementary material, Table S6), indicating that inference from expression data worked reliably.Hierarchical clustering showed that most lymph node metastases clustered together (Figure 3D).This cluster was enriched for immune cells expected in the lymph node environment, such as B-cells and activated dendritic cells, but also other immune cell types, including activated CD8 and CD4 cells.Although samples had been macrodissected we cannot completely rule out contamination with immune cells from surrounding lymph node tissue.Yet, this can be reliably excluded in CD8 T-cell IHC, which also supported higher T-cell densities within cancer cell areas in lymph nodes.Primary tumour regions from individual cancers cosegregated into one of two major clusters.Higher Genetic and immune landscape evolution in MMRd colorectal cancer activated and memory B-cells in cluster 1 and higher activated dendritic cells in cluster 2 were the only significant differences between these (Figure 3D).Hence, these clusters did not explain differences in CD8 T-cell densities.
Taken together, we found no correlation of CD8 T-cell densities with mutation numbers or MMRd CRC subtypes.We therefore investigated whether different T-cell densities are a consequence of IE driver evolution.

Coevolution of genetic and immune landscapes
Three distinct patterns of genetic IE evolution were apparent in cases where at least three primary tumour regions had been analysed (Table 2 and supplementary material, Figure S3).These patterns were: (1) tumours without evolution of any genetic IE drivers, which we refer to as IE evolution stasis (T1, T2, T6, T13, T17, T18); (2) tumours with subclonal IE driver evolution in some but not all regions (T4, T5, T14, T15), and (3) tumours with pan-tumour IE through truncal aberrations or parallel evolution in all tumour regions (T3, T7, T12, T16, T19).
CD8 T-cell densities were significantly higher in group 2 (subclonal evolution) compared with groups 1 (stasis) and 3 (pan-tumour evolution) (Figure 4A).Deconvolution from RNA expression data showed that activated CD8 T-cells, effector memory CD8 T-cells, and activated CD4 T-cells were particularly enriched in group 2 (Figure 4B, sample-level data: Figure 4C).This indicates that active selection pressure from abundant CD8 T-cells is the proximate cause for subclonal IE evolution in group 2. Conversely, the absence of this selection pressure is one likely explanation for evolutionary stasis with respect to known IE mechanisms in group 1, whereas pan-tumour IE is probably responsible for the low infiltrates of CD8 T-cells in group 3.These subtypes may also explain the ITH of CD8 T-cell infiltrates (Figure 3B), as group 2 was only found in tumours with high variability, whereas groups 1 and 3 predominated among tumours with low mean T-cell densities and limited variability between regions.
We hypothesised that low CD8 T-cell infiltrates in tumours with IE evolution stasis (group 1) may indicate a previously unrecognised IE mechanism.This was not the consequence of lower numbers of mutations (median group 1: 56, group 2: 51).We assessed CD8 T-cell densities at the tumour margin using IHC (supplementary material, Table S2) and found that this was significantly (p = 0.045) lowest in the evolutionary stasis group (Figure 4D), suggesting that T-cell recruitment may be impaired.This was supported by the numerically lower expression of the CD8 T-cell chemoattractants CXCL9-11 compared with group 2 (Figure 4E).Regulatory T-cells were also more abundant in group 1, but not other immunosuppressive cells, such as myeloid-derived suppressor cells or macrophages (Figure 4B).A recent mouse study showed that cGAS-STING pathway activation promotes CD8 T-cell infiltration into MMRd CRCs [34].Loss of STING expression occurs in CRCs [35] and may impair immunogenicity, yet we detected STING in all group 1 tumours (Figure 4F).Overall, our data are most consistent with impaired recruitment rather than inactivation of CD8 T-cells as an explanation for the absence of IE evolution in group 1.

Distinct effects of impaired antigen presentation and IFN-γ signalling on immune infiltrates
We next assessed whether the disruption of antigen presentation versus the IFN-γ pathway had different impacts on immune infiltrates.We compared immune profiles of primary tumour regions that lost genes in the antigen presentation pathway (n = 7) with proficient regions (n = 4) in the same tumours.B2M was the affected gene in all cases.RNA-inferred activated CD4 and activated CD8 T-cells were significantly less abundant, whereas macrophages, immature B-cells, NK cells, and CD8 effector memory cells were significantly higher in regions with defective B2M (Figure 4G,H).The impact of IFN-γ pathway inactivation could not be assessed within tumours, as only one case contained both subclonal driver and wild-type regions.We therefore compared regions with IFN-γ pathway inactivation (n = 7) to those with defective B2M (n = 13).This revealed no significant difference in activated CD4 or CD8 T-cells, but lower infiltrates for most other immune cell subtypes, including NK cells, demonstrating broad depression of RNA-inferred immune infiltrates (Figure 4I).The enrichment of NK cells is an expected consequence of B2M loss and suggests opportunities for NK cell therapies [13].Sparse infiltrates of most immune cell subtypes in tumours with IFN-γ pathway defects indicate that distinct treatment strategies are required for these.

Mechanisms influencing PD-L1 expression
Expression of PD-L1 by cancer cells, cells in the tumour stroma (predominantly by immune cells, mainly macrophages, in the stroma [36]), or both, is a major non-genetic IE mechanism [37][38][39].To assess the heterogeneity of PD-L1 expression we categorised samples with accepted cut-offs (0: 0%; 1+: 1-4%; 2+: 5-49%; 3+: 50-100% [40]) based on the percentage of IHC PD-L1-positive cancer or stromal cells (supplementary material, Table S2).Stromal expression was common (1+ to 3+ staining in 97.1% of samples) and ITH was modest (Figure 5A,B).Only four of 19 tumours harboured PD-L1-expressing cancer cells and expression was detected in all regions of these.Cancer cell PD-L1 expression has been associated with more aggressive tumour behaviour than stromal PD-L1 expression in CRCs [39,41].We show that this can be a stable characteristic, similar to truncal IE through genetic mechanisms.Absent cancer cell PD-L1 expression in 15 cases was surprising in view of the abundant stromal PD-L1 expression, but is consistent with previous reports in MMRd CRCs [36].
We correlated PD-L1 expression with immune cell infiltrates to investigate the determinants of PD-L1 expression.Because of the bias towards higher immune infiltrates in lymph nodes, we only analysed primary tumour regions.Stromal PD-L1 expression correlated significantly with IHC-quantified CD8 T-cells (r = 0.355, Figure 5C).Comparison with 15 immune cell subtypes from RNA expression showed an even stronger correlation of stromal PD-L1 with activated CD8 T-cells (r = 0.507) and with activated CD4 and dendritic cells (Figure 5D,E).Thus, an adaptive immune response predominated by T-cells strongly associated with stromal PD-L1 expression.In contrast, cancer cell PD-L1 expression did not correlate with IHC-quantified CD8 T-cells and only weakly with activated CD8 T-cells inferred from RNA expression (Figure 5F).Thus, we hypothesised that PD-L1 expression on cancer cells requires additional permissive factors.
We investigated whether genetic drivers were specific to tumours with PD-L1-expressing cancer cells.CD274 (encoding PD-L1) was not amplified (supplementary material, Table S2).We found no enrichment of specific truncal drivers in tumours with PD-L1-positive cancer cells (supplementary material, Figure S3).Published data showed higher PD-L1 expression in CRCs with low expression of the intestinal homeobox transcription factor CDX2 but this did not distinguish PD-L1 expression by stromal or cancer cells nor between MMRd and MMRp tumours [42].In our series, CDX2 RNA expression was significantly negatively correlated with PD-L1-positive cancer cells (r = À0.683)but not with stromal PD-L1 (r = À0.269, Figure 5G).Remarkably, Genetic and immune landscape evolution in MMRd colorectal cancer 233 CDX2 expression was 24.7-fold lower in tumours with PD-L1-positive cancer cells.We sought to validate this by IHC [8] in an independent cohort of 23 MMRd CRCs (supplementary material, Table S7).CDX2 staining was low or absent in tumours with PD-L1-positive cancer cells and those without PD-L1-positive cancer cells showed significantly higher CDX2 (Figure 5H).Moreover, analysis of 57 CRC cell lines [43] confirmed a significant negative correlation of CDX2 and CD274 RNA expression (Figure 5I).

BR Challoner et al
Taken together, stromal PD-L1 was highest in tumour regions with abundant CD8 T-cells, where it probably protects cancer cells from immunemediated destruction, whereas PD-L1 expression by cancer cells was restricted to cases with low or absent CDX2.

IE and metastatic dissemination
We finally explored whether IE mechanisms in cancer cells (including drivers in IE genes or cancer cell PD-L1 expression) associated with the presence of metastases.We found no significant difference between the three evolutionary subtypes (Figure 6A) and the proportion of tumours harbouring any IE mechanism was similar between stage 1/2 tumours and Stage 3/4 tumours (Figure 6B).However, truncal IE mechanisms were exclusively identified in Stage 3/4 tumours (Figure 6C), suggesting that early acquisition of these may foster metastasis development.This was significant ( p = 0.044), but larger studies are required to substantiate this.

Discussion
Novel findings of this systematic analysis of MMRd CRC evolution include the identification of a clear hierarchy of clonal and subclonal driver pathways, the correlation of CD8 T-cell densities with phylogenetic patterns of IE driver evolution, and an association of PD-L1 expression in cancer cells with CDX2 loss.That T-cell densities did not correlate with mutation numbers is perhaps unsurprising, as all tumours had sufficient truncal mutations (909 extrapolated exonic mutations/ tumour) to encode for multiple neoantigens, even when considering conservative estimates that only 0.5-1% of somatic mutations give rise to HLA-presented neoantigens [32,33].The pervasive ITH in these tumours with high CPI response rates reinforces that abundant subclonal mutations do not seem to hinder effective immune responses during these treatments [19].The temporal hierarchy of driver evolution shows that the WNT pathway, RTK/MAPK pathway, and TGF-β receptor family members are critical for cancer initiation.The increasing prevalence of subclonal drivers in PI3K, DNA damage response and repair genes, histone Genetic and immune landscape evolution in MMRd colorectal cancer modifier, and IE genes support their relevance in promoting cancer progression.IFN-γ pathway inactivation is associated with resistance to PD1 inhibitors [12,14] and CTLA4 inhibitors [15] in lung cancers and melanoma.B2M inactivation correlated with better responses in MMRd CRCs.IE through mutations in these pathways is known to occur frequently in MMRd CRCs [23,[44][45][46], but neither these studies nor biomarker analyses in CPI-treated MMRd CRCs interrogated ITH.Our results show that clonality assessment of IE mutations must be taken into account for predictive CPI biomarker development from primary tumour tissue.Multiregion sequencing is unlikely to be clinically feasible.However, circulating tumour DNA sequencing is effective in CRC and methods to determine driver aberration clonality have been developed [47,48].
Tumours with subclonal IE evolution showed the highest CD8 T-cell densities.This may appear counterintuitive, but it is probably testament to ongoing evolutionary adaptation to the immune selection pressure that is active in these tumours at the time of sampling.Low immune cell infiltrates in cases with pan-tumour IE patterns are plausibly explained by the lack of antigen presentation or IFN-γ pathway inactivation.In contrast, the low immune infiltrates in tumours without any IE mutations suggest a novel mode of IE, as no known IE mechanisms were apparent.The specific mechanism could not be identified in this small cohort, but our data indicate T-cell recruitment may be impaired.The low CD8 T-cell densities and expression of T-cell chemoattractants further associated with low myeloid and dendritic cell abundancies.This may also represent failure to trigger any T-cell immune responses, for example through insufficient danger signals.IFN-γ pathway inactivation was associated with broad depression of RNA-inferred immune cell abundances, even in comparison with B2M inactivation.Whether IFN-γ pathway aberrations lead to CPI resistance is therefore an important question.
We finally found that PD-L1 expression by cancer cells strongly associated with CDX2 loss, suggesting a permissive effect for PD-L1 upregulation.Impaired CDX2 expression in bowel cancer confers an increased recurrence risk [8,49,50].This is thought to be a consequence of differentiation loss and increased invasiveness [51].Our data indicate that increased T-cell suppression through PD-L1 upregulation may contribute.However, the role and effect of other coinhibitors of immune responses, such as IDO or LAG3, was not assessed in our study.In addition, cancer cell PD-L1 expression conferred higher susceptibility to CPIs in mouse models [37,39].Whether CDX2 loss identifies tumours that particularly benefit from adjuvant CPIs should be investigated in ongoing trials (e.g.NCT02912559).
The small number of 19 MMRd CRCs is the main limitation of this study.Further validation of our findings in larger cohorts, preferentially from clinical trials that used immunotherapy, should be performed.As multiregion tissue samples are often unavailable, circulating tumour DNA-based assessment of ITH may be a suitable approach.Furthermore, measuring immune cells from RNA expression data can be error prone.These results need to be interpreted cautiously in the absence of an independent validation method.

Figure 1 .
Figure 1.Multiregion sequencing analysis of MMRd CRCs.(A) Ubiquitous and heterogeneous mutations detected in 191 of 194 genes (excluding HLA-A/B/C) by tumour region/metastasis.Stage, BRAF V600E mutations, and MMR protein expression are shown.(B) Fraction of heterogeneous mutations per region by MMR protein loss, BRAF mutation status, and stage.The horizontal lines represent the estimated marginal mean.Significance was assessed with the nested t-test.(C) Fraction of indels among all called mutations by MMR protein loss.(D) Representative phylogenetic trees generated from mutation calls with the PHYLIP parsimony algorithm Pars, annotated with metastatic dissemination timing and IE evolutionary group.Driver aberrations in TS or IE genes were mapped on the branch where the second disrupting aberration had been acquired and in oncogenes on the branch where a hotspot mutation had been acquired.T, tumour; R, primary tumour region; L, lymph node metastasis.Bootstrap values (%) were calculated for phylogenetic trees where three or more tumour regions were available for genetic analysis.

Figure 2 .
Figure 2. Driver aberrations and heterogeneity.(A) Heatmap of identified truncal or subclonal driver aberrations for each tumour.(B) Confirmation that distinct B2M mutations were located on different paired-end reads but never together.Screenshots were obtained from the Integrated Genomics Viewer.(C) Proportion of truncal and subclonal driver aberrations by functional group.(D) Mean AXIN2 expression in primary tumour regions by truncal driver aberrations in APC, RNF43, and CTNNB1.Horizontal lines represent the mean and significance was assessed using a t-test.

Figure 3 .
Figure 3.Immune landscapes of MMRd CRCs.(A) Fraction of CD8 T-cells (IHC) among all nucleated cells per primary tumour region by BRAF mutation status, MLH1 loss, and stage.The horizontal lines represent the estimated marginal mean.(B) CD8 T-cell fraction (IHC) in each primary tumour region and the mean for each tumour where at least three primary tumour regions were available.(C) Linear regression analysis of average CD8 T-cell fraction in the primary tumour and in matched lymph node metastases.The regression line (solid line) and 95% confidence intervals (dotted lines) are shown.(D) Euclidian clustering of 15 immune cell subtype abundances inferred from RNA expression data with ssGSEA;q values were generated with Morpheus with the t-test and multiple testing correction.Significance was assessed using the nested t-test (A) and a t-test (D).False discovery rate multiple testing correction was applied where q values are shown.p < 0.05 and q < 0.1 were considered significant.Correlation coefficients (r) were calculated using the Pearson test (C).

Figure 4 .
Figure 4. Coevolution of genetic and immune landscapes.(A) CD8 T-cell fraction (IHC) measured in 50 primary tumour regions from 15 MMRd CRCs that were classified into three groups based on IE driver evolution patterns.The horizontal lines represent the estimated marginal mean.(B) Abundance of 15 immune cell subtypes inferred from RNA expression data in the three evolution groups.Statistical results were not significant after multiple testing correction and values are not shown.(C) Sample-level data and means (horizontal bars) for immune cell subtypes in (B).(D) CD8 T-cell density at the tumour margin by evolution group.The horizontal lines represent the estimated marginal mean.(E) Expression of the CD8 T-cell chemoattractants CXCL9, CXCL10, and CXCL11.The horizontal lines represent the estimated marginal means.(F) Staining intensity for STING (using IHC) in cancer cells.(G) Comparison of primary tumour regions with B2M defects (n = 7) to all remaining primary tumour regions from the same cases (n = 4).The heatmap shows the mean abundance in regions with B2M defects minus the mean abundance in regions without defects.(H) Abundance difference of immune cell subtypes that were significant in (G), comparing tumour regions with B2M disruption (+) to regions without disruption (À).(I) Comparison of immune cell abundancies in all primary tumour regions with defects of IFN-γ pathway genes (n = 7) to all remaining primary tumour regions with B2M disruption (n = 13).The heatmap shows the mean abundance in regions with IFN-γ defects minus the mean abundance in regions with B2M disruption.Significance analyses were performed using the nested one-way ANOVA and Tukey's comparison of means test (A, D, and E), a paired t-test (G), and a nested t-test (I).

Figure 5 .
Figure 5. Distinct mechanisms influence PD-L1 expression by stromal and cancer cells.(A) Proportion of PD-L1-positive stromal cells and cancer cells by IHC in each analysed sample.(B) Examples of PD-L1 staining in stroma and cancer cells (S: stromal staining, C: cancer cell staining).(C) CD8 T-cell fraction (using IHC) in primary tumour regions by PD-L1 positivity in stromal cells.(D) Correlation of immune cell subtype abundance assessed by ssGSEA with stromal PD-L1 expression.(E) Individual sample data of activated CD8 T-cell abundance (inferred from RNA expression) in primary tumour regions by PD-L1 positivity.(F) CD8 T-cell fraction IHC (left) and activated CD8 T-cells from RNA expression (right) in primary tumour regions by PD-L1 positivity in cancer cells.(G) CDX2 RNA expression in primary tumour regions by PD-L1 positivity in cancer cells.(H) CDX2 staining intensity (0: absent, 1: weak, 2: moderate, 3: strong) by cancer cell PD-L1 expression in an independent validation cohort of 23 MMRd CRCs.(I) Linear regression analysis of PD-L1/CD274 and CDX2 mRNA expression in 57 CRC cell lines from the Cancer Cell Line Encyclopedia.Horizontal bars show the medians in (H) and means elsewhere.Significance was assessed using a Mann-Whitney test (H).False discovery rate multiple testing correction was applied where q values are shown.p < 0.05 and q < 0.1 were considered significant.Correlation coefficients (r) were calculated with Pearson (I) and Spearman rank (C-G) tests.

Table 1 .
Patients and pathological characteristics.
*T20 reclassified as MMRp following targeted gene sequencing and MMR IHC slide review.NOS, not otherwise specified.

Table 2 .
Identified aberrant IE genes incorporated into MMRd CRC evolutionary pattern classification.