Interpretation of somatic POLE mutations in endometrial carcinoma

Abstract Pathogenic somatic missense mutations within the DNA polymerase epsilon (POLE) exonuclease domain define the important subtype of ultramutated tumours (‘POLE‐ultramutated’) within the novel molecular classification of endometrial carcinoma (EC). However, clinical implementation of this classifier requires systematic evaluation of the pathogenicity of POLE mutations. To address this, we examined base changes, tumour mutational burden (TMB), DNA microsatellite instability (MSI) status, POLE variant frequency, and the results from six in silico tools on 82 ECs with whole‐exome sequencing from The Cancer Genome Atlas (TCGA). Of these, 41 had one of five known pathogenic POLE exonuclease domain mutations (EDM) and showed characteristic genomic alterations: C>A substitution > 20%, T>G substitutions > 4%, C>G substitutions < 0.6%, indels < 5%, TMB > 100 mut/Mb. A scoring system to assess these alterations (POLE‐score) was developed; based on their scores, 7/18 (39%) additional tumours with EDM were classified as POLE‐ultramutated ECs, and the six POLE mutations present in these tumours were considered pathogenic. Only 1/23 (4%) tumours with non‐EDM showed these genomic alterations, indicating that a large majority of mutations outside the exonuclease domain are not pathogenic. The infrequent combination of MSI‐H with POLE EDM led us to investigate the clinical significance of this association. Tumours with pathogenic POLE EDM co‐existent with MSI‐H showed genomic alterations characteristic of POLE‐ultramutated ECs. In a pooled analysis of 3361 ECs, 13 ECs with DNA mismatch repair deficiency (MMRd)/MSI‐H and a pathogenic POLE EDM had a 5‐year recurrence‐free survival (RFS) of 92.3%, comparable to previously reported POLE‐ultramutated ECs. Additionally, 14 cases with non‐pathogenic POLE EDM and MMRd/MSI‐H had a 5‐year RFS of 76.2%, similar to MMRd/MSI‐H, POLE wild‐type ECs, suggesting that these should be categorised as MMRd, rather than POLE‐ultramutated ECs for prognostication. This work provides guidance on classification of ECs with POLE mutations, facilitating implementation of POLE testing in routine clinical care. © 2019 The Authors. The Journal of Pathology published by John Wiley & Sons Ltd on behalf of Pathological Society of Great Britain and Ireland.

Previous work has shown that ECs with a pathogenic POLE EDM typically display characteristic genomic alterations, with a high prevalence of C>A substitutions, frequently exceeding 20%; a low proportion of small insertion and deletion mutations (indels); and an extremely high tumour mutational burden (TMB; > 100 mut/Mb) [12,26]. In the pivotal 2013 EC study from The Cancer Genome Atlas (TCGA), all 17 tumours classified as ultramutated had a POLE EDM, including recurrent P286R and V411L substitutions (eight and five cases, respectively), and one case each of S297F, A456P, M444K, and L424I substitutions [7]. Interestingly, 10 of 231 non-ultramutated ECs in this study also had a POLE mutation either within or outside the exonuclease domain. Following the TCGA report, further studies have confirmed the prevalence of the five pathogenic mutations listed above and identified additional variants of uncertain pathogenicity. The parameters by which to evaluate the latter are ill defined, and thus classification of such cases is challenging, particularly in the absence of whole-exome or whole-genome sequencing (WES/WGS). In order to facilitate the classification of ECs in clinical practice, we aimed to develop a scoring system to estimate the pathogenicity of novel POLE mutations based on the presence or absence of genomic alterations associated with known pathogenic POLE mutations. We also sought to provide pragmatic guidelines for the interpretation of POLE variants in cases analysed by targeted POLE sequencing where such comprehensive genomic data are unavailable, being mindful that the designation of a tumour as POLE-ultramutated EC may lead to withholding treatment, given the very favourable prognosis of this EC molecular subtype, so that a conservative approach to diagnosis is warranted.

Data extraction TCGA EC cohort
To analyse the base change proportions of the TCGA cohort of ECs (n = 530), we downloaded the MAF files [using Mutect for point somatic mutation call as well as small insertions and deletions (indels)] from Genome Data Commons (https://portal.gdc.cancer .gov/; accessed 27 February 2019). We used somatic called coding variants [single nucleotide substitutions (SNV), including synonymous mutations, and indels] as mutation count. To calculate tumour mutational burden (TMB), we used 38 Mb as the estimate of the exome size. Microsatellite status, as defined by the Bethesda Protocol classification [27], was obtained from the Genome Data Analysis Center (GDAC) database (https://gdac.broadinstitute.org/; accessed 30 October 2018).

Recurrence of somatic POLE mutations in EC and pan-cancer
We searched for each somatic POLE mutation in the complete TCGA (Genome Data Commons) catalogues and COSMIC (https://cancer.sanger.ac.uk/cosmic, accessed 10 January 2019), annotating their recurrence on all cancer types (pan-cancer) and exclusively within ECs (supplementary material, Table S1). Recurrent mutations were defined as those present in two or more cancer samples in the COSMIC and TCGA databases combined (cases present in both databases were counted only once). A mutation was considered non-recurrent if it was found only once.

In silico prediction tools
To evaluate the functional status of somatic POLE mutations, we used six widely-used in silico tools: SIFT [30], PROVEAN [31], PolyPhen-2 [32], PANTHER [33], SNAP2 [34], and the meta predictor REVEL [35]. SIFT is a multi-step algorithm using sequence-based predictive features to predict the effect of single-nucleotide polymorphisms (SNPs) [30]. PROVEAN extends this approach, additionally incorporating analysis of in-frame insertions, deletions, and multiple substitutions [31]. PolyPhen-2 implements sequence-based and structure-based predictive features and compares wild-type and mutant allele through a decision tree [32]; 'possibly damaging' results were interpreted as benign. PANTHER is based on protein sequence, using a metric based on evolutionary conservation on direct ancestors of the organism [33]; 'possibly damaging' and 'probably benign' results were interpreted as benign. SNAP2 is a neural network-based classifier that uses sequence and structural-based data as inputs [34]. REVEL is an ensemble method based on 13 individual tools [35]; scores below 0.5 were considered benign.
DNA mismatch repair-deficient/microsatellite-unstable, POLE exonuclease domain-mutated endometrial cancer cohort Tumours with concomitant mismatch repair deficiency (MMRd) and somatic POLE EDM, and clinical follow-up were identified from a pooled cohort of 2988 molecularly profiled ECs across ten participating institutes (a detailed description can be found in León-Castillo et al [36]). Informed consent and ethical approvals were obtained according to local protocols in each participating centre. These tumours were combined with five tumours with concomitant microsatellite instability and POLE EDM from the 2013 TCGA EC cohort [7] for survival analysis.

Statistical analysis
Nominal variables were compared by χ 2 statistics or Fisher's exact test and ordinal variables using the Mann-Whitney test. All statistical tests were two-sided and statistical significance was accepted at p < 0.05. We generated Kaplan-Meier curves for recurrence-free survival (RFS) and overall survival (OS), and differences were tested by the log-rank test. The median follow-up was estimated by the reverse Kaplan-Meier method.

Genomic characteristics of endometrial cancers with somatic POLE mutations in the complete TCGA cohort
To elucidate which genomic alterations best define pathogenic somatic POLE mutations (which we use in this context to mean very likely causal for tumour ultramutation), we used data from 530 ECs profiled by TCGA, including those reported in the 2013 publication [7]. This included 82 tumours with a somatic POLE mutation, of which 59 (72%) were located within the exonuclease domain and 23 (28%) outside the exonuclease domain. The 59 exonuclease domain mutations comprised 21 unique variants, the five most common of which (P286R, 21 cases; V411L, 13 cases; S297F, 3 cases; A456P, 2 cases; and S459F, 2 cases) were classified as pathogenic based on previous reports [7,8,26] and designated as 'hotspot' POLE mutations for the purpose of this study ( Table 1).
Of the 41 TCGA ECs with a somatic non-hotspot POLE mutation, 18 were located within the exonuclease domain. Comparing these with the 23 tumours with non-exonuclease domain mutations, non-hotspot POLE exonuclease domain-mutant ECs had a higher TMB (median 164.4 versus 42.8 mut/Mb) and C>A proportion (median 20.2% versus 10.8%), and a lower C>G proportion (median 0.5% versus 1.0%) and indel proportion (median 5.2% versus 9.5%) ( Table 2).
MSI status was available for all TCGA ECs, of which 35/82 cases with somatic POLE mutations (42.7%) were MSI-H. Comparison between ECs with hotspot mutations and non-hotspot mutations within and outside the exonuclease domain revealed striking differences: only 4/41 (9.8%) of the TCGA ECs with one of the five hotspot mutations were MSI-H, whereas 14/18 (78%) ECs with a non-hotspot exonuclease domain mutation and 17/23 (74%) ECs with a non-exonuclease domain mutation were MSI-H (p < 0.0001). Analysis of the genomic architecture of these tumours revealed notable differences between groups. Tumours with hotspot POLE mutations and MSI had a high TMB (median TMB of 339.0 mut/Mb, > 100 mut/Mb in all four cases) and a high proportion of C>A and T>G substitutions (median 20.0% and 5.1%, respectively), with a low proportion of C>G substitutions (median 0.3%) and indels (median 2.8%) ( Table 2). Tumours with non-hotspot POLE EDM and MSI had a lower TMB (median 207.1 mut/Mb, > 100 mut/Mb in 9/14 cases)  and C>A and T>G proportions (median 10.8% and 1.6%, respectively), a similar proportion of C>G substitutions (median 0.5%), and higher indel proportion (median 6.7%) (  cancers carrying these mutations, and in differences in the genomic architecture of tumours harbouring both defects. Collectively, these data confirm that different POLE mutations vary in pathogenicity and underscore the need for its reliable estimation to ensure accurate patient classification. Establishing a pathogenicity score for somatic POLE mutations Motivated by our preliminary analyses, we next used the TCGA WES data to develop a scoring system to assess the pathogenicity of POLE mutations (defined Table 2. Tumour mutation burden and SNV/indel by POLE mutation location and tumour MSI status in TCGA endometrial cancers as the likelihood that they are associated with the characteristic ultramutated phenotype), using the hotspot POLE mutations as a truth set. Taking TMB and C>A, T>G, C>G, and indel proportions as the most discriminating genomic alterations for these pathogenic mutations, and building on previous work [26], we developed a pragmatic scoring system in which tumours scored 1 point for each of the following: TMB > 100 mut/Mb; C>A ≥ 20%; T>G ≥ 4%; C>G ≤ 0.6%; and indels ≤ 5%. All 41 TCGA ECs with a hotspot POLE mutation scored 3-5 points, while 13/41 (31.7%) ECs with a non-hotspot POLE mutation scored ≥ 3 points, including 8/18 with exonuclease domain mutations, while 19/23 tumours with POLE mutations outside the exonuclease domain had scores ≤ 2, the exceptions being three tumours with score 3 (each of which had likely pathogenic mutations in POLD1: D316G, S478N, and L606M) and one scoring 5 points with a POLE R705W mutation. We therefore chose to focus on mutations in the exonuclease domain, given the infrequent association of non-exonuclease domain mutations with genomic alterations associated with the ultramutated phenotype.
To define a cut-off for pathogenicity, we applied the POLE-score on hotspot POLE-mutant, non-hotspot POLE EDM and control POLE wild-type ECs (MSS and MSI-H) in the TCGA cohort. Thirty-eight of 41 (92.7%) ECs with a hotspot POLE EDM had a POLE-score of ≥ 5 points (Figure 1). The remaining three tumours, all of which harboured a V411L mutation, scored 4 points. In contrast, of the 18 tumours with a non-hotspot POLE EDM, seven scored ≥ 4 points (all of which carried mutations recurrent in the TCGA or COS-MIC EC databases: F367S, L424I, M295R, P436R, M444K, D368Y), five scored 3 points (four of which carried recurrent mutations: A465V, L424V, T278M, L424I; one with a non-recurrent A428T substitution), and six scored ≤ 2 points (one of which had a recurrent mutation). For comparison, all 321 MSS, POLE wild-type ECs scored ≤ 3 points and all 127 MSI-H POLE wild-type ECs scored ≤ 2.
Based on these data, we used a POLE-score of ≥ 4 points to define pathogenicity of POLE mutations in EC. When applying this cut-off, 48 ECs in the TCGA are classified as having pathogenic POLE EDM (all 41 cases with hotspot mutations and seven with non-hotspot variants), comprising 11 unique mutations, all of which are recurrent in TCGA/COSMIC (Table 3). ECs with a POLE-score ≤ 2 were classified as having non-pathogenic POLE EDM, based on the absence of genomic alterations associated with ultramutated phenotype. Cancers with a score of 3 (A465V, L424V, T278M, and A428T) were classified as having a variant of uncertain significance.
To validate the POLE-score, we noted the contribution of COSMIC signature 10 in ECs with a POLE

Relationship between pathogenicity of somatic POLE mutations, microsatellite instability, and clinical outcome
The co-existence of POLE mutations and MMRd/MSI in EC [26,38] and the variation in its prevalence by POLE mutation location raise important questions about which is the initial, presumably dominant factor determining tumour phenotype and clinical outcome.
To further investigate this, we used the POLE-score to stratify TCGA cases into predicted pathogenic and non-pathogenic POLE mutations using a score of ≥ 4. Nine of 49 (18.4%) ECs with a predicted pathogenic POLE mutation (including four known hotspot mutations) were MSI-H, compared with 26/33 (78.8%) tumours with a predicted non-pathogenic mutation (p ≤ 0.0001, χ 2 statistic). Restricting the analysis to tumours with POLE EDM, 9/48 (18.8%) cases with a predicted pathogenic EDM (including hotspot mutations) were MSI-H, as opposed to 9/11 (81.8%) with a predicted non-pathogenic EDM (p ≤ 0.0001, Fisher's exact test). Interestingly, further stratification suggested a similar variation between likely pathogenic POLE mutations, as only 2/34 ECs with a P286R or V411L mutation were MSI-H, compared with 7/14 ECs with one of the other nine predicted pathogenic mutations (p = 0.0012). Thus, POLE mutations co-existent with MSI in EC are more likely to be non-exonuclease, non-pathogenic mutations, though this is not universally the case. To investigate the clinical outcome of POLE exonuclease domain-mutant EC with concomitant MMRd, we identified 30 such patients from a pooled analysis of 3236 ECs (Table 4). Five-year recurrence-free survival (RFS) for this subgroup was 83.2%, with a 5-year overall survival (OS) of 80.9% (Figure 3) (corresponding figures for 24 patients with stage I disease were 84.2% and 85.4%, respectively) (supplementary material, Figure S1), seemingly contrasting with the 5-year RFS and OS of 92-100% previously reported for POLE exonuclease domain-mutant EC [4,5,7]. To clarify this, we stratified patients according to predicted pathogenic versus non-pathogenic EDM using the POLE-score and analysed their clinical outcome. For cases that lacked WES data and for which POLE EDM had not been previously described in the TCGA, we considered all mutations different to the ones present in Table 3 (mutations deemed pathogenic using the POLE-score) as VUS. This revealed that the 13 cases with one of the 11 mutations classified as likely pathogenic by POLE-score (Table 3) Table 3) versus all other tumours MMRd-POLEmut (C and D).
classified as likely non-pathogenic/VUS was 76.2% (p = 0.40, log-rank test) ( Figure 3). While the clinical behaviour of tumours with combined MMRd/MSI and POLE EDM may vary based on the pathogenicity of the latter, this difference was not statistically significant, possibly owing to insufficient power/small numbers of cases, and it is not possible to determine the prognosis of this subgroup with certainty at present.
Estimation of pathogenicity of somatic POLE mutations in the absence of exome or genome sequencing Somatic mutation profiling in clinical practice is typically performed by targeted panel sequencing, rather than WES/WGS approaches at present. To develop a classification tool for ECs with somatic non-hotspot POLE mutations that can be implemented using such data, we used mutation location, prior data, and in silico tools which estimate the probability that a mutation is damaging. We first noted that nearly all (> 95%) POLE mutations outside the exonuclease domain are classified as non-pathogenic by POLE-score. We next noted that in the case of exonuclease domain mutations reported in the TCGA, the POLE-score can be used to estimate pathogenicity (Table 3). We finally noted that for POLE EDM not present in the TCGA, in silico prediction tools could be used to estimate pathogenicity. Further exploration of this revealed that 10/11 POLE EDMs classified as pathogenic by POLE-score in the TCGA cases were universally predicted to be disruptive by six in silico tools, the exception being an L424I substitution predicted to be deleterious by five tools but benign by one (Table 1 and supplementary material, Table S2). However, of five POLE EDMs present in the TCGA but classified as non-pathogenic by POLE-score, one (S461L, POLE-score 2) was predicted to be damaging by all six tools, while another variant (E396G, POLE-score 1) was predicted to be damaging by four tools. Furthermore, of four mutations classified as uncertain pathogenicity with a POLE-score classification. The presence of a pathogenic POLE EDM is causal for ultramutated EC, a subtype associated with enhanced immune response [2,42] and excellent clinical outcome [6,7,13]. De-escalating adjuvant treatment in these patients is currently under investigation in the randomised PORTEC4a trial. However, interpretation of POLE sequence variants is challenging due to lack of standardized criteria, other than for the most common 'hotspot' mutations for which pathogenicity is reliably established. We aimed to generate tools to estimate the pathogenicity of POLE mutations using WES data, and to guide the management of cases where comprehensive genomic profiling is not available. Using cases with recurrent 'hotspot' POLE EDMs as a truth set, we identified their characteristic genomic correlates to generate a 'POLE-score'. In addition to correctly classifying all cases with POLE hotspot mutations in the TCGA cohort, it classified a further six POLE EDMs as likely pathogenic. Four exonuclease domain mutations had a POLE-score of 3 and were classified as being of uncertain pathogenicity, while three cases with POLE mutations outside the exonuclease domain had a POLE-score of 3 -all of which carried a plausibly pathogenic POLD1 mutation that could explain the mutational spectrum [8]. Intriguingly, a single case with a POLE mutation outside the exonuclease domain (R705W) was classified as pathogenic by POLE-score. The location of the mutation within the catalytic domain, close to the polymerase active sites, may explain this mutational spectrum; however, the clinical significance of this is unclear at present.
Because POLE-score relies on WES or WGS to estimate TMB and mutation proportions, it is unable to assign pathogenicity in the case of novel POLE mutations detected by targeted sequencing, where breadth is typically inadequate to estimate these parameters. Although this represents a potential challenge in clinical practice where targeted approaches are common, our pooled analysis suggests that this situation is uncommon -only 0.7% of ECs at the time of writing, a figure that will drop over the coming years as more WES/WGS data are accrued. We found that pathogenicity of such variants is not reliably predicted by in silico tools, which have low specificity. We suggest an approach to these tumours (outlined in Table 6), which may guide the use of additional sequencing (e.g. WES) to permit calculation of POLE-score in these cases. Although WES remains relatively costly compared with targeted approaches, such outlay is modest against that of local or systemic therapy, and thus remains a possible approach for cases where a significant treatment decision hangs in the balance.
Our study confirms the complex relationship between POLE mutations and DNA mismatch repair deficiency/microsatellite instability in EC. Perhaps most straightforward are those with POLE mutations outside the exonuclease domain: these appear to be passengers secondary to the hypermutator phenotype and should be classified as MMRd ECs. Co-existence of POLE EDM with MSI/MMRd is relatively uncommon, occurring in 3.4% cases in TCGA and 0.9% cases of molecularly subtyped tumours in our pooled series (this variation probably reflects a combination of targeted sequencing with enrichment for pathogenic POLE mutations in the latter cases). This group of tumours is heterogeneous. Those with POLE mutations predicted as pathogenic by POLE-score and MSI had genomic architecture similar to POLE hotspot-mutant/MSS tumours, supporting their classification as POLEmut EC. Those with POLE mutations predicted as non-pathogenic by POLE-score and MSI more closely resembled POLE-wild-type MSI cases, supporting their classification as MMRd EC. POLE EDM in combination with MMR loss causes a distinct mutational signature in EC (COSMIC signature 14) [1,38] -the observation that this is not universal in cases with both defects supports the notion that these tumours are a heterogeneous group, where MSI/MMRd could be acquired after POLE EDM and vice versa, with differing impacts on prognosis. Interestingly, while data were limited, patients with combined pathogenic POLE EDM and MSI appeared to have a good clinical outcome in our pooled cohort (5-year RFS 92.3%), though additional cases are required before this can be concluded.
In conclusion, our work provides guidance in the diagnostic interpretation of POLE mutations in EC in the presence and absence of WES data. Tumours with any of the 11 POLE EDMs identified in the TCGA and classified as pathogenic by POLE-score should be classified as 'POLE ultramutated' EC, independently of MMRd/MSI status. For cases where a POLE EDM not present in the TCGA is identified and WES data are available, POLE-score can be used for classification. In the absence of WES data, classification should be informed by the results of POLE-score on mutations reported in the TCGA and classified in Table 3. In silico prediction tools have limited value but may be able to identify benign changes and triage cases for WES/WGS. The guidelines that we provide will evolve over time but will allow for almost all tumours encountered to be classified into a molecular subtype based on currently available information. Figure S1. Clinical outcome of MMRd-POLEmut ECs Table S1. POLE mutations reported in ECs in COSMIC or TCGA