Microsatellite instability status determined by next‐generation sequencing and compared with PD‐L1 and tumor mutational burden in 11,348 patients

Abstract Microsatellite instability (MSI) testing identifies patients who may benefit from immune checkpoint inhibitors. We developed an MSI assay that uses data from a commercially available next‐generation sequencing (NGS) panel to determine MSI status. The assay is applicable across cancer types and does not require matched samples from normal tissue. Here, we describe the MSI‐NGS method and explore the relationship of MSI with tumor mutational burden (TMB) and PD‐L1. MSI examined by PCR fragment analysis and NGS was compared for 2189 matched cases. Mismatch repair status by immunohistochemistry was compared to MSI‐NGS for 1986 matched cases. TMB was examined by NGS, and PD‐L1 was determined by immunohistochemistry. Among 2189 matched cases that spanned 26 cancer types, MSI‐NGS, as compared to MSI by PCR fragment analysis, had sensitivity of 95.8% (95% confidence interval [CI] 92.24, 98.08), specificity of 99.4% (95% CI 98.94, 99.69), positive predictive value of 94.5% (95% CI 90.62, 97.14), and negative predictive value of 99.2% (95% CI, 98.75, 99.57). High MSI (MSI‐H) status was identified in 23 of 26 cancer types. Among 11,348 cases examined (including the 2189 matched cases), the overall rates of MSI‐H, TMB‐high, and PD‐L1 positivity were 3.0%, 7.7%, and 25.4%, respectively. Thirty percent of MSI‐H cases were TMB‐low, and only 26% of MSI‐H cases were PD‐L1 positive. The overlap between TMB, MSI, and PD‐L1 differed among cancer types. Only 0.6% of the cases were positive for all three markers. MSI‐H status can be determined by NGS across cancer types. MSI‐H offers distinct data for treatment decisions regarding immune checkpoint inhibitors, in addition to the data available from TMB and PD‐L1.


Introduction
Microsatellite instability (MSI) involves the gain or loss of nucleotides from microsatellite tracts, which are DNA elements composed of repeating motifs that occur as alleles of variable lengths [1]. MSI can result from inherited mutations or originate somatically. Lynch syndrome results from inherited mutations of known mismatch repair (MMR) genes. Tumors are classified as MMR-deficient (dMMR) if they have somatic or germ line mutations.
MSI can also occur due to epigenetic changes or altered microRNA pathways affecting MMR proteins, or without a loss of a known underlying protein [2]. MSI is most commonly found in colon and endometrial cancers (the most common Lynch syndrome cancer types); however, recent analyses have found MSI in as many as 24 cancer types, suggesting that MSI is a generalized cancer phenotype [3][4][5][6].
MSI has been associated with improved prognosis, but until the recent advent of immune checkpoint inhibitors,

ORIGINAL RESEARCH
Microsatellite instability status determined by nextgeneration sequencing and compared with PD-L1 and tumor mutational burden in 11,348 patients the predictive use of MSI has been limited. A proof-ofconcept study including 87 patients with 12 different cancer types demonstrated the predictive value of MSI status to predict response of solid tumors to the anti-PD-1 agent pembrolizumab [5,7]. This ability of MSI to predict pembrolizumab response has led to the first tumor-agnostic drug approval by the FDA in May 2017. Additional evidence showed an improved response for MSI-high (MSI-H) patients to the anti-PD-1 agents nivolumab and MEDI0680, the anti-PD-L1 agent durvalumab, and the anti-CTLA-4 agent ipilimumab [7][8][9][10]. These results elevate MSI status as a third, possibly independent, predictive biomarker for immune checkpoint inhibitors, along with PD-L1 and tumor mutational burden (TMB) [11][12][13][14][15][16][17]. Given that patient responses to these drugs can be highly durable [5,7,18], it is critical to identify as many potential responders as possible. Therefore, a method to efficiently determine MSI status for every cancer patient is needed.
Currently, MSI is most commonly detected through polymerase chain reaction (PCR) by fragment analysis (FA) of five conserved satellite regions, which is considered the gold standard method for MSI detection [1,19]. FA is not ideal in the clinic, however, as it requires samples of both tumor and normal tissue. As a result, FA is not always feasible for cases with limited amounts of tissue, including the analysis of cancer metastases, which are commonly submitted as biopsies and may contain few normal cells. Additionally, determining MSI by FA and MMR analysis from immunohistochemistry (IHC) is performed as stand-alone tests and would be inefficient to perform on every patient with cancer, given that the incidence of MSI is only about 5% across cancer types [5].
As broad tumor profiling becomes a common part of care for patients with cancer, it would be preferable to determine MSI status from sequencing panel results. Next-generation sequencing (NGS) was recently found to be feasible to determine MSI status, but the published techniques require the use of paired tumor and normal tissue [3,6]. Given a large database of samples with both broad NGS results and matching MSI/dMMR status by FA/IHC, we hypothesized that we could develop and technically validate an NGS-based MSI assay without the need for matched samples from normal tissue. Here, we describe our process for developing such a method and explore the relationship of MSI with other immunotherapy markers, specifically TMB and PD-L1.

Patient cohort
For development of the NGS assay, 2189 cases were retrospectively selected based on having data available for both the 592-gene sequencing panel and MSI testing by PCR-FA (assay details below). For the TMB, PD-L1, and MSI-NGS comparison, 11,348 patients were retrospectively selected based on available data from commercial comprehensive sequencing profiles performed on their tumors by a commercial laboratory (Caris Life Sciences, Phoenix, AZ, USA) that included PD-L1 by immunohistochemistry (IHC) and the 592-gene sequencing panel. This research used a collection of existing data that were deidentified prior to analysis. As this research was compliant with 45 CFR 46.101(b), the project was deemed exempt from IRB oversight and consent requirements were waived.

Fragment analysis by PCR
MSI-FA was tested by the fluorescent multiplex PCR-based method (MSI Analysis: Promega, Life Sciences, Madison, WI, USA).

Next-generation sequencing
NGS was performed on genomic DNA isolated from formalin-fixed paraffin-embedded (FFPE) tumor samples using the NextSeq platform (Illumina Inc., San Diego, CA, USA). A custom-designed SureSelect XT assay (Agilent Technologies, Santa Clara, CA, USA) was used to enrich the 592 whole-gene targets that comprised a 592-gene NGS panel. All variants were detected with >99% confidence based on allele frequency and baited capture pulldown coverage with an average sequencing depth of over 500X and an analytic sensitivity of 5% variant frequency.

Microsatellite instability by NGS
Microsatellite loci in the target regions of a 592-gene NGS panel were first identified using the MISA algorithm (pgrc. ipk-gatersleben.de/misa/), which revealed 8921 microsatellite locations. Subsequent analyses excluded sex chromosome loci, microsatellite loci in regions that typically have lower coverage depth relative to other genomic regions, and microsatellites with repeat unit lengths greater than five nucleotides. These exclusions resulted in 7317 target microsatellite loci.
Patient DNA was sequenced by NGS using the 592-gene panel. We examined the 7317 target microsatellite loci and compared them to the reference genome hg19 from the UCSC Genome Browser database (http://hgdownload. cse.ucsc.edu/goldenPath/hg19/bigZips/). The number of microsatellite loci that were altered by somatic insertion or deletion was counted for each patient sample. Only insertions or deletions that increased or decreased the number of repeats were considered. A locus was not A. Vanderwalde et al. Determining MSI Status by NGS counted more than once even if it had multiple lengths of insertions or deletions. Thresholds were calibrated based on comparison of total number of altered loci per patient to MSI-FA results with the aim to maximize sensitivity while maintaining an appropriately high specificity, positive predictive value (PPV), and negative predictive value (NPV).

Total mutation burden
TMB was calculated based on the number of nonsynonymous somatic mutations identified by NGS while excluding any known single nucleotide polymorphisms (SNPs) in dbSNP (version 137) or in the 1000 Genomes Project database (phase 3; http://www.internationalgenome. org/) [20]. TMB is reported as mutations per Mb sequenced. The threshold for determining high TMB as greater than or equal to 17 mutations/megabase was established by comparing TMB with MSI by FA in CRC cases, based on reports of TMB having high concordance with MSI in CRC [7,21].

PD-L1 IHC
IHC analysis was performed on slides of FFPE tumor samples using automated staining techniques. The procedures met the standards and requirements of the College of American Pathologists.
The primary antibody against PD-L1 was SP142 (Spring Bioscience, Pleasanton, CA, USA), except for NSCLC tumors tested after January 2016. For NSCLC tumors tested after January 2016, the primary PD-L1 antibody clone was 22c3 (Dako, Santa Clara, CA, USA). For the calculations in this manuscript, staining for both antibodies was considered positive if there was staining on ≥1% of tumor cells.

Results
Matched MSI-FA PCR and 592-gene NGS assays from 2189 cases ( Fig. 1 and Table 1) were used to calibrate the MSI-NGS assay to classify samples as MSI-H or microsatellite stable (MSS). A cutoff of ≥46 altered loci was chosen with the goal of optimizing the performance of the MSI-NGS test in CRC and endometrial cancers, which are the cancer types for which MSI testing has traditionally had the highest clinical relevance (Fig. 1). Lower cutoffs resulted in unacceptably high levels of MSS-FA CRC cases (Fig. 1 Table 3 The relationship between TMB, MSI, and PD-L1 was explored by analyzing 11,348 cases that had results for all three assays ( Fig. 2A and Table 3). In this set, the overall rate of MSI-H was 3.0%. Overall, high TMB was 7.7%, and PD-L1 positivity was 25.4%. Among MSI-H cases, 70% were also high TMB (62.6% with CRC cases removed). Among high TMB cases, 27% were also MSI-H. Only 0.6% of the cases were positive for all three markers, while 69.5% of the cases were negative for all three. Of the total cohort, only 26% of MSI-H cases were PD-L1 positive compared to 44% of high TMB cases.
The overlap between the biomarkers TMB, MSI, and PD-L1 differed among cancer types (Fig. 2B-H). High TMB and MSI-H had 95% overlap for CRC, which was expected, as the TMB cutoff was based on CRC MSI-FA results. However, only 57% of MSI-H endometrial cancer cases were also high TMB. Likewise, ovarian, neuroendocrine, and cervical cancers also had significant percentages of MSI-H cases that were not TMB-high. In contrast, NSCLC and melanoma had few or no MSI-H cases, while still having a significant number of high TMB cases.
Certain cancer types showed interesting relationships regarding MSI and TMB (Fig. 3). In both CRC and

Discussion
MSI-H cancers are a genetically defined subset of cancers with the potential for enhanced responsiveness to anti-PD-1 therapies [5][6][7]. Determining MSI status across cancer types offers the opportunity to identify patients who are likely to respond while avoiding unnecessary toxicities for patients identified as unlikely to respond. Here, we show the development of a sensitive and specific MSI assay by NGS that is comparable to the existing gold standard of PCR-FA methods without requiring matched samples from normal tissue.
The method was calibrated with 2189 cases across 26 cancer types that had both MSI-FA and 592-gene NGS results. This number of matched samples between FA and NGS is a substantially larger calibration set than that used in another published NGS-MSI method [22]. Previously published data using the MSI-NGS method described      [23]. Likewise, here, we identified MSI-H in 23 of 26 cancer types. The detection of MSI-H cases in this extensive list of cancer types supports the concept that MSI may be a generalized cancer phenotype [3].
Notably, MSI-H cases that were not TMB-H or PD-L1positive occurred in significant percentages of ovarian (24%), neuroendocrine (57%), and cervical (33%) cancers. With the recent approval of pembrolizumab for MSI-H patients of any solid tumor type, this subset of patients now has a promising treatment that would not have been identified using either of the other two immunotherapy biomarker assays. Given the lack of overlap of MSI and high TMB in several cancer types, these data do not support substituting TMB analysis with MSI-NGS or vice versa. If future clinical studies show significantly reduced response rates of TMBlow/MSI-H or TMB-high/MSS tumors to pembrolizumab, then this conclusion can be reconsidered.
This MSI-NGS assay has good concordance with the FA method for CRC (100% sensitivity and 99.9% specificity), but its performance is slightly reduced when looking across all cancer types (95.8% sensitivity and 99.9% specificity; PPV of 94.5%). As the FA test was developed for CRC, MSI-NGS discrepancies in non-CRC cancer types may be due to other loci being involved in these cancer types that are not measured by the FA method. This raises the possibility that some of the FA PCR results could be false negatives, rather than the corresponding MSI-NGS results being false positives. Future studies investigating responses to immunotherapies in these discordant patients will help to identify the clinical relevance of these discrepancies. This NGS assay, with broader microsatellite coverage, may be a better predictor of response than the FA assay, which is limited to five microsatellite sites.
The use of NGS to determine MSI status offers significant advantages over FA by PCR. Due to the large number of microsatellite regions analyzed, this method of NGS analysis of MSI does not require a sample of normal tissue for comparison. The comparison of a large number of microsatellite sequences to a reference human genome was able to provide a level of sensitivity comparable to that achieved using only a few microsatellites and comparing to a normal sample from the same patient.
Thus, with this method, it is feasible to determine MSI status for patients who do not have available normal tissue or for whom it would be a burden to obtain. Coupling the calculation of MSI to data that are already generated by a broad NGS panel allows for MSI status to be determined efficiently for any patient who is already receiving broad NGS results, without adding the cost of an additional stand-alone test or consuming additional tumor tissue that could be used for other testings. Further, while FA by PCR was optimized to analyze CRC [24], our NGS analysis of MSI is a pan-cancer method whose development was technically validated across 26 cancer types.
IHC testing for MMR protein is commonly performed on CRC and endometrial cancer cases to test for Lynch syndrome. Clinical evidence indicates that treatments with the PD-1 inhibitors pembrolizumab and nivolumab both lead to favorable responses in patients with dMMR tumors [5,7,18]. Our NGS-MSI assay has only 87.1% sensitivity for dMMR detection compared to MMR-IHC (Table 2). However, the proteins measured by standard MMR-IHC (MLH1, MSH2, MSH6, and PMS2) are not equal in their contribution to the mismatch repair process. Previous research on endometrial carcinoma found that most MSI-H tumors had loss of MLH1 and PMS2, with concordant loss of the MLH1/PMS2 heterodimer in 48% and with MSI-H in 97% of PSM2-negative cases [25]. As such, there may be a subset of dMMR cases with relatively low microsatellite alterations, which are identified as MSS by NGS, that have lower rates of response to PD-1 inhibition compared with cases that are MSI-H and dMMR cases. This hypothesis is supported by data indicating that the subset of dMMR CRC cases called MSS by FA was much less likely to respond to nivolumab than MSI-H cases [18]. Until more data are available, the best choice may be to run both MSI-NGS and MMR-IHC, in lineages where MMR-IHC loss is more common, to identify as many patients as possible. In addition to this question of magnitude of clinical response for dMMR/MSS patients, MMR-IHC testing may be impractical for malignancies with low rates of microsatellite instability as these tests require dedicated slides, consuming valuable tissue for a low yield of pathogenic findings.
Current NCCN guidelines recommend MSI and MMR proficiency testing on patients with colon and endometrial cancer. Considering the landscape of the site-agnostic approval of pembrolizumab for patients with MSI-H cancers, the testing recommendation should now be expanded to include all patients with advanced solid tumors lacking satisfactory treatment options. The method of MSI-NGS addresses many of the disadvantages of both FA and MMR-IHC, thus providing an ideal platform to measure MSI status in all tumors. MSI-NGS is easily added to other malignancy-specific molecular panels, requires no extra tissue, and has lower marginal cost when FA is considered as an add-on test that must be performed along with an NGS panel. With the evolution in cancer care toward molecularly defined diagnoses, validation of NGS measurement of MSI status provides a needed mechanism for all patients with cancer, regardless of malignancy, to achieve testing that can determine whether a potentially life-extending agent may be appropriate.
A question remains regarding the clinical relevance of measuring both TMB and MSI. MSI is measured by NGS through counting insertions or deletions of 2-5 nucleotides in specific areas of the genome known to accumulate errors in microsatellites. In contrast, TMB was measured here by counting nonsynonymous mutations across the sequenced portion of the genome. Therefore, TMB can capture a wider range of mutational signatures because it covers the genome more broadly. While most MSI-H cases are high TMB, the opposite is not true. The comparison here of TMB and MSI in CRC is limited by the fact that the threshold for TMB was based on the CRC MSI-FA results. Our cutoff for high TMB of ≥17 mutations/Mb is similar to the recently published cutoff values of >13.8 and >20 mutations/Mb [6,26]. True biological differences in TMB and MSI appear to exist in certain cancer types. For example, tumors driven primarily by environmentally caused mutations (NSCLC and melanoma) have a much higher proportion of cases with high TMB than MSI (Fig. 3) compared to tumors that are not as strongly associated with environmental causes.
Potential selection bias may limit the ability to extrapolate from this study. The 11,348 cases included in these comprehensive genomic analyses by NGS are generally from patients with advanced, refractory disease who lacked obvious treatment options. This could lead to some bias in the reported MSI frequencies, for example, CRC MSI-H rates are lower in advanced disease than in the overall CRC population [4].
In conclusion, we have used a large database to develop a method to determine MSI status using NGS results. Specifically, this MSI-NGS method is applicable across cancer types and does not require matched normal samples, thus providing an alternative for patients with limited tissue. The investigation of the relationship among TMB, MSI, and PD-L1 revealed a population with MSI-H disease, but low TMB and no PD-L1 expression, thus expanding the pool of potential immunotherapy recipients. Until more clinical data are available to show how these markers work together, the best option may be to continue to measure all three to ensure that as many patients as possible benefit from these drugs.