Identification of neoantigens in esophageal adenocarcinoma

Esophageal adenocarcinoma (EAC) has a relatively poor long-term survival and limited treatment options. Promising targets for immunotherapy are short peptide neoantigens containing tumor mutations, presented to cytotoxic T-cells by human leukocyte antigen molecules (HLA). Despite an association between putative neoantigen abundance and therapeutic response across cancers, immunogenic neoantigens are challenging to identify. Here we characterized the mutational and immunopeptidomic landscapes of tumors from a cohort of seven patients with EAC. We directly identified one HLA-I presented neoantigen from one patient, and report functional T-cell responses from a predicted HLA-II neoantigen in a second patient. The predicted class II neoantigen contains both HLA I and II binding motifs. Our exploratory observations are consistent with previous neoantigen studies in finding that neoantigens are rarely directly observed, and an identification success rate following prediction in the order of 10%. However, our identified putative neoantigen is capable of eliciting strong T-cell responses, emphasizing the need for improved strategies for neoantigen identification.


53
Esophageal adenocarcinoma (EAC) is the 14th most common cancer in the UK, with a 10-year 54 survivability of 12% [1]. Early-stage treatment of EAC involves resection of the esophagus, 55 whereas later stage diagnosis is treated with chemoradiotherapy or chemotherapy followed by 56 surgery [2]. Relative to other cancers, EAC is characterized by having a high mutational burden, 57 measured as the number of mutations per protein coding region [3]. Many of these mutations 58 appearing in EAC driver genes [4,5]. 59 tumor infiltrating lymphocytes (TILs), specifically cytotoxic CD8+ and CD4+ helper T-cells 60 recognize respectively, peptides of intracellular and extracellular origin presented by class I and 61 II human leukocyte molecules (HLA). Presented at the cell surface, these HLA bound peptides 62 form the immunopeptidome. Neoantigen peptides contain tumor mutations, making attractive 63 therapeutic targets because of their potential to elicit tumor specific T-cell responses. 64 Progress in developing neoantigen vaccines has been hindered by the difficulty in identifying 65 neoantigen targets, and the challenge of overcoming the immunosuppressive tumor 66 microenviroment. In addressing neoantigen identification, direct identification using 67 immunopeptidomics suggests observing neoantigens is rare [6]. Attempts to predict 68 neoantigens using HLA binding algorithms show it is relatively straight forward to create a long 69 list of potential putative neoantigens, but difficult to reliably select immunogenic neoantigens [7]. 70 Large scale studies of EAC report the density of CD8+ T-cells correlates with the number of 71 somatic mutations [4], but an analysis of the EAC immunopeptidome has yet to be performed. 72 Here we explore a proteogenomics approach combining whole exome sequencing (WES), gene 73 expression (RNASeq), HLA immunopeptidomics and algorithmic neoantigen prediction to 74 identify neoantigens in seven EAC patients. We show that EAC has an abundance of somatic 75 mutations and immunopeptides, and that whilst direct observation or prediction of immunogenic neoantigens remains challenging we were able to identify two neoantigens in two patients, one 77 by direct observation and one by prediction. These findings are an important step towards 78 demonstrating the usefulness of neoantigen based therapies for EAC. 79

81
We collected tissues comprising of tumor and adjacent normal tissue, and peripheral blood 82 mononuclear cells (PBMCs) from seven male individuals with esophageal adenocarcinoma 83 (median age 68; Table 1). Whole genome sequencing for three donors have been previously 84 deposited as part of ICGC project ESAD-UK and EGA data set EGAD00001007785. We 85 sequenced the exomes of tumor and normal tissues, and performed gene expression and 86 immunopeptidomic analysis of the tumor tissues. PBMCs were used for HLA typing and IFN-ߛ 87 ELISpot functional assays for patient EN-181-11 ( Figure 1) [8]. 88  The mutational landscape of seven esophageal adenocarcinomas 94 To assess the likelihood of identifying HLA presented neoantigens we first examined the 95 mutational landscape of the seven esophageal adenocarcinomas. Somatic mutations 96 accumulate in the genome over time as cells divide, and in cancer the causes and patterns of 97 somatic mutations help characterize the cancer type and explain its cells selective advantage 98 [3]. The total number of somatic mutations per coding region of genome defines the mutational 99 burden of the cancer type, and has been correlated with response to anti-PD1 therapy, and is 100 therefore a proxy for the number of neoantigens presented by tumor cells [9,10]. Across 101 cancers, median mutational burden ranges from 16 to over 300 mutations per megabase [3]. 102 Here, the esophageal adenocarcinomas had a median mutational burden of 124 mutations/Mb 103 in comparison to a median of 40 mutations/Mb for normal adjacent tissue (Figure 2A, S1 Table). indicating the presence of TILs, a necessary but not sufficient requirement for a response to 123 presented neoantigens. 124 In summary, the mutational landscape of our EAC samples is characterized by a high tumor 125 mutational burden along with the presence of TILs, both necessary conditions for neoantigen 126 presentation and recognition respectively. 127

Figure 2: The mutational landscape of seven esophageal adenocarcinomas (A) The 129 mutational burden of tumor and normal adjacent tissues from WES assuming a whole 130 exome size of 30 Mb. The bar demarks the median. (B) The proportions of the four single 131 base substitution mutational signatures in each EAC sample extracted from WES. The 132 best fit signatures in COSMIC v3 database are prefixed SBS. (C-F) The four mutational 133 signatures extracted from WES of seven EAC samples. (G) The proportions of TILs 134 estimated from bulk tumor RNASeq in each EAC sample. (H) The proportions of TILs 135 estimated from bulk tumor RNASeq across the cohort. The bar demarks the median. 136
Immunopeptidome analysis reveals one putative neoantigen 137 We next sought to directly observe neoantigens present in the immunopeptidomes of our EAC 138 samples. Using the mutations identified from WES we created individual databases appended 139 with patient specific mutated sequences (mutanomes) to search for neoantigens in their 140 immunopeptidomes ( Figure 1). In total we identified 41,535 HLA class I and II peptides isolated 141 from these tumors by LC-MS/MS analysis at a false discovery rate of 1% (S1 Table). These 142 immunopeptidomes comprised of 24,095 unique class I and 8,023 unique class II peptides, 143 forming characteristic HLA length distributions with modes of 9-mers and 15-mers for HLA class 144 I and II peptides respectively ( Figure 3A). 145 Across the seven patients we identified only one putative neoantigen from the HLA-I 146 immunopeptidome of patient EN-454-11 ( Figure 3B) [15]. This is an 8-mer peptide derived from 147 Nucleolar protein 58 (Gene: NOP58, UniProt: Q9Y2X3 COSMIC: COSV51895876) with a 148 hydrophobic glycine to basic arginine mutation at protein amino acid (AA) residue 97, peptide 149 AA residue 5. The mutation at peptide residue 5 creates a sequence DAKLRGVI with anchor 150 residues for HLA-B*08:01 ( Figure 3C) [16,17]. Over 40% of the 650 HLA-B*08:01 peptides 151 identified for EN-454-11 were 8-mers, consistent with previous reports of a secondary length 152 preference for 8-mers for this allotype ( Figure 3D) [17]. Unfortunately, there were insufficient 153 PBMCs available to perform functional assays for this donor, so we next focused on predicted 154 neoantigens from patient EN-181-11 from which we could perform a functional assay. 155 and B, and of 15-mer peptides for class II DRB1 allotypes were calculated using pVACseq 166 [18,19]. Neoantigen rank score is calculated as a function of the predicted binding affinity, the 167 neoantigen agretopicity (the relative increase in predicted binding affinity of mutant peptide to 168 wildtype peptide), the variant allele frequency and gene expression levels (Figure 1, Materials  169 and Methods). Predictions were performed for each peptide length and allotype combination 170 yielding 15 ranked tables, comprising a total of 6842 peptides with binding affinity <500 nM for 171 patient EN-181-11. Nine top ranking putative neoantigens were selected for functional analysis 172 (   G  e  n  e  W  T  p  e  p  t  i  d  e  M  T  p  e  p  t  i  d  e  M  u  t  a  t  i  o  n  L  e  n  g  t  h  H  L  A  a  l  l  o  t  y  p  e  W  T  n  M  M  T  n  M  A  g  r  e  t  o  p  i  c  i  t  y  G  e  n  e  e  x  p  r  e  s  s  i  o  n  (  t  p  m  We synthesized both the neoantigen and wild type peptides at their specific lengths (Table 2) 175 and tested their ability to stimulate T-cells present in PBMCs using an IFN-release cultured 176

Figure 3: A single putative neoantigen identifed from the immunopeptidomes of seven 157 esophageal adenocarcinomas (A) Histogram of 41,535 eluted HLA class I and II peptides 158 from seven EAC samples. (B) MS/MS spectrum from donor EN-454-11 of putative HLA-
ELISpot assay ( Figure 4A). We observed a strong response for only the putative class II 177 neoantigen derived from collagen alpha-1(XII) chain (Gene: COL12A1 UniProt: Q99715). Closer 178 examination of the COL12A1 neoantigen sequence revealed that the first nine amino acids 179 TLYLIVTDLK contain the HLA-A*03:01 motif in addition to the HLA-DRB1*03:01 motif in the full 180 length TLYLIVTDLKTYQIG peptide ( Figure 4B-C). Moreover, the observation of COL12A1 wild 181 type peptides in both the class I and II immunopeptidomes of EN-181-11 indicate that this 182 protein is presented in both antigen processing pathways by this tumor (Supporting 183 Information). 184

191
Here we report the first in-depth study of HLA presented neoantigens in esophageal 192 adenocarcinoma, investigating both direct observation and predicted neoantigens from seven 193 patients. 194 The mutational landscape of these EAC patients described by WES is consistent with previous 195 characterizations of high mutational burden [13], mutational signatures with high proportions of 196 C>T substitutions and evidence of chromosomal instability [4,5,20]. Gene expression analysis 197 estimating the populations of infiltrating lymphocytes indicated that mutations yielding 198 neoantigens may be detectable. However, only one neoantigen could be identified following 199 direct examination of neoantigens using mass spectrometry-based proteomics to identify HLA 200 bound peptides eluted from tumor tissues. This is consistent with previous attempts at direct 201 neoantigen identification across multiple cancer types [6,21,22]. 202 Although we were unable to validate the functionality of this observed neoantigen due to 203 unavailability of PBMCs for this individual, the observed neoantigen had an optimum length and 204 binding motif for one of the patients HLA molecules. The G>R mutation changes this peptide 205 from a wildtype peptide that would not be expected to bind to HLA and therefore not be 206 presented, to a peptide that can bind and be presented. Hence, we believe that this is likely to 207 be a true neoantigen, although we are unable to confirm if it is also immunogenic. 208 Evidence from checkpoint inhibitor therapy and T-cell responses to predicted neoantigens 209 suggest that neoantigens are effective at eliciting immune responses [23][24][25]. Therefore, for 210 another patient for which there was available PBMCs we used the mutational and gene 211 expression information to generate ranked lists of predicted neoantigens for each HLA-A, B and 212 HLA-DRB1 allotype [18,19]. We tested nine of the top ranked predicted neoantigens and their 213 wildtype equivalents for their ability to stimulate T-cells in an IFN-release cultured ELISpot assay and found a single high responding neoantigen (>3,000 spots/million cells, the wildtype 215 peptide did not respond.) As with our attempts at direct identification, a one in nine success is 216 comparable with previous reported attempts at predicting functional neoantigens [7]. The 217 responding 15-mer peptide was a HLA-DRB1 predicted neoantigen, but on examination the first 218 nine amino acids also comprised a HLA-A neoantigen for this patient. 219 The identification of a neoantigen containing both a HLA-I and HLA-II motif corresponds with 220 reports of primarily CD4 responsive neoantigens even where neoantigens have been predicted 221 as HLA-I peptides eliciting CD8 responses [24,25]. Similar observations have been reported in 222 studies for viral pathogen specific peptides of CD4 responses where CD8 responses would be 223 expected to predominate [26]. A feature of many neoantigen studies is the use of long peptides 224 containing the neoantigen sequence and the reliance on cellular machinery to process these 20-225 25mer peptides into appropriate HLA-I or HLA-II length neoantigens [25,27]. Without further 226 deconvolution, such as pre-enrichment for CD8 T-cells prior to ELISpot or single cell RNAseq 227 TCR analysis, it is unclear what peptide processing has occurred and what immune response is 228 being observed [28]. Here we used the specific peptides, but there still remains uncertainty 229 about whether this a combined CD4/CD8 response or only CD4 response. Likewise the reasons 230 for reports of predominantly CD4 responses, especially to remains unclear. 231 The main limitation of our study was the availability of PBMCs for validation of putative 232 neoantigens. Our identification of a functional neoantigen in one patient suggests that we would 233 be able to identify others across the cohort if we were able to test them. 234 Overall, this study confirms that either direct observation or prediction of functional neoantigens 235 is rare with existing methodologies, and thus further work is required to increase the frequency 236 of successful identification [28]. However, our study also demonstrates that identified 237 neoantigens can yield strong immune responses in functional assays, highlighting the potential 238 for the development of neoantigen based T-cells vaccines and expanding the treatment options 239 for a cancer with low survivability. 240 plugin ProteinSeqs to derive the amino acid sequences arising from missense mutations for 296 each sample for use in immunopeptide analyses. 297

Neoantigen prediction 298
Variant call files were prepared for the pvacseq neoantigen prediction pipeline (version 1.5.1) 299 [18,19] by adding tumor and normal DNA coverage, and tumor transcript and gene expression 300 estimates using vatools (version 4.1.0) (http://www.vatools.org/). Variant call files of phased 301 proximal variants were also created for use with the pipeline [40]. Prediction of neoantigens 302 arising from somatic variants was then performed using pvacseq with the patient HLA allotypes 303 to predict 8-11mer peptides for class I HLA and 15-mer peptides for class II HLA-DRB allotypes. 304 Eight binding algorithms were used for class I predictions (MHCflurry,MHCnuggetsI,NNalign,305 NetMHC, PickPocket, SMM, SMMPMBEC, SMMalign) and four for class II predictions