Colibactin mutational signatures in NTHL1 tumor syndrome and MUTYH associated polyposis patients

Polyketide synthase (pks) island harboring Escherichia coli are, under the right circumstances, able to produce the genotoxin colibactin. Colibactin is a risk factor for the development of colorectal cancer and associated with mutational signatures SBS88 and ID18. This study explores colibactin‐associated mutational signatures in biallelic NTHL1 and MUTYH patients. Targeted Next Generation Sequencing (NGS) was performed on colorectal adenomas and carcinomas of one biallelic NTHL and 12 biallelic MUTYH patients. Additional fecal metagenomics and genome sequencing followed by mutational signature analysis was conducted for the NTHL1 patient. Targeted NGS of the NTHL1 patient showed somatic APC variants fitting SBS88 which was confirmed using WGS. Furthermore, fecal metagenomics revealed pks genes. Also, in 1 out of 11 MUTYH patient a somatic variant was detected fitting SBS88. This report shows that colibactin may influence development of colorectal neoplasms in predisposed patients.


| INTRODUCTION
Presence of colibactin is a risk factor for the development of colorectal cancer and adenomas. 1,2Colibactin is a genotoxin produced by specific bacteria harboring the polyketide synthase (pks) island, of which Escherichia coli (pks + E. coli) is one.Mutational signatures associated with colibactin are characterized and have been added to the COSMIC database as Single Base Substitution signature 88 (SBS88) and Insertion Deletion signature 18 (ID18). 1,3Interestingly, a specific splice variant in APC, c.835-8A>G, was previously described to fit SBS88 and is recently proposed to act as a possible biomarker for the colibactin-associated mutational signature in cancer. 2,4As 20%-30% of the general population harbor pks + E.coli, colibactin may play a role in colorectal cancer patients with and without hereditary colorectal cancer syndromes. 5,6| METHODS

| Targeted Next Generation Sequencing
DNA was isolated from Formalin Fixed Paraffin Embedded (FFPE) tissue using the Tissue Preparation System (Siemens).Ampliseq Next Generation Sequencing (NGS) libraries (ThermoFisher Scientific) were H. Morreau, T. van Wezel, and M. Nielsen shared last authorship.
prepared according to manufacturer's instructions.Sequencing was performed in an Ion GeneStudio S5 Series sequencer (ThermoFisher Scientific), raw reads were mapped against hg19 and variants called using Torrent Variant Caller.Three NGS panels were used: a limited polyposis panel including APC, MUTYH, POLE and POLD1, a custom-made panel containing 20 colorectal cancer and polyposis associated genes and an Oncomine Comprehensive Assay (OCA) Plus (ThermoFisher) panel containing >500 genes.
All T>N and delT variants were visualized using Integrative Genomic Viewer.T>N variants within sequencing context: 5 0 A-(N)-(T/A)-T-(T/A/G) 3 0 were determined to fit SBS88. 1,3DelT variants in a thymine homopolymer flanked by 2-4 adenine homopolymer at the 5 0 side with a total of 5-6 base pairs were determined to fit ID18.

| Fecal metagenomics
DNA was extracted and libraries were prepared according to manufacturer's protocol and sequencing was performed on the Nova-seq6000 platform (Illumina).The analyses were performed partly comparable to the method description by Nooij et al. 7 but with direct read mapping.In short, reads mapped to GRCh38 were removed and quality-trimmed.These reads were screened for the presence of the pks island by mapping to the colibactin gene cluster (accession ID AM229678) after which technical artifacts were removed.The preprocessing and pks screening workflow are available: (https://git.lumc.nl/snooij/metagenomics-preprocessing; https://git.lumc.nl/snooij/screen_pks_in_polyposis_fecal_metagenomes).

| Genome sequencing
DNA was isolated from FFPE tissue blocks using the NucleoSpin DNA FFPE XS kit (BIOKE) according to manufacturer's instructions.
Sequencing was performed on the NovaSeq6000 platform (Illumina).
The raw sequencing reads were aligned to a reference genome (GRCh38), processed and mutational signature assignment was performed using mSigAct::sparseAssignSignatures followed by mSigAct signature presence test, as previously described. 3| RESULTS

| Biallelic NTHL1 patient
We describe the case of a 38 year old man with a biallelic pathogenic germline NTHL1 variant (NTHL1 tumor syndrome; NTS) diagnosed with two colorectal cancers: a cT3bN1M1 adenocarcinoma of the rectum and a pT1 adenocarcinoma in a pedunculated polyp in the sigmoid colon.Furthermore, a non-advanced tubular adenoma in the colibactin-associated APC variant c.835-8A>G in T2 and two other APC variants in T1 and T3; c.2008A>T and c.1600A>T, depicted in Figure 1A and Table 1.The sequence context of c.2008A>T (ATTTT) and c.1600A>T (ATTTT) showed that these two variants also fit SBS88.
Fecal metagenomics showed presence of 6 out of 19 pks genes.
Although Formalin Fixed Paraffin Embedded (FFPE) material is not optimal for genome sequencing, mutational signature analyses revealed a significant enrichment of SBS88 in one (T1) of the two analyzed colorectal lesions (T1-T2).None of these lesions showed an enrichment of ID18 or SBS30 (associated with biallelic NTHL1 variants).The distribution of mutational signatures in T1 is depicted in Figure S1.A BRAF variant c.1781A>G detected in patient 3 fits SBS88.As shown in Figure 1B, this one variant was found in an adenoma lacking variants fitting SBS36.Moreover, the adenoma (T1) shared two APC and two SMAD4 variants with another adenoma (T2), suggestive of a clonal relationship between the adenomas.
To detect additional somatic variants, an OCA Plus NGS panel was performed on T1 but the tumor mutational burden was too low to determine a mutational signature.

| DISCUSSION
In this study, NGS of a patient with a biallelic pathogenic NTHL1 variant showed somatic variants fitting SBS88.Presence of the colibactinassociated signature is confirmed using genome sequencing and pks genes were detected in a stool sample using fecal metagenomics.Previous literature describing exome sequencing colorectal neoplasms of two biallelic NTHL1 patients showed 18 somatic T>N variants but none of these variants fit SBS88. 8Another exome sequencing study of mono-allelic NTHL1 patients also did not show contribution of colibactin-associated mutational signatures. 9This is therefore the first study to present a colibactin influence in a biallelic NTHL1 patient.
Strikingly, mutational signature analysis of this same patient did not show a contribution of SBS30.Although more research is needed, the previous study investigating the NTHL1 signature in multiple neoplasms of biallelic NTHL1 patient showed that SBS30 did not contribute in all neoplasms to the same extent. 8rthermore, colorectal carcinomas and adenomas of 12 biallelic MUTYH patients were analyzed using NGS.This showed, as expected, the KRAS variant c.34G>T in the majority of samples (63%). 10Moreover, in one adenoma of 1 out of 12 patients a BRAF variant, c.1781A>G, was found fitting SBS88.This colibactin-associated mutational signature could unfortunately not be confirmed using a broad NGS panel.Still, this variant is recently described as one of the top 10 recurring somatic variants associated with SBS88-positive colorectal cancers. 4Therefore, this variant hints toward colibactin mutagenesis in this adenoma.
Also, the APC variants c.835-8A>G and c.1600A>T are described as one of these top 10 recurring variants, supporting our findings of fitting SBS88.Both these variants were not common in the 3916 SBS88 negative colorectal cancers included in this article (c.835-8A>G: N = 18 and c.1600A>T: N = 3).These findings suggest that these APC variants could be used as biomarkers for SBS88 lesions.
Although further research is required, since these numbers are low, this report highlights that presence of pks + E. coli might be considered as an additional risk factor for the development of colorectal malignancies in patients with a known predisposition to colorectal cancer or polyposis.
ascending colon was removed by snare polypectomy.The patient had a maternal aunt with breast cancer at the age of 38 and paternal grandfather with salivary duct cancer at an age above 80.Germline pathogenic variant analysis on leukocyte DNA and somatic mosaicism analysis on DNA isolated from the colorectal neoplasms were performed simultaneously.A homozygous germline pathogenic variant was identified in NTHL1: c.244C>T p.(Gln82*), alias p.(Gln90*).Targeted NGS on both colorectal carcinomas (T2-T3) and the tubular adenoma (T1) removed during index colonoscopy, revealed the F I G U R E 1 (A) Biallelic NTHL1 (p.Q90*) patient with three colorectal lesions with APC variants fitting the colibactin mutational signature (SBS88) in red.(B) The only biallelic MUTYH patient with (two) colorectal lesions without the MUTYH associated mutational signature (SBS36).One of these two lesions showed a BRAF variant suiting the colibactin mutational signature (SBS88) in red.Created with Biorender.com.T A B L E 1 Patient characteristics and variants found using targeted Next Generation Sequencing.Unless otherwise specified all variants were considered to be (likely) pathogenic.N polyps-number of adenomas at time of collection, ad-adenoma, CRC-colorectal carcinoma, VAF-variant allele frequency, Other-variants in other genes than NTHL1, MUTYH, KRAS or APC.(3) Variants with unknown pathogenicity.a Variants fitting with SBS88 (colibactin associated mutational signature).b Variants fitting with SBS36 (mutational signature associated with MUTYH inactivation).