Detection of disease‐causing mutations in prostate cancer by NGS sequencing

Abstract Gene mutations may affect the fate of many tumors including prostate cancer (PCa); therefore, the research of specific mutations associated with tumor outcomes might help the urologist to identify the best therapy for PCa patients such as surgical resection, adjuvant therapy or active surveillance. Genomic DNA (gDNA) was extracted from 48 paraffin‐embedded PCa samples and normal paired tissues. Next, gDNA was amplified and analyzed by next‐generation sequencing (NGS) using a specific gene panel for PCa. Raw data were refined to exclude false‐positive mutations; thus, variants with coverage and frequency lower than 100× and 5%, respectively were removed. Mutation significance was processed by Genomic Evolutionary Rate Profiling, ClinVar, and Varsome tools. Most of 3000 mutations (80%) were single nucleotide variants and the remaining 20% indels. After raw data elaboration, 312 variants were selected. Most mutated genes were KMT2D (26.45%), FOXA1 (16.13%), ATM (15.81%), ZFHX3 (9.35%), TP53 (8.06%), and APC (5.48%). Hot spot mutations in FOXA1, ATM, ZFHX3, SPOP, and MED12 were also found. Truncating mutations of ATM, lesions lying in hot spot regions of SPOP and FOXA1 as well as mutations of TP53 correlated with poor prognosis. Importantly, we have also found some germline mutations associated with hereditary cancer‐predisposing syndrome. gDNA sequencing of 48 cancer tissues by NGS allowed to detect new tumor variants as well as confirmed lesions in genes linked to prostate cancer. Overall, somatic and germline mutations linked to good/poor prognosis could represent new prognostic tools to improve the management of PCa patients.


| INTRODUCTION
Prostate cancer (PCa) is the most common noncutaneous cancer of man in Europe, where the highest incidence of clinically diagnosed PCa in Northern and Western part of Europe was found (Mottet et al., 2021). In absence of early diagnosis, the mortality rate for PCa patients is very high representing about the sixth most fatal cancer in man (Dejous & Krishnan, 2020). Patients with high-grade disease characterized by T3-4 stage, lymph node invasion, or an extraprostatic extension have a high-risk (most of 40%) of disease recurrence after 5-10 years from the diagnosis (Spratt et al., 2018). Currently, the main tool for PCa detection is the analysis of prostate-specific antigen (PSA) serum levels combined with direct rectal examination (DRE). However, PSA serum detection remains one of the most controversial topics in the urologic literature, since it leads to overdiagnosis and overtreatment of positive subjects (Mottet et al., 2021). Moreover, neither overall survival (OS) nor cancerspecific survival (CSS) benefits in patients screened by PSA were observed (Mottet et al., 2021). Prostate cancer treatments are dependent on the staging of tumor and includes active surveillance (AS), surgery, hormone therapy, radiotherapy, or a combination of these treatments (Dejous & Krishnan, 2020). Moreover, early diagnosis and disease outcome prediction are crucial points to increase patient OS (Dejous & Krishnan, 2020). Genomic alterations deeply affected cancer biology and disease course in tumors including PCa. In particular, the fusion of the genes ERG and TMPRSS2 is one of most frequent genomic alterations observed in PCa (Gasi Tandefelt et al., 2014). Moreover, somatic gene mutations linked to tumor progression such as oncogenes or tumor suppressor genes were also identified (Gandhi et al., 2018). The detection of gene mutations linked to PCa outcome might improve the knowledge of this tumor increasing prognostic tools and therapeutic options.

| Tissue collection
Paraffin-embedded tumor samples (23 GS6, 11 GS7, 11 GS8, and 3 GS9) from 48 patients underwent to radical prostatectomy in the years 2010-2015 were collected. The diagnosis of cancer samples was evaluated by genitourinary pathologist on hematoxiline and eosine (H&E)-stained slides. Selected samples (both tumor and normal tissues from the same patient) were cut into 8 × 10 µm sections with the last H&E stained 4 µm sections to confirm tumor cellularity. This is a retrospective study approved by Ethics Committee (no 151095). A written consent regarding tissue analysis and outcome data for all cases enrolled was collected. This study follows the guidelines of Helsinki Declaration.  (Frank et al., 2018;Robinson et al., 2015). The PC panel consists of two DNA primer pools (pool 1: 337 amplicons, pool 2: 331 amplicons) capable to amplified coding regions of maximum 150 bp in length to ensure optimal amplification. All gene information of PC panel was inserted in Table 1. 2.4 | Genomic DNA extraction, sample enrichment, and NGS sequencing Genomic DNA (gDNA) was extracted with QIAmp FFPE tissues kit (Qiagen) according to the manufacturer's instructions. gDNA quantity and quality were assessed using the Qubit ® 2.0 photometer (Thermo Fisher Scientific) and the Qubit ® dsDNA HS Assay Kit. gDNA was diluted at the final concentration of 5 ng/μl with deionized water.

| Prostate panel design
Libraries were prepared from 10 ng of gDNA using the PC Panel.
Overall, gDNA was subjected to library preparation according with Ion Ampliseq Libreries kit 2.0 (Thermo Fisher Scientific). Target regions were initially amplified (20 PCR cycles) with a multiple PCR; after thermal cycling amplification, amplicons produced from pool 1 and pool 2 were combined and partially digested. Next, they were subjected to ligation of barcoded adapters and purified. Before sequencing, libraries were quantified using the Agilent™ 2100  (Table S1). As shown in Figure

| Recurrent mutations
We identified some recurrent mutations in different subjects ( Figure 2). In particular, the V1822D (n = 3) and G2502S (n = 4) substitutions in APC were considered benign variants.
The mutation E365K (n = 18) in ATM was processed as uncertain significance and showed a high frequency in our cohort (37.5%). In this gene the benign variant D1853N (n = 6) was also identified.
Interestingly, the mutation P72R was the germline variant most frequent our cohort, which is present in approximately 40% of cases.

| Hotspot mutations
We found hotspot mutations in different genes ( Figure  We found that about 66% of mutations in AR were located in the ligand-binding domain (LBD) and were characterized as pathogenic lesions. Three of these were close together, while the fourth was located at the end of LBD. Finally, we discovered several lesions where three of these variants lay very close together while the others were spread along this motif.

| Linkage between gene mutation and disease outcome
Mutations found in our cohort were matched with patient follow-up data. As shown in Figure 4, the percentage of mutated genes between the group with good and poor prognosis was different. The mutation frequency of MED12, AR, CHD1, OR5L1, and KTM2D was lower in patients with poor prognosis. In particular, lesions found in KMT2D were much more common in the group of patient with good prognosis. Conversely, mutations detected in FOXA1, SPOP, ATM, and TP53 were mainly found in patients with poor prognosis, while the mutation percentage of APC, COL5A1, ZFHX3, and CDK12 was substantially unchanged. In more detail, different FOXA1 variants laying in the forkhead domain were linked to biochemical recurrence as well as those found in SPOP. Moreover, the truncating lesions R805X and L2692X as well as the substitution R3008H in ATM were F I G U R E 1 Mutation frequency of genes related to prostate cancer detected in a cohort of 48 subjects by NGS analysis. The most mutated genes are KMT2D, FOXA1, and ATM, while in RB1, PTEN, and PIK3CA few variants were detected. NGS, next-generation sequencing associated with poor prognosis. Similarly, lesions in TP53 such as Y163H, T172Ifs, and R267P were associated with both higher Gleason score and tumor progression (Table 2).

| Germline mutations and cancer familiarity
We detected different germline variants with likely pathological significance and possible hereditary predisposing-cancer syndrome in our PCa cohort. In particular, these germline mutations were observed in 10 patients (about 20%) and hit several genes including ATM, KMT2D, TP53, and CDK12. Many germline mutations were found in cases with metastasis and high Gleason score. In fact, of the 10 patients with germline variants, two had a Gleason score 9, three 8, four 7, and only one subject 6. The germline variants R3008H and R805X in ATM as well as the substitution P1275L in CDK12 correlated with cancer familiarity. In particular, we found that the mother of the case carrying the R3008H substitution suffered for breast cancer, while the patient carrying the truncating mutation R805X showed a severe cancer familiarity. His father suffered for gastric carcinoma, while his mother was diagnosed with lung cancer.
In addition, two brothers died for lung carcinoma and a sister was deceased for blood cancer ( Figure 5). Finally, the mother of the case with the P1275L substitution in CDK12 suffered for breast cancer.
No hereditary cancer predisposition linked to the germline mutations K1992T, G2023R, and L2492R in ATM as well as R466C, R5229H, and S5357T in KMT2D were observed (Table 3).

| DISCUSSION
The most common alteration found in prostate cancer is the fusion between the androgen-regulated TMPRSS2 gene and ERG oncogene  (Ma et al., 2018). Mutations of residues F102, S119, W131, and F133 are already observed in PCa (Barbieri et al., 2012;Boysen et al., 2015;Ma et al., 2018), while the lesion D130fs has never been detected before. The linkage between SPOP mutations and poor prognosis is not well defined, because some authors report that the impairment of SPOP function is associated with less adverse pathologic features and a favorable prognosis (Liu et al., 2018). Our observations indicate that all SPOP pathogenic lesions are associated with patients that have developed biochemical recurrence or lymph node metastasis, but they do not correlate with the most serious cases.
No linkage between CHD1 and CDK12 mutations and cancer progression has been observed in our cohort except for the germline variant P1275L in CDK12 that will be discussed later.

C Cell ell B Biology iology I International nternational
Mutations of FOXA1, a protein that functions as a pioneer factor to facilitate AR transactivation and PCa growth (Zhao et al., 2014), are very frequent in our cohort. FOXA1 is a transcription factor that modulates AR-driven transcription and mutations strictly affected residues of the Forkhead domain in PCa (Barbieri et al., 2012).
Consistently, the most of Moreover, mutations in this region promote PCa progression regulating the expression of genes that mediate EMT and metastasis (Gao et al., 2019). Furthermore, it was observed that FOXA1 mutations are associated with a worse clinical outcome (Shah & Brown, 2019). In our cases, most of the mutations found in forkhead domain of FOXA1 are associated with biochemical recurrence.

| Tumor suppressor proteins
Many tumors including prostate cancer rise, develop, and expand due to mutation in tumor suppressor genes including KMT2D, PTEN, RB1, TP53, and ZFHX3. KTM2D is the most mutated gene in our cohort.
Eighty-three mutations were detected in this gene suggesting that the dysfunction of this protein may affect prostate carcinogenesis. In fact, it is emerging that this gene is one of the most frequently mutated in a variety of tumors including PCa (Guo et al., 2013).
Moreover, mutations in KMT2D are more frequent in metastatic than in primary tumors (Testa et al., 2019). In contrast to these observations, we report that mutations of KTM2D are prevalent in PCa patients with good outcome. On the other hand, the most of KMT2D mutations found in our cases have a low frequency or are classified as benign except the somatic stop gain E568X that is associated with biochemical recurrence. The germline variants R466C, R5229H, and S5357T will be discussed later.
F I G U R E 5 Pedigree of a case with the germline mutation R805X in ATM. Subjects 1 and 2 are deceased for gastric and lung cancer, respectively; Cases 3 and 7 are deceased for lung carcinoma and Subject 5 is dead for a hematological disease. The proband (Case 4) is alive and he suffered from prostate cancer, cholangiocarcinoma, melanoma, and two lung cancers.  (Sun et al., 2005(Sun et al., , 2015, were identified. These are mainly clustered in a region lying between the fifth and sixth zincfinger domain. It has been reported that the inactivation of ZFHX3 may correlate with tumor aggressiveness, especially in subjects with the deletion of chromosome 16q that contains this gene (Sun et al., 2005) V274A also considered pathogenic is not linked to cancer progression, however, it was predominantly found in breast cancer (Végran et al., 2013).
Lesions in TP53 are associated with more aggressive disease not only in PCa but also in many other solid tumors (Mateo et al., 2020;Vodicka et al., 2021) and our data support these observations.

| Cell growth and invasion
We have analyzed mutations in genes associated with cell proliferation and motility such as COL5A1, PIK3CA, APC, and MED12. Mutations found in PIK3CA, COL5A1, and APC have not a significant impact on patient outcomes in our cohort. Regarding MED12, it was reported that mutations in this gene are frequent in PCa (Barbieri et al., 2012). We have detected variants of MED12 in 7 of 48 patients (14.5%). All pathogenic mutations detected in MED12 lie in the leucine-serin-rich domain except the variant A157T, suggesting that this protein region may be involved in the tumorigenesis of PCa. Actually, this domain is strongly conserved and mutations located inside this region are associated with prostate tumor (Barbieri et al., 2012;Kämpjärvi et al., 2016). Interestingly, some studies report that the missense mutation L1224F is a recurrent variant in prostate cancer (Barbieri et al., 2012), while others did not observe this lesion in any of their cases (Stoehr et al., 2013). We have found this mutation solely in one subject with a low tumor stage and without metastasis. Moreover, MED12 mutations found in our cohort do not correlate with cancer progression in most of cases, suggesting that MED12 dysfunction could not be associated with tumor metastasis.

| Germline mutations and cancer familiarity
We have searched germline mutations that could be associated with inherited cancer. Ten variants in heterozygous form also expressed in normal tissue were detected in ATM, KMT2D, TP53, and CDK12. and shows a severe cancer familiarity. In particular, mother and father are deceased for lung and gastric cancer, respectively. Furthermore, four siblings are deceased; two brothers with lung cancer, one sister for leukemia, and the second for a disease not linked to cancer (pedigree of Figure 5). The proband is alive and, in addition to prostate cancer, two lung tumors, one cholangiocarcinoma, and one melanoma were diagnosed. Currently, the truncating variant R805X has been described only in breast cancer, however truncating mutations in ATM such as stop gain or frameshift were also found in familial PCa (Karlsson et al., 2021). In addition, germline mutations of ATM are associated with gastric cancer as well as lung carcinoma Parry et al., 2017). Taken together, these observations suggest that the lesion R805X could be associated with a high risk to develop tumors; moreover, ATM pathogenic germline lesions could be considered possible markers for familial cancer.
We have found germline mutations also in KMT2D; the variants R466C, R5259H, and S5357T are classified as uncertain significance and none of these is associated with familial cancer. However, patients carrying the R466C and R5229H substitutions have developed biochemical recurrence and lung cancer, respectively. Consistently, it is known that KMT2D is among the most highly inactivated epigenetic modifiers in lung cancer (Alam et al., 2020). Interestingly, in a subject with advanced PCa and bone metastasis, we have detected the germline mutation R267P in TP53. This variant causes the dysfunction of TP53 protein and was already detected in both liver and lung carcinoma (Giacomelli et al., 2018). Unfortunately, this patient is deceased and information about hereditary cancer predisposition is no longer available.
Finally, we identified the germline mutation P1275L of CDK12 in a case deceased for multiple cancers. In addition to PCa, this patient has suffered from lung carcinoma and laryngeal cancer; moreover, his mother is deceased of breast cancer. Importantly, in this patient, the somatic mutation Y163H in TP53 that is associated with lung cancer was also detected (Vega et al., 1997). The germline variant P1275L was observed in myeloproliferative neoplasms and in EGFR-mutated tumors (Jiang et al., 2018;Pratz et al., 2016), but its role in both prostate and breast cancer should be further investigated.

| CONCLUSIONS
NGS analysis performed in 48 normal and corresponding prostate cancer tissues has allowed the detection of several lesions in TP53, ATM, FOXA1, and SPOP associated with cancer progression.
Moreover, we described first-time hotspot mutations in ZFHX3 and novel mutations in the hotspot region of FOXA1. Furthermore, this study has led to the identification of different germline mutations, some of which in cases with familial cancer were found.
Our data indicate that mutations detected mainly in ATM and TP53 could be used as biomarkers for poor prognosis in prostate cancer. Moreover, mutations altering pathways involved in prostate carcinogenesis including FOXA1-, SPOP-and ATM-regulated signals could be useful to discover new therapeutic targets for the treatment of metastatic PCa.

AUTHOR CONTRIBUTIONS
Gianluca Aguiari and Alessandra Mangolini designed the project.

CONFLICTS OF INTEREST
The authors declare no conflicts of interest.

DATA AVAILABILITY STATEMENT
The data that supports the findings of this study are available in the supplementary material of this article.