PMS2 or PMS2CL? Characterization of variants detected in the 3′ of the PMS2 gene

PMS2 germline pathogenic variants are one of the major causes for Lynch syndrome and constitutional mismatch repair deficiencies. Variant identification in the 3′ region of this gene is complicated by the presence of the pseudogene PMS2CL which shares a high sequence homology with PMS2. Consequently, short‐fragment screening strategies (NGS, Sanger) may fail to discriminate variant's gene localization. Using a comprehensive analysis strategy, we assessed 42 NGS‐detected variants in 76 patients and found 32 localized on PMS2 while 6 on PMS2CL. Interestingly, four variants were detected in either of them in different patients. Clinical phenotype was well correlated to genotype, making it very helpful in variant assessment. Our findings emphasize the necessity of more specific complementary analyses to confirm the gene origin of each variant detected in different individuals in order to avoid variant misinterpretation. In addition, we characterized two PMS2 genomic alterations involving Alu‐mediated tandem duplication and gene conversion. Those mechanisms seemed to be particularly favored in PMS2 which contribute to frequent genomic rearrangements in the 3′ region of the gene.


| INTRODUCTION
Heterozygous pathogenic variants (PVs) in the DNA mismatch repair (MMR) genes MLH1, MSH2, MSH6, PMS2 are responsible for Lynch syndrome (LS), an autosomal dominant cancer syndrome.Carriers are at increased risk of developing LS related cancers, most frequently colorectal and endometrial cancers. 1 In rare cases, biallelic alterations in MMR genes cause constitutional mismatch repair deficiency (CMMRD) which is responsible for a distinct childhood cancer predisposition syndrome, characterized by brain tumors, hematological malignancies, and early onset of LS-related tumors. 2 The PMS2 gene is reportedly responsible for about 5%-15% of LS, although the prevalence of PMS2 alteration in the population is estimated to be more frequent than other MMR genes. 3In fact, because of a reduced penetrance, PMS2 germline monoallelic PVs are less frequently identified in families manifesting typical LS-related phenotype.A number of PV carriers may have little or no clinical features of LS and the detection of PMS2-PV could be fortuitous.Meanwhile, germline PMS2 biallelic inactivation is the most common cause for CMMRD.However, molecular screening for PMS2 PVs is complicated by the presence of numerous pseudogenes sharing sequence homology with PMS2, in particular the pseudogene PMS2CL.This pseudogene is resulted from an inverted 16 kb duplication of 3 0 of PMS2 sequence, located in the same chromosome at approximatively 700 kb centromeric to PMS2.It shares up to 98% of homology with PMS2 sequences encompassing exons 9, 11-15, with identical coding sequences for exons 12 and 15. 4 Consequently, short-fragment screening technologies like NGS or Sanger sequencing may fail to align sequences to the correct reference gene, leading to false negative or false positive variant calls.In addition, frequent genomic rearrangements between PMS2 and PMS2CL further complicate the screening due to the generation of hybrid alleles. 5As a result, none of variants can be systematically assigned to the PMS2 gene or the PMS2CL pseudogene without confirmation with specific approaches.7][8] Undoubtedly, it is essential to ensure that the variants detected were truly located on the PMS2 gene, for appropriate clinical follow-up.
Here, we report the assessment of 42 unique variants detected in this region using a comprehensive analysis strategy including longrange PCR and transcriptional analysis.

| Patients and samples
Patients suspected for cancer syndromes were identified through genetic consultation sessions ensured by clinical geneticists in health care centers throughout France.Genetic variant screening was first carried out using NGS in each local Cancer Genetics Laboratory after informed consent was obtained.Further assessment for variants located in the 3 0 region of PMS2 was requested by the laboratory who performed the NGS analysis and DNA or blood samples were sent to our laboratory with available clinical and family information.
MSI and immunostaining data were provided when available.

| PMS2 variant assessment
For blood samples, DNA was isolated using standard procedures.
0][11] Using nested PCR, individual exons were amplified from long-range PCR products for PMS2 and PMS2CL, respectively and then sequenced with Sanger method.For large genomic rearrangements, multiplex ligand-dependent probe amplification (MLPA, kit P008, MRC Holland, Amsterdam, The Netherlands) was first carried out following the manufacturer's instructions.If a large deletion was suspected or detected by MLPA, long-range PCR was subsequently applied to determine/confirm whether the deletion was originated from the PMS2 gene or PMS2CL pseudogene.
Variant description was based on Human Genetic Variation Society nomenclature guidelines using the reference sequence NM_000535.6(LRG_161t1) with c.1 corresponding to the first nucleotide of the coding sequence (www.hgvs.org/varnomen).Interpretation and classification of the variants were based on the American College of Medical Genetics and Genomics (ACMG) criteria, adapted by the French oncogenetic network with the addition of MSI/loss of expression as a supporting element (https://anpgm.fr/), and the current working version of the International Society for Gastrointestinal Hereditary Tumors (InSiGHT) criteria (https://www.insight-group.org/criteria/). 12As for clinical relevance, Lynch syndrome-related cancers refer to colorectal and endometrial cancers, cancers of the ovary, stomach, small intestine, urinary and biliary tracts, as well as brain tumors and skin tumors in relation to LS variants CMMRD, Turcot or Muir-Torre syndromes.

| PMS2 transcriptional analysis
RNA was isolated either directly from blood samples collected on PAXgene Blood RNA Tube (PreAnalytiX, Switzerland) or from patient's established lymphoblastoid cell lines.Reverse transcription PCR was performed using primers anchored in the exon 10 which is absent in PMS2CL, and were analyzed subsequently using agarose gel electrophoresis followed by Sanger sequencing.

| RESULTS
A total of 42 NGS-detected variants from 76 a priori unrelated patients were undergone further characterization since each unique variant could be located either on the PMS2 gene or on the PMS2CL pseudogene in different patients due to frequent genomic rearrangements in this region.Clinical information on cancer diagnosis was available for the majority of patients analyzed (n = 72; 95%) and tumor phenotype (MSI and/or immunostaining data) was available for 24% of patients (n = 18).Using long-range PCR for both PMS2 and PMS2CL, complemented by transcriptional analysis in some cases, we found 32 variants on the PMS2 gene in all carriers, six variants only on the pseudogene PMS2CL and four variants on both the PMS2 and PMS2CL genes in different carriers (Table 1 and Figure 1).Among variants located on the PMS2 gene, 13 were pathogenic fulfilling classification criteria (Table 1 and Figure 1).Twenty other variants were considered as variants of unknown signification (VUS) which were nucleotide substitutions, a small in-frame deletion as well as a truncating variant located in the last exon (Table 1).To note that for the patient S53, the pathogenic variant was found on both PMS2 and PMS2CL, most probably as a consequence of recombination between two paralogues.The presence on PMS2 was further confirmed on transcription level which co-existed with normal transcript.Further characterization was undertaken for two cases.For the patient S76, a heterozygous intragenic duplication of exons 12-13-14 was detected by NGS and confirmed by MLPA.To define whether this duplication was located in tandem and thus lead to protein truncation, we attempted to identify the breakpoint by using a forward primer in the intron 14 and a reverse primer in the intron 11.Subsequent sequencing enabled to identify the junction characterized by a joined 25-nucleotide motif lying respectively in the intron 14 and the intron 11 (Figure 2).This motif was part of AluSc/ AluSz sequences.Thus, this was an Alu-mediated 8.5 kb genome duplication in juxtaposition with intrinsic exon 14, provoking a T A B L E 1 Summary of clinical and biological characteristics for variant assessment.Clinical and biological data were correlated to genotype since PMS2 pathogenic variants were expected to be associated with biological consequences, contrary to those located on the non-functional PMS2CL (Table 1).Although clinical data were not available for all patients, it was still clearly shown that PMS2 PV carriers were diagnosed with LS cancers or CMMRD-related clinical manifestations, closely associated with MSI phenotype and isolated loss of PMS2 expression.Indeed, MSI and/or loss of PMS2 expression were found in all 14 PV carriers for whom such data were available.One tumor displayed a coupled MLH1/PMS2 loss (patient S61).Inversely, none but one carrier of PMS2CL variants was diagnosed with colorectal or endometrial tumors or with CMMRD clinical feature.One patient (S39) was diagnosed with colon cancer at 33 years whose tumor displayed an isolated loss of PMS2.The PMS2CL variant (n.1126_1127del) was cooccurred with a pathogenic variant in the MLH1 gene which was most likely responsible for PMS2 loss in the tumor.In fact, previous reports have shown that MLH1 inactivation can cause an isolated loss of PMS2 expression. 13I G U R E 1 Graphical representation of the PMS2 3 0 region and PMS2CL pseudogene with the localization of assessed variants.Pathogenic variants localized on the PMS2 gene are indicated on the top, while those localized on the PMS2CL pseudogene are indicated on the bottom.Variants found on both PMS2 and PMS2CL were shown between the two schematic genes.
In order to establish the diagnosis for Lynch or CMMRD syndromes with clinical decision making, it is crucial to determine whether "presumed" PMS2 variants are truly originated from the PMS2 gene or from PMS2CL.In this study we reported the characterization of 42 variants detected by NGS in 76 patients, either from patients suspected for LS or CMMRD, or from patients suspected for other cancer syndromes for whom PMS2 variants were detected fortuitously.Overall, 32 variants were confirmed to be located on the PMS2 gene in all tested patients of which 13 were classified as pathogenic in patients suspected for LS or CMMRD who, importantly, displayed consistent clinical features associated with tumoral dMMR phenotype (MSI and/or loss of PMS2 expression).Indeed, although it is known that PMS2 PVs are associated with a lower penetrance, relevant clinical phenotype could still be clearly observed in carriers of pathogenic variants assigned to the PMS2 gene, contrary to PMS2CL variant carriers.Therefore, consistent clinical manifestation and tumor dMMR status, especially an isolated loss of PMS2 expression, constitute strong supporting elements for the determination of authentic PMS2 PVs.
Large genomic rearrangements are frequent causes of PMS2 inactivation, which can account for up to 40% of all PMS2 PVs 4 and the majority of them involve 3 0 region.Frequent genomic recombination between PMS2 and PMSCL with consequent hybrid allele formation, in addition to Alu-mediated recombination as another frequent mechanism made specific assessment necessary. 14,15Moreover, compared to large deletions which were principally pathogenic as they give rise to a truncated or a substantially shortened protein, pathogenicity determination for genomic duplications may not be straightforward.In this report, we described an Alu-mediated in tandem duplication of exons 12-13-14 resulting in a frameshift alteration, allowing for ascertaining its pathogenicity.6][17][18] However, while such events were reported to involve mainly downstream of exon 12, 19 the only two cases described so far that lead to a pathologic consequence were both within the exon 11: one was a transfer of PMS2CL sequence estimated to be about 20 nt (c.1718_c.1739) 16and the other (this report), a larger than 80 nt-sequence transfer (c.1717_c.1798).Since patients for whom its PMS2-origin was specifically confirmed. 4The c.2404C > T variant was detected in two patients, one on the PMS2 gene and the other, on the PMS2CL.The PMS2-variant carrier was diagnosed with an endometrial cancer at the age of 50, but unfortunately MSI/IHC testing was not able to be performed.Consistently, carriers of these two PMS2CL-variants had no clinical signs suggestive of LS although one patient had multiple primary tumors including a colon cancer at the age of 47 with MSS phenotype (Table 1).In line with this, the missense variant c.2444C > T p.(Ser815Leu) classified as probably pathogenic by the InSiGHT database since detected in four individuals with dMMR tumors, was revealed in this study to be on the pseudogene PMS2CL in a patient (S58) whose phenotype was not suggestive of LS.The mechanism for the phenomenon remains to be elucidated.
Hypotheses may include randomly occurred mutations on PMS2 or PMS2CL, or the occurrence of hybrid alleles from a mutant haplotype.
Interestingly, six variants were assigned to PMS2CL in all tested cases.Two of them: n.1122_1124delinsG, n.1126_1127del appeared to be recurrent.Indeed, they were previously reported as PMS2 pathogenic variants in patients whose clinical phenotype was not suggestive of LS, 6,8,20 thus the question whether the variants were PMS2-originated in these patients may be asked.In our series, all these PMS2CL variants were detected in patients not related to LS.Since n.1122_1124delinsG and n.1126_1127del were not natural PSVs, such a high prevalence pointed them as, most likely, PMSCL polymorphisms.
In fact, the c.2186_2187del variant was recorded as a PMS2 polymorphism in African population (2.5%) by gnomAD (v2.1.1).Given that this data were established with conventional NGS-based approach with little capacity of discrimination between PMS2 and PMS2CL, one may reasonably speculate that this polymorphism was in reality from the PMS2CL.Furthermore, the fact that they were invariably detected in PMS2CL in our patients suggested that this region (exon 4 of PM2CL) was rather stable with no or rare genomic exchange with PMS2.However, it is worthy to note that these presumed PMS2CL polymorphism were not revealed in Ganster and colleague's population study on PMS2/PMS2CL hybrid alleles, including African-American, Caucasian and Hispanic ethnic population. 5This might be explained either by an insufficient number of individuals in the study cohorts, or by the possible existence of concentrated variant-carrying haplotypes in France.
Further investigations with haplotype analysis would be helpful.Nevertheless, for each variant detected in PMS2 3 0 UTR region by short-read sequencing strategies, it is impossible to affirm the gene location without performing specific analyses.Consistently, it is recommended that such variants should be considered as variant of unknown significance until their localization was confirmed (InSiGHT).
There are some limitations about this study, including incomplete clinical information in a substantial number of patients, unavailability of appropriate samples for RNA analysis for all carriers, with especially the lack of NMD-blocking treatment to minimize the risk of biased RNA assessment.
In conclusion, our study provided further insight into the understanding of variants detected in the 3 0 of PMS2 region, demonstrating the necessity of complementary analyses for authentication of each PMS2 variant in individual patients, as the same variants can effectively be found either on PMS2 or on PMS2CL in different patients.
Breakpoint identification of PMS2 exon 12-14 duplication.(A) Schematic representation of PMS2 normal allele and mutant allele with the duplicated exons denoted by boxes filled with diagonal lines.The location and orientation of the primers used for specific nested PCRs are indicated by vertical arrows which revealed an overlapped 25-nt sequence between intron 14 and intron 12 involving Alu sequences.(B), Electropherogram with reversed primer showing merged 25-nt motive (outlined by a box) between intron 14 and intron 11 of PMS2.
there exists insertion/deletion discriminant variants in both paralogs in the exon 11 and not in other exons, it can be presumed that sequence transfer occurred in exon 11 should be more susceptible to frameshift alterations and thus to pathologic consequences.Two truncating variants: c.2192_2196del, p.(Leu731Cysfs*3) and c.2404C > T, p.(Arg802*) were either found to be from PMS2 or from PMS2CL in different patients.The c.2192_2196del was found on the PMS2 gene in one patient (S44) who had developed endometrial cancer displaying an isolated PMS2 loss.Of note, it was reported in other LS

F I G U R E 3
PMS2 alteration by gene conversion.The transfer of a PMS2CL fragment delimited by flanking PSVs into the PMS2 gene is schematically shown.All discriminating variants within this fragment were indicated (denoted by c. for PMS2 and n. for PMS2CL) with corresponding sequencing electropherograms of long-range PCR products specific for PMS2.