Personalized molecular modeling for pinpointing associations of protein dysfunction and variants associated with hereditary cancer syndromes

Abstract Background Although the process of reclassification of a variant of uncertain significance can be complex, they are commonly detected through molecular testing. It often takes years before enough clinical data are acquired, and it can be costly and time‐consuming to perform functional analysis of a single variant. It is important that other tools are developed to aid in clarifying how a specific genetic variant impacts a protein's function, and ultimately the health of the patient. Methods Two more newly characterized, suspected pathogenic variants in NBN and PTEN were analyzed through personalized protein modeling. Comparisons between the wild‐type and altered protein were studied using simulations, genomic exome analysis, and clinic study. Results Modeling of the new NBN and PTEN protein structures suggested loss of essential domains important for normal enzymatic function for these personalized genomic examples which matched the clinical findings. Conclusion The defects detected through modeling were consistent with the expected clinical effect. Personalized protein modeling is another tool for determination of correct variant classification, which can become further useful through construction of deposition archive.


| INTRODUCTION
Of the numerous individuals who complete clinical genetic testing, a significant portion will receive an inconclusive result (Altschul et al., 1997). Variants of uncertain significance are genetic alterations with an unclear impact on gene and downstream protein function. One-third of individuals tested with a 25 gene hereditary cancer panel will receive at least one Variant(s) of Unknown Significance (VUS) (Altschul et al., 1997), and the number of uncertain results discovered generally increases with number of genes tested (Amendola et al., 2016). It is not always clear though whether a specific variant should be classified as (likely) pathogenic, uncertain significance or (likely) benign. Guidelines have been put forward by the American College of Medical Genetics and Genomics and the Association for Molecular Pathology to aid in the classification of variants (Caulfield, 2011), but there can still be disagreement amongst laboratories (Caulfield & Devkota, 2012). It often takes a long period of time for enough clinical data to accumulate to allow for confident reclassification of a VUS to pathogenic or benign (Caulfield, Devkota, & Rollins, 2011;, and functional studies can be costly and time consuming. Computational and predictive data, like protein modeling, can aid geneticists in this task. Personalized molecular modeling and structure-based analytics from comparison of VUS and wild-type represent an expanding toolkit at the physicians practice.

| CASE PRESENTATION
The first variant, previously discussed in a case report (Garcia, 2014), presents as a frameshift variant of NBN (OMIM No. 602667), c.698_701delAACA, which was classified by the original testing laboratory as pathogenic. The patient had a history of contralateral breast cancers, with a diagnosis of left breast ductal carcinoma in situ at age 62 and right breast invasive ductal carcinoma at age 64. Her medical history was also significant for melanoma on the lateral corner of the left eyelid, melanoma on the left posterior calf at age 40 and squamous cell carcinoma of the skin at age 65. Her family history included her mother with breast cancer at age 69, a brother with renal cancer at age 59, a sister with melanoma in her 50's and a maternal half-brother with small bowel lymphoma. While no heterozygous individuals had been reported previously, this NBN variant had been seen previously in homozygous individuals with Nijmegen breakage syndrome 1 (Gass et al., 2017).
The second genetic change studied was a pathogenic deletion of exon 8 in the PTEN gene (OMIM No. 601728). The patient presented with a recent diagnosis of endometrioid cancer of the ovaries with synchronous endometrial intraepithelial neoplasia and an endocervical adenomcarcinoma at age 37. Her past medical history included a partial thyroidectomy for premalignant lesion at age 15, a thyroidectomy for premalignant lesion at age 34, and a premalignant melanoma at age 31. She also had a history of multiple colon polyps, pathology unavailable, discovered around age 22 and 35, as well as, a ganglioneuroma that was discovered at her most recent colonoscopy at age 37. She was adopted, and her family history was unknown. Genetic testing identified a deletion of exon 8 in PTEN which was originally classified as a VUS and then upgraded to pathogenic by the testing laboratory some weeks later. Physical examination identified macrocephaly, consistent with PTEN-related disease. She was referred to dermatology and found to have a sclerotic fibroma/storiform collagenoma. Her son subsequently tested positive for He had no history of cancers or tumors, but did have a history of macrocephaly, premature birth at 29 weeks and early developmental delay that had largely resolved. As this PTEN variant and the pathogenic NBN variant were less well characterized with regards to hereditary cancer syndromes, they were good examples to test with protein modeling.

| MATERIALS AND METHODS
Computer assisted modeling was completed using a sequence of human protein nibrin (NBN) taken from the NCBI Reference Accession Sequence: NP_002476: version NP_002476.2 (GenBank: AF069291.1), which is encoded for a 754 amino acid sequence. The required PTEN sequence was also taken from the NCBI Reference Accession Sequence: NM_000314: version NM_000314.6 (GenBank: AF067844.1), which encodes a 403 amino acid sequence.
The full-length structure for the wild-type (WT) NBN protein was 754 amino acids (11,919 atoms) and 236 amino acids (3,710 atoms) for the K233SIRN(X) variant. For the PTEN protein, dimeric complexes of 161 amino acids (5,017 atoms) of the full-length wild-type model and 91 amino acids (2,970 atoms) for the p.del(D268-K342) variant were reviewed.

| RESULTS
Both of these proteins, NBN and PTEN, form internal domain interactions that are important to function, and the structural modeling drawn from X-ray structural datasets demonstrated a strong effect from the respective variants.
The wild-type was compared to the NBN truncation variant K233SIRN(X), and the stability of the object from energetic calculations for ΔG per amino acid is lower for the K233SIRN(X) variant. WT and K233SIRN(X) were 580.98 and 146.79 kcal/mol*Å2 respectively (Caulfield, 2011;Caulfield & Devkota, 2012;Lopez-Vallejo et al., 2011;Reumers et al., 2005;Schymkowitz et al., 2005). 24 Per amino acid, they measured 0.77 kcal/mol*Å2 for the WT and 0.62 kcal/ mol*Å2 for K233SIRN(X). Interestingly, the K233SIRN (X) protein fragment had better total object stability when not examined at a per amino acid distribution of energies. Nibrin is involved in DNA damage repair and aids with histone-related protein recruitment; better object stability may make the K233SIRN(X) variant better for N-terminus binding over the WT protein. However, the huge deletion of 69.1% of the NBN protein for K233SIRN(X) results in massive defect in the protein's function, which eliminates binding partners. The actual residue mutation, p.K233S, induces a +0.28808 kcal/mol*Å2 change in free energy, which is disruptive to the local region. However, given the additional missense mutations from I234-N236, functional impact is much greater. The molecular model for the full structure and its deletion form are given (Figure 1b-d) (Caulfield, 2011;Caulfield & Devkota, 2012;Lopez-Vallejo et al., 2011). [24][25][26][27][28][29][30][31] Examination of the domain regions lost due to the NBN truncation consists of residues within the SP binding region (residues E111-H238) and mTOR, MAPKAP1, RICTOR binding sites (residues I221-T402) (Figure 1a). Nuclear localization signal is lost (P461-E467) and the EEXXX-DDL motif is gone (E736-L743) (Figure 1a). The electrostatic distribution map (ESM) and portions for the deleted regions were also determined (Supporting information Figure S1).

| DISCUSSION
As it is justifiable to immediately think that frameshift and other truncating variants can cause nonsense mediated decay and no protein product, previous studies have shown that this is not always the case. Tanzarella et al. 2003 in his paper described two NBN variants (835del4 and 900del25), which were translated into a form of Nibrin with smaller molecular weight (Tanzanella, 2003 #2). Using established techniques, the energetics data and modeling reveal conclusive defects in the NBN protein and the PTEN protein (Qiu & Elber, 2006;Rehm et al., 2013;Reumers et al., 2005;Richards et al., 2015;Tung et al., 2016;Varon et al., 1998). The models for the NBN structure and PTEN structure show loss of essential domains interactions for proper function. In both cases, the energetic calculations structure show in great detail how the deletion construct has been perturbed from the wild-type protein in a manner that may put the native protein at a disadvantage when competing for N-terminus binding partners; thus promoting loss of function or decreased activity. The structural models of these two variants concur with previous clinical data. We are continuing to build a database for all NBN and PTEN VUSs for future use to the clinical community as an online publicly available resource, which will become available after we have added sufficient number of cases. Protein modeling should be considered as another tool in the difficult task of variant classification.