EFTUD2 missense variants disrupt protein function and splicing in mandibulofacial dysostosis Guion‐Almeida type

Pathogenic variants in the core spliceosome U5 small nuclear ribonucleoprotein gene EFTUD2/SNU114 cause the craniofacial disorder mandibulofacial dysostosis Guion‐Almeida type (MFDGA). MFDGA‐associated variants in EFTUD2 comprise large deletions encompassing EFTUD2, intragenic deletions and single nucleotide truncating or missense variants. These variants are predicted to result in haploinsufficiency by loss‐of‐function of the variant allele. While the contribution of deletions within EFTUD2 to allele loss‐of‐function are self‐evident, the mechanisms by which missense variants are disease‐causing have not been characterized functionally. Combining bioinformatics software prediction, yeast functional growth assays, and a minigene (MG) splicing assay, we have characterized how MFDGA missense variants result in EFTUD2 loss‐of‐function. Only four of 19 assessed missense variants cause EFTUD2 loss‐of‐function through altered protein function when modeled in yeast. Of the remaining 15 missense variants, five altered the normal splicing pattern of EFTUD2 pre‐messenger RNA predominantly through exon skipping or cryptic splice site activation, leading to the introduction of a premature termination codon. Comparison of bioinformatic predictors for each missense variant revealed a disparity amongst different software packages and, in many cases, an inability to correctly predict changes in splicing subsequently determined by MG interrogation. This study highlights the need for laboratory‐based validation of bioinformatic predictions for EFTUD2 missense variants.


| INTRODUCTION
The spliceosome is a large RNA/protein complex that is required for removal of intron regions from pre-messenger RNA (pre-mRNA; Wahl, Will, & Lührmann, 2009). The spliceosome is composed of five small nuclear ribonucleoproteins (snRNPs) that associate with the pre-mRNA and are dynamically remodeled to allow the two transesterification reactions required for intron removal from pre-mRNA (Will & Lührmann, 2011). Genetic variants in spliceosomeassociated genes cause a number of human craniofacial disorders.
Variants in the human spliceosome-associated genes TXNL4A, EFTUD2, SF3B4, SNRPB, and EIF4A3 cause craniofacial disorders: Burn-McKeown syndrome, mandibulofacial dysostosis Guion-Almeida type (MFDGA), Nager syndrome/Rodriguez syndrome, cerebrocostomandibular syndrome, and Richieri-Costa-Pereira syndrome, respectively (Lehalle et al., 2015). In most of these craniofacial disorders, the disease variants inactivate one allele and are proposed to cause disease through haploinsufficiency. Patients with MFDGA possess a wide variety of variants within EFTUD2 that potentially inactivate one EFTUD2 allele (Gordon et al., 2012;Huang et al., 2016;Lacour et al., 2019;Lehalle et al., 2014;Lines et al., 2012;Luquetti et al., 2013;Matsuo et al., 2017;Sarkar et al., 2015;Smigiel et al., 2015;Vincent et al., 2016;Voigt et al., 2013). EFTUD2/Snu114 is a GTPase, and a core U5 snRNP protein that is present throughout the splicing cycle and regulates spliceosome remodeling (Frazer, Nancollis, & O'Keefe, 2008). The MFDGA disease-associated variants comprise small and large deletions, splice site variants, and nonsense and missense variants. These MFDGA variants are present in a single allele in trans with a wild type, functional allele, consistent with haploinsufficiency. It is not entirely clear why a reduction in the amount of a core pre-mRNA splicing protein, required for the splicing of all pre-mRNAs, results in such a specific disease phenotype.
Of all the variants in EFTUD2, the missense variants are of particular interest as several of these variants have been suggested to disrupt EFTUD2 protein function (Huang et al., 2016), but have not been tested for their function experimentally. We took advantage of the high conservation between EFTUD2 and its orthologue in yeast, SNU114, to test the function in vivo of 19 EFTUD2 missense variants associated with MFDGA. Functional assays in yeast revealed that only four missense variants in SNU114, modeling EFTUD2 missense variants in MFDGA, disrupted protein function. The viability of many MFDGA related SNU114 missense variants in the yeast functional assay suggested that EFTUD2 missense variants influenced EFTUD2 function in a different way. In fact, by subsequently using a minigene (MG) splicing assay, we determined that five EFTUD2 missense variants influenced the splicing of the EFTUD2 pre-mRNA to inactivate one allele in MFDGA. Thus, we have defined how missense variants in EFTUD2 can influence both EFTUD2 protein function and pre-mRNA splicing to cause MFDGA and provide support for the growing evidence that missense variants can influence splicing and should be routinely tested for splicing defects.

| Splicing minigene construction
A 3.8 kb fragment ("Fragment 3") of the pSpliceExpress MG splicing reporter vector (a gift from Stefan Stamm, Addgene 32485) (Kishore, Khanna, & Stamm, 2008) was amplified by polymerase chain reaction (PCR) using Phusion High-Fidelity DNA Polymerase (Thermo Fisher Scientific), or isolated by restriction enzyme digestion with NheI and BamHI. Similarly, EFTUD2 exons and at least 100 bp of flanking 5′ and 3′ intronic sequence were PCR-amplified from control genomic DNA also using Phusion High-Fidelity DNA Polymerase employing two pairs of primers to produce two overlapping (10-20 bp) fragments ("Fragment 1" and "Fragment 2") each with a single portion of overlap (6-10 bp) with the vector fragment ("Fragment 3") on one end. Where necessary, overlapping primer sequences for Fragments 1 and 2 were altered to introduce single-nucleotide variations into the exons corresponding to MFDGA-associated EFTUD2 missense variants (see Table 1 for full list of variants). Fragments 1, 2, and 3 were assembled using the Gibson method (Gibson, 2011) and transformed into competent bacteria. Successfully assembled vectors were isolated from candidate colonies and their sequence-verified by direct Sanger sequencing performed by Eurofins Genomics.
Sequences of primers used for Gibson assembly of MG splicing constructs can be found in Table S1.
For the MG splicing assay, HEK293 cells were grown overnight to 40-60% confluency in 3 ml of Dulbecco's modified Eagle's medium high-glucose, DMEM (Sigma-Aldrich), supplemented THOMAS ET AL.
| 1373 with 10% fetal bovine serum (Sigma-Aldrich) in tissue-culture treated six-well plates at 37°C and with 5% CO 2 . Cells were transiently transfected with at least 0.2 μg of MG vector (either wild type or mutant) using Lipofectamine (Thermo Fisher Scientific) and the manufacturer's standard protocol. Following 48 hr incubation at 37°C with 5% CO 2, RNA was extracted using TRI Reagent ® according to the manufacturer's instructions (Sigma-Aldrich). Extracted RNA was purified further using the RNeasy column clean-up kit (Qiagen), which included a DNase digestion step. cDNA was synthesized from up to 4 μg RNA (using an equal amount of RNA for each sample set) using Superscript Reverse Transcriptase (Thermo Fisher Scientific).
Resulting cDNA was amplified by Phusion High-Fidelity DNA Polymerase (Thermo Fisher Scientific) using "Minigene RT PCR-for" and "Minigene RT PCR-rev" primers (Table S1). Finally, PCR products were run on an agarose gel (1-3%) supplemented with SafeView nucleic acid stain (NBS Biologicals). Gels were visualized under a blue-light transilluminator and, where appropriate, bands of interest were extracted and purified using QIAquick gel extraction kit (Qiagen) followed by direct Sanger sequencing performed by Eurofins Genomics, to confirm splicing products.

| Ethical compliance
Institutional ethical review and approval was granted, and informed consent was provided, for all data obtained from patients. Exon 9 of EFTUD2 contains the highest number (n = 3) of MFDGA-associated missense variants of any exon within the EFTUD2 transcript (Table 1). In addition, the mutation characteristics of the  A single variant (Gln383His) in EFTUD2 exon 13 was analyzed for its influence on splicing compared to the wild type sequence using the MG assay. As expected, the wild type MG construct resulted in the correct splicing of exon 13, producing a RT-PCR product of 408 bp in length (Figure 1d). However, the Gln383His variant sequence altered splicing of the MG transcript. In total, three clear bands were visible (Figure 1d). The smallest product was a skipped-exon product. A middle band, which is larger than the wild type spliced product, could not be sequenced accurately by Sanger sequencing. However, most interestingly, the largest product was approximately 800 bp in length (Figure 1d). Direct sequencing of this approximately 800 bp product revealed that the Gln383His variant sequence leads to a decrease in the strength of the exon 13 donor splice site and retention of the downstream intron. This retained intron transcript is also present in the wild type sample, albeit at a much lower proportion than that seen in the Gln383His variant.
Next, exon 18 variants were investigated including Lys620Asn, its synthetic counterpart, and a nonsense variant associated with MFDGA (Arg578Ter) (Figure 1e). The reason behind inclusion of the nonsense variant was to act as a negative control for the splicing assay and to assess what consequence a premature termination codon (PTC) had on the stability of the transcript produced from the MG assay. As expected, the presence of the nonsense variant had no influence on the normal splicing of exon 18, resulting in similar levels of transcript to that seen for the wild type MG construct. Thus, the inclusion of a PTC, in this instance, does not lead to detectable levels of degradation of the transcript via the NMD pathway. Both Lys620Asn and Lys620Asn-Synth variants resulted in a different splice pattern compared to wild type (Figure 1e). In each case a lower band where exon 18 is entirely skipped, and a larger band, which could not be accurately sequenced, were observed. Since exon 18 is 141 bp in length, its exclusion from any mature mRNA transcripts in vivo would not lead to a frameshift. However, the essential EFGdomain 2 of EFTUD2 is located within exons 18 and 19 ( Figure S1) and would be removed by the absence of exon 18. In fact, removal of the equivalent amino acids from the yeast Snu114 that are encoded by the human EFTUD2 exon 18 results in a lethal phenotype in yeast (Table 1).

Analysis of EFTUD2 exon 25 variants Ala823Thr and
Glu829Lys by MG assay revealed that the wild type construct produced two splice products in roughly equal proportions to influence the splicing patterns of the parent exon. All remaining variants tested (across exons 5, 15, 16, 19, 20, 23, 26, and 27) showed no difference in splicing patterns from their respective wild type constructs (Table 1).

| DISCUSSION
Of the EFTUD2 variants associated with the human craniofacial disorder MFDGA to date, over 15% are missense. Many of these missense variants in EFTUD2 are located at highly conserved residues between eukaryotes and have, therefore, previously been assumed to influence EFTUD2 protein function or stability, acting as null alleles.
These missense null alleles would cause haploinsufficiency consistent with the majority of disease-associated EFTUD2 variants that are predicted to result in loss of function (Lines et al., 2012 (Nguyen et al., 2016). Surprisingly, the remaining variant that was found to be lethal in our yeast functional assays (Ala470Arg)  variants are all located at highly conserved residues but were not found to be deleterious in either our yeast growth assay or MG assay.
Of particular interest is the variant Arg262Trp, which is currently the most commonly found MFDGA-associated missense variant and has been identified in three unrelated families (Huang et al., 2016;Lines et al., 2012;Smigiel et al., 2015). With these missense variants where no functional defect has been determined, either the yeast system is not revealing any functional defect for an amino acid change or the cell type used for the MG assay does not express the relevant splicing factors to reveal mis-splicing. Alternatively, the variant sequence may lead to a splicing defect affecting a distal part of the EFTUD2 transcript not included in the MG construct. Finally, missense variants could create a novel binding site for miRNA that could inhibit translation of the mRNA produced from that allele (Brummer & Hausser, 2014;Ni & Leng, 2015).
In conclusion, our results reveal that not only do missense mutations lead to splicing errors and erroneous mRNA transcripts, but also highlights how delicately balanced and dynamically regulated the mechanisms of splicing are. Our MG assay has determined that, even in a wild type construct of EFTUD2 exon splicing, more than one splicing outcome may be produced constitutively, as seen for the EFTUD2 exon 25 wild type MG construct (Figure 1f). Additionally, alteration of a single nucleotide, often distal to the wild type splice site, can lead to dramatic changes in the splicing patterns of that transcript. This influence of single nucleotide changes was particularly true for the Lys620Asn synthetic variant produced, which led to a change in the proportions of spliced products compared to its paired MFDGA-associated variant.
Our comparison of bioinformatic results from splicing predictors with our in vitro splicing assay results also demonstrates how varied and inconsistent many software packages are in predicting the splicing outcomes based on sequence information alone. While in silico predictions are useful tools for uncovering potentially pathogenic point mutations or implicating a possible mechanism of action, the importance of in vitro or in vivo lab-based assays cannot be underestimated, in particular in a medical setting where a functional diagnosis is often important to furthering our understanding of the underlying disease/disorder. As with a solely bioinformatic approach, both assays utilized here can accommodate a lack of patient material by replicating the genetic variant using a wild type reference, and this is of particular importance for medical conditions (such as MFDGA), which are both vanishingly rare and genetically heterogeneous.

ACKNOWLEDGMENTS
We would like to thank Clair Byrne and Katie Evans who were in-

CONFLICT OF INTERESTS
The authors declare that there are no conflict of interests.

DATA AVAILABILITY STATEMENT
The data that support the findings of this study are available from the corresponding author upon reasonable request. Variant sequence data for novel variants presented in Table 1 and Supporting Notes have been submitted to LOVD3 (https://www.lovd.nl/).