UMD-DYSF, a novel locus specific database for the compilation and interactive analysis of mutations in the dysferlin gene

Authors


  • Communicated by Alastair F. Brown

Abstract

Mutations in the dysferlin gene (DYSF) lead to a complete or partial absence of the dysferlin protein in skeletal muscles and are at the origin of dysferlinopathies, a heterogeneous group of rare autosomal recessive inherited neuromuscular disorders. As a step towards a better understanding of the DYSF mutational spectrum, and towards possible inclusion of patients in future therapeutic clinical trials, we set up the Universal Mutation Database for Dysferlin (UMD-DYSF), a Locus-Specific Database developed with the UMD® software. The main objective of UMD-DYSF is to provide an updated compilation of mutational data and relevant interactive tools for the analysis of DYSF sequence variants, for diagnostic and research purposes. In particular, specific algorithms can facilitate the interpretation of newly identified intronic, missense- or isosemantic-exonic sequence variants, a problem encountered recurrently during genetic diagnosis in dysferlinopathies. UMD-DYSF v1.0 is freely accessible at www.umd.be/DYSF/. It contains a total of 742 mutational entries corresponding to 266 different disease-causing mutations identified in 558 patients worldwide diagnosed with dysferlinopathy. This article presents for the first time a comprehensive analysis of the dysferlin mutational spectrum based on all compiled DYSF disease-causing mutations reported in the literature to date, and using the main bioinformatics tools offered in UMD-DYSF. ©2011 Wiley-Liss, Inc. Hum Mutat 33:E2317–E2331, 2012. © 2012 Wiley Periodicals, Inc.

Introduction

In 1998, the groups of Robert H. Brown Jr. [Liu et al., 1998] and Kate Bushby [Bashir et al., 1998] identified the genetic cause of the autosomal recessive muscle-wasting diseases Miyoshi myopathy (MIM# 254130), and Limb Girdle Muscular Dystrophy type 2B (LGMD2B; MIM# 253601) as resulting from mutations in a novel gene on chromosome 2p13. The encoded protein was named dysferlin (DYSF; MIM# 603009), relating to its involvement in muscular dystrophy, and homology with the C. elegans fer-1 protein. Using membrane repair assays on muscle fibers from dysferlin-deficient mouse models, the groups of Paul McNeil/Kevin Campbell [Bansal et al., 2003] and Robert H. Brown Jr. [Lennon et al., 2003] subsequently demonstrated a central role for dysferlin in sarcolemmal repair after membrane injury. This established Miyoshi myopathy and LGMD2B as the first entities of a new subgroup of muscular dystrophies, due to defective membrane-repair.

From a clinical point of view, numerous reports corroborated the implication of mutated dysferlin in muscular dystrophy, and in particular in a high proportion of LGMD. In the past ten years, regular mutational analysis has allowed for better characterization of the phenotypic manifestations associated with deleterious mutations in the DYSF gene. The main clinical presentations are the distal-onset muscular dystrophy called, Miyoshi myopathy and the proximal-onset form LGMD2B, both characterized by progressive muscle weakness, usually appearing in the second decade, and highly elevated serum creatine kinase (CK) levels. Progressively, the description of different phenotypes caused by DYSF mutations [Illa et al., 2001; Klinge et al., 2008; Nguyen et al., 2005; Nguyen et al., 2007; Okahashi et al., 2008; Paradas et al., 2009; Seror et al., 2008; Spuler et al., 2008; Ueyama et al., 2002; Wenzel et al., 2007], in addition to the “typical” LGMD and Miyoshi phenotypes, unraveled a wide spectrum of phenotypes, ranging from clinically asymptomatic, isolated hyperCKemia to severe and early onset presentations [Bushby, 2000; Laval and Bushby, 2004; Urtizberea et al., 2008]. This wide range of clinical presentations is collectively referred to as dysferlinopathies.

DYSF was initially shown to be expressed in the skeletal and cardiac muscle tissues [Bashir et al., 1998; Liu et al., 1998], in monocytes [Ho et al., 2002], as well as in a variety of tissues, including liver, lung, kidney, pancreas, brain, and placenta [Bashir et al., 1998; Liu et al., 1998]. More recent studies have isolated 14 isoforms that are differentially expressed among tissues. These isoforms originate from the differential use of promoters and alternative exons which have been identified, respectively DYSF (AF075575) [Foxton et al., 2004] and DYSF_v1 [Pramono et al., 2006] promoters, and alternative exons 5a and 40a [Pramono et al., 2009]. Isoform 8, which contains the 55 canonical exons transcribed via the DYSF promoter, constitutes the major DYSF transcript (73%) among all reported isoforms expressed in skeletal muscle, but is not expressed in monocytes where isoform 13 (NM_001130980; containing exon 5a) represents the main dysferlin messenger (44%) [Pramono et al., 2009]. The other isoforms are less expressed in skeletal muscle and blood. In addition to the canonical messenger composed of 55 exons, splice variants lacking exon 17 are expressed at early stages of myogenic cell differentiation and also constitute predominant dysferlin transcripts in mature peripheral nerve tissue [Salani et al., 2004]. To date, the functional role of the different messengers remains unknown.

Due to the large size of the DYSF gene, which spans a genomic locus of approximately 233kbp, mutation screening is challenging on a routine clinical basis. Mutational analysis of DYSF is further complicated by the large mutational spectrum (detailed in this article), and a high proportion of “private” mutations, which leaves molecular geneticists with the recurrent difficulty of interpreting novel DYSF sequence variants, in particular putative splicing and missense variants.

Until now, the Leiden Open (source) Variation Database established in 1998 for dysferlin (LOVD Dysferlin), has been the unique Locus Specific Database serving as a public repository for human dysferlin variants (www.lovd.nl/DYSF). Our laboratory, as many others, widely uses this valuable resource. On November, 18, 2011, LOVD Dysferlin consisted of 424 unique sequence variants, including published or directly submitted (unpublished) variants, being either disease-causing mutations (ca. 300 variants) or polymorphisms (ca. 100 variants). While LOVD is an efficient and convenient tool for gene-centered collection, curation and display of DNA variation, data analysis options are limited. The Universal Mutation Database (UMD®) Locus Specific Databases [Beroud et al., 2000; Beroud et al., 2005] have been developed specifically to allow for the collection of mutational data and provide numerous bioinformatics tools for the interactive analysis of mutational data, including the analysis of novel sequence variants. Even more, the UMD® software is very flexible for the development of novel tools, based on questions arising in the research field.

In the present article, we describe the online version of UMD-DYSF, freely accessible at www.umd.be/DYSF/. In complement to LOVD Dysferlin, UMD-DYSF not only references all previously published disease-causing mutations identified in the DYSF gene but also includes interactive bioinformatics tools for the analysis of DYSF sequence variants. In particular, UMD-DYSF offers a computational procedure for the analysis of possible deleterious mutations affecting splicing signals in the dysferlin gene, using the Human Splicing Finder (HSF) algorithm [Desmet et al., 2009] and integrates the UMD-Predictor tool for the analysis of missense variants [Frederic et al., 2009]. Furthermore, interactive functions allow for analysis of the full UMD-DYSF dataset, single mutational events or customized subsets of mutations referenced in the database. We previously used an offline version of UMD-DYSF to successfully analyse the mutational spectrum of a large cohort of patients analysed for DYSF mutations in our diagnostic laboratory [Krahn et al., 2009a]. To further illustrate the use of UMD-DYSF, we here report the results of statistical analyses of the DYSF mutational spectrum based for the first time on all compiled DYSF disease-causing mutations reported in the literature to date.

The UMD-DYSF Database

Database Description

The UMD-DYSF database was developed using a software package of specific routines, which allows optimized multicriteria search and sorting of data [Beroud et al., 2000; Beroud et al., 2005]. Mutational data entries are standardized to facilitate mutational analysis, as previously described [Collod-Beroud et al., 2003; Frederic et al., 2008]. Each entry corresponds to one mutation associated with one affected individual, either index patient or affected relative. At the moment, UMD-DYSF includes DYSF mutations described or predicted in the literature as deleterious, exclusively. However, in future versions, UMD-DYSF will include unpublished data for disease-causing variants (see the DATABASE UPDATE section). UMD-DYSF is currently not aimed at collecting polymorphism data from patients because current diagnostic screening methods are not homogenized between laboratories, and results of polymorphism patient data are therefore biased. For users interested in known polymorphism data, the UMD-DYSF website links to the UCSC genome browser page for DYSF [Dreszer et al., 2011] (http://genome.ucsc.edu) and each UMD-DYSF mutation description page links to the collection of sequence variations available in LOVD Dysferlin for the corresponding nucleotide position. As dysferlinopathies are an autosomal recessive disease, users should be warned that “polymorphism data” issued from large-scale “normal” control studies can be “contaminated” with truly pathogenic DYSF sequence variants found at a heterozygous state in healthy carriers. These variants should thus be confronted to pathogenicity prediction tools –such as those available in UMD-DYSF– to further evaluate the possibility of a deleterious effect.

The following mutational events can be entered into the database: point mutations, insertions, deletions, and insertions-deletions (intronic and/or exonic); as well as mono- or multi-exonic large-sized deletions or duplications. Several levels of information are provided for each mutation, including the affected exon and codon number, wild-type and mutant codon sequence, type of mutational event, mutation nomenclature, wild-type and mutant amino-acid, affected domain, etc. Whenever available, we also included in the database clinical information; however, in most publications, only the main phenotype data (i.e. LGMD2B or Miyoshi myopathy), but no detailed information, are described.

Mutational events are automatically described using the official nomenclature of the Human Genome Variation Society (den Dunnen and Antonarakis, 2000), and relating to the human DYSF cDNA sequence of reference (isoform 8, GenBank #NM_003494.2) which corresponds to the major DYSF transcript among all reported isoforms expressed in skeletal muscle [Pramono et al., 2009]. DYSF isoform 8 (6243bp) is transcribed under the DYSF promoter and contains the 55 canonical exons, with exons 5a and 40a exclusion and exon 17 inclusion. The dysferlin protein sequence was annotated for C2 domains, ferlin family domains, DysF domains and TM domain based on predictions from Pfam 25.0 [Finn et al., 2010] and SMART 6 [Letunic et al., 2009] and for highly conserved residues expected to be involved in calcium coordination as described by Therrien and colleagues [Therrien et al., 2006]. Users of UMD-DYSF can verify whether exonic mutations affect annotated structural domains or highly conserved residues.

Interactive analysis of DYSF mutational data included in the database was done using previously described sorting- and research-functions [Beroud et al., 2000; Beroud et al., 2005]. In addition, the present version of the UMD-package includes novel routines to assist the design of new therapeutic tools. Analysis tools and functions accessible on the UMD-DYSF website are described in Table 1, and a brief user guide can be downloaded from the website.

Table 1. Complete list of tools and functions available on the UMD-DYSF website (www.umd.be/DYSF/)
Function or tool nameFunction or tool description
I found a mutationDisplays a table of the various mutational events registered in UMD-DYSF for a given position.
I want to analyze the impact of a missense variantUses the UMD-Predictor® algorithm to predict the pathogenicity of all possible non-synonymous or synonymous mutations from the DYSF gene.
I want to analyze an intronic variantUses the Human Splicing Finder tool to evaluate the consequences of substitutions on splicing.
I want to search the databaseAllows the selection of a specific subset of the database. Results are displayed as a list on the screen.
Predicted impact of all previously reported missense variationsUses the UMD-Predictor® tool to predict the pathogenicity of all UMD-DYSF missense variants localized in the coding sequence.
Global analysisGives a summary of mutation types.
PositionStudies the distribution of mutations at the nucleotide level to identify preferential mutation sites.
Potential stop codonsDisplays all codons from a specific exon that can be mutated into a stop codon by a single substitution.
Mutation mapDisplays the distribution of the various mutations along the gene and the protein.
Deletion mapDisplays the distribution of the various deletions along the gene and the protein.
Stop codon mapDisplays the exon phasing and the position and number of reported nonsense mutations.
Geographic distributionDisplays geographic origin of patients.
Binary comparisonDisplays the distribution of the various mutations along the gene for two chosen subsets of the database.
Stat exonsStudies the distribution of mutations in the different exons. It enables detection of a statistically significant difference between observed and expected mutations
Distribution by exonDisplays the partition of each type of mutation in each exon
StructureStudies the distribution of allelic mutations both in the various structural domains of the protein and in the highly conserved residues expected to be implicated in calcium coordination

Database Entries

The UMD-DYSF v1.0 (April 12, 2011) contains a total of 742 entries corresponding to mutational data from 558 patients diagnosed with primary dysferlinopathy and previously reported in the literature as disease-causing mutations. The total number of patients amounts to 401 index cases (557 mutational entries) and 157 relatives. Among all UMD-DYSF entries, 192 entries from 129 patients correspond to mutations identified in our laboratory [Khadilkar et al., 2008; Krahn et al., 2009a; Krahn et al., 2009b; Krahn et al., 2010; Nguyen et al., 2005; Nguyen et al., 2007; Seror et al., 2008] while the others correspond to mutational data reported in 55 additional publications (see www.umd.be/DYSF/ for a comprehensive list of references). All mutational data can be visualized through the “Search” function described in Table 1 and downloaded from the UMD-DYSF website.

Bioinformatics Tools for the Interpretation of Sequence Variants

A recurrent problem in genetic diagnosis is the interpretation of sequence variants, including the difficulty in predicting the impact of a genomic variation on the pre-mRNA maturation and the mRNA translation mechanisms, and in predicting any deleterious effect on the mRNA and protein stability. The Human Gene Mutation Database (professional release 2010.4) which collects all known gene lesions responsible for human inherited diseases [Stenson et al., 2009], reports a total of 108046 mutational entries, 54% of which are missense mutations, as well as mutations affecting RNA splicing. Interpretation of the effect of DYSF missense variants and identification of DYSF splice variants is facilitated by a number of bioinformatics tools integrated into UMD-DYSF and available online.

The HSF tool is based on UMD algorithms and predicts consequences of mutations affecting existing splice signals (donor and acceptor sites, branchpoints and cis-acting elements such as exonic splicing enhancers and silencers) or possibly creating novel ectopic splicing sequences. These algorithms are integrated into UMD-DYSF to allow for the analysis of sequence variants. Detailed analysis of UMD-DYSF abnormal splicing variants is described below.

To further discriminate between neutral and pathogenic sequence variations, UMD-DYSF also integrates the recently developed UMD-Predictor tool [Frederic et al., 2009]. UMD-Predictor combines data such as localization within the protein, conservation and biochemical properties of the mutant and wild-type residues, as well as results from HSF analysis to calculate a pathogenicity score ranging from 0 to 100 for each missense variant (score >65 indicates a probable or highly likely pathogenicity). Its efficiency for predicting pathogenic missense mutations was demonstrated by a sensitivity of 95.4% and a positive predictive value of 99.5% [Frederic et al., 2009]. The UMD-Predictor score was computed for all UMD-DYSF missense variant entries and can be consulted on the UMD-DYSF website using the “Predicted Impact of all Previously Reported Missense Variations” function. Although all variants predicted or described in the literature to be deleterious were entered into UMD-DYSF, 5% were predicted as probable or likely polymorphisms using UMD-Predictor (pathogenicity score <65). These variants could correspond to true polymorphisms in patients for whom the accurate deleterious mutation has been missed during genetic testing (incomplete mutation detection rates of pre-screening techniques such as Single Strand Conformation Polymorphism analysis or Denaturing High Pressure Liquid Chromatography; mutations not detected using routine sequencing approaches such as large genomic rearrangements and “deep” intronic mutations; etc.). More likely, these variants spot cases for which the UMD-Predictor algorithm lacked predictive elements to accurately interpret the pathogenic effect of the sequence variant. More generally, for variants of unclear pathogenicity, definitive conclusion on their possible deleterious effect will only be achievable with integration of novel functional data into the UMD-Predictor algorithm. In particular, bioinformatics predictions can greatly benefit from sequencing data of mutated DYSF RNAs and proteins, and from novel functional elements that would shed light on molecular roles and functions of dysferlin, domain organisation and critical residues of the protein.

Bioinformatics Routines to Assist the Design of Therapeutic Strategies

Two interesting tools available on the UMD-DYSF website (Table 1) have been designed to help develop certain types of therapeutic approaches for dysferlinopathies. In particular, several nonsense mutations could be targets for possible therapeutic approaches based on aminoglycoside read through of stop codons [Wang et al., 2010]. The “Potential Stop Codon” function gives the list of codons that can lead to a premature termination codon (PTC) when mutated by a single substitution; along with the number of such mutations reported in UMD-DYSF. This function also provides statistical calculation about the environment of observed PTC compared to potential PTC for which no mutation has ever been reported. The distribution of nonsense mutations reported in the DYSF gene is described below. In addition, the “Stop Codon Map” function is a UMD newly implemented tool that displays the exon phasing and the position and number of reported nonsense mutations. This function has been designed to facilitate envisaging exon skipping strategies [Aartsma-Rus et al., 2010; Levy et al., 2010; Wein et al., 2010].

Analysis of the DYSF Mutational Spectrum

General statistics

Mutational data from large cohorts of patients repeatedly revealed a large mutational spectrum for the DYSF gene, with a high proportion of missense changes, or frameshifting insertions and/or deletions (for example, [Aoki et al., 2001; Cagliani et al., 2003; De Luna et al., 2007; Guglieri et al., 2008; Klinge et al., 2010; Krahn et al., 2009a; Mahjneh et al., 1996; Nguyen et al., 2005; Tagawa et al., 2003; Takahashi et al., 2003]). Accordingly, most of the UMD-DYSF entries correspond to “private” or rare DYSF disease-causing mutations. In the 401 reported index patients, 266 disease-causing variants were identified along the DYSF coding sequence. Within the index cases population, 379 heterozygous variants and 178 homozygous variants were identified and constitute a set of 735 alleles.

Founder mutations and recurrent mutations

Among DYSF disease-causing mutations, seven different founder mutations have been suggested or demonstrated in patients of various geographic/ethnic origins [Argov et al., 2000; Cagliani et al., 2003; Leshinsky-Silver et al., 2007; Santos et al., 2010; Vernengo et al., 2011; Vilchez et al., 2005; Weiler et al., 1999] (Table 2). In addition, interrogation of the database shows that 51 mutations have been recurrently identified in at least three non-related index patients (see updated list on the UMD-DYSF website). These recurrent mutations are distributed along the coding sequence and canonic splice sites without any apparent mutational «hotspot» (Fig. 1).

Figure 1.

Distribution of exonic disease-causing mutations reported in the dysferlin sequence. Above a scale at the amino acid level, the colored boxes represent the various structural or functional domains annotated for the protein. Above a scale at the nucleotide level, the various white boxes represent the exons of the gene. The middle panel displays the distribution of all exonic mutations identified in patients first diagnosed with LGMD2B (yellow vertical lines) or Miyoshi myopathy (orange vertical lines). The bottom panel displays the number of the various exonic mutational entries found in the index cases population and classified as missense and in-frame insertion or deletion mutations (blue vertical lines) or nonsense and frameshifting mutations (red vertical lines). Mutations below the red horizontal line represent recurrent mutations identified in at least three non-related index patients.

Table 2. List of DYSF founder mutations
Mutation nomenclature on cDNA (RNA, protein)Geographic/ethnic origin of the populationEvaluated carrier frequencyReference
  1. Mutations are described using the official nomenclature of the Human Genome Variation Society, and relating to the human DYSF cDNA sequence of reference (isoform 8, GenBank #NM_003494.2).

c.1180_1180+7delAGTGCGTG (r.1054_1284del, p.Glu353_Leu429del)PortugueseUnknownVernengo et al., 2011
c.2372C>G (p.Pro791Arg)Native canadianUnknownWeiler et al., 1999
c.2779delG (p.Ala927LeufsX21)Caucasian jewish4%Leshinsky-Silver et al., 2007
c.2875C>T (p.Arg959Trp)ItalianUnknownCagliani et al., 2003
c.4872_4876delinsCCCC (p.Glu1624AspfsX10)Libyan jewish10%Argov et al., 2000
c.5492G>A (exon skipping)PortugueseUnknownSantos et al., 2010
c.5713C>T (p.Arg1905X)Spanish (region of Sueca)2%Vilchez et al., 2005

Type of mutational events

Among the 266 different reported mutational events, the following type of mutations were identified: 175 single base substitutions (65.8%), 54 deletions (20.3%), 26 duplications (9.8%), 6 insertions (2.3%) and 5 insertion/deletions (1.9%). Among the total deletion and insertion events, 51.8% of deletions and 68.7% of insertions occurred within a repeated sequence. A total of 220 (82.7%) distinct mutations affect exonic sequences and the remaining 46 (17.3%) mutations involve change of intronic nucleotides. Altogether, among all disease-causing mutations in UMD-DYSF, exonic mutations segregate into missense mutations (33.1%), nonsense mutations (18.0%), frameshifting mutations (27.8%) and in-frame exonic insertions or deletions (3.8%) (Table 3A). The partition of the different mutation types found within the UMD-DYSF allele set is summarized in Table 3B. Moreover, UMD-DYSF reports six large rearrangements found in eight index patients involving deletion or duplication of one or several exons (Table 4). Because such large mutational events are not systematically searched for in genetic testing, this figure is expected to be an underestimate of the real large rearrangements frequency [Krahn et al., 2009b].

Table 3. Types of disease-causing mutations recorded in UMD-DYSF
Type of mutationsA. Number of different mutationsB. Number of alleles from index patients*C. Number of homozygous alleles from LGMD2B index patients*D. Number of homozygous alleles from Miyoshi index patients*
  1. All percentages are calculated with respect to the value in the TOTAL line. *For each patient, heterozygous disease-causing mutations are counted once and homozygous disease-causing mutations are counted twice.

Exonic point mutations136 (51.1%)365 (49.7%)70 (46.1%)62 (40.3%)
Missenses88 (33.1%)236 (32.1%)56 (36.8%)30 (19.5%)
Nonsenses48 (18.0%)129 (17.6%)14 (9.2%)32 (20.8%)
Exonic deletions and insertions84 (31.6%)261 (35.5%)54 (35.5%)74 (48.1%)
Deletions49 (18.4%)149 (20.3%)28 (18.4%)42 (27.3%)
Out of frame deletions45 (16.9%)139 (18.9%)24 (15.8%)40 (26.0%)
In frame deletions4 (1.5%)10 (1.4%)4 (2.6%)2 (1.3%)
Insertions30 (11.3%)74 (10.1%)14 (9.2%)16 (10.4%)
Out of frame insertions27 (10.2%)71 (9.7%)14 (9.2%)16 (10.4%)
In frame insertions3 (1.1%)3 (0.4%)0 (0.0%)0 (0.0%)
Indels5 (1.9%)38 (5.2%)12 (7.9%)16 (10.4%)
Out of frame indels2 (0.8%)33 (4.5%)6 (6.6%)0 (0%)
In frame indels3 (1.1%)5 (0.7%)2 (1.3%)16 (10.4%)
Intronic mutations46 (17.3%)109 (14.8%)28 (18.4%)18 (11.7%)
TOTAL266 (100%)735 (100%)152 (100%)154 (100%)
Table 4. List of large rearrangements identified in the DYSF gene
Mutation nomenclatureDuplicated or deleted exonsNumber of occurrence in probands
  1. Large deletions can be displayed using the “Deletion Map” function on the UMD-DYSF website. Mutations are described using the official nomenclature of the Human Genome Variation Society, and relating to the human DYSF cDNA sequence of reference (isoform 8, GenBank #NM_003494.2).

c.89-643_4411-2493del2 to 401
c.343-?_457+?del51
c.2512-?_3174+?del25 to 293
c.3904-?_4333+?dup37 to 391
c.5768-?_5946+?del521
c.6205-?*+?del551

Exonic variants

The 220 exonic mutations are distributed along the entire coding sequence, affecting regions of the protein both within or outside of predicted functional domains, and without any defined mutational hotspot (Fig. 1). A total of 122 exonic mutations are predicted to disrupt the open reading frame and/or to lead to a premature stop codon. These mutations can be classified into insertions or deletions events (74 frameshifting mutations) and nonsense mutations (48 mutations) (Table 3A) and are found evenly distributed along the coding sequence (Fig. 1). Overall, the events that presumably lead to the translation of a truncated and unstable dysferlin protein represent 50.6% of the proband allele population (Table 3B). We examined the distribution of missense and in-frame exonic insertion or deletion mutations and compared their proportion either within or outside annotated domains (Fig. 1). We show that mutations recorded in UMD-DYSF affect 3.3% of all amino acids residing outside annotated domains and 5.0% of all amino acids residing within domains. In particular, we confirm the susceptibility of the repeated DysF domain to mutations [Patel et al., 2008] as the UMD-DYSF mutations affect 7.9% of the amino acids within this domain. The “Structure” function summarizes the distribution of small rearrangements in structural domains and in possible calcium binding residues of the dysferlin protein. Within the group of proband alleles, 453 (81.3%) correspond to DYSF variants mutated within regions encoding a predicted structural or functional domain. Overall, C2 domains are the most frequently affected (266 mutational entries), followed by DysF and ferlin domains (126 and 60 mutational entries, respectively) whereas one single index patient was identified with a deleterious mutation (12 bp insertion/deletion) in the region coding for the carboxy-terminal transmembrane domain [Guglieri et al., 2008]. Interestingly, mutations in predicted calcium binding residues of C2 domains were reported for only three patients, within C2B, C2C and C2F domains [De Luna et al., 2007; Nguyen et al., 2005; Walter et al., 2003].

Splice variants

Among the 266 different mutational events reported in UMD-DYSF, 46 splice variants consist of both intronic or exonic mutations associated with a predicted or experimentally described abnormal splicing of the DYSF gene (Table 5). Intronic variants include 31 mutations directly affecting 5′ splice donor-sites, 14 mutations affecting 3′ splice acceptor-sites and one deleterious mutation within a branchpoint signal. In addition, two exonic mutations have been shown to produce aberrantly spliced transcripts by either abolishing the canonical donor splice site (c.5429G>A) [Santos et al., 2010] or by creating a novel ectopic acceptor splice site (c.1555G>A) [De Luna et al., 2007]. Altogether, splice variants constitute 14.8% of the allele population in UMD-DYSF index patients (Table 3B). Using dedicated functions included in UMD and HSF, a pathogenic effect on the splice donor or acceptor sites, or in the branchpoint (c.3443-33A>G), was correctly predicted in all cases, exception made for one mutation, c.5525+3A>G. This mutation was shown to promote exon 49 skipping [De Luna et al., 2007]. HSF analysis predicts an effect on the splice donor site, but below the threshold of pathogenicity. However, possible effects on exonic splicing enhancer and silencer sites are also predicted, and may cause the experimentally proven exon 49 skipping in this case.

Table 5. List of reported splice mutations within the DYSF gene
LocalisationMutation nomenclatureEffect at the RNAOriginal description
  1. Mutations affect canonical intronic splice signals (5′ and 3′splice sites, branchpoints) or exonic nucleotides. Effect on RNA splicing was either predicted (r.spl?) or experimentally described. Disruption of canonical splice signals or creation of novel splice signals can promote exon skipping (ES), intron retention (IR), or other sequence insertion/deletion in the mRNA. Mutations are predicted to either maintain the reading frame (IF) or introduce a frameshift (FS) leading to the translation of a truncated product and possibly to nonsense-mediated mRNA decay. * Predominant transcript. Mutations are described using the official nomenclature of the Human Genome Variation Society, and relating to the human DYSF cDNA sequence of reference (isoform 8, GenBank #NM_003494.2).

IVS3c.236+1G>Tr.143_236del (E3S, FS)Liewluck et al. 2009
IVS5c.457+1insGr.spl?Nguyen et al. 2005
IVS5c.457+2T>Gr.343_457del (E5S, FS)Cagliani et al. 2005
IVS6c.663+1G>Cr.spl?Saito et al. 2002
IVS6c.664-9_667del13r.spl?Klinge et al. 2010
IVS8c.855+1delGr.spl?Nguyen et al. 2005
IVS10c.937+1G>Ar.spl?Saito et al. 2002
IVS11c.1053+5G>Ar.spl?Klinge et al. 2008
IVS12c.1180+2T>Cr.spl?Cuglieri et al. 2008
IVS12c.1181-2A>Cr.1181_1212del (FS)Cagliani et al. 2005
IVS13c.1284+2T>Cr.spl?Tagawa et al. 2003
IVS13c.1285-2A>Gr.spl?Spuler et al. 2008
IVS14c.1353+1G>A[r.1353+1_1354-1ins; r.1353+1g>a] (I14R, FS)de Luna et al. 2007
IVS14c.1354-1G>Ar.spl?Klinge et al. 2010
IVS16c.1480+1delGr.1398_1480del (E16S, FS)Therrien et al. 2006
IVS16c.1481-1G>Ar.spl?Rosales et al. 2010
Exon17c.1555G>Ar.1523_1556del (FS)de Luna et al. 2007
IVS22c.2163-1G>Tr.spl?Klinge et al. 2010
IVS24c.2511+1G>Ar.spl?Nguyen et al. 2005
IVS25c.2643+1G>Ar.spl?Matsuda et al. 2001
IVS25c.2643+2T>Cr.spl?Klinge et al. 2010
IVS25c.2643+2T>Gr.2512_2643del (E25S, IF)Therrien et al. 2006
IVS25c.2644-2A>Gr.spl?Matsuda et al. 2001
IVS26c.2810+1G>Ar.spl?Nguyen et al. 2005
IVS26c.2810+1G>Cr.spl?Cuglieri et al. 2008
IVS28c.3031+2T>Cr.spl?Nguyen et al. 2005
IVS30c.3348+1delGTATr.spl?Nguyen et al. 2005
IVS30c.3349-2A>Gr.spl?Klinge et al. 2010
IVS31c.3443-33A>Gr.3443_3520del (E32S, IF)Sinnreich et al. 2006
IVS33c.3702+1G>Ar.spl?Nguyen et al. 2005
IVS33c.3703-1G>Ar.spl?Nguyen et al. 2005
IVS34c.3843+1G>Ar.spl?Nguyen et al. 2005
IVS34c.3843+2T>Ar.spl?Rosales et al. 2010
IVS37c.4005+1G>Ar.spl?Nguyen et al. 2005
IVS38c.4167+1G>Cr.spl?Nguyen et al. 2005
IVS40c.4411-5C>Gr.spl?Klinge et al. 2008
IVS45c.5057+4_delCGTr.?_5057del (FS)Cagliani et al. 2003
IVS45c.5057+5G>Ar.splMcNally et al. 2000
IVS45c.5057+4_5057+5ins23r.spl?Anderson et al. 2000
IVS46c.5200+1G>Ar.spl?Cagliani et al. 2005
IVS47c.5341-1G>Ar.spl?Klinge et al. 2010
Exon48c.5429G>Ar.5341_5429del * (E48S, FS)Santos et al. 2010
IVS48c.5430-2A>Gr.spl?Kesari et al. 2008
IVS49c.5525+3A>Gr.5430_5525del (E49S, IF)De Luna et al. 2007
IVS49c.5526-1G>Ar.spl?Rosales et al. 2010
IVS50c.5668-7G>A[r. 5668-5_5668-1ins;r.5668-7g>a] (FS)Cagliani et al. 2005
IVS51c.5767+1G>Ar.spl?Nguyen et al. 2005
IVS52c.5946+1G>Ar.spl?Liu et al. 1998

Mutation status

Altogether, 280 patients carry at least one homozygous mutation. Among them, two patients carry two or three homozygous mutations (F1-47-1-2 and F1-18-1-2) and three patients carry one homozygous mutation and one heterozygous mutation (UK2-29-1-0, UK2-47-1-0 and UK2-49-1-0). A total of 176 patients carry at least two compound heterozygous mutations, including two patients carrying three heterozygous mutations (F1-65-1-2 and UK2-35-1-0). The identification of more than two distinct possibly disease-causing mutations in a patient may be related to the existence of hypomorphic sequence variants, or complex alleles. For 102 patients, only one heterozygous disease-causing mutation was identified. Among these are two symptomatic dysferlin mutation carriers described by Illa and colleagues [Illa et al., 2007]. Overall, both disease-causing alleles were identified in 323 index patients (80.5%), whereas only one disease-causing allele was identified in the other 78 index cases (19.5%), thus underlining incomplete sensitivity of the currently used mutation detection techniques. However, these figures do not reflect the overall detection rate of dysferlin mutation screening procedures since in patients with a clinical diagnosis of dysferlinopathy, it is estimated that for approximately 10% of them, mutational analyses did not confirm them as carriers of any disease-causing mutation in the dysferlin gene and these patients are thus not recorded in UMD-DYSF (the inclusion criteria being the identification of at least one deleterious mutation).

Comparison of mutational profiles of the LGMD2B and Miyoshi myopathy phenotypes

Dysferlinopathies are characterized by the two main clinical phenotypes, LGMD2B and Miyoshi myopathy, and additional clinical variants, thus presenting a broad range of symptoms and onset. In all cases the genotype-phenotype relationship has always remained difficult to define. In UMD-DYSF, 88% of patients present with either a LGMD2B or Miyoshi myopathy phenotype, as described in the original publications. We have compared the distribution of the mutations along the DYSF gene (Fig. 1) and the type of mutations between the two main clinical groups (Table 3C and D, with patients with one homozygous mutation) and no significant difference was observed between them (Chi test, p>0.01). Therefore, available mutational data do not point out any genotype-phenotype correlation for dysferlin mutations with regard to the two main clinical presentations, LGMD2B or Miyoshi myopathy. It can be speculated that the observed clinical heterogeneity in dysferlinopathies may rather be related to the implication of genetic or environmental modifiers.

Database Update

The UMD-DYSF v1.0 database and subsequent updated versions are available at www.umd.be/DYSF/. Curation of the UMD-DYSF database by a dedicated curator will allow continuous updating. Clinicians and researchers are encouraged to submit unpublished variants by contacting the curator of the database. Notification of omissions and errors in the current version, as well as specific phenotypic data, would be gratefully received by the curator. The software package is available on a collaborative basis and will be expanded as the database grows, with the implementation of new specific functions according to the requirements of its users. In referring to UMD-DYSF, we kindly ask all users of the database to cite this article.

Acknowledgements

We sincerely thank Kate Bushby, Brigitta von Rekowski and Hanns Lochmüller for helpful advice on the UMD-DYSF website, Andrew Phillips for his help with the HGMD statistics, and Bruno Eymard, Jean Pouget, Shahram Attarian and Emmanuelle Campana-Salort for helpful discussions.

Ancillary