Transcriptional and Translational Effects of Intronic CAPN3 Gene Mutations

Variants of unknown significance in the CAPN3 gene constitute a significant challenge for genetic counselling. Despite the frequency of intronic nucleotide changes in this gene (15–25% of all mutations), so far their pathogenicity has only been inferred by in-silico analysis, and occasionally, proven by RNA analysis. In this study, 5 different intronic variants (one novel) that bioinformatic tools predicted would affect RNA splicing, underwent comprehensive studies which were designed to prove they are disease-causing. Muscle mRNA from 15 calpainopathy patients was analyzed by RT-PCR and splicing-specific-PCR tests. We established the previously unrecognized pathogenicity of these mutations, which caused aberrant splicing, most frequently by the activation of cryptic splicing sites or, occasionally, by exon skipping. The absence or severe reduction of protein demonstrated their deleterious effect at translational level. We concluded that bioinformatic tools are valuable to suggest the potential effects of intronic variants; however, the experimental demonstration of the pathogenicity is not always easy to do even when using RNA analysis (low abundance, degradation mechanisms), and it might not be successful unless splicing-specific-PCR tests are used. A comprehensive approach is therefore recommended to identify and describe unclassified variants in order to offer essential data for basic and clinical geneticists. ©2010 Wiley-Liss, Inc.


INTRODUCTION
Autosomal recessive limb girdle muscular dystrophies (LGMD) are a group of disorders characterized by progressive involvement of proximal limb girdle muscles, including at least 14 different genetic entities.
LGMD2A [MIM# 253600] is the most prevalent form of LGMD in many countries, and is caused by mutations in the CAPN3 gene [MIM# 114240] encoding calpain-3 protein. About 350 different CAPN3 gene mutations have so far been reported on the Leiden Muscular Dystrophy Database, most of which are private and distributed throughout the gene. Approximately 70% of mutant alleles are of missense type; the remainder are null mutations (deletion/insertion causing frame shifting, nonsense, and splice site mutations), large genomic rearrangements, and OFFICIAL JOURNAL www.hgvs.org synonymous or intronic changes causing aberrant splicing.
While the truncating mutations are considered to be causative, the pathological significance of nucleotide changes localized in intronic regions is very difficult to predict, which makes distinguishing between benign and pathogenetic variants a challenge and compromises a conclusive genetic counselling. Indeed, many intronic changes remain of uncertain clinical significance (UVs), thus partially explaining the observation that, although a severe calpain-3 protein defect in muscle usually corresponds to primary LGMD2A, about 20% of biochemical defects remain without any molecular proof [Fanin et al., 2009a]. Many CAPN3 intronic variants have been identified during diagnostic screening: they account for about 15% of the total variants listed at the Leiden Database (28% in splice sites, 72% in deep intronic position), and for about 25% of the mutations reported in other studies [Blazquez et al., 2008;Krahn et al., 2006]. However, it is conceivable that the true frequency of intronic mutations has largely been underestimated because deep intronic sequences are not conventionally sequenced and, most aberrant transcripts are expressed at low abundance and are usually prone to degradation by nonsensemediated mRNA decay (NMD) mechanism [Maquat, 2004]. For the majority of intronic variants the consequences on mRNA splicing have been only inferred by in-silico analysis [Duno et al., 2008], whereas experimental demonstration of their pathogenicity has been obtained by mRNA studies for only 1% of them [Leiden Database;Haffner et al., 1998;Krahn et al., 2007;Blazquez et al., 2008]. In this study we have exploited the availability of diagnostic muscle biopsies to demonstrate previously unrecognized pathogenetic effects, at both transcriptional and translational levels, of 5 different intronic mutations in the CAPN3 gene.

Selection of patients and muscle biopsies
From a population of over 100 LGMD2A patients who had received a biochemical and genetic characterization in our Centre, 16 cases were selected because of the presence of one intronic variant in the CAPN3 gene with an unknown pathogenetic effect. In all cases except one, the second mutant allele has been identified. Patients underwent diagnostic open muscle biopsies, after written consent had been obtained. As normal controls, we used genomic DNA, cDNA from muscle tissue and calpain-3 protein from subjects who had resulted free of any neuromuscular disorders.

Protein analysis by western blotting
Semi-quantitative analysis of calpain-3 protein in muscle was conducted as reported [Fanin et al. 2009b]. The quantity of immunoreactive bands at 94 kDa (full-length protein) was determined by densitometry and expressed as a percentage of control.

In-silico analysis of unclassified intronic variants (UVs)
Prediction of the potential effect of intronic variants was determined by the following splice-site prediction programs (SSPPs): Human Splicing Finder (HSF V2.4 at http://www.umd.be/HSF) [Desmet et al., 2009], Splice Site prediction by Neural Network (NNSPLICE V0.9 at www.fruitfly.org/seq_tools/splice.html), Splice View (http: //bioinfo.itb.cnr.it/oriel/splice-view.html) and NetGene2 (at www.cbs.dtu.dk/services/NetGene2). Score variation in canonical splice-site use and potential activation of cryptic splice-sites in the presence of mutations were evaluated by analysis of exon and intron sequences localized in proximity to mutations using default settings (Supp . Table S1).

RNA analysis
Total RNA was isolated from muscle biopsies using the SV Total RNA Isolation System kit (Promega, Madison, WI) including treatment with DNAsel, and reverse transcribed to cDNA with Superscript III reverse transcriptase (Invitrogen, San Diego, CA) and random hexamers. In one patient the total RNA was isolated also from blood using the PAXGene blood RNA System (PreAnalytiX, Qiagen, Hilden, Germany). cDNA was amplified with specific calpain-3 primers designed on the published human calpain-3 mRNA sequence (GenBank accession number: NM_000070). PCR products were analyzed by gel electrophoresis: aberrant bands were manually excised and processed using the QIAquick Gel Extraction kit (Qiagen) for sequencing. Furthermore, we designed splicing-specific and skipping-specific primers (sequence available on request) to selectively amplify abnormal transcripts predicted by bioinformatic tools, even when this was not detectable on RT-PCR. Each splicing-specific PCR was tested on both patients and normal controls and directly sequenced. Details of in-silico predictions of splicing are reported in Supp. Table S1.

RESULTS
We investigated the pathogenetic effect in muscle of 5 different intronic variants (one novel) in the CAPN3 gene, including 3 nucleotide substitutions in the canonical donor or acceptor splice sites and 2 more deep intronic variations, and demonstrated that they all caused abnormal mRNA splicing and abolished or impaired protein synthesis (Table 1, Table 2, Supp. Table S1). These intronic mutations accounted for 9% of the total mutant alleles and 7% of all the mutations found in our population of over 100 LGMD2A patients.

Variation c.1030-1G>A (intron 7)
This was the first time this mutation had been identified in an LGMD2A patient. The patient was a compound heterozygote for a null mutant allele (c.550delA). This mutation abolishes the canonical acceptor splice-site, but the expected skipping of exon 8 was not detected. From SSPPs analysis, this intronic variant was expected to create a novel acceptor splice site (A1) with consequent deletion of the first nucleotide of exon 8 (r.1030delG). cDNA sequencing confirmed this deletion, which resulted in the creation of a downstream premature stop codon (p.V344SfsX8). Calpain-3 protein in muscle was absent, demonstrating the deleterious effect of this mutation at translational level.

Variation c.1524+1G>C (intron 11)
This mutation was identified in one LGMD2A patient in our series, who was a compound heterozygote for a missense mutant allele (p.M252T) and showed severely reduced calpain-3 protein level in muscle (5% of control), revealing its deleterious effects at translational level. This mutation abolishes the canonical donor splice-site, but a transcript carrying the skipping of exon 11 was not detected, even when using skipping-specific primers. Following SSPPs analysis, this variant was expected to produce two different aberrant transcripts ( Figure 1): one originated by the use of a cryptic donor site (D1), leading to a deletion of the last 99 basepairs in exon 11 (r.1425_1524del), and another (D2) resulting from the exonization (intron retention) of 212 basepairs in intron 11 (r.1524_1525ins1524+1_1524+31;1524+1g>c; p.E508_V509ins 508+1_508+70). D1 transcript was characterized by cDNA amplification, which showed a shortened product corresponding to the expected deletion, that was identified by sequencing (p.V476_E508del). A further confirmation was obtained by PCR with D1 splicingspecific primers. While this transcript was sufficiently expressed and stable enough to be easily detected by cDNA analysis in muscle, the D2 transcript was detected only by splicing-specific PCR.

Variation c.1992+1G>T (intron 17)
This mutation was identified in 3 unrelated LGMD2A patients in our series; its deleterious effect at translational level was demonstrated by the observation that it was associated with absent or virtually absent protein when found in a compound heterozygote state with different mutant alleles. The mutation abolishes the canonical donor splice-site and we demonstrated that it caused the skipping of the exon 17 (r.1915_1992del; p.P639_D664del) in muscle. This result was confirmed also in blood mRNA from one patient. The resulting transcript (WTD) is probably unstable and poorly expressed, since it was detectable only with skipping-specific primers. At least 3 cryptic splicing sites in the region surrounding this mutation obtained a high score from SSPPs analysis, but only one of them was shown to be effectively activated (D1) in muscle and blood cDNA by splicing-specific primers. This cryptic splice site generated an aberrant transcript carrying the insertion of 31 basepairs in intron 17 (r.1992_1993ins1992+1_1992+31;1992+6g>u; p.D664_D665ins 664+1_664+10) ( Figure 2).

Variation c.1193+6T>A (intron 9)
This mutation was identified in 6 unrelated LGMD2A patients in our series, all from the same administrative district of the Veneto Region (a possible founder effect followed by genetic isolation might have occurred). Among the 6 patients with this mutation, one was the object of a family study (Figure 3): we conducted both a segregation analysis of the mutant allele and a study of the effect of this mutation at calpain-3 protein level in 2 different family members (a muscle biopsy was obtained from both an affected girl and her heterozygote father, who reported hyperCKemia before the diagnosis was obtained in his daughter).
The deleterious effect of this mutation at translational level was demonstrated by the observation that: 1) when it was associated with a null mutant allele, it produced absent or very reduced amounts of protein. This variant was not predicted to cause the loss of the canonical donor splice site, but the score for its use was reduced by all the SSPPs algorithms used. This means that the generation of a correctly spliced transcript would still be possible invivo, but our protein data suggest that, if this is the case, this might only take place to a very limited extent; 2) when it was expressed in heterozygote state, it produced one half of the amount of protein; 3) when associated with a missense mutant allele, which, as we had previously reported, caused the loss of functional autolytic activity without any quantitative defect, it produced normal protein quantities. Furthermore, following SSPPs analysis, this variation was expected to affect correct splicing by use of an alternative cryptic donor splice site (D1), resulting in the insertion of 31 basepairs at the beginning of intron 9 (r.1193_1194ins1193+1_1193+31; 1193+6u>a; p.M399X). The D1 transcript was sufficiently expressed to be detected by RT-PCR analysis: it produced an extraband which was detectable after gel electrophoresis and which was excised and sequenced ( Figure 3). No previous functional characterization of this mutation has been provided. RT-PCR analysis of normal control (C) and a heterozygous mutant patient (II-1) who shows the WT product and one additional low-abundance product corresponding to the alternately spliced mRNA (D1). Family pedigree (case n. 7652) and western blot show that this intronic mutation (filled symbol) produced a reduction of calpain-3 protein of about 50% of control (C) after myosin normalization, as assessed in the muscle biopsy from both the heterozygous father (I-1) and his affected daughter (II-1), who was a compound heterozygote for a second missense mutation (p.E435K, dashed symbol).
Conversely, in our study, the deleterious effect of this mutation has definitely been demonstrated at both translational (when this mutation was associated with another mutant allele, it always resulted in severely reduced protein quantity; Figure 4) and transcriptional levels. Indeed, this mutation had been predicted by the HSF algorithm to create a new acceptor splice site (A1), which was detected in muscle mRNA by splicing-specific PCR. Furthermore, numerous cryptic splicing sites in the region surrounding this mutation obtained a high score by SSPPs analysis and were expected to cause the insertion of different portions of intron 13. Three such transcripts (A2, A3, A4) were detected in our patients by cDNA analysis. However, even though none of the algorithms used predicted the loss of the canonical acceptor splice site (and only a slight decrease in its score was obtained for two of them), we identified an aberrant transcript (WTA) carrying the insertion of the entire intron 13. All these transcripts were variably expressed and detected in the patients, but while 4 aberrant transcripts (A1, A2, A3, A4) were detected only in mutant patients, suggesting their pathogenetic effect, the variant WTA was also expressed in 10 normal controls, indicating its non-pathogenicity. We attribute the occurrence of this latter transcript to the result of an alternative splicing event that takes place in normal tissues, as previously reported [Kawabata et al., 2003;De Tullio et al., 2003]. Family pedigree (case n. 6211) and western blot showing that this intronic mutation (filled symbol) produced a deleterious effect at protein level, as demonstrated by the complete loss of calpain-3 protein in the muscle from one affected patient (II-1) who was a compound heterozygote for a second missense mutation that has a deleterious effect as well (p.L204V, dashed symbol).

DISCUSSION
Despite the frequent identification of intronic changes in the CAPN3 gene, the demonstration of their pathogenetic effect has only occasionally been sought or achieved, both due to the laborious investigations involved and, also, because mRNA/protein studies in LGMD2A are often limited by the unavailability of a muscle biopsy. Although assessment of the pathogenetic effects of such variants is challenging, the effort required is justified because successful results are crucial to offer definitive diagnoses and conclusive genetic counseling, when establishing genotype-phenotype correlations and in view of future therapies.
Using in-silico predictions combined with RT-PCR analyses on muscle mRNA, we demonstrated the pathological significance of 5 UVs, showing their effects on splicing and protein translation.
We showed that CAPN3 mRNA degradation by NMD is not a drawback in muscle RT-PCR analyses, provided that these are combined with SSPPs analysis and the use of splicing-specific PCR tests. SSPPs are valuable tools to select variants that are predicted to impair correct RNA splicing, for address the subsequent variant-specific analyses at RNA level. However, they may provide contradictory or negative results, requiring the use of multiple softwares to resolve the discrepancies. Among the 4 softwares we used in this study, the most reliable and informative was the HSF, which contains also matrices for the prediction of effects on putative enhancer/silencer sequences.
Previous studies have reported different results for 3 of the 5 variants we analyzed (c.1992+1G>T, c.1524+1G>C and c.1746-20C>G). In particular, one study on blood mRNA identified only one aberrant transcript (D1) for the mutation c.1992+1G>T and failed to detect any mRNA for the mutation c.1524+1G>C [Blazquez et al., 2008]. We suggest that this discrepancy could be due to the different approach used. Indeed, using splicingspecific PCR, combined with in-silico predictions, we demonstrated that all the aberrant transcripts for the c.1992+1G>T mutation identified in muscle were expressed also in blood. Another possible explanation may be the use of a tissue other than muscle. Earlier studies reported the unreliability of RNA splicing analysis from blood samples, both because of its different splicing pattern and of the possibility that degenerated or illegitimate splicing had occurred associated with diverse storage conditions [Wimmer et al., 2000]. These observations highlight how critical the choice of the tissue to be studied is. Furthermore, because of the lack of a molecular proof, some intronic variants have been reported either as polymorphisms or as "possibly pathogenetic", thus generating confusion and compromising a conclusive genetic counselling. This is the case of the variation c.1746-20C>G in intron 13 [Hermanova et al., 2006;Stehlikova et al., 2007;Krahn et al., 2007;Groen et al., 2007], for which we have provided a definite demonstration of the pathogenetic effect.
In our study, the 3 mutations localized at conserved canonical donor/acceptor splice sites showed clearly their pathogenicity at both RNA and protein level, causing the loss of splice site and resulting in the absence of protein.
For the 2 mutations localized in less conserved regions nearby the exon/intron boundaries, SSPPs did not predict the loss of canonical splice sites, but only provided a decreased score, indicating a higher probability of use of nearby cryptic splicing sites, which indeed we experimentally demonstrated. Patients carrying these 2 latter mutations and heterozygous for a different frame-shifting mutation showed either absent or severely reduced calpain-3 protein, indicating that if a correct splicing were still possible, it might occur to a very limited extent.
The majority of the aberrant transcripts characterized in this study both contained a frame-shifting and either were expressed at very low abundance or were detectable only by splicing-specific-PCR. This result agrees with trace amounts of the corresponding transcript identified in LGMD2A patients who were compound heterozygous for 2 null mutations [Stehlikova et al., 2007]. Conversely, the transcripts originating from the c.1524+1G>C and c.1992+1G>T variants contained in-frame insertions/deletions, which were however expressed at low abundance, possibly because alternative splicing may limit the export of mRNA, making it a target for degradation. These transcripts could potentially have generated longer/truncated protein products, which were however not detected.
One conclusion from our study is that although SSPPs analysis helps in addressing the potential effects of intronic variations, experimental demonstration of this is not always easy to do by mRNA analysis (low abundance of transcripts, NMD mechanism) and might prove unsuccessful unless the individual transcripts are identified by both splicing-specific-PCR tests and sequencing. Furthermore, SSPPs analysis proved to be very valuable and reliable in predicting aberrant splicing for variants localized in the intronic 5' and 3' splice-site region, whereas it was neither conclusive nor always successful in the prediction of the consequences of such loss. For this purpose, however, it was useful to check the regions surrounding the nucleotide variant for the presence of potential cryptic splicing sites and for the loss/gain of splicing regulatory elements [Faustino and Cooper, 2003;Shapiro and Senapathy, 1987].
We verified that a common effect of intronic variations in the CAPN3 gene, occurring in all 5 mutations studied, is the aberrant splicing caused by the activation of a series of cryptic splicing sites near the mutant nucleotide, and the creation/disruption of potential silencer/enhancer motifs. Mutation analysis by SSPPs should therefore involve the sequence context, in order to obtain more informative predictions. Conversely, the occurrence of exon skipping was observed only in one case in our series, suggesting either that this a less frequent mechanism used for the variants analysed (possibly because of the presence of a nearby cryptic splice site) or that transcripts carrying exon-skipping are more unstable and prone to degradation by NMD.
Although experimental demonstration of the pathogenetic effects of intronic mutations is difficult and laborious, we found that it is often successful when this effort is based on a preliminary SSPPs analysis. This aim should be pursued more frequently because of its important consequences for clinical and genetic counselling, for establishing genotype-phenotype correlations and providing novel insights into the complex mechanism of splicing.

ACKNOWLEDGMENTS
We are very grateful to patients for their generous cooperation. This work was supported by the "Association Française contre les Myopathies", the Italian Telethon and the EuroBioBank Network. Contract