Alternative splicing: role of pseudoexons in human disease and potential therapeutic strategies


E. Buratti, Padriciano 99, 34012 Trieste, Italy
Fax: +39 040 226555
Tel: +39 040 3757316


What makes a nucleotide sequence an exon (or an intron) is a question that still lacks a satisfactory answer. Indeed, most eukaryotic genes are full of sequences that look like perfect exons, but which are nonetheless ignored by the splicing machinery (hence the name ‘pseudoexons’). The existence of these pseudoexons has been known since the earliest days of splicing research, but until recently the tendency has been to view them as an interesting, but rather rare, curiosity. In recent years, however, the importance of pseudoexons in regulating splicing processes has been steadily revalued. Even more importantly, clinically oriented screening studies that search for splicing mutations are beginning to uncover a situation where aberrant pseudoexon inclusion as a cause of human disease is more frequent than previously thought. Here we aim to provide a review of the mechanisms that lead to pseudoexon activation in human genes and how the various cis- and trans-acting cellular factors regulate their inclusion. Moreover, we list the potential therapeutic approaches that are being tested with the aim of inhibiting their inclusion in the final mRNA molecules.


3′ splice site


5′ splice site


antisense oligonucleotide


long interspersed elements


nonsense-mediated decay


polypyrimidine tract binding protein


short interspersed elements


Towards the end of the 1970s, in the beginning of pre-mRNA splicing research [1,2], defining exons and introns was essentially based on observing the final composition of the mature mRNA molecule. In 1978, any sequence that was included in a mature mRNA became tagged as an ‘exon’, whereas all the intervening genomic sequences that were left out during the splicing process became defined as ‘introns’ [3]. However, this way of thinking did not explain what makes an exon an exon or an intron an intron. The discovery of the basic splice site consensus sequences during the same years [4,5], and later on of enhancer and repressor elements, has taken us a long way in the direction of discovering exon- and intron-definition complexes [6–8]. Nowadays, the splicing signals that define exons/introns have been greatly aided by basic research, bioinformatic approaches and advanced sequencing tools [9,10]. In this regard, we certainly know much more about splicing regulation than we did 20 years ago. Considering that several reviews have been written recently on the subject, the reader is referred to them for further information on the latest discoveries [11–14]. Most important, in this respect, have been the initial observations that in alternative splicing processes the same nucleotide sequence could be defined by the spliceosome as an intron or an exon in response to specific signals [15,16]. It is now clear that these kinds of decision (What is an exon? What is an intron?) are of paramount importance in explaining genome complexity and evolutionary pathways [17–20]. However, the sum of this new knowledge does not necessarily mean that we are near the goal of understanding most splicing decisions. Indeed, even the latest attempts at ‘designing’ exons based on current state-of-the-art knowledge have basically demonstrated that there is still a long way to go before we can become as good as the spliceosome in deciding what is an exon and what is an intron [21].

Where do pseudoexon sequences come into the story?

Central to the issue of deciding what is an exon and what is an intron is the question of their origin, a very much debated field to this day that basically deals with deciding the order of appearance of introns during evolution, whether first, early or late [22]. Whatever the answer to this question will turn out to be, it is now clear that many of the ‘new’ exons in our genome originate from the insertion of transposable sequence elements belonging to the SINE and LINE classes in the eukaryotic genome [23–25]. In particular, exonization of Alu elements (which are primate specific and represent the most abundant mobile elements in the human genome) through retrotranposition–mutation events is a prominent source of new exons in the eukaryotic transcriptome, as schematically depicted in Fig. 1 [26,27].

Figure 1.

 The left panel shows a schematic model of Alu element exonization. The element (Alu) is inserted by retrotransposition and during the course of evolution mutations within this sequence create viable splicing sequences. The middle panel shows the effect of the inclusion of a nonsense exon sequence (NE) in a transcript. When this nonsense exon sequence is included, the resulting transcript is degraded by NMD (lower diagram). The right panel shows the classical pathway of pseudoexon (PE) inclusion in human disease. In this case, a nucleotide sequence on the brink of becoming an exon becomes activated following a number of different mutational events.

However, even if we ignore this particular class of exonization event, every in silico analysis shows that ‘false exons’ are very abundant in the intronic sequences of most genes [with this term we refer to any nucleotide sequence between 50 and 200–300 nucleotides in length with apparently viable 5′ and 3′ splice sites (5′ss and 3′ss) at either end]. Presently, there is evidence that inclusion of many of these sequences is actively inhibited due to the presence of intrinsic defects [28], the presence of silencer elements [29–31] or the formation of inhibiting RNA secondary structures [32]. Even if a combination of all these elements succeeds in repressing the use of many of these pseudoexon sequences, we have to consider the possibility that there must be many exceptions to this rule.

First, it is probable that several of these pseudoexons may actually be recognized only in particular circumstances, such as a consequence of particular external stimuli [33,34] or present in a given tissue or developmental stage. Proof of this possibility is the observation that ‘novel’ exons keep being identified even in well-known and studied genes, such as the DMD gene [35].

Second, our failure to observe their use in normal conditions may also be due to the fact that their inclusion can intentionally lead to premature insertion of a termination codon in the mature mRNA and the consequent rapid degradation by nonsense-mediated decay (NMD) pathways [36] (Fig. 1). Such an occurrence has been described in the rat α-tropomyosin gene with a putative pseudoexon sequence localized downstream of two mutually exclusive exons: an upstream exon that is included only in smooth muscle tissue and a downstream exon that is included in most cell types [37]. Experimental analysis has shown that, when this pseudoexon is included in the mRNA molecule together with the ubiquitously expressed downstream exon, the formation of a stop codon causes activation of the NMD pathway. On the other hand, when inclusion of this pseudoexon occurs with the upstream smooth muscle tissue-specific exon, then it can still be removed through a resplicing pathway (and a normally processed mRNA molecule can be generated). For this reason, the term ‘nonsense’ exon is now preferred to define these kinds of sequence, which according to bioinformatic analyses may be more prevalent in human genes than previously thought [37].

Nonetheless, from a human disease point of view, many pseudoexon intronic sequences seem poised on the brink of becoming exons (Fig. 1) and a comprehensive list of more than 60 published pathological pseudoexon events is presented in Table 1. Although briefly reviewed previously elsewhere [38], the recent advances in pseudoexon research warrant a second look at several pseudoexon-related issues, especially with regards to novel therapeutic approaches.

Table 1.   Pathological pseudoexon inclusion events in human disease. NA, not available; SRE, splicing regulatory element.
GeneSize (bp)Activating mutationReferenceDBASS3/DBASS5 reference
  1. Alu-derived pseudoexons. LINE-2-derived pseudoexons.

α-Gal A57SRE creation[78]
ATM65SRE deletion[56]
ATM1375′ss creation[79]
β-globin1655′ss creation[80]
β-globin1265′ss creation[81]
β-globin735′ss creation[82]
BRCA1663′ss creation[83]NA
BRCA293Downstream 3′ss deletion[46]NA
CD40L595′ss creation[84]
CEP2901285′ss creation[85]
CFTR495′ss creation[86]
CFTR845′ss creation[87]
CFTR101SRE creation[88]NA
CFTR184Downstream 3′ss deletion[47]
CFTR2145′ss creation[89]
CHM983′ss creation[90]NA
COL4A3a743′ss creation[91]NA
COL4A5303′ss creation[92]
COL4A5147SRE creation[92]NA
CTDP1955′ss creation[93]
CYBB565′ss creation[94]NA
CYBB615′ss creation[95]
DHPR/QDPR1525′ss creation[96]
DMD9828 kb gene inversion[41]NA
DMD10828 kb gene inversion[41]NA
DMD12528 kb gene inversion[41]NA
DMD14928 kb gene inversion[41]NA
DMD16028 kb gene inversion[41]NA
DMD18028 kb gene inversion[41]NA
DMD585′ss creation[97]
DMD675′ss creation[98]NA
DMD895′ss creation[98]
DMD905′ss creation[98]
DMD953′ss creation[97]
DMD1475′ss creation[99]
DMD1493′ss creation[98]
DMD172/2025′ss creation[100]
DMD46/1323′ss creation[101]
FBN1935′ss creation[102]NA
FGB50SRE creation[63]NA
FGG755′ss creation[103]NA
FVIII1915′ss creation[104]
GHER695′ss creation[106]NA
GHR102SRE deletion[57,107]
GUSBa685′ss creation[108]
HADHB56/1065′ss creation[109]NA
HSPG21305′ss creation[110]
IDS785′ss creation[111]
IDS103Upstream 5′ss deletion[42,43]
INI1/SNF5725′ss creation[112]
ISCU86/1003′ss creation[113–115]NA
JK136Internal 7 kb deletion[40]
MCBB64SRE deletion[116]NA
MYO61085′ss creation[117]NA
MUT765′ss creation or upstream 5′ss deletion[45,68]
NDUFS71225′ss creation[118]
NF-1705′ss creation[70]NA
NF-11075′ss creation[70]NA
NF-11723′ss creation[119]
NF-1583′ss creation[120]NA
NF-1765′ss creation[120]NA
NF-1545′ss creation[121]
NF-11775′ss creation[70,122,123]
NF-2106Branch-point creation[124]NA
NPC11945′ss creation[125]NA
OA1/GPR1431653′ss creation[69]
OATa1425′ss creation[126]
OTC1353′ss creation[127]NA
PCCA84SRE creation[68]NA
PCCB725′ss creation[68]
PHEX50/100/1705′ss creation[128]
PKHD11165′ss creation[129]NA
PMM2663′ss creation[130]NA
PMM21235′ss creation[130,131]NA
PRPF311755′ss creation[132]NA
PTSa45Branch-point optimization[133]NA
PTSb79Py-tract optimization[133]NA
RB11033′ss creation[134]NA
RYR11195′ss creation[135]
SOD-1435′ss creation[136]NA
TSC2895′ss creation[137]

Cis-acting sequences in pseudoexon inclusion

As previously mentioned, most pathological pseudoexon inclusion events originate from the creation of new splicing donor or acceptor splice sites within an intronic sequence, followed by the subsequent selection of weaker ‘opportunistic’ acceptor or donor site sequences (Fig. 2A). A preliminary analysis of the strength of donor sites activated in pseudoexon inclusion events has highlighted their relatively high strength (according to in silico prediction programs) with respect to normally processed exons and to cryptic donor sites activated following normal donor site inactivation [39]. In a slightly lower number of cases, pseudoexon activation has been observed following the creation of de novo acceptor sites (Table 1), whereas branch-point creation still represents a minority (probably owing to the fact that a new branch point needs to find both a viable acceptor and donor site nearby, rather than just one of them).

Figure 2.

 The mutational events that determine pathological pseudoexon inclusion. The most frequent is represented by the creation of de novo functional splice sites or branch-point elements through a single or few point mutation (A). Other mechanisms include the creation or deletion of splicing regulatory elements (B), genomic rearrangements (C, D) and inactivation of upstream or downstream splice sites (E).

In addition to de novo creation of strong donor, acceptor and branch site sequences, the other most frequent mechanisms that may lead to pseudoexon activation involves the creation/deletion of splicing regulatory sequences that will be discussed more in detail below (Fig. 2B). Finally, in two individual cases, the rearrangement of genomic regions through gross deletions (Fig. 2C) [40] or genomic inversions (Fig. 2D) [41] has also been described to give rise to pseudoexon inclusion events. This has come about either by bringing together viable splice sites that would normally be too far away from each other on the gene sequence or by activating exons in what would normally have been the antisense genomic strand.

In a few genes, a particularly interesting method of pseudoexon activation event has also occurred following the inactivation of naturally occurring upstream 5′ss (FAA, IDS, MUT) [42–45] or downstream 3′ss (BRCA2, CFTR) [46,47] (Fig. 2E). These findings suggest that the processivity of these mRNA transcripts probably represents an element capable of determining pseudoexon repression apart from being capable of influencing normal splicing levels [48].

On a more general note, a still underappreciated aspect of pseudoexon recognition that concerns the effect of cis-acting sequences is represented by the potential influence of RNA secondary structure on splicing efficiency [49]. Recently, it has been shown that donor site usage in the inclusion of two pseudoexon sequences in the ATM and CFTR genes is strongly dependent on their availability in the single-stranded region [50]. Interestingly, the same conclusion was reached in a recent study by Schwartz et al. [51] analysing the differences between exonized and nonexonized Alu elements. In this work, it was found that one of the major discriminating factors between these two classes of Alu elements was represented by the potential availability of 5′ss sequences in an unstructured conformation.

Trans-acting factors in pseudoexon inclusion

Not many studies have focused on identifying the role played by trans-acting factors in pseudoexon inclusion. However, because of its significance, this is an area of research that would probably benefit from increased attention by researchers in the future.

In the case of nonpathologically related pseudoexons carrying nonsense codons, the presence of splicing regulatory elements may well provide a clue with regards to the possible roles played by these sequences. For example, in the case of the previously described tropomyosin pseudoexon [37], the specific binding of hnRNP H/F proteins has been described as a potential key modifier of this pseudoexon inclusion event [52]. The fact that these proteins are particularly downregulated in cardiomyocytes may explain the cell-specific repression of the downstream ‘normal’ exon 3 that is otherwise present in all cell types (Fig. 3A).

Figure 3.

 A schematic diagram of the tropomyosin gene with exons 2 and 3, which are mutually exclusive (exon 3 is the predominant form in most cell types), and the nonsense exon (NE), which causes transcript degradation following its joining to exon 3 (but not exon 2). The levels of hnRNP H/F proteins can regulate the extent of NE inclusion. (B) shows that in the NF-1 intron, 30 pseudoexon inclusion levels are regulated by silencer elements in UCUU-rich motifs that bind the PTB (hnRNP I) splicing regulator. In the ATM gene, a four nucleotide deletion (GUAA) in the intronic region between exons 20 and 21 causes the insertion of a 65 nucleotide long pseudoexon (C). Functional analysis has demonstrated that this deletion abolished binding of an U1snRNP molecule in this position and activated a 3′ss lying 12 nucleotides upstream of this element. In the last case, binding of hnRNP E1 and U1snRNP to a silencer motif near a weak 5′ss efficiently silences pseudoexon inclusion in the GHR gene, preventing the development of Laron syndrome (D).

Interestingly, repression of the tropomyosin nonsense exon was also observed following PTB overexpression. PTB is a well-known and powerful splicing modifier that plays a major role in alternative splicing regulation [8,53]. Recently, this protein has been reported to also downregulate the inclusion efficiency of a pathological pseudoexon in NF-1 intron 31 independently of the activating mutation that creates a very strong splicing acceptor site [54] (Fig. 3B). This finding suggests that silencer binding sites may be actively used by evolutionary mechanisms to decrease the probability that random activating mutations may determine the constitutive inclusion of pseudoexon sequences.

In this respect, one interesting molecular complex is U1snRNP, a ribonucleoprotein complex normally associated with 5′ss recognition in the normal splicing process [55]. First, U1snRNP binding to an intronic splicing processing element has been found to inhibit pathological pseudoexon inclusion in intron 20 of the ATM gene (Fig. 3C). Inactivation of this element through a four nucleotide deletion causes pseudoexon inclusion and occurrence of ataxia telangiectasia in a patient [56]. In a second case, binding of hnRNP E1 and U1snRNP to a weak 5′ss efficiently silences pseudoexon inclusion in the GHR gene [57], preventing the development of Laron syndrome (Fig. 3D).

Finally, it should also be noted that in a variety of pseudoexon inclusion events, the activating mutations potentially created new splicing enhancer sequences (Table 1). Although in very few of these cases was trans-acting factors binding to these elements identified, in silico and experimental analyses have shown that several of the newly created enhancer sequences strongly correlate with potential binding to the SR protein class of splicing regulators.

Therapeutic strategies aimed at correcting pseudoexon inclusion in genetic diseases

Therapeutic strategies based on antisense oligonucleotide (AON) chemistry, which uses base pairing to target specific sequences in RNAs, have been extensively employed to correct splicing disorders in human genes [58,59]. Interestingly, apart from these therapeutic applications, short nuclear RNAs may also play a similar functional role to physiologically regulate exon inclusion, such as the case of snoRNA HBII-52 in the regulation of exon Vb inclusion in the serotonin receptor 2C [60]. AONs are thought to modulate the splicing pattern by steric hindrance of the recruitment of the splicing factors to the targeted splicing competent cis-elements, thus forcing the machinery to use the natural sites. Dominski and Kole [61] were the first to pioneer the antisense-mediated modulation of pre-mRNA splicing. In the earliest examples, AONs were aimed at activated cryptic splice sites in the β-globin and CFTR genes in order to restore normal splicing in β-thalassaemia and cystic fibrosis patients [61,62]. Currently, however, AON strategies have been used successfully to restore normal splicing in several disease models.

Afibrinogenemia is caused by genetic abnormalities within any of the three genes that encode the fibrinogen molecule: FGA, FGB, FGG. Recently, Davis et al. [63] showed that a homozygous c.115–600A>G point mutation located deep within intron 1 of FGB causes pseudoexon inclusion. In this study, pseudoexon inclusion was corrected by targeting this mutation with an antisense phosphorodiamidate morpholino oligonucleotide.

In several forms of β-thalassaemia, two single nucleotide mutations (IVS2-705 and IVS2-654) in the β-globin gene have been reported to cause pathological pseudoexon insertion. In 1993, Dominski and Kole [61] successfully tested 2′-O-methylribose AONs to restore correct splicing. Later, Sierakowska et al. [64] also restored correct splicing and β-globin polypeptide production using a phosphorothioate 2′-O-methyl-oligoribonucleotide targeted to the aberrant 3′ss. More recently, Gorman et al. [65,66] engineered the U7 snRNA gene to correct pre-mRNA splicing by replacing the antihistone sequence with sequences targeting β-globin aberrant splice sites (Fig. 4A).

Figure 4.

 A schematic representation of three different 5′ss activating mutations in various disease-causing genes that activate pseudoexon inclusion where therapeutic correction has been attempted with an antisense approach. (A) represents the IVS2-705 T>G splicing mutation that activates a 126 nucleotide pseudoexon in intron 2 of the β-globin gene. In this case, 2′-O-methyl ribose AONs and functionally modified U7 snRNA were employed to block the acceptor and donor splice sites. In (B), the 3849+10kbC>T splicing mutation activates a 84 nucleotide pseudoexon in intron 19 of the CFTR gene. Three 2′-O-methyl phosphorothioate oligoribonucleotides were targeted against the splice sites and against the pseudoexonic premature stop codon sequence to rescue normal splicing. Finally, (C) shows the c.6614+3310G>T splicing mutation that activates a 137 nucleotide pseudoexon in intron 45 of the DMD gene. To restore normal splicing, 2′-O-methyl ribose AONs were also targeted against the donor splice site and a predicted cluster of exonic splicing enhancer sequences within the pseudoexon.

The congenital disorders of glycosylation are caused by defects in the PMM2 gene. Recently, Vega et al. [130] studied a c.640–15479C>T deep intronic mutation that creates a new aberrant 5′ss in intron 7 and caused pseudoexon activation. Antisense morpholino oligonucleotides that targeted the aberrant 5′ss and 3′ss sites achieved 100% restoration of correctly spliced mRNA.

Pseudoexon-activating mutation 3849 + 10 kb C > T in intron 19 of the CFTR gene has been reported to frequently cause cystic fibrosis. In their study, Friedman et al. [62] reported that a cocktail of 2′-O-methyl phosphorothioate oligoribonucleotides against different regions of this pseudoexon abolished pseudoexon inclusion and partially restored production of normal mRNA and CFTR processed protein (Fig. 4B).

Mutations in the DMD gene are known to cause Duchenne and Becker muscular dystrophies. Recently, Gurvich et al. [67] demonstrated that 2′-O-methyl ribose phosphorothioate AONs restored normal splicing in primary myoblast cultures established from two individual patients carrying out-of-frame pseudoexon insertion mutations (Fig. 4C).

Methylmalonic acidaemia and propionic acidaemia are caused by different gene defects in the MUT, PCCA and PCCB genes. Ugarte et al. [68] recently reported the identification of three novel deep intronic mutations in each of these genes that potentially lead to pseudoexon activation through diverse mechanisms. Antisense therapeutics using antisense morpholino oligomers correctly restored almost complete normal splicing that was effectively translated.

Ocular albinism type 1 involves mutations in the OA1 gene. Vetrini et al. [69] identified a deep intronic point mutation g.25288G>A that created a new acceptor splice site in intron 7 of this gene and resulted in pseudoexon inclusion. Treatment of a patient’s melanocytes with antisense morpholino AONs complementary to the mutant sequence rescued mRNA and protein expression levels.

Mutations in the NF-1 gene cause neurofibromatosis type 1. Recently, Pros et al. [70] identified six neurofibromatosis type 1 patients carrying three different deep intronic mutations that create new 5′ss leading to the activation of the pseudoexon in the mature mRNA. In this study, antisense morpholino oligonucleotides were targeted against these newly created 5′ss, effectively restoring normal NF-1 splicing.

All of these different therapeutic strategies are summarized in Table 2.

Table 2.   Therapeutic approaches. PMO, phosphorodiamidate morpholino oligonucleotide.
Therapeutic approachesGene and diseases caused by pseudoexon inclusionMode of action
Antisense PMOsFGB, afibrinogenaemiaAntisense PMO specifically target the predicted ESE motif created by the mutation
Engineered U7 snRNAHBB, β-thalassaemiaUse of modified U7 snRNA to target aberrant splice sites for long-term restoration of correct splicing
Antisense morpholino oligonucleotidesPMM2, congenital disorders of glycosylation; MUT, PCCA and PCCB, methylmalonic acidaemia and propionic acidemia; OA1, ocular albinism type 1; NF-1, neurofibromatosis type 1Targeting aberrant splice sites activated due to mutations. AONs block access of the splicing machinery to the pseudoexonic regions in the pre-mRNAs
Antisense 2′-O-methyl ribose phosphorothioate oligonucleotides; 2′-O- methylribo-oligonucleotidesCFTR, cystic fibrosis; HBB, β-thalassaemia; DMD, Duchenne and Becker muscular dystrophiesTargeting of antisense against aberrant splice sites, in-frame stop codon and predicted exonic splicing enhancers within pseudoexons

Concluding remarks

This review is part of a miniseries co-ordinated by Diana Baralle [71] to look at emerging topics in splicing research, such as the correct assessment of sequence variants as pathogenic mutations [72]; the development of novel splicing-based therapeutic agents to treat HIV-1 infections [73]; and new methods in the global analysis of alternative splicing profiles [74]. We decided to examine the role of pseudoexons in recent research, as no specialized reviews have appeared in the past dealing with this particular kind of event.

From a basic science point of view, the possibility for researchers to look at the splicing process on a much more global scale than the single exon or the individual gene will clarify the issues examined in this review by helping to distinguish clearly between exons and pseudoexons [19,75,76]. In turn, this will provide a better appreciation regarding how the splicing process has evolved to define ‘exons’, how it distinguishes them from similar potentially pathological sequences (pseudoexons) and what is the preferential way it has chosen to repress their recognition. In this respect, pseudoexon research will also provide us with an unparalleled opportunity to understand evolutionary mechanisms that cause some of these sequences to become exons and, of course, vice versa.

Considering that aberrant pseudoexon inclusion events are an increasing phenomenon linked with disease, just the simple characterization of these sequences may have some very practical consequences. The studies reported in this review clearly highlight the feasibility of using AONs to correct these types of splicing defect (even in the absence of a complete or even partial understanding of the ‘basic science’ explaining their occurrence). From a therapeutic point of view, the major advantage of targeting pseudoexon inclusion events is provided by the supposition that AONs targeted against what would normally be intronic sequences would not be expected to remain bound to the mature mRNA (and thus interfere with later stages of RNA processing, such as export/translation). However, several factors will still need to be improved before human application becomes a reality. These start from basic studies aimed at optimizing gene/exon specificity (that will necessarily have to be made on an individual gene-specific basis) to the development of appropriate carrier systems. These systems will be absolutely necessary to achieve successful delivery, low toxicity and avoidance of undesired immune responses. Furthermore, even after achieving all of these aims, there will still remain the need to optimize recurrent administration protocols (this is an often overlooked consideration, as none of these methods will cause a permanent correction of mRNA splicing defects), and determining their clearance/accumulation in human organs/tissues. However, notwithstanding all of these difficulties, AON technology [59,77] has already entered the clinical trial stage for diseases such as Duchenne muscular dystrophy ( and this represents a bright hope for the not too distant future.


This work was supported by Telethon Onlus Foundation (Italy) (grant no. GGP06147) and by a European community grant (EURASNET-LSHG-CT-2005-518238). We thank Professor F. E. Baralle for helpful discussion.