SMG1 is an ancient nonsense-mediated mRNA decay effector



Nonsense-mediated mRNA decay (NMD) is a eukaryotic process that targets selected mRNAs for destruction, for both quality control and gene regulatory purposes. SMG1, the core kinase of the NMD machinery in animals, phosphorylates the highly conserved UPF1 effector protein to activate NMD. However, SMG1 is missing from the genomes of fungi and the model flowering plant Arabidopsis thaliana, leading to the conclusion that SMG1 is animal-specific and questioning the mechanistic conservation of the pathway. Here we show that SMG1 is not animal-specific, by identifying SMG1 in a range of eukaryotes, including all examined green plants with the exception of A. thaliana. Knockout of SMG1 by homologous recombination in the basal land plant Physcomitrella patens reveals that SMG1 has a conserved role in the NMD pathway across kingdoms. SMG1 has been lost at various points during the evolution of eukaryotes from multiple lineages, including an early loss in the fungal lineage and a very recent observable gene loss in A. thaliana. These findings suggest that the SMG1 kinase functioned in the NMD pathway of the last common eukaryotic ancestor.


Eukaryotes selectively degrade transcripts to regulate gene expression, through processes such as the RNA silencing/interference pathways (Belostotsky and Sieburth, 2009; Chen, 2009; Huntzinger and Izaurralde, 2011). The nonsense-mediated mRNA decay (NMD) pathway was originally identified as a quality-control mechanism that degrades aberrant transcripts with premature termination codons (PTCs) generated by mutation or transcription/processing errors (Mühlemann et al., 2008). However, more recently, it has become clear that NMD has a function beyond its role in transcript quality control. NMD regulates 1–10% of the transcriptomes of all examined eukaryotes (He et al., 2003; Mendell et al., 2004; Rehwinkel et al., 2005; Guan et al., 2006; Ramani et al., 2009; Rayson et al., 2012a) by degrading selected, non-aberrant transcripts (Mühlemann et al., 2008). Stop codons located in specific contexts may trigger NMD to destroy their mRNAs. These contexts include stop codons positioned upstream of a long 3′ UTR, or the site of a splicing event, marked by an exon-junction complex (Mühlemann et al., 2008). In some instances, an upstream open reading frame (uORF) may target a transcript for NMD, by making the transcript appear to have a long 3′ UTR or downstream exon-junction complex (Mühlemann et al., 2008).

Post-transcriptional regulation of gene expression by NMD has been implicated in diverse biological processes in various organisms. For example, a microRNA-induced reduction in NMD activity is essential for the correct expression of neural-related transcripts and the differentiation of neurons from stem cells during mammalian brain development (Bruno et al., 2011). In plants, NMD has been shown to repress production of the defence hormone salicylic acid, and Arabidopsis thaliana NMD mutants display partial resistance to the bacterial pathogen Pseudomonas syringae DC3000 (Jeong et al., 2011; Rayson et al., 2012a,b; Riehs-Kearnan et al., 2012). Such biological functions imply that transcripts may be conditionally targeted to NMD. Suitable mechanisms for such conditional targeting of transcripts to NMD have been described. For example, alternative splicing (AS) may bring transcripts under the influence of NMD by switching between non-PTC-containing and PTC-containing splice isoforms in a process known as AS-coupled NMD (Saltzman et al., 2008; McIlwain et al., 2010). In addition, upstream ORFs may conditionally target transcripts to NMD, as has been reported for the CPA transcript in yeast (Gaba et al., 2005). Identification of NMD targets in A. thaliana demonstrated an enrichment of conserved peptide upstream ORFs (CPuORFs) (Nyikó et al., 2009; Rayson et al., 2012a,b). CPuORFs are uORFs with evolutionarily conserved peptide sequences, several of which have previously been shown to affect gene expression at the post-transcriptional level (Hayden and Jorgensen, 2007; Tran et al., 2008; Jorgensen and Dorantes-Acosta, 2012; Takahashi et al., 2012). Although many A. thaliana transcripts contain uORFs (>20% of transcripts) (Kawaguchi and Bailey-Serres, 2005), there is no significant enrichment of these amongst the transcripts that are up-regulated in NMD mutants (Rayson et al., 2012a). In contrast, 49% of the 82 A. thaliana CPuORF-containing transcripts were increased in at least one of three NMD mutants tested under standard conditions (upf1-5, upf3-1 and smg7-1) (Rayson et al., 2012a,b).

Despite the importance of NMD in plants, an understanding of its mechanism remains incomplete. In animals, following recognition of the PTC, the core NMD effector UPF1 is phosphorylated by the kinase SMG1 at TQ and SQ dipeptides that are present at both the N- and C--termini of UPF1 (Okada-Katsuhata et al., 2012). In turn, SMG5–7 are recruited to the transcript by binding to the phosphorylated sites of UPF1 through their 14-3-3-like domains (Fukuhara et al., 2005; Okada-Katsuhata et al., 2012; Jonas et al., 2013). SMG6 is an endonuclease that cuts the targeted transcript near the PTC (Glavan et al., 2006; Huntzinger et al., 2008; Eberle et al., 2009), while the SMG7:SMG5 complex recruits both the XRN1 exoribonuclease and the exosome complex, to degrade the transcript in the 5′→3′ and 3′→5′ directions, respectively (Unterholzner and Izaurralde, 2004; Jonas et al., 2013). In A. thaliana, UPF1 is present and required for NMD (Arciga-Reyes et al., 2006; Yoine et al., 2006; Kerényi et al., 2008; Mérai et al., 2013). In contrast, the SMG1 kinase has not been identified (Grimson et al., 2004) and only one NMD-active SMG7, rather than the full complement of SMG5–7, is present (Kerényi et al., 2008; Riehs et al., 2008). Despite this, plant UPF1 is phosphorylated at the N- and C-termini (Mérai et al., 2013), and SMG7 is essential for the decay of the targeted transcript (Riese et al., 2007; Kerényi et al., 2008; Rayson et al., 2012a; Mérai et al., 2013). To date, a kinase acting in the NMD pathway has not been discovered outside the animal kingdom. The absence of SMG1 from the genomes of yeasts and A. thaliana (Figure 1) (Grimson et al., 2004) has led to the suggestion that SMG1 is an animal-specific component of the NMD pathway (Izumi et al., 2010; Kalyna et al., 2012).

Figure 1.

Conservation of NMD effectors across the eukaryotic domain. (a) Tree of selected eukaryotes rooted based on the work by Richards and Cavalier-Smith (2005) and Derelle and Lang (2012) indicating the ancient divergence between animals and plants, showing conservation of the NMD effectors UPF1, SMG5–7 (EBS1) and SMG1 (indicated by symbols), as assessed by homology searches and phylogenetic analysis. Note independent losses of the ancestral SMG1 (red circle) in fungi, A. thaliana, the red algae Cyanidioschyzon merolae, the brown algae Ectocarpus siliculosus and the excavates Trypanosoma brucei and Giardia lamblia, and its presence in the last eukaryotic common ancestor (LECA). (b) Domain structure of PIKKs. The kinase domain (KD) of SMG1 is centrally located, distant from the C-terminally located FATC domain in both animals and plants. The kinase domain of other PIKKs (ATM, ATR, TOR, TRRAP and DNA-PKc) is directly adjacent to the FATC domain at the C-terminus. (c) The Arabidopsis syntenic region that contains SMG1 in A. lyrata has no SMG1 in A. thaliana. The expected position of SMG1 contains other genes, including two unrelated transposable elements (indicated by asterisks). Synteny was examined using the Plant Genome Duplication Database (

In this study, we have analysed the phosphatidylinositol 3-kinase-related kinase (PIKK) family, of which SMG1 is a member, to identify a kinase involved in NMD outside the animal kingdom. We show that SMG1 is actually conserved between animals and plants, but confirm that it is missing from the genomes of fungi and A. thaliana. Despite its competence for NMD, A. thaliana is unusual in the plant kingdom in having recently lost its copy of SMG1, making it an unsuitable model system to determine the involvement of plant SMG1 in NMD. We therefore used homologous recombination in the moss Physcomitrella patens, a basal land plant, to show that plant SMG1 is active in NMD and is required for normal development. The presence of an NMD-active SMG1 kinase in both animals and plants suggests that NMD relied on phosphorylation of UPF1 by SMG1 in the last eukaryotic common ancestor over 2 billion years ago (Brocks et al., 1999).


SMG1 is conserved across eukaryotes but has been lost from multiple lineages

The absence of an SMG1 NMD-associated kinase from the genomes of yeasts and A. thaliana (Grimson et al., 2004) raised the question about the origins of the SMG1 kinase in eukaryotes. To address this, the genomes of a range of eukaryotes, including animals, plants and fungi, were analysed to identify kinases belonging to the PIKK family. To distinguish between SMG1 kinases and other PIKK sub-family members; ataxia telangiectasia mutated (ATM), ataxia telangiectasia and Rad3-related (ATR), target of rapamycin (TOR), transformation/transcription domain- associated protein (TRRAP), and DNA-dependent protein kinase, catalytic subunit (DNA-PKcs), a phylogenetic tree was constructed using aligned kinase domain sequences from PIKKs (Figure S1). Surprisingly, this analysis identified multiple SMG1 orthologues from diverse eukaryotes including plants and oomycetes (Figure 1a). Recent work has suggested that the clade including plants/oomycetes and that including animals/fungi are on opposing sides of the root of eukaryotes (Richards and Cavalier-Smith, 2005; Derelle and Lang, 2012). The presence of SMG1 in both clades therefore suggests that SMG1 was present in the last eukaryotic common ancestor (Figure 1a). In agreement with previous reports, no fungal SMG1 kinase was identified, indicating that the SMG1 kinase has been lost from this eukaryotic lineage (Figure 1a). Further confirmation of the phylogenetic analysis of PIKKs was obtained by examining the PIKK domain structure. All PIKKs, with the exception of SMG1, have a kinase domain adjacent to the FRAP (FKBP-rapamycin-associated protein)/ATM/TRRAP/C-terminus (FATC) domain at the C-terminus (Lempiäinen and Halazonetis, 2009) (Figure 1b). All SMG1 proteins identified phylogenetically in this study, from both animals and plants, have a centrally located kinase domain, separated from the FATC domain by a large middle region (Figure 1b). The identification of SMG1 outside the animal kingdom suggests that the mechanism used to phosphorylate the NMD effector UPF1 is more widely conserved than previously thought. The presence of other NMD effectors in moss was investigated by reciprocal BLASTp searches (Altschul et al., 1990), and P. patens was found to have homologues of the core NMD effectors UPF1–3 and SMG7; however, unlike PpSMG1, the majority of these are present in multiple copies in the moss genome (Table S1).

Arabidopsis thaliana is the only green plant examined that does not contain an SMG1 orthologue (Figure 1a and Figure S1) (Grimson et al., 2004). To explore this further, the expected location of SMG1 was located in the A. thaliana genome using synteny with the genome of its close relative, Arabidopsis lyrata, which contains SMG1 (Figure 1c). Comparison of the gene content in this syntenic region confirms the absence of SMG1 from the A. thaliana genome. At the expected position of SMG1, the A. thaliana genome instead contains two unrelated transposable elements bracketing a short predicted ORF of 59 amino acids (ORF59, Figure 1c). This ORF displays weak homology to SMG1 in its reverse strand, revealing the remains of the kinase-encoding locus that was lost within the last 5–10 million years (Hu et al., 2011). These results show that orthologues of SMG1 are present in diverse eukaryotes, including green algae and land plants. However, they also confirm the absence of SMG1 from fungi, and demonstrate that the SMG1 gene has been recently lost in the A. thaliana lineage, despite A. thaliana retaining a competent NMD pathway (Arciga-Reyes et al., 2006; Riehs et al., 2008; Rayson et al., 2012a).

SMG1 functions in the NMD pathway of P. patens

Although the phylogeny reveals the presence of SMG1 orthologues outside the animal kingdom, it remained to be seen whether they act in NMD in any other kingdom. Due to the unusual absence of SMG1 from A. thaliana, an alternative model plant was necessary to determine the involvement of SMG1 in plant NMD. We identified a single copy of the SMG1 gene in a basal land plant, the moss P. patens (PpSMG1, Figure 2a and Figure S1), and used homologous recombination to replace the 15 kb SMG1 locus (Pp1s51_180U2__zimmer.1) with a kanamycin resistance cassette, generating smg1 null alleles (Figure 2). Four independent lines with a deleted SMG1 (smg1Δ) were identified (Figure 2 and Figure S2). Expression of SMG1 was eliminated in the four smg1Δ lines, but not in a transgenic line that retained an intact genomic copy of the SMG1 gene (SMG1WT line 1; Figure 2b).

Figure 2.

Targeted disruption of SMG1 in moss. (a) The structure of the PpSMG1 gene in moss, with coding exons in black boxes and UTRs in grey boxes. The whole coding region (Pp1s51_180U2__zimmer.1) was replaced by the P35S-nptII-g6term selection cassette, shown as a white box (KAN; kanamycin resistance gene), with regions of homology used in gene targeting on either side. (b) Semi-quantitative RT-PCR expression analysis in WT, four mutant lines with 5′ and 3′ gene targeting events and no genomic PpSMG1 (smg1Δ lines 1–4), and one mutant line with gene targeting and a genomic PpSMG1 copy (SMG1WT line 1).

To assess whether smg1Δ lines are compromised in NMD, we first identified a series of moss NMD targets, representing a range of NMD-targeting features including a long 3′ UTR, uORF and AS-coupled NMD. Transcripts targeted by NMD should be up-regulated in the smg1 knockout lines if SMG1 is required for NMD in moss. The A. thaliana splicing factor AtPTB3 (AT1G43190), of the polypyrimidine tract-binding protein (PTB) family, undergoes exon skipping to introduce a PTC into one splice variant (Stauffer et al., 2010). We identified the moss homologue of this gene (PpPTB3; Pp1s48_128V6) and showed by semi-quantitative RT-PCR that this also undergoes exon skipping and thereby introduces a PTC into one splice variant (Figure 3a). As predicted, expression of the PTC+ splice variant is elevated in smg1Δ lines when compared to wild-type (WT) or SMG1WT line 1. The serine/arginine-rich (SR) protein-encoding transcript PpRS2Z37 (Pp1s69_23V6) also undergoes AS at an alternative 3′ splice site to produce a PTC+ variant. Semi-quantitative RT-PCR and quantitative RT-PCR show that this PTC+ splice variant is up-regulated 3–6-fold in smg1Δ lines when compared to WT or control SMG1WT line 1 (Figure 3a,b). Further examples of AS-coupled NMD in moss include the SR protein-encoding transcripts PpRS2Z38 and Pp108464. PpRS2Z38 undergoes alternative 3′ splice site selection, and Pp108464 undergoes inclusion of an exon cassette, generating PTC+ variants in both cases. The PTC+ variants of these transcripts are over-expressed in smg1Δ lines in comparison with the expression in the WT and SMG1WT line 1 (Figure 3c,d). SMG7, a conserved NMD effector-encoding gene, has previously been shown to be a direct target of NMD in flowering plants, due to the presence of a long 3′ UTR containing introns (Kerényi et al., 2008; Nyiko et al., 2013). Regulation of SMG7 by NMD creates an autoregulatory loop to control the level of NMD activity (Kerényi et al., 2008; Benkovics et al., 2011; Rayson et al., 2012a). This is a common feature in the NMD pathway of animals (Rehwinkel et al., 2005; Huang et al., 2011; Yepiskoposyan et al., 2011). We therefore tested the expression of PpSMG7-2 (Pp1s311_73V6) and found that it was also up-regulated in smg1Δ lines (Figure 3e), suggesting that this autoregulatory feedback loop is conserved across land plants. uORF-containing transcripts have been characterized as direct targets of NMD in animals, flowering plants and budding yeast. However, only a subset of uORF-containing transcripts appear to be targeted for NMD (Rayson et al., 2012a). Recent work in A. thaliana highlighted a strong correlation between the presence of CPuORFs in transcripts and their targeting by NMD in plants (Nyikó et al., 2009; Rayson et al., 2012a,b). Searching the moss genome for predicted transcripts that show sequence homology to both a CPuORF and its associated downstream major ORFs (mORF) from the list of NMD targets in A. thaliana identified two moss eIF5-related genes: PpeIF5-like1 (PpeIF5L1; Pp1s626_4V6) and PpeIF5-like2 (PpeIF5L2; Pp1s93_126V6). The uORFs of these moss genes show homology to the CPuORF upstream of the A. thaliana eIF5-related gene AT1G36730 (Figure S3), expression of which is elevated in the three NMD mutants upf1-5, upf3-1 and smg7-1 (Rayson et al., 2012a). The conserved association between the CPuORF and mORF in this eIF5-related transcript, from bryophyte to angiosperm, strongly implies a functional dependence between the CPuORF and mORF across all land plants. Both PpeIF5L1 and PpeIF5L2 are over-expressed in smg1Δ lines (Figure 3f,g), in a similar fashion to the A. thaliana eIF5-related transcript, which is over-expressed in NMD mutants (Rayson et al., 2012a). Additionally, we have identified two further uORF-containing transcripts (Pp173_136V6.1 and Pp1s60_199V6.2) that are targeted to NMD in moss (Figure 3h,i). Pp173_136V6.1 encodes a kinase, has four uORFs and is related to the uORF-containing AT5G45430, which is up-regulated in A. thaliana NMD mutants (Rayson et al., 2012a). Pp1s60_199V6.2 encodes a magnesium/proton exchanger (MHX) like protein, has a single uORF and is related to the uORF-containing A. thaliana NMD target AtMHX (AT2G47600) (Saul et al., 2009). Taken together, the finding that a series of NMD-targeted moss transcripts show elevated expression in independent smg1 mutant lines indicates that SMG1 functions in the NMD pathway of plants as it does in animals.

Figure 3.

Targets of NMD are over-expressed in moss smg1Δ lines. (a) Semi-quantitative RT-PCR analysis of two alternatively spliced targets of NMD. PpPTB3 (Pp1s48_128V6) produces two splice variants by exon skipping: the shorter PTC+ variant and the longer PTC– variant. The alternative 3′ splice site (A3′SS) of PpRS2Z37 (Pp1s69_23V6) produces a longer PTC+ variant compared to the shorter PTC– variant. The PpEF1α level is shown as a control for RNA loading. PTCs are indicated by a vertical black line. Constitutive exon sequences are shown in blue, and alternative exon sequences are shown in orange. (b) Quantitative RT-PCR analysis of the PpRS2Z37 (Pp1s69_23V6) PTC+ variant. (c) Quantitative RT-PCR analysis of the PpRS2Z38 (Pp1s65_286V6) PTC+ variant. (d) Quantitative RT-PCR analysis of the Pp108464 (Pp1s270_54V6) PTC+ variant. This variant is generated by inclusion of a poison cassette exon (CE). (e) Quantitative RT-PCR analysis of PpSMG7-2 (Pp1s311_73V6). (f) Quantitative RT-PCR analysis of PpeIF5L1 (Pp1s626_4V6). (g) Quantitative RT-PCR analysis of PpeIF5L2 (Pp1s93_126V6). (h) Quantitative RT-PCR analysis of the kinase-encoding uORF-containing Pp173_136V6.1. (i) Quantitative RT-PCR analysis of the MHX-like protein-encoding uORF-containing Pp1s60_199V6.2. (b–i) The fold change indicates the amount of target expression normalized to that of PpEF1α and relative to WT levels. Error bars represent the standard error of the mean from three biological replicates. Asterisks indicate lines with a statistically significant difference from WT using an unpaired t test (< 0.05); NS, not significantly different.

To demonstrate further that the observed over-expression of these genes in PpSMG1 knockout lines is attributable to a reduction in NMD, we exposed moss to cycloheximide (CHX). CHX is an inhibitor of translation, and, as NMD is a translation-dependent decay mechanism, CHX has previously been used to identify targets of NMD (Kalyna et al., 2012). We chose three of our putative NMD targets, each representative of a different class of NMD-targeting feature: AS-coupled NMD (PpRS2Z37), a uORF-containing transcript (PpeIF5L1) and long 3′ UTR (PpSMG7-2). A 6 h incubation with CHX caused over-expression of each of these transcripts, consistent with their status as targets of NMD (Figure 4).

Figure 4.

Targets of NMD are over-expressed when moss is exposed to CHX. (a) Quantitative RT-PCR analysis of the PpRS2Z37 PTC+ variant. (b) Quantitative RT-PCR analysis of PpSMG7-2. (c) Quantitative RT-PCR analysis of PpeIF5L1. (a–c) The fold change indicates the amount of target expression normalized to that of the reference gene Pp1s54_156V6.1 and relative to WT levels in the absence of CHX. Error bars represent the standard error of the mean from three biological replicates. Asterisks indicate lines with a statistically significant difference from WT levels in the absence of CHX (P < 0.05, unpaired t test).

SMG1 is important for normal moss growth

A compromised NMD pathway has previously been shown to have an adverse affect on development in animals and plants (Arciga-Reyes et al., 2006; Riehs et al., 2008; McIlwain et al., 2010; Rayson et al., 2012a; Riehs-Kearnan et al., 2012). Whilst mice lacking SMG1 exhibit embryonic lethality (McIlwain et al., 2010), SMG1 appears to be dispensable for development in Drosophila and Caenorhabditis elegans (Pulak and Anderson, 1993; Metzstein and Krasnow, 2006) implying a differential requirement for SMG1 in NMD and/or development between organisms. Phenotypic analyses of the moss smg1Δ lines showed that the mutants are viable, but produce fewer leafy structures (gametophores) than WT (Figure 5a–d). The few gametophores produced by smg1Δ lines predominantly grow downwards into the agar, rather than equally upwards out of the agar and downwards into the agar as in WT plants (Figure 5e). As the moss NMD mutants are SMG1 null mutants, this result shows that, although SMG1 is needed for normal moss growth, it is not essential under standard laboratory conditions.

Figure 5.

smg1Δ lines produce fewer leafy structures, which mainly grow into the medium, not into the air. (a) Three-week-old wild-type colony (WT). (b) Three-week-old smg1Δ line 1. (c) Three-week-old smg1Δ line 2. (d) Total number of gametophores per colony after 3 weeks. (e) The percentage of gametophores not submerged in agar after 3 weeks. In (a–c), scale bars = 1 mm. In (d,e), n = 18. Error bars represent the standard error of the mean. Asterisks indicate knockout lines with a statistically significant difference from WT using an unpaired t test (P < 0.05).


In this study, we have identified the SMG1 kinase in multiple plant species and demonstrated its involvement in NMD in the moss P. patens, despite it previously being reported as an animal-specific component of the NMD pathway (Grimson et al., 2004; Izumi et al., 2010; Kalyna et al., 2012) due to its absence from A. thaliana (Figure 1). We propose that SMG1 in plants functions as it does in animals by phosphorylating UPF1, although it remains possible that plant SMG1 exerts its influence on plant NMD is some other way. The involvement of SMG1 in the NMD pathways of both animals and plants is consistent with SMG1 functioning in NMD in the last eukaryotic common ancestor (Figure 1). This suggests that the origins of SMG1 are very ancient: 2–3 billion years ago (Brocks et al., 1999).

Targets of NMD are conserved across eukaryotic kingdoms

In this study, we have identified a range of NMD targets in moss, including transcripts encoding splicing factors, which often undergo AS-coupled NMD as part of a feedback loop controlling splicing factor levels (Lewis et al., 2003; Palusa and Reddy, 2010; Stauffer et al., 2010). AS-coupled NMD in P. patens affects the SR protein-encoding transcripts PpRS2Z37, PpRS2Z38 and Pp108464 and the PTB-encoding transcript PpPTB3 (Figures 3 and 4), suggesting that this mechanism of regulation is conserved across the plant kingdom (Palusa and Reddy, 2010; Stauffer et al., 2010). Transcripts encoding NMD effectors are common targets of NMD in both animals and plants (Huang et al., 2011; Yepiskoposyan et al., 2011). In flowering plants, SMG7 is a direct target of NMD due to its long 3′ UTR containing two introns, a feature that is conserved between monocots and eudicots (Kerényi et al., 2008; Benkovics et al., 2011; Rayson et al., 2012a; Nyiko et al., 2013). Here we show that moss PpSMG7-2 is an NMD target (Figures 3 and 4), indicating that this autoregulatory circuit is also conserved across land plants. Finally, we have shown that uORF-containing transcripts, including CPuORF-containing transcripts, are targeted by NMD in moss as they are in A. thaliana (Figures 3 and 4). The CPuORF of the eIF5-related protein-encoding transcript AT1G36730 is conserved in transcripts of two moss genes, which also encode eIF5-related proteins (Figure S2). Expression of AT1G36730 is elevated in A. thaliana NMD mutants (Rayson et al., 2012a), and expression of the two moss homologues is elevated in smg1Δ lines, indicating that CPuORFs target transcripts to NMD across the plant kingdom. These findings demonstrate the conservation of NMD targets across a large evolutionary span. The effect of NMD on these conserved targets is likely to have important physiological outcomes, for example in regulating the activity of the NMD pathway itself.

SMG1 is found in many but not all eukaryotes

We have identified and characterized the SMG1 kinase as a component of the NMD pathway outside the animal kingdom. Despite its conserved role in NMD in plants and animals, SMG1 has been independently lost in multiple eukaryotic lineages (Figure 1). In yeast, the lack of an NMD-associated kinase, through early evolutionary loss of SMG1 in the fungal lineage (Figure 1), led to the suggestion that yeast NMD does not rely on UPF1 phosphorylation (Figure 6d) (Gatfield et al., 2003). Nevertheless, yeast UPF1 is phosphorylated (Wang et al., 2006), and loss of the yeast 14-3-3-like protein SMG7/EBS1, which is predicted to bind phosphorylated UPF1, results in a partially compromised NMD pathway (Luke et al., 2007). These data indicate that phosphorylation of UPF1 may be important for NMD in fungi, and that, in the absence of an SMG1 kinase, an alternative kinase acts in NMD (Figure 6b). A more recent loss of SMG1 has occurred within the Arabidopsis genus within the last 5–10 million years (Hu et al., 2011). Even though SMG1 is not present in the A. thaliana genome (Figure 1) (Grimson et al., 2004), the A. thaliana UPF1 may be phosphorylated in a plant system (Mérai et al., 2013) and SMG7 is required for NMD (Riehs et al., 2008), suggesting that phosphorylation of UPF1 remains necessary for NMD, again implicating an alternative kinase (Figure 6b).

Figure 6.

Proposed models for UPF1 activation in NMD. (a) The NMD pathway of organisms reliant on SMG1-dependent phosphorylation of UPF1 (e.g. C. elegans). (b) The NMD pathway of organisms that have replaced SMG1 with an as yet undiscovered kinase, such as fungi and A. thaliana. (c) The proposed NMD pathway of organisms with two kinases. These may include the NMD pathways of Drosophila, zebrafish and the last eukaryotic common ancestor. (d) The proposed NMD pathway of organisms that do not require UPF1 phosphorylation. These may include the NMD pathways of yeasts and excavates.

Both SMG1 and SMG7 have been lost in the analysed members of the excavate group, Giardia lamblia and Trypanosoma brucei, and in the red algae Cyanidioschyzon merolae (Figure 1a). The presence of UPF1 in these organisms and the absence of both its kinase and the protein that recognizes its phosphorylated form may suggest that they no longer require the UPF1 phosphorylation/dephosphorylation cycle that is necessary for mammalian NMD (Figure 6d) (Chen et al., 2008; Delhi et al., 2011). Alternatively, these eukaryotes may have independently acquired replacements for both SMG1 and SMG7 (Figure 6b). An investigation of the phosphorylation status of UPF1 in these species will help to discriminate between these possibilities. It is interesting to consider the diversity of NMD pathway arrangements that exist in different organisms (Figure 6). Evidence suggests that there are distinct ‘branches’ of the NMD pathway where different substrates have specific effector requirements (Gehring et al., 2005; Chan et al., 2007; Huang et al., 2011). There is a possibility that currently unidentified NMD effectors may be involved in these branches of the NMD pathway in different eukaryotes. Proline-rich nuclear receptor coregulatory protein 2 (PNRC2) has recently been identified as a vertebrate-specific protein that interacts with SMG5 to link UPF1 phosphorylation status to the RNA decay machinery (Cho et al., 2009, 2013; Lai et al., 2012).

Multiple independent losses of SMG1 in eukaryotes that have conserved SMG7 and phosphorylation of UPF1 (Wang et al., 2006; Luke et al., 2007; Riehs et al., 2008; Mérai et al., 2013) suggest that an alternative kinase is capable of activating NMD in many organisms (Figure 6b,c). One possibility is that, at each loss event, SMG1 was replaced by an independent kinase (Figure 6b). A more appealing explanation is that redundancy between two or more kinases in a common ancestor allowed the independent losses of SMG1 to be replaced by a pre-existing alternative kinase(s), which may be as ancient as SMG1. Loss of SMG1 in Drosophila and zebrafish has been shown to have little or no effect on NMD and/or development (Chen et al., 2005; Metzstein and Krasnow, 2006; Wittkopp et al., 2009), suggesting redundancy between SMG1 and another kinase(s) (Figure 6c). Redundancy may also explain the differential requirement for SMG1 for organismal survival. While UPF1 and UPF2 are required for Drosophila and zebrafish development, SMG1 is dispensable, possibly due to the presence of another NMD kinase (Figure 6c) (Chen et al., 2005; Metzstein and Krasnow, 2006). Similarly, the plant NMD pathway may include an alternative kinase, providing an explanation for the relatively mild phenotype observed in moss smg1Δ lines compared to the NMD-compromised phenotypes in A. thaliana and other non-plant species. We attempted to knockout PpUPF2 (Pp1s123_14V6.1) as it is the only other single-copy NMD effector-encoding gene in moss (Table S2) and is therefore amenable to reverse genetics without multiple transformations. However, no transformants lacking a copy of PpUPF2 were identified. One explanation is that the core NMD machinery such as UPF2 is essential for moss survival while SMG1 is not, supporting the existence of an alternative kinase; however, more work is needed to confirm this.

If it is indeed the case that many eukaryotes carry an alternative NMD kinase, it will be interesting to examine the potential involvement of other PIKK family members in NMD. SMG1 is related to the DNA damage-activated kinases ATM and ATR (Waterworth et al., 2011) and TOR, a regulator of translation (Deprost et al., 2007). Although a link to NMD has not been established, ATM and ATR phosphorylate UPF1 in mammals. ATM phosphorylates UPF1 after DNA damage, but knockdown of ATM had no effect on NMD (Brumbaugh et al., 2004), suggesting that UPF1 is involved in DNA damage repair, independently of its role in NMD (Brumbaugh et al., 2004). ATR also phosphorylates UPF1, regulating genome stability (Azzalin and Lingner, 2006). The lack of an NMD phenotype in PIKK mutants, other than smg1, may indicate that these kinases are unable to substitute for SMG1, a view supported by the difference in the domain structure of SMG1 proteins in animals and plants compared to other PIKKs (Figure 1). Alternatively, it is possible that these PIKKs may act redundantly in some species, and that this redundancy may mask their NMD effects.

In summary, we have shown that SMG1 is not an animal-specific component of the NMD machinery, by demonstrating that it plays a conserved role in NMD between animals and plants, thus indicating a very early evolutionary origin for this NMD-associated kinase. However, the multiple independent losses of SMG1, together with its varied influence on NMD across the eukaryotic kingdoms, indicate that alternative means of NMD activation exist and may have existed since the last eukaryotic common ancestor. Given the possibility that this involves an alternative NMD kinase, it will be interesting to identify kinase(s) capable of phosphorylating UPF1 in organisms lacking SMG1, such as A. thaliana, and then to test the extent to which that capacity is conserved in other eukaryotes.

Experimental Procedures

P. patens growth conditions

Physcomitrella patens ssp. patens (Hedwig) ecotype ‘Gransden 2004’ (Ashton and Cove, 1977; Rensing et al., 2008) was cultured on BCDAT medium at 25°C under continuous light (Nishiyama et al., 2000). Lawns of protonemal filaments were grown on BCDAT medium overlaid with a cellophane disc, and sub-cultured by homogenization at 7-day intervals. Individual plants for phenotyping were cultured as spot inoculates on BCDAT medium without cellophane after being picked from 5-day-old homogenized lawns grown on BCDAT medium with cellophane. For CHX treatment, 100 mg of 6-day-old homogenized lawns on BCDAT medium with cellophane was incubated on a BCDAT medium plate supplemented with 20 μm dissolved in dimethylsulfoxide (Sigma) or on a dimethylsulfoxide control plate for 6 h.

Phylogenetic and bioinformatic analysis

Protein sequences of kinases were obtained from the National Center for Biotechnology Information (, Phytozome ( and COSMOSS ( databases using BLAST (Altschul et al., 1990) with experimentally verified kinase sequences submitted as a query (for example, human SMG1). Predicted protein sequences were run through the SMART protein domain identification tool (Letunic et al., 2012) to identify and visualize the location of the kinase domain within the whole molecule. The kinase domain sequences were aligned using ClustalX, and a neighbour-joining tree was generated with bootstrap values (Larkin et al., 2007). Bootstrapping was performed using 1000 replicates. The tree was rooted by mid-point and visualized using FigTree ( (Figure S1). For synteny analysis, the A. lyrata SMG1 was identified in the Plant Genome Duplication Database (, and close genes with synteny to other plant SMG1 genes were used to find partners in A. thaliana (Figure 1). Alignment of CPuORF sequences was performed using ClustalX and visualized using Jalview (Waterhouse et al., 2009).

Generation of smg1Δ mutants

Genomic sequences of SMG1 in moss (PpSMG1) were identified in version 1.6 of the moss genome ( by BLASTp searches. In version 1.6, PpSMG1 was split into two adjacent gene models (Pp1s51_180V6.1 and Pp1s51_182V6.1; Figure 2). A gene model (Pp1s51_180U2__zimmer.1; Figure 2) correctly unifying the two gene models was generated using (D. Lang and A. Zimmer, University of Freiburg, Germany, personal communication). Approximately 1 kb of genomic DNA upstream and downstream of PpSMG1 (Pp1s51_180U2__zimmer.1) were amplified by PCR and cloned into pDONR221 P1–P4 and pDONR221 P3–P2 (, respectively. The P35S-nptII-g6term selection cassette was cloned from pMBL6aDL provided by Y. Kamisugi and A. Cuming (University of Leeds, UK), and cloned into pDONR221 P4r–P3r ( Multisite Gateway™ (Magnani et al., 2006) was used to recombine all three entry clones into a destination vector (pDEST22, ( to produce pKO SMG1. pKO SMG1 was used as a PCR template to produce the linear transforming DNA used to replace the approximately 15 kb PpSMG1 locus. Transformation and selection of knockout lines was performed as described in Kamisugi et al. (2005). All primers are listed in Table S2.

Gene expression analysis

Transcript abundance in WT and transgenic lines was assessed by RT-PCR. Total RNA was collected from 100 mg of protonema tissue 5 days post homogenization using an RNeasy plant mini kit with on-column DNase I treatment (Qiagen, cDNA for semi-quantitative RT-PCR was synthesized using 2 μg total RNA as template with SuperScript™ II reverse transcriptase (Invitrogen, and oligo(dT). This was diluted 30-fold, and 5 μl was used in a 20 μl PCR using Phusion™ (NEB, for semi-quantitative RT-PCR. PpEF1α was amplified as a reference as previously described (Khraiwesh et al., 2010), except when CHX was used, in which case Pp1s54_156V6.1 (the clathrin adapter complex subunit) was used as a reference gene (Kamisugi et al., 2012). Semi-quantitative RT-PCR products were directly sequenced after gel extraction. The template for quantitative RT-PCR was produced using 1 μg template total RNA and an iScript cDNA synthesis kit (Bio-Rad, Quantitative RT-PCR was performed using a CFX96 real-time system with SsoFast EvaGreen Supermix (Bio-Rad). All primers are listed in Table S2.


We thank Samantha Rayson and Barry Causier for discussions and technical assistance, Yasuko Kamisugi and Andrew Cuming for donating moss, assistance with moss culturing and transformations, Mary Ashworth for assistance sub-culturing moss, and Daniel Lang and Andreas Zimmer (University of Freiburg, Germany) for help with gene model predictions for PpSMG1 and identification of uORF-containing genes. This work was supported by the Gatsby Charitable Foundation.