The catalytic domain CysPc of the DEK1 calpain is functionally conserved in land plants


For correspondence (e-mail


DEK1, the single calpain of land plants, is a member of the ancient membrane bound TML–CysPc–C2L calpain family that dates back 1.5 billion years. Here we show that the CysPc–C2L domains of land plant calpains form a separate sub-clade in the DEK1 clade of the phylogenetic tree of plants. The charophycean alga Mesostigma viride DEK1-like gene is clearly divergent from those in land plants, suggesting that a major evolutionary shift in DEK1 occurred during the transition to land plants. Based on genetic complementation of the Arabidopsis thaliana dek1-3 mutant using CysPc–C2L domains of various origins, we show that these two domains have been functionally conserved within land plants for at least 450 million years. This conclusion is based on the observation that the CysPc–C2L domains of DEK1 from the moss Physcomitrella patens complements the A. thaliana dek1-3 mutant phenotype. In contrast, neither the CysPc–C2L domains from M. viride nor chimeric animal–plant calpains complement this mutant. Co-evolution analysis identified differences in the interactions between the CysPc–C2L residues of DEK1 and classical calpains, supporting the view that the two enzymes are regulated by fundamentally different mechanisms. Using the A. thaliana dek1-3 complementation assay, we show that four conserved amino acid residues of two Ca2+-binding sites in the CysPc domain of classical calpains are conserved in land plants and functionally essential in A. thaliana DEK1.


Land plants evolved from green algae related to extant charophyte groups during the last 450–470 million years after the split of the green plant ancestral lineage to Chlorophyta and Streptophyta (Becker and Marin, 2009). Recent data from sequenced genomes of members of the plant lineage have identified a basal green plant gene set consisting of almost 4000 families, which represent the minimum set of genes likely to have been present in the common ancestor of all green plants (Banks et al., 2011). During the transition from single-celled green algae to land plants, the number of common genes almost doubled, as deduced from the genome of the moss P. patens. Together, they represent the set of basal land plant genes (Banks et al., 2011). New data from genome sequencing projects provide insight into the subset of genes that control key morphogenetic traits that define the evolution of land plants, including development of multicellular gametophytes and sporophytes, cell–cell communication via plasmodesmata, 3D body patterning and hormonal signaling (Graham et al., 2000; Pires and Dolan, 2012). A number of such genes, mostly transcription factors, including members of the MADS box, C3/4HDZ and WOX homeodomain gene families, were already present in the earliest land plants and some of the charophyte algae, but not chlorophytes, indicating that the common ancestor of land plants possessed a subset of gene families that are currently known to direct angiosperm development (Banks et al., 2011; Bowman, 2013). The DEK1 gene represents an example of an ancient gene that was recruited to serve a new function in land plants, namely the ability to sense and/or to transmit positional information, an ability that we hypothesize was essential for evolution of the 3D body patterning that allowed plants to conquer new habitats (Lid et al., 2002; Tian et al., 2007; Johnson et al., 2008). Recently, we showed that the DEK1 gene is a representative of the monophylogenetic TML calpains (calpains with large transmembrane domain) dating back 1.5 billion years (Zhao et al., 2012). Descendants of TML calpains were lost in some groups of eukaryotes and retained in others, together with progenitors of three other ancestral calpains (Zhao et al., 2012). Approximately 800 million years ago, the last common progenitor of Chlorophyta and Charophyta possessed both TML calpains and cytosolic calpain derivatives of one or more of the three ancestral calpains. During the next few hundred million years, the Chlorophyta linage appears to have lost TML calpains and retained non-TML calpains. So far, no functional information on these calpains is available. Land plant species sequenced to date, including mosses and angiosperms, posess only one calpain, namely DEK1 (Tian et al., 2007; Zhao et al., 2012).

The DEK1 protein consists of 21 transmembrane segments, a cytosolic linker segment (Arm), a CysPc domain, the signature domain of all calpains, and a C2-like domain (C2L), which is shared with animal classical calpains (Croall and Ersfeld, 2007; Ono and Sorimachi, 2012). Detailed information on the structure–function relationship of the CysPc and C2L domains exist only for animal calpains. The CysPc of classical calpains carries the active-site residues cysteine (Cys) on sub-domain PC1, and histidine (His) and asparagine (Asn) on sub-domain PC2, and two calcium-binding sites that are essential for Ca2+ activation (Strobl et al., 2000; Moldoveanu et al., 2002; Hanna et al., 2008). It is currently unknown to what extent the Ca2+ -binding features apply to DEK1. Homology modeling of the DEK1 calpain from maize (Zea mays) based on the 3D structure of classical calpains has shown that, although the amino acid identity between the CysPc–C2L domains of classical calpains and maize DEK1 is only 30–40%, the two proteins have overall structural features in common, including the three active-site residues that have been shown to be essential for DEK1 in vitro and in vivo activity (Wang et al., 2003; Johnson et al., 2008).

The first indication that DEK1 from higher plants is fundamentally different from classical calpains was the observation that maize recombinant DEK1 CysPc–C2L calpain does not depend on Ca2+ for in vitro proteolytic activity (Wang et al., 2003). In classical calpains, the C2L domain plays an essential role in regulating CysPc activity via electrostatic interactions (Fernandez-Montalvan et al., 2004). In the absence of Ca2+, the C2L domain interacts with the PC2 sub-domain via an acidic residue-rich loop, creating a repulsive force that keeps PC2 distanced from PC1. Upon Ca2+ binding, the electrostatic interaction between the C2L–acidic loop and PC2 is disrupted, enabling formation of active calpain (Fernandez-Montalvan et al., 2004; Hanna et al., 2008). DEK1 C2L domains lack these acidic residues and the corresponding basic residues in the CysPc domain. Nevertheless, the C2L domain is still essential for DEK1 CysPc activity in vitro (Wang et al., 2003). Additional evidence for a specific function for the C2L domain was obtained by Roeder et al. (2012), who identified a new dek1-4 allele in Arabidopsis thaliana with a single point mutation in a highly conserved residue in the DEK1 C2L domain. A lack of sepal giant cells in this mutant indicates that the C2L domain plays a regulatory role in a specific cellular context within the epidermis. Finally, the ability of the DEK1 CysPc–C2L domains to complement the lethal phenotype of the A. thaliana dek1-3 mutant suggests that release from the membrane anchor is a key step in activation of the CysPc–C2L domains (Johnson et al., 2008).

In this study, we first sequenced the DEK1 CysPc–C2L domains of Mesostigma viride, a representative of basal unicellular charophytes. We then re-constructed the phylogenetic history for all currently available DEK1 genes within the Streptophyta. Next, we compared the co-evolution patterns for the CysPc–C2L domains of DEK1 and classical calpains. We then investigated the functional conservation of CysPc–C2L domains from various sources in a genetic complementation assay using the A. thaliana dek1-3 mutant. Finally, we use the same complementation assay to show that four CysPc amino acids that are essential for Ca2+ binding in classical calpains are also essential for DEK1 calpain activity in vivo.


The CysPc–C2L domains of land plants and the charophycean alga Mesostigma viride define two separate sub-clades of DEK1 TML calpains

In order to investigate whether basal charophyte algae contain TML calpains and/or cytosolic calpains, we screened a genomic BAC library from M. viride. Analyses of hybridization-positive sequences (GenBank accession numbers JQ309842JQ309849) showed that the predicted protein contains TML, CysPc and C2L domains and the active-site amino acid residues of CysPc (Table S1). Rapid amplification of cDNA ends (RACE) using primers designed from M. viride BAC sequences successfully produced an 1827 bp 5' cDNA clone (GenBank accession number JQ248594.1) and a 1296 bp 3' cDNA clone (GenBank accession number JQ248595.1), confirming that a DEK1-like gene is actively expressed. Comparison of the cDNA and genomic DNA, identified in the hybridization-positive BAC clone, indicates that two DEK1-like genes are present in M. viride. Despite several sequencing efforts, we have not been able to assemble the complete DEK1 sequence in this BAC. However, analyses of the cDNA and BAC sequences show that the identified calpain gene belongs to the TML calpain class (Table S1).

From current databases, we sampled 32 TML calpain sequences from land plants (Table S1), all of which are structurally similar to DEK1 (Figure S1). We also include additional DEK1 sequences from Marchantia polymorpha (provided by Katsuyuki T. Yamato, Graduate School of Biostudies, Kyoto University) (Data S1) and Ceratodon purpureus (provided by the US Department of Energy Joint Genome Institute, Importantly, TML calpains are present in every land plant genome that has been sequenced, from the liverwort M. polymorpha to the angiosperm A. thaliana. The DEK1 gene is mostly present as a single copy, with the exception of two copies in M. polymorpha, Selaginella moellendorffii, Populus trichocarpa, Glycine max and Mimulus guttatus. The fact that the two identified DEK1 sequences in S. moellendorffii represents two different genes and not two different alleles of DEK1 is evident from the observation that the identified DEK1 sequences have different adjacent genes. The evolutionary relationship between the CysPc–C2L domains of all identified TML calpains was analyzed in more detail using neighbor-joining and maximum-likelihood phylogenetic trees, with animal classical calpains as an out-group (Figure 1). The M. viride branch length is longer than the land plant branches, which are short and highly similar in length, indicating a large degree of sequence divergence between the DEK1 protein of M. viride and those of the land plants. We also performed phylogenetic analyses of available TML calpains using CysPc alignments, showing that TML calpains are divided into a Streptophyta clade (comprising charophytes and embryophytes), referred to as the DEK1 clade, and a non-land plant clade (Figure S2). Within the DEK1 clade, the M. viride calpain forms a separate sub-clade. A heatmap representation of the level of amino acid sequence conservation within the land plant DEK1 sub-clade proteins illustrates the high degree of conservation, at least 80% amino acid identity for CysPc and 60% for C2L. In contrast, the identity between A. thaliana and M. viride is approximately 50% for both domains (Figure 1). One measure of the selection pressure acting on the DEK1 protein is the ratio of non-synonymous (dN) versus synonymous (dS) nucleotide substitution sites (dN/dS) in the full-length DEK1 sequence and in the CysPc–C2L domains. In this analysis, a dN/dS value of 1.0 implies neutral selection on the coding sequence, and values that are significantly less or greater than 1.0 indicate purifying and positive selection, respectively (Hughes and Nei, 1988). First, we identified the best fitting model 010232 by HyPhy (Pond et al., 2005). Under this model, dN/dS values of 0.127 and 0.090 for the full-length DEK1 and CysPc–C2L sequences, respectively, were estimated. Furthermore, we evaluated the level of site-specific selection using the QuickSelectionDetection method (Pond et al., 2005). At the individual codon level, we identified 1354 residues under purifying selection without any single positively selected residue (Figure S3). These data demonstrate that the DEK1 protein is highly conserved within the land plant DEK1 sub-clade, consistent with the hypothesis that a major shift in DEK1 function occurred in the transition from single-celled charophytes to land plants.

Figure 1.

Phylogenetic analyses and sequence conservation of TML calpain CysPc–C2L sequences of streptophytes.

For each node, the level of statistical support by the neighbor-joining method using the Jones-Thornton-Taylor (JTT) with discrete Gamma distribution (+G) model and maximum-likelihood bootstraps inferred using PROTGAMMALG model (for model selection see Zhao et al., 2012), is marked by a filled circle if all values are above 75% bootstrap support or an open circle if values are higher than 50% bootstrap support. Dashes show bootstrap support values < 50%. On the right a heatmap is presented showing the degree of amino acid sequence identity between the various DEK1 CysPc–C2L domains and that of Arabidopsis thaliana. The protein alignment used to construct the phylogenetic tree is provided in Data S3.

Plant and animal calpains exhibit different patterns of amino acid co-evolution within the CysPc–C2L domains

To identify the evolutionary constraints that shaped land plant calpains, and to compare these constraints between DEK1 and classical calpains, we used the CAPS program to analyze the amino acid inter-dependent evolution pattern within the CysPc–C2L domains of DEK1 and the animal orthologs of the classical calpain CAPN2. For DEK1, we identified 59 groups of co-evolving residues (Table 1). A co-evolution group was defined as including all residues that show inter-dependent covariance (co-evolution). For example, if residue A co-evolves with B, B with C and A with C, then all three residues belong to one co-evolution group (see 'Experimental procedures'). Among the 59 co-evolution groups in DEK1, 37 groups represent inter-domain co-evolution between CysPc and C2L. A number of compensatory mutations correlated with physico-chemical properties of the residues. Thus, 52 co-evolving groups exhibited correlated hydrophobicity, 51 groups correlated in terms of molecular weight, and 47 groups showed correlation in both hydrophobicity and molecular weight (Table 1). A network of DEK1 CysPc–C2L co-evolving residues is shown in Figure S4(a). The complexity of co-evolution for the CysPc–C2L domains differs widely in the classical calpains (Table 1 and Figure S4b). For CAPN2 orthologs, 141 co-evolution groups were identified, among which 84 are specifically involved in inter-domain co-evolution between CysPc and C2L, 105 exhibited correlated hydrophobicity, 103 correlated by molecular weight, and 77 showed correlation in both of these physico-chemical parameters (Table 1). These analyses indicate strong co-evolution between the CysPc and C2L domains in both DEK1 and CAPN2 orthologs. Further examination of the relative positions of all co-evolved residues in both DEK1 and CAPN2 CysPc–C2L domains showed that none of the active-site residues or Ca2+-binding sites has been subjected to co-evolution (Data S2). Moreover, no co-evolving amino acid pairs are shared between DEK1 and CAPN2 CysPc–C2L domains.

Table 1. The number of co-evolving groups under different correlated types in DEK1 and CAPN2 CysPc-C2L domains
Co-evolved groups59141
Molecular weight51103
Hydrophobicity and molecular weight4777
Co-evolved between CysPc and C2L3784

These results show that amino acid residue co-evolution within the CysPc–C2L domains differs widely between plant and animal calpains despite the predicted structural similarity of these domains in both groups of organisms. Thus, the strong purifying selection acting on DEK1-like proteins, together with a lower complexity of CysPc–C2L residue co-evolution, suggest high functional conservation of DEK1 calpains in land plants and more flexible functional divergence within CAPN2 orthologs.

DEK1 CysPc–C2L function is conserved within land plants

The phylogenetic analysis described above grouped all land plants in a DEK1 sub-clade with high identity for the CysPc–C2L domains. In order to investigate whether this sequence conservation reflects functional conservation, we performed a series of genetic complementation studies in A. thaliana taking advantage of the observation that ectopic expression of the AtDEK1 CysPc–C2L domains under the control of the RPS5A promoter complements the embryo-lethal phenotype of the dek1-3 mutant (Johnson et al., 2008). First, in a control experiment, we showed that the construct pRPS5A:AtCysPc–C2L–GFP rescued dek1-3/dek1-3 plants in progeny from transformed dek1-3/+ plants (Figure 2a,b and Table 2). Immunoprecipitation of protein extracts, followed by Western blot analysis, showed the presence of the recombinant protein (Figure 2c). We then transformed heterozygous dek1-3/+ plants with constructs containing the CysPc–C2L sequence from Z. mays (Zm), P. patens (Pp) and M. viride (Mv) fused to GFP under the control of the RPS5A promoter. Genotyping confirmed the identity of complemented homozygous dek1-3/dek1-3 lines showing wild-type phenotypes for the Z. mays and the P. patens constructs (Figure 2a,b and Table 2). RT-PCR using primers to detect AtCysPcC2L transcripts showed that endogenous A. thaliana DEK1 transcript was not detectable in the complemented PpCysPc–C2L–GFP and ZmCysPc–C2L–GFP lines (Figure 2d). The presence of the recombinant proteins was confirmed by immunoprecipitation analysis (Figure 2c). Microscopic observation of the complemented AtCysPc–C2L–GFP, PpCysPc–C2L–GFP and ZmCysPc–C2L–GFP dek1-3/dek1-3 plants showed that the pavement cells of the complemented lines have a wild-type appearance and are organized in a continuous layer (Figure 2e). We also performed detailed examination of the T2 seeds from individual siliques harvested from each complemented T1 line, and found that the seeds appeared normal and well-filled, with no collapsed dek1-3 mutant seeds present (Figure 2f). These results show that the Z. mays and P. patens CysPc–C2L proteins complement the embryo-lethal phenotype of the A. thaliana dek1-3 mutant.

Table 2. Number and genotype of transgenic Arabidopsis thaliana plants obtained in the complementation assay
ConstructsNumber of each genotype
T1 plantsT2 plants
  1. At, Arabidopsis thaliana; Zm, Zea mays; Pp, Physcomitrella patens; Mv, Mesostigma viride; Rn, Rattus norvegicus; Hs, Homo sapiens.

  2. a

    T2 plants for genotyping were harvested from one individual T1 dek1-3/dek1-3 plant.

  3. b

    T2 plants for genotyping were harvested from one individual T1 dek1-3/+ plant.

Animal–plant chimera
Figure 2.

Cross-species genetic complementation of Arabidopsis thaliana mutant dek1-3.

(a) Phenotype of homozygous dek1-3 lines complemented with constructs containing the CysPc–C2L domains from A. thaliana (pRPS5A:AtCysPc–C2L–GFP; line At43), Z. mays (pRPS5A:ZmCysPc–C2L–GFP; line Zm21) and P. patens (pRPS5A:PpCysPc–C2L–GFP; line Pp55). The images show representative 1-month-old seedlings.

(b) PCR genotyping showing absence of the wild-type DEK1 allele in genomic DNA extracted from lines At43, Zm21 and Pp55 (N, negative control lacking template).

(c) Western blot analysis of immunoprecipitated GFP fusion proteins extracted from lines At43, Zm21 and Pp55, showing the predicted recombinant protein of approximately 80 kDa.

(d) RT-PCR showing that endogenous A. thaliana DEK1 transcripts are not detected in Zm21 and Pp55 lines (upper panel). Actin was used as a control (lower panel); N, negative control lacking template.

(e) Scanning electron micrographs of the adaxial epidermis of 4th leaf of plants from the At43, Zm21 and Pp55 lines and a wild-type plant. Scale bar = 100 μm.

(f) Seeds from individual siliques from dek1-3/+ plants and the complemented At43, Zm21 and Pp55 lines. The arrowheads indicate collapsed dek1-3 mutant seeds. Scale bar = 1 mm.

In contrast, no complemented dek1-3/dek1-3 plants were recovered when we used the M. viride CysPc–C2L construct to transform dek1-3/+ plants (Table 2). Immunoprecipitation experiments were performed to show the presence of the recombinant protein (Figure S5a). In addition, we also tested whether the CysPc domain of animal calpains functions in plants. We created chimeric CysPc–C2L constructs in which we replaced the CysPc domain of A. thaliana with the CysPc domains of classical rat μ- and m-calpains (CAPN1 and CAPN2, respectively) and the human p94 calpain (CAPN3). Two versions of each chimeric construct were used, differing in the start point of the sequence encoding the N-terminal region of CysPc (Figure S6). We used the RPS5A promoter to drive expression and the GFP tag for detection of recombinant protein. Although many heterozygous GFP-positive lines were generated, no homozygous dek1-3/dek1-3 plants were recovered (Table 2). We also genotyped T2 progeny from individual transgenic heterozygote dek1-3 lines, but no homozygous complemented lines were identified (Table 2). Immunoprecipitation experiments showed that the GFP fusion proteins were expressed (Figure S5b). From these results, we conclude that the catalytic domains of CAPN1, 2 and 3 do not functionally substitute for the A. thaliana DEK1 CysPc domain in the dek1-3 mutant background.

Conserved residues coordinating Ca2+ binding in animal calpains are essential for in vivo CyPc–C2L function in Arabidopsis thaliana

Previous modeling of the 3D structure of maize DEK1 CysPc–C2L based on the crystal structure of the classical calpain CAPN2 identified conservation of important structural elements of classical calpains (Wang et al., 2003). Three-dimensional modeling of A. thaliana DEK1 CysPc–C2L using rat calpain as a model confirm these findings, including conservation of the PC1 and PC2 sub-domains, the catalytic site residues Cys, His and Asn, as well as 100% conservation of two glycine residues (Gly209 and Gly210) that function as a hinge between the two sub-domains PC1 and PC2 in classical calpains (Strobl et al., 2000) (Figure 3a,b). A major question regarding DEK1 regulation is whether Ca2+ is an activator as for classical calpains, especially in light of the observation that maize CysPc–C2L proteolytic activity does not depend on Ca2+ in vitro (Wang et al., 2003). Animal calpains contain two conserved Ca2 + -binding sites in the CysPc domain, each of which consists of multiple amino acids (Moldoveanu et al., 2002). Sequence alignment between the land plant DEK1 calpains and classical calpains show a near-perfect conservation of these residues (Figure 3a). Residues of the first binding site, including I99, G101, D106 and E185 (Figure 3c) (numbering relative to rat m-calpain, NP_058812.1) are all conserved in DEK1 (Figure 3a,b). The amino acids D106 and E185, which bind Ca2+ with two side-chain oxygen atoms in animal calpains, are 100% conserved in plant DEK1 sequences (corresponding to D1752 and E1828 in the A. thaliana DEK1 protein) (Figure 3c). The second Ca2+-binding site in m-calpain includes the amino acids E302, D309, Q329, D331 and E333 (Figure 3c). The amino acids that contribute two Ca2+-coordinating side-chain oxygen atoms in animal calpains, E302 and D309, are also 100% conserved in the DEK1 calpains (E1946 and D1953 in A. thaliana DEK1) (Figure 3c). However, the amino acid D331 is less conserved in plant DEK1 sequences, and two additional amino acids (E333 and R104 in m-calpain) that form a double salt bridge between the two CysPc Ca2+-binding sites are conserved in most, but not all animal calpains, but are absent in land plant DEK1 proteins (Figure 3a). The calculated dN/dS ratio for the Ca2+-binding sites, the active-site residues (Cys, His and Asn) and the hinge glycine of DEK1 show strong purifying selection (Figure 3a). In order to experimentally investigate the functional significance of the four conserved DEK1 residues corresponding to residues that coordinate Ca2+ in animal CysPc, we performed in vitro mutagenesis and tested their functionality in the A. thaliana dek1-3 complementation test. In these experiments, we created the four single mutation variants D106N, E185Q, E302Q and D309N (corresponding to amino acids D1752, E1828, E1946 and D1953, respectively, in the A. thaliana DEK1 protein). We also created two double mutation variants D106N/E185Q and E302Q/D309N, and one quadruple mutant D106N/E185Q/E302Q/D309N variant. These pRPS5A:AtCysPc–C2L–GFP mutant variants were transformed into segregating heterozygous dek1-3 plants as described above. A large number of independent transgenic plants were generated for all constructs. Genotyping performed as described above showed that none of the recovered transformed plants were homozygous for the dek1-3 allele (Table 3). Fisher's exact test show that the total absence of transgenic dek1-3 mutant lines was statistically significant based on the number of transformed heterozygous dek1-3 lines obtained (Table 3). Transgene transcript accumulation was confirmed by RT-PCR for all mutant lines (Figure S7a), and GFP expression of the quadruple mutant fusion protein was verified by confocal microscopy (Figure S7b). In addition, immunoprecipitation experiments showed that the recombinant AtCysPc–C2L–GFP fusion proteins harboring the single mutations and the double mutation E302Q/D309N were expressed in A. thaliana (Figure S5c). As we were unable to detect complemented T1 lines, we investigated the T2 progeny from heterozygous dek1-3 lines transformed with the quadruple, double (D106N/E185Q) and single mutant versions of the AtCysPc–C2L–GFP constructs. Microscopy evaluation of the seeds in individual siliques from these plants showed that the transgenic dek1-3/+ plants segregate for well-filled viable seeds and shrunken non-viable seeds (Figure S7c). We also genotyped 50 T2 seedlings from each of three individual lines transformed with the quadruple and the two double mutations constructs, but did not detect complemented dek1-3/dek1-3 T2 individuals. These results show that AtCysPc–C2L–GFP proteins carrying mutations in amino acids corresponding to Ca2+-coordinating residues in classical calpains are unable to complement the embryo lethal dek1-3 mutant phenotype. These results show that the amino acids D1752, E1828, E1946 and D1953 of A. thaliana DEK1 CysPc have essential in vivo functions.

Table 3. Number and genotype of transgenic Arabidopsis thaliana plants (T1) transformed with the mutated versions of the pRPS5A:AtCysPc–C2L–GFP construct
ConstructsNumber of each genotypeP valuea
  1. a

    Fisher's exact test.

Quadruple mutation
D106N, E185Q, E302Q, D309N291400.0407
Double mutations
D106N, E185Q182200.0089
E302Q, D309N462800.0018
Single mutations
Figure 3.

Identification of conserved amino acid residues between the CysPc–C2L domains of streptophytes, rats and humans.

(a) Site alignment of functional amino acid residues in classical rat and human calpains, and the conservation of these in DEK1 proteins of streptophytes. The function of each amino acid is indicated below the alignment using colored circles. Residues with 100% conservation in both rat/human and calpain sequences from streptophytes, and in rat/human calpains alone, are shown in red and blue, respectively. Other residues that are highly conserved in DEK1 proteins are shown in green. The estimated dN/dS ratio of selected amino acids codons is shown below the alignment. The numbers above the alignment correspond to the numbering of the amino acids in the Arabidopsis thaliana DEK1. The site alignment was generated using the Mafft version 6 G-INS-I strategy (Katoh and Toh, 2008).

(b) Upper panel: modeled structure of the A. thaliana CysPc domain showing the PC1 and PC2 sub-domains in various shades of blue, the active-site cleft (arrow) and the two putative Ca2+-binding sites (green spheres). Lower panel: 3BOW model of the rat m-calpain CysPc (red), C2L and PEF (both in orange) domains.

(c) Enlargement of the A. thaliana amino acids (upper panel) corresponding to Ca2+-binding sites 1 and 2 in rat m-calpain (lower panel). For orientations, see green dots in (b). The green and red spheres are Ca2+ atoms and H2O molecules, respectively.


Previously, we showed that the DEK1 protein belongs to an ancient group of TML calpains formed early in the evolution of eukaryotes (Zhao et al., 2012). Here, we provide evidence that DEK1s of land plants form a separate TML calpain sub-clade with high identity to both the CysPc and C2L domains of species representing an evolutionary time span of approximately 450 million years. Analysis of sequence constraints confirms that these domains have been under strong purifying selection in land plants. Using an A. thaliana dek1-3 genetic complementation assay, we show that the CysPc–C2L domains of DEK1 are also functionally conserved in land plants, as demonstrated by the ability of the CysPc–C2L of P. patens to complement the A. thaliana dek1-3 mutant. This functional conservation over at least 450 million years supports the hypothesis that DEK1 assumed a new essential role during the transition from algae to land plants. As shown here, M. viride, the basal line of land plants (Leliaert et al., 2012), harbors a DEK1 gene that forms a separate sub-clade of TML calpains, with CysPc–C2L domains that are unable to functionally substitute for its land plant counterpart. Although yet to be identified, the sequence differences between DEK1 of M. viride and land plants are assumed to hold the key to identifying the changes that allowed DEK1 to assume a new role in land plants. In order to identify whether these changes occurred with the evolution of multicellularity or later when conquering the land, it will be important to investigate whether DEK1 sequences from basal multicellular charophytes such as Coleochaete orbicularis and Chara braunii also complement the A. thaliana dek1-3 mutant as soon as genome data from these species become available. The new function for DEK1 is also likely to involve novel protein–protein interactions, and searching for the DEK1 substrate(s) is an important area for future research. Data are also lacking to enable us to conclude whether the TML calpain of M. viride is the only calpain variant in this species, an important question as it may help to determine at what stage during land plant evolution cytosolic calpains were lost.

What is the proposed novel role of DEK1? One trait that was added to the repertoire of plant development during the divergence of multicellular streptophytes from basal charophycean algae such as M. viride is phragmoplast-based cytokinesis, paving the way for the development of more complex plant organs through control of th cell division plane orientation (Leliaert et al., 2012). Another requirement for constructing 3D architectures is the ability to identify the relative division plane orientation in complex organs such as meristems, in which the L1 layer plays an essential role. Based on the effect of mutations or perturbation of DEK1 using RNAi in higher plants, including developmental defects in embryo, endosperm, meristems and leaves, we propose that DEK1 provides positional information to developing epidermal cells in these organs (Lid et al., 2002, 2005; Tian et al., 2007). One of the critical features that evolved in the earliest diverging land plants such as mosses was a histogenetic apical meristem, characterized by cells with the ability to divide in multiple directions (Nedelcu et al., 2006). The fact that CysPc–C2L of the moss P. patens rescues the A. thaliana dek1-3 embryo-lethal phenotype provides evidence that the fundamental role of DEK1 is linked to the early morphogenetic events that control 3D body patterning, and that this function has been conserved for over 450 million years. The substrate of the DEK1 calpain has yet to be identified, but the functional conservation of the catalytic domain CysPc shown here suggests that the substrate may also be conserved. Our current knowledge of the cellular targets of calpains comes exclusively from animal calpain research. Mammalian classical calpains cleave various cellular substrates, among them cytoskeletal proteins, including actin-binding proteins and microtubule-associated proteins, receptors and cellular transporters (Fischer et al., 1991; Franco et al., 2004; Gomes et al., 2011). Recently, it has been shown that microtubule-associated proteins belonging to the MAP65 and CLASP families control division plane rotation and epidermal cell fate in A. thaliana roots (Dhonukshe et al., 2012). Based on the dek1 mutant phenotypes in angiosperms, including aberrant cell division plane positioning in dek1 embryos (Lid et al., 2005; Johnson et al., 2005) and loss of epidermal cell fate in leaves and endosperms (Lid et al., 2002; Tian et al., 2007), it is tempting to hypothesize that DEK1 calpains are involved in the processing of cellular components that control oriented cell division and cell differentiation.

Classical CysPc calpain activity is under strict regulation by multiple control mechanisms, including Ca2+ and interaction with C2L and the PEF domain of the large subunit and a small regulatory subunit, as well as the specific inhibitor calpastatin (Fernandez-Montalvan et al., 2004; Wendt et al., 2004). For DEK1, the proposed mechanism for regulating calpain activity involves the membrane domain, which senses and/or transmits the cell surface position, thereby activating CysPc–C2L, probably via autocatalytic cleavage (Tian et al., 2007; Johnson et al., 2008). Assuming that the in vitro results for maize DEK1 are representative for the in vivo regulation of DEK1, the results shown here that the four amino acid residues of DEK1 CysPc that correspond to Ca2+-binding sites are all essential for in vivo function in A. thaliana may appear surprising. Two alternative explanations may be offered for the inactivity of the mutagenized CysPc constructs in A. thaliana. The first is that the enzyme is strictly dependent on Ca2+ for activity, despite of the data cited above showing that Ca2+ is not an essential in vitro DEK1 activator. If this is the case, it is surprising that we do not observe a quantitative effect of replacing each of the four amino acid residues in the complementation test. The second scenario is that A. thaliana DEK1 is not activated by Ca2+, but that the amino acids that bind Ca2+ in classical calpains are essential for the overall structure of the enzyme, and changes in these residues disrupt enzyme activity. To minimize the possibility of folding and charge effects, we chose D→N and E→Q substitutions, which are conservative in shape and functionality, for our site-directed mutagenesis. However, disruption of the fine structure cannot be totally ruled out. Although the CysPc domain of DEK1 represents a bona fide calpain catalytic core domain, we show that the pattern of co-evolution of the CysPc–C2L domains differs significantly between plants and animals, indicating that different mechanisms control the activity of these calpains. Clearly, the questions raised here regarding DEK1 CysPc and C2L structure–function relationships can only be answered by resolving the crystal structure in the presence and absence of Ca2+, as for classical calpains (Strobl et al., 2000; Moldoveanu et al., 2002; Hanna et al., 2008). One of the proposed mechanisms that controls calpain activity independently of calcium is phosphorylation of the CysPc domain (Zadran et al., 2010). Whether this modification controls DEK1 activity and which kinases/phosphatases are involved will be the subject of further investigation.

In conclusion, we show here that DEK1s of land plants form a sub-clade of the TML calpains, with functional conservation of the CysPc–C2L domains across mosses and angiosperms. From this result, we propose that DEK1 evolved a novel function during the transition from single-celled green algae to land plants. The mechanisms involved in this transition are yet to be determined, but we speculate that the novel function involved the ability to provide positional information to complex organs. In our co-evolution analysis, we further identified major differences in the interaction between the CysPc–C2L residues of DEK1 and classical calpains, suggesting that the two enzymes are regulated by different mechanisms. Finally, we demonstrate that four amino acid residues involved in Ca2+ binding in classical calpains are essential for DEK1 in vivo function, suggesting that they have essential structural roles that are not necessarily related to Ca2+ activation.

Experimental procedures

Identification and cloning of DEK1-like sequences from Mesostigma viride

The M. viride BAC library Mv_Bb (Arizona Genomics Institute, Tuscon, AZ) was hybridized with a DEK1-like probe isolated from Coleochaete orbicularis (strain LB 2651, The University of Texas at Austin, Austin, TX) using primers designed from a C. orbicularis EST sequence (clone corb-UMD_Coleochaete_c12129_c_s, kindly provided by Ruth Timme and Charles Delwiche, Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD) encoding a DEK1-like calpain. The two most strongly hybridizing clones, 47K6 and 46G6, were subjected to pyrosequencing using the 454 Flx platform (454 Life Sciences, The sequence reads were assembled independently using several programs, including Newbler (454 Life Sciences), Celera (Celera Corporation, and CLC Genomic Workbench (CLC bio, Assembled contigs were verified by PCR. Methods S1 provides details of the hybridization conditions.

In order to clone DEK1-like encoding transcripts from M. viride, 5' and 3' RACE was performed using the FirstChoice RLM-RACE kit (Ambion,, according to the manufacturer's instructions. All gene-specific primers were designed based on information from the M. viride BAC sequence. Methods S1 provides further information. The sequences of all primers used are listed in Table S2.

Identifying transmembrane calpains from public available databases

BLASTp searches were performed using the A. thaliana DEK1 sequence (GenBank accession NP_175932.2) as query to identify all calpain homologs in Phytozome (; Altschul et al., 1990), the Origins of Multicellularity project database ( and the National Center for Biotechnology Information database ( Each retrieved calpain-encoding sequence was then submitted to the TMHMM server (version 2.0, (Krogh et al., 2001) to identify proteins harboring large transmembrane domains. All TML calpains were submitted to the Pfam (Punta et al., 2012), SMART (Letunic et al., 2009) and Conserved Domain (Marchler-Bauer et al., 2003) databases to identify and classify protein domains. All of the sequences used in this study are listed in Table S1.

Phylogenetic analyses of transmembrane calpains

To investigate the phylogenetic relationship between TML calpains of streptophytes, we used the calpain domains (CysPc–C2L) to reconstruct maximum-likelihood (ML) and neighbor-joining (NJ) trees. The ML tree was constructed as described previously (Zhao et al., 2012). The NJ tree was constructed using MEGA5 (Tamura et al., 2011) with 1000 bootstrap replicates using the next-best ProtTest model JTT+G+I because LG is not available in the program. We used the same method to reconstruct the phylogeny of the CysPc sequences of all identified TML calpains.

Selection analysis

Sequences used for selection analysis are listed in Table S1. DEK1 CysPc–C2L codon alignment was performed using Mafft version 6 (Katoh and Toh, 2008) with the L-INS-i algorithm, and then loaded into Hyphy version 2.0 (Pond et al., 2005) (along with a corresponding NJ phylogenetic tree). The HyPhy batch file with model rejection level of 0.0002 was used to establish the best fit of 203 general time-reversible (GTR) models of nucleotide substitution (Kosakovsky et al., 2009). The Hyphy batch file was used to estimate site-by-site variation in rates.

CysPc and C2L domain co-evolution analysis

To identify co-evolution patterns in CysPc–C2L domains of the land plant DEK1 and animal m-calpains, we used the parametric method based on correlated evolutionary patterns among amino acid sites implemented in CAPS (version 1.0) (Fares and McNally, 2006). The probabilities and significance of the correlated evolutionary patterns among amino acid sites were estimated using a large number (10 000) of random samplings and a small α value (0.001) to minimize the false-positive rate (type I error). The scores for the amino acid substitutions were obtained using the appropriate block substitution matrix (BLOSUM80) (Henikoff and Henikoff, 1992), depending on the similarity of the protein sequences. Cytoscape (Smoot et al., 2011) was used to visualize the co-evolutionary networks identified by CAPS, and to generate the networks of correlation between co-evolving amino acids. All of the sequences used in this study are listed in Data S2 and Table S1.

Homology modeling of the Arabidopsis thaliana DEK1 CysPc domain

A homology model of the A. thaliana DEK1 CysPc domain was created using the Rattus norvegicus m-calpain structure (PDB entry 3BOW) as a template (Hanna et al., 2008). The model was built using the Prime module of the Schrödinger suite (Schrödinger Inc., Loops were refined and the two Ca2+ ions were modeled in based on the template. Finally, the model was energy-minimized using Schrödinger Macromodel module. The final A. thaliana CysPc model consists of 256 residues, including the active-site residues Cys1761, His1919 and Asn1939 and the two putative Ca2+-binding sites. We evaluated the model for appropriate stereochemical quality using the SAVES server (, which hosts five quality checking tools (PROCHECK, WHAT_CHECK, ERRAT, VERIFY_3D and PROVE).

Construction of binary vectors for genetic complementation studies in Arabidopsis thaliana

Constructs for genetic transformation of A. thaliana were prepared in the binary vector pMDC107 (CD3-748) (Curtis and Grossniklaus, 2003). The A. thaliana and P. patens DEK1 CysPc–C2L coding sequences corresponding to amino acids P1665RMETQE to LEAL2151 (GenBank accession number NP_175932.2) and P1686KIET to LEPL2173 (GenBank accession number XP_001774206.1), respectively, were amplified from cDNA and cloned into the vector pMDC107 producing a GFP fusion at the C-terminus under the control of the ribosomal protein 5a (RPS5A) promoter (Weijers et al., 2001), resulting in plasmids pRPS5A:AtCysPc–C2L–GFP and pRPS5A:PpCysPc–C2L–GFP, respectively. The native Z. mays DEK1 CysPc–C2L coding sequence (corresponding to amino acids P1673RFET to RLEAV2159; GenBank accession number NP_001105528) and an A. thaliana codon-optimized M. viride DEK1-like CysPc–C2L coding sequence (Figure S8) were synthesized by GenScript (, and cloned into the binary vector, resulting in plasmids pRPS5A:ZmCysPc–C2L–GFP and pRPS5A:MvCysPc–C2L–GFP, respectively. The chimera sequences, comprising the CysPc coding sequence from either rat μ-calpain (Capn1; GenBank accession number NM_019152.2), rat m-calpain (Capn2; GenBank accession number NM_017116.2) or human p94 (Capn3; GenBank accession number NM_000070.2) and the A. thaliana DEK1 C2L coding sequence, were synthesized by GenScript and cloned into vector pMDC107 harboring the RPS5A promoter sequence. Detailed illustrations of the chimera junction position are shown in Figure S6. Methods S1 provides more information about the cloning strategy.

A plasmid containing the AtCysPc–C2L sequence was used as template to generate mutations of putative Ca2+ -binding sites in the A. thaliana DEK1 CysPc sequence. Mutant constructs were generated using either the GeneTailor site-directed mutagenesis system (Invitrogen, (single mutations) or the QuikChange multi site-directed mutagenesis kit (Agilent Technologies, (double and quadruple mutations). The sequence of the mutant oligonucleotides is shown in Table S2, and successful incorporation of the mutations was confirmed by sequencing. The mutant sequences were cloned into pRPS5A:AtCysPc–C2L–GFP, replacing the native AtCysPc–C2L sequence, to generate the mutated versions of the pRPS5A:AtCysPc–C2L–GFP constructs.

Agrobacterium transformation, generation of transgenic lines and Arabidopsis thaliana growth

The binary vectors were transformed into Agrobacterium tumefaciens C58 (pGV2330) by electroporation as previously described (Shen and Forde, 1989). Wild-type A. thaliana ecotype Col-0 or heterozygous dek1-3 (SAIL_384_F07; N817685, Nottingham Arabidopsis Stock Centre) plants were transformed by the floral-dip method (Clough and Bent, 1998). Transformants were selected on 1 × Murashige & Skoog (MS) medium, pH 5.7, supplemented with 1% w/v sucrose and 20 μg/ml hygromycin B (Sigma-Aldrich, Heterozygote dek1-3 plants for transformations were selected on MS/agar plates containing 10 μg/ml glufosinate ammonium (Fluka, All plants were cultivated in growth chambers at 22°C and 70% humidity, under cool white fluorescent lights (100 μmol m−2 sec−1) for 8 h dark/16 h light.

Genotyping of transformed Arabidopsis thaliana

PCR was used to identify the genotype of transformed hygromycin-resistant plants. The primer combinations SP/1-3 +  LB1 and SP/1-3 +  ASP/1-3 (Table S2) were used to detect the dek1-3 allele and the wild-type allele, respectively. PCR genotyping was performed using the dilution protocol of the Phire Plant Direct PCR kit (NEB, according to the manufacturer's instructions.


Total RNA was isolated from rosette leaves of 5-week-old transgenic A. thaliana lines using the RNeasy plant mini kit (Invitrogen). After DNase I treatment of RNA samples, 800 ng total RNA was reverse-transcribed using 200 units of Superscript III reverse transcriptase (Invitrogen), primed with either 500 ng oligo(dT)12-18 or 250 ng random primers. PCR products were amplified using 1 μl 1:5 diluted cDNA sample as template with 0.1 units of HotStartTaq DNA polymerase (Qiagen,, using the antisense primer ASP/RT PCR in combination with one of the following sense primers: SP/At_RT-PCR (AtCysPc–C2L–GFP and animal CysPc/AtC2L–GFP), SP/Zm_RT-PCR (ZmCysPc–C2L–GFP), SP/Pp_RT-PCR (PpCysPc–C2L–GFP) or SP/Mv_RT-PCR (MvCysPc–C2L–GFP). Endogenous A. thaliana DEK1 transcripts were detected using the primers SP/AtDEK1 and ASP/AtDEK1 (Table S2).

Analysis of protein extracts

Total proteins were extracted from fresh leaves of 6-week-old plants and ground in liquid nitrogen. The tissue powder was used for immunoprecipitation followed by Western blot detection. The GFP fusion proteins were immunoprecipitated using the μMACS epitope tag protein isolation kit (Miltenyi Biotec, Eluted immunoprecipitate was loaded on 4–15% gradient gels (Bio-Rad, and separated by SDS–PAGE at 4°C. Proteins were rapidly transferred to nitrocellulose membrane using a pre-chilled semi-dry blotting system (Bio-Rad Laboratories). GFP-fused proteins were detected using anti-GFP N-terminal primary antibodies and horseradish peroxidase-conjugated secondary anti-rabbit antibodies (Sigma-Aldrich).


Samples for scanning electron microscopy were fixed in 2% para-formaldehyde/1.25% glutaraldehyde in 0.1 M PIPES buffer. After overnight fixation at 4°C, the samples were washed in 0.1 M PIPES buffer, dried to the critical point, and gold-coated. Samples were examined using a Zeiss EVO-50 scanning electron microscope ( Confocal microscopy was performed using a Zeiss LSM 510 laser scanning confocal microscope.


We are grateful to Peter L. Davies (Department of Biomedical and Molecular Sciences, Queen's University, Kingston, Ontario, Canada) for valuable help with calpain protein alignments. We also thank Ruth Timme and Charles Delwiche (Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD) for performing BLAST searches and sharing the Coleochaete orbicularis DEK1-like EST sequence with us, and to Katsuyuki T. Yamato and Takayuki Kohchi (Graduate School of Biostudies, Kyoto University, Japan) for generously providing us with Marchantia polymorpha genomic sequences harboring DEK1 orthologs. We also thank Lex Nederbragt (Department of Bioscience, University of Oslo, Norway) for his help with the Newbler assembly, and Hanna-Kirsti S. Leiros (The Norwegian Structural Biology Centre, University of Tromsø, Norway) for technical assistance with homology modeling of the DEK1 CysPc domain. This work was supported by research grants from the Norwegian Research Council to the Norwegian University of Life Sciences (to O.-A.O.) and Hedmark University College (to O.-A.O.), NSF grant MCB1157824 to MSO and a fellowship from the Norwegian University of Life Sciences to Z.L.