Discovery of rice essential genes by characterizing a CRISPR-edited mutation of closely related rice MAP kinase genes

Authors

  • Bastian Minkenberg,

    1. Intercollege Graduate Degree Program in Plant Biology, The Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA, USA
    2. Department of Plant Pathology and Environmental Microbiology, The Pennsylvania State University, University Park, PA, USA
    Search for more papers by this author
  • Kabin Xie,

    1. Department of Plant Pathology and Environmental Microbiology, The Pennsylvania State University, University Park, PA, USA
    Current affiliation:
    1. College of Plant Science and Technology, Huazhong Agricultural University, Wuhan, China
    Search for more papers by this author
  • Yinong Yang

    Corresponding author
    1. Intercollege Graduate Degree Program in Plant Biology, The Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA, USA
    2. Department of Plant Pathology and Environmental Microbiology, The Pennsylvania State University, University Park, PA, USA
    Search for more papers by this author

Summary

The clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR-associated protein 9 nuclease (Cas9) system depends on a guide RNA (gRNA) to specify its target. By efficiently co-expressing multiple gRNAs that target different genomic sites, the polycistronic tRNA-gRNA gene (PTG) strategy enables multiplex gene editing in the family of closely related mitogen-activated protein kinase (MPK) genes in Oryza sativa (rice). In this study, we identified MPK1 and MPK6 (Arabidopsis AtMPK6 and AtMPK4 orthologs, respectively) as essential genes for rice development by finding the preservation of MPK functional alleles and normal phenotypes in CRISPR-edited mutants. The true knock-out mutants of MPK1 were severely dwarfed and sterile, and homozygous mpk1 seeds from heterozygous parents were defective in embryo development. By contrast, heterozygous mpk6 mutant plants completely failed to produce homozygous mpk6 seeds. In addition, the functional importance of specific MPK features could be evaluated by characterizing CRISPR-induced allelic variation in the conserved kinase domain of MPK6. By simultaneously targeting between two and eight genomic sites in the closely related MPK genes, we demonstrated 45–86% frequency of biallelic mutations and the successful creation of single, double and quadruple gene mutants. Indels and fragment deletion were both stably inherited to the next generations, and transgene-free mutants of rice MPK genes were readily obtained via genetic segregation, thereby eliminating any positional effects of transgene insertions. Taken together, our study reveals the essentiality of MPK1 and MPK6 in rice development, and enables the functional discovery of previously inaccessible genes or domains with phenotypes masked by lethality or redundancy.

Introduction

New genome-editing tools achieve the mutation of genes by inducing a targeted double-strand break (DSB) that is repaired by the cell. The imperfect non-homologous end-joining (NHEJ) pathway is the prevalent DSB repair mechanism and leads to insertions or deletions (indels) of nucleotides at the target site. These indels can cause a gene knock-out if the mutations prevent or alter the transcription or translation of the gene. Precise repair via homologous recombination is possible if a donor template with homology to the target site is present, but higher plants have a low intrinsic homologous recombination rate (Voytas, 2013). The predominant tools to achieve site-directed DSBs in a genome are zinc-finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs) and recently the bacterial clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR-associated protein 9 nuclease (Cas9) system. The targeting ability of ZFNs and TALENs relies on protein–DNA interaction based on specific binding domains newly designed for each target (Voytas, 2013). By comparison, CRISPR/Cas9 is much easier to program because the Cas9 endonuclease can be recycled to cut a different target site based on the specific base pairing of the so-called single-guide RNA (gRNA) with the target DNA (Jinek et al., 2012). The 20 bases at the 5′ end of the gRNA (spacer) match the genomic target DNA (protospacer) next to the protospacer adjacent motif (PAM, 5′-NGG-3′ in case of the Streptococcus pyogenes Cas9), and bind to the complementary strand (Jinek et al., 2012; Cong et al., 2013; Mali et al., 2013). CRISPR/Cas9 has been used for targeted mutagenesis in many plants, such as Arabidopsis, Nicotiana tabacum (tobacco), Nicotiana benthamiama, Solanum tuberosum (potato), Solanum lycopersicum (tomato), Glycine max (soybean), Citrus sinensis (sweet orange), Marchantia polymorpha (liverwort), Zea mays (maize), Sorghum, Triticum spp. (wheat), and Oryza sativa (rice) (Jiang et al., 2013; Li et al., 2013; Miao et al., 2013; Shan et al., 2013; Xie and Yang, 2013; Brooks et al., 2014; Jia and Wang, 2014; Sugano et al., 2014; Cai et al., 2015). In addition to its easy design, CRISPR/Cas9 can target most plant genes with high specificity. With SpCas9 alone, 97.1% of all transcription units in Arabidopsis and 89.6% of the transcription units in rice can be targeted with highly specific gRNA spacers (Xie et al., 2014a). The remaining genes could still be targeted with moderate specificity or by Cas9 variants with different PAM requirements.

One of biggest advantages of CRISPR/Cas9, however, is its multiplexing ability. Cas9 can simultaneously target multiple sites if several gRNAs are provided or co-expressed. This enables researchers to target multiple genes for mutagenesis and to create double, triple or even decuple mutants in a single step. But most multiplex editing vectors use several gRNA expression cassettes that express each gRNA from its own PolIII promoter (Li et al., 2013; Lowder et al., 2015; Ma et al., 2015a; Zhang et al., 2015b). A single gRNA expression cassette has a length of 500–800 bp, which makes multiplex editing in plants a challenge still, because the number of simultaneously expressible gRNAs is limited by vector capacity and cloning efficiency. Expressing multiple gRNAs from a single transcript instead of multiple individual cassettes is a promising alternative to overcome this limitation.

The polycistronic tRNA-gRNA (PTG) gene system is a new strategy to provide stable and efficient expression of multiple gRNAs from a single transcript (Xie et al., 2015). This synthetic gene consists of tandemly arrayed tRNA-gRNA sequences driven by a PolIII promoter (Figure 1). Every repeat is less than 180 bp long, in comparison with the at least 500 bp needed for an individual cassette. Each gRNA may contain a specific spacer that recognizes a different target, enabling the simultaneous targeting of multiple genomic sites in one transformation. The endogenous RNaseP and RNaseZ recognize the secondary structure of tRNA (Schiffer et al., 2002; Barbezier et al., 2009; Canino et al., 2009; Phizicky and Hopper, 2010; Gutmann et al., 2012) and specifically cut the tRNAs at the 5′ and 3′ sites, respectively, to release the mature gRNAs that direct Cas9 to multiple targets (Figure 1). In addition, the expression of gRNAs with a PTG is up to 30 times higher than expression with a simple PolIII promoter, probably because the A- and B-box of the tRNA can enhance the transcription of the PTG primary transcript (Xie et al., 2015). The ability to simultaneously mutate multiple genes is especially useful to analyze closely related genes, which tend to have overlapping functions. Currently, researchers need to find reliable mutant lines and cross them to obtain double or higher order mutants. The PTG/Cas9 technology significantly improves the analysis of closely related genes because multiple genes can be efficiently targeted in a single transformation.

Figure 1.

The polycistronic tRNA-gRNA gene (PTG) processing system and design of PTGs to target four MPKs. (a) A PTG consists of tRNA-gRNA repeats transcribed by a PolIII promoter. Endogenous RNaseP and RNaseZ recognize the tRNA secondary structure within the primary transcript and cut and release multiple gRNAs. The processed gRNAs complex with Cas9 to target multiple sites. (b) Eight designed gRNAs target four MPK genes. Chromosomal deletion could occur because each gene is targeted by two gRNAs. (c) PTGs co-expressing up to eight gRNAs. PTG2 and PTG6 targeted MPK5. PTG3, PTG4 and PTG5 targeted MPK1, MPK2 and MPK6, respectively. PTG7 and PTG8 targeted MPK1/MPK5 and MPK2/MPK6, respectively, to create double mutants. PTG9 targeted all four MPK genes.

Although the ability of CRISPR/Cas9 to knock-out genes in plants and the outcome of NHEJ-mediated repair of non-essential genes is well described (Feng et al., 2014; Jiang et al., 2014; Xu et al., 2015; Zhang et al., 2015a), little is known about the outcomes when essential genes with lethal phenotypes are targeted. Only a few studies in human cancer cell lines have attempted to identify essential genes by introducing libraries of gRNAs (Shi et al., 2015; Wang et al., 2015). The main goal of these studies was merely to identify essential genes that could be targeted for cancer treatment; however, it would be more interesting for the research community to create new allelic variation in single essential genes to study their function, rather than simply identifying them.

In this study, we tested the ability of PTG/Cas9 to create mutant resources for studying essential genes and gene families by successfully mutating four members of the rice mitogen-activated protein kinase (MPK) gene family with up to eight co-expressed gRNAs. MPKs are important signal transducers with partly overlapping functions in Arabidopsis (Asai et al., 2002; Wang et al., 2007, 2008; Beckers et al., 2009; Pitzschke et al., 2009). We have shown that a variety of indels, as well as chromosomal deletions, could be generated and reliably inherited into subsequent generations. We found in the example of rice MPK1 (AtMPK6 ortholog) and MPK6 (AtMPK4 ortholog) genes that indels in their mutants favor the preservation of the open reading frame (ORF), and can create new alleles with deletions of between one and seven amino acids. Although a complete knock-out of MPK1 or MPK6 is lethal, we were able to obtain viable and fertile heterozygous T0 mutants that could facilitate the analysis of an essential gene. This study further demonstrates that closely related genes could be simultaneously edited with the multiplexed PTG/Cas9 technology, resulting in biallelic mutations at all targeted genes in 45–86% of the cases. Transgene-free mutant progeny carrying up to eight mutations could be obtained by genetic segregation and the removal of the PTG/Cas9 transgene, thereby eliminating any positional effects of T-DNA insertions. Hence, the PTG/Cas9 multiplex editing approach can reliably create heritable mutations in essential and closely related genes, and facilitates the functional genetic analysis of essential genes, redundant genes, multi-gene families and complex gene networks.

Results

Highly efficient targeting of four closely related rice MPK genes with PTG/Cas9

We designed constructs for Agrobacterium-mediated transformation to mutate single and multiple stress-responsive rice MPKs and to explore whether PTGs can reliably produce stable mutants to facilitate the analysis of multi-gene families. Closely related MPK genes in Arabidopsis overlap in their function and partly compensate for each other in single mutants. In Atmpk3 or Atmpk6 knock-out lines, the expression, protein, and activity levels of the other kinase are elevated when compared with the wild type (Beckers et al., 2009). The PTG/Cas9 constructs were designed to target single rice MPK genes or combinations of them to overcome possible functional redundancy (Table 1). PTGb3, PTGb4, PTGb5 and PTGb6 carried Cas9 driven by a rice ubiquitin promoter (UBIp:Cas9) and PTGs driven by a rice U3 promoter (U3p:PTG) to target MPK1 (AtMPK6 ortholog; Rohila and Yang, 2007), MPK2, MPK6 (both AtMPK4 orthologs) and MPK5 (AtMPK3 ortholog), correspondingly, for the creation of single mutants (Table 1). These PTGs encoded two gRNAs targeting distinctive sites at one of the corresponding MPK genes. An additional construct, PTGb2, targeted MPK5 with only one gRNA (PS2; Figure  S1). PTGb7 and PTGb8 encoded four gRNAs for targeting MPK5/MPK1 and MPK6/MPK2, respectively, to create double mutants (Table 1). We chose these pairs of MPK genes based on their close phylogenetic relationship (Reyna and Yang, 2006; Rohila and Yang, 2007). Creating double mutants of rice MPK genes might help to discover redundant functionality, as previously observed in Arabidopsis. PTGb9 contained eight gRNAs to mutate all four closely related rice MPK genes in a single transformation event.

Table 1. Design and efficiency of PTG/Cas9 binary vectors used to target four closely related MPK genes
Binary vectorNo. of gRNAsTarget genesEfficiencyb
  1. a

    Previously described in Xie et al. (2015).

  2. b

    Percentage of genome-edited T0 lines over total number of tested T0 lines (total number of tested lines in parentheses).

PTGb32 MPK1 100% (4)
PTGb42 MPK2 100% (13)
PTGb52 MPK6 100% (2)
PTGb6a2 MPK5 86% (17)
PTGb21 MPK5 100% (9)
PTGb7a4 MPK5/1 100%/89% (17)
PTGb84 MPK6/2 83%/67% (12)
PTGb9a8 MPK5/1/6/2 86%/86%/86%/86% (14)

Our previous study has shown the functionality of all used gRNAs in rice protoplasts and demonstrated high mutation efficiency (86–100%) for PTGb6 (MPK5), PTGb7 (MPK5/1), and PTGb9 (MPK5/1/6/2) in transgenic rice (Xie et al., 2015). In this study, we also targeted MPK1, MPK2, MPK6 and MPK6/2 to investigate if we could mutate these MPK genes with similar high efficiencies, and to produce a more complete set of the rice MPK mutants. Transformation with PTGb3 (MPK1), PTGb4 (MPK2) and PTGb5 (MPK6) each yielded 100% mutation efficiency (Figures S2 and S3; Table 1). We observed slightly lower efficiencies when simultaneously targeting two MPKs. PTGb8 targeting MPK6 and MPK2 produced 83% efficiency at the MPK6 and 67% efficiency at the MPK2 locus (Figure S4); however, only one protospacer per gene provided a convenient restriction enzyme (RE) site for the PCR-RE assay (Figure S1). Therefore, the actual editing efficiency for PTGb8 could be higher than the measured 83 and 67% if both target sites for each gene are considered. The main objective was to create mpk6 mpk2 double mutations, and a high percentage of 67% showed genome editing at both genes, which yielded putative double mutants (Figure S4, green boxes). Out of these putative double mutant lines, 75% of them had biallelic mutations on MPK6 and MPK2, simultaneously.

Whereas transformation with PTGb2, PTGb4, PTGb6, PTGb7, PTGb8 and PTGb9 produced an abundance of hygromycin-resistant and putatively edited rice calli and plants, efforts to mutate only MPK1 with PTGb3 or MPK6 with PTGb5 were less successful. We only recovered four and two hygromycin-resistant calli for PTGb3 and PTGb5, respectively, out of three independent repeats of 600 calli in total for each construct. Nevertheless, we showed that the small percentage of recovered lines was mutated with an efficiency of 100% (Figure S2; Table 1). Therefore, PTG/Cas9 can induce single or multiple gene mutations in the closely related members of a gene family with efficiencies of 67–100%.

Editing of essential genes produces viable but heterozygous T0 mutants

We encountered great difficulty in mutating rice calli with PTGb3 and PTGb5 (targeting MPK1 or MPK6, respectively), which led to the hypothesis that MPK1 and MPK6 loss of function is lethal. Whereas PTGb3-1 and PTGb3-2 lines produced normal plantlets, rice calli of lines PTGb3-3 and PTGb3-4 turned black on the medium and mostly died. PTGb3-3 was not able to regenerate any plantlets. Interestingly, even though the callus of line PTGb3-4 carried a homozygous fragment deletion on the MPK1 gene (Figure S2a), we were able to recover two plantlets from the regeneration medium. The recovered plantlets remained severely dwarfed, however, and only produced one sterile panicle (Figure 2a), supporting the hypothesis that the knock-out of MPK1 causes detrimental defects. Similarly to PTGb3-1 and PTGb3-2, the two resistant rice calli obtained from lines PTGb5-1 and PTGb5-2 produced normal plantlets.

Figure 2.

Phenotypes of true mpk1 knock-out plants and seeds from heterozygous parents: (a) Comparison of T0 control plant with T0 PTGb3 plants with the chromosomal deletion of MPK1. Mature plants 4-A and 4-B stayed severely dwarfed and sterile compared with the control; (b) 110 screened seeds from heterozygous MPK1 mutants showed either normal embryos (84 seeds) or abnormally small embryos (26 seeds), marked red.

As we did not observe any abnormal phenotype for the lines of PTGb3-1, PTGb3-2, PTGb5-1 and PTGb5-2, we genotyped their T0 generation plants. All lines carried two different mutations on each allele, with one allele carrying a mutation of either −3 bp or −6 bp (Figure S5). Such mutations are likely to preserve the ORF of the coding sequence because three nucleotides constitute a codon for one amino acid. Therefore, lines PTGb3-1, PTGb3-2, PTGb5-1 and PTGb5-2 are likely to carry a mutated but still functional MPK1 or MPK6 protein that protects the plantlets from the observed detrimental phenotypes of lines PTGb3-3 and PTGb3-4. Taken together, the editing of MPK1 or MPK6 could produce viable T0 generation plants only if one allele carried a mutation that putatively preserved protein function. Although it was impossible to obtain viable homozygous knock-out mutant plants of these essential genes, we show here that at least heterozygous T0 generation mutants of essential genes can be created by exploiting NHEJ-mediated DNA repair. In addition, these heterozygous lines carry new and previously undescribed functional alleles with deletions of some amino acids.

Rice MPK1 and MPK6 loss-of-function phenotypes are caused by different lethal defects

To validate the hypothesis that MPK1 and MPK6 knock-out leads to a detrimental or lethal phenotype, we then tried to germinate seeds of the self-pollinated lines PTGb3-1, PTGb3-2, PTGb5-1 and PTGb5-2 to investigate the phenotype of homozygous progeny for MPK1 or MPK6 loss-of-function mutation. Whereas all seeds of the wild-type control germinated (100%), the germination rates of PTGb3-1 and PTGb3-2 seeds were 70 and 73.8%, respectively (Table 2). The 3:1 Mendelian inheritance of the lethal mpk1 mutation phenotype in T1 progeny was supported by statistical analysis. Therefore, the non-germinating seeds of the mpk1 lines most likely consist of homozygous knock-out mutants with the embryo-lethal loss of MPK1.

Table 2. Effect of mpk1 and mpk6 mutations on the germination rates of seeds from PTGb3 and PTGb5 lines
LineNo. of total seedNo. of germinated% germinated% not germinatedPa
  1. a

    Derived from χ2 statistics with the expectation of Mendelian inheritance, and that homozygous knock-out seeds will not germinate because of embryo lethality.

PTGb3-1302170300.53
PTGb3-2423173.826.20.86
PTGb5-1302996.73.3<0.01
PTGb5-2312890.39.7<0.05
Wild type3838100

During the preparation of this manuscript, a study further confirmed that MPK1 loss of function is embryo-lethal by analyzing heterozygous T-DNA insertion mutants (there referred to as OsMPK6 with the use of a different nomenclature system; Yi et al., 2016). The authors found that mpk1 embryos were affected in cell differentiation, which arrested the embryonic development and led to unviable and abnormally small embryos (Yi et al., 2016). We therefore examined the seeds of our heterozygous mpk1 and mpk6 mutants to compare the embryo phenotypes (Figure 2b; Table S1). Of the 110 seeds analyzed from the heterozygous mpk1 mutant lines, 26 showed an abnormally small embryo (Figure 2b), and statistical analysis supported the hypothesis that homozygous mpk1 seeds have defective embryos (Table S1).

When analyzing seeds of heterozygous mpk6 mutants, only a small subset showed an abnormal embryo (Table S1). In addition, the germination assay showed that seeds from PTGb5-1 and PTGb5-2 germinate similarly to seeds of the wild type, with 96.7 and 90.3% germinated seeds, respectively (Table 2). Based on these results, homozygous mpk6 mutation may completely abolish seed development instead of only arresting embryo development. Indeed, DNA sequencing of 17 randomly chosen germinated seedlings from the PTGb5-1 T1 generation confirmed that no homozygous mpk6 knock-out seeds could be obtained from heterozygous parents (Table S2). To validate the hypothesis that MPK6 loss of function abolishes seed development, we investigated the seed-setting rates of wild-type plants versus homozygous, ORF-preserving mpk6 plants (−3 bp mutation in both alleles) versus heterozygous mpk6 plants (−3 bp in allele a and −4 bp in allele b). The seed-setting rate was defined as number of seeded spikelets over the number of total spikelets per panicle. As expected, wild-type plants and homozygous mpk6 plants with ORF-preserving mutations displayed similar seed-setting rates of 92.5 and 94.0%, respectively (Figure S6). Heterozygous mpk6 mutants, on the other hand, displayed a reduced seed-setting rate of 57.7%, indicating that either pollen carrying the −4 bp allele or egg cells with the same −4 bp allele are infertile. We therefore conclude that the lethality of MPK1 and MPK6 loss of function is caused by different developmental defects preventing the development of embryos or seed, respectively.

Mutations in essential and non-essential genes were transmitted to T1 generations

We next investigated inheritance for mutations in essential and non-essential genes by genetically characterizing the progeny of our multiple mpk mutant plants. We chose to analyze the progeny of self-pollinated T0 PTGb9 plants that carry biallelic mutations at eight targeted sites. The T1 generation of these biallelic mutants was easy to genotype because five of the eight targeted genomic sites encompass RE-sites that were destroyed at both alleles by the targeted mutations (Figures 3 and S1). We also screened the T1 plants obtained to find transgene-free progeny. After self-pollination, transgene-free mutants could be obtained through genetic segregation and removal of the T-DNA fragment containing PTG/Cas9. A transgene-free mutant plant should not contain any DNA encoding either the PTG or the Cas9 protein. PCRs with primer pairs amplifying the 1.9-kb U3p:PTG9 cassette or a 1-kb fragment of the Cas9 gene confirmed that plants 2-1, 3-2 and 4-2 were transgene free (Figure S7, blue arrows). In contrast, the transgenic parent plants (T0) and plant 1-2 showed the presence of U3p:PTG9 and Cas9 in their genomic DNAs (Figure S7). Primers amplifying an endogenous rice gene confirmed that the DNA in all samples was of sufficiently good quality for PCR (Figure S7, control). Any mutations detected in these transgene-free plants should be the result of inheritance because PTG/Cas9 is no longer present to induce any new mutations.

Figure 3.

T1 rice plants of four PTGb9 lines carrying eight biallelic mutations in four genes. (a) PCR-RE assay. The PCR product of mutated DNA cannot be digested by the restriction enzyme (RE; highlighted in blue). PCR products of the corresponding genes in all four T1 plants are resistant to digestion (red arrow), indicating indel mutations at these sites. MPK2 possesses two EcoNI sites, and digestion of the PCR product from mutated DNA yields two bands: CK+, digested wild-type control; CK−, undigested wild-type control. (b) Sequences were obtained by direct sequencing of the PCR products or cloned fragments. Lower case letters (a/b) behind the plant numbers indicate different alleles. The scissors indicate the predicted double-strand break site 3 bp upstream of the PAM (red letters). Protospacer sequences are highlighted in green. Dash: deletion. Lower case letter: insertion or substitution.

To genotype the mutation of the T1 plants, we amplified the targeted regions of the four genes for PCR-RE assays. The PCR product from wild-type DNA could be digested, whereas the PCR product from T1 mutant plant DNAs was indigestible, indicating that the mutations at the protospacers PS3, PS5, PS2, PS1 and PS7 (Figure S1) were stably transmitted to the T1 generation (Figure 3a, red arrows). We further tested whether mutations could also be found at the remaining three sites. Because it is not possible to detect mutations at PS4, PS6 and PS8 with a PCR-RE assay, and a T7 endonuclease I assay is not suitable to detect homozygous mutations (Xie et al., 2014b), we decided to directly sequence the PCR products. The results from the direct sequencing of PCR products also reveal the zygosity of the T1 mutant plants. The sequencing result consists of distinct single peaks if both alleles carry the same mutation and are homozygous. If the gene carries a different mutation on each allele and is heterozygous, the result will consist of ambiguous double peaks, usually starting from the Cas9 targeting site (Figure S5, black arrow). The sequencing results confirmed mutations at all eight genomic target sites of the four MPK genes in T1 generation plants of PTGb9. This concludes that all eight mutations were faithfully inherited from the T0 into the T1 generation (Figure 3b); however, each plant had a different degree of heterozygosity. Plant 2-1 carried only homozygous mutations. Plant 3-2 was homozygous for mutations in MPK2, MPK5, and MPK6, but heterozygous for mutations in MPK1. Plant 1-2 was homozygous for MPK2 and MPK5 mutation sites, but heterozygous at MPK1 and MPK6. Plant 4-2 carried homozygous mutations in MPK1 and MPK5, but was heterozygous for mutations in MPK2 and MPK6 (Figure 3b). Even though the zygosity varied, all lines carried biallelic mutations. The progeny inherited all mutations regardless of the essential functions of MPK1 and MPK6, but we went on to analyze whether the inherited mutations would preserve allele functionality in these cases.

Inherited mutations in essential genes favor the preservation of ORFs and functional alleles

In agreement with our previous results that loss-of-function mutations in MPK1 and MPK6 are unfavorable, we detected an enrichment of inherited ORF-preserving mutations for the MPK1 and MPK6 genes in T1 plants of the PTGb9 lines (Figure 4): 66% of the MPK1 mutations affected three or multiples of three nucleotides (Figure 4a), and 75% of such mutations were found in MPK6 (Figure 4a). When only considering mutations in exons, 100% of the mutations in MPK1 preserved the ORF, whereas only 66.7% preserved the ORF in MPK6 (Figure 4b). The mutations that contributed to the 33.3% of non-preserving mutations in MPK6 exons were heterozygous mutations, however, and were always paired with one ORF-preserving mutation (Figure 4b,d). By contrast, mutations in the non-essential genes MPK2 and MPK5 generally altered and interrupted the ORF of the sequence (Figure 4a,b).

Figure 4.

Mutations in the MPK1 and MPK6 loci preserve the open reading frame (ORF). (a) Frequency of ORF-preserving mutations versus non-preserving mutations in the four targeted genes of PTGb9 T1 plants. (b) Frequency of ORF-preserving mutations only considering exons (*33.3% of non-preserving alleles in MPK6 exons were always paired with one ORF-preserving mutation in heterozygous plants). (c) Predicted MPK1 protein sequence of the mutant alleles compared with the wild-type sequence. The mutations in MPK1 resulted in a deletion of between one and five amino acids, but kept the predicted protein kinase domains intact, which start at amino acid 67 in the wild-type protein. (d) Predicted MPK6 protein sequence of the mutant alleles compared with the wild-type sequence. Note that sequences of allele 1-2a and 4-2b produced a premature stop, but both plants possessed a second allele, 1-2b and 4-2a, respectively, which kept most of the predicted protein kinase domain starting from amino acid 46 intact. The start of the conserved MPK signature site is marked with a black box. (e) Predicted MPK5 protein sequence of the mutant alleles compared with the wild-type sequence. Mutations in all alleles produce a premature stop of the protein sequence before the predicted protein kinase domains of the wild-type protein, starting from amino acid 35.

To predict the effect of the mutations in these new MPK1 and MPK6 alleles, we translated the coding sequence of the mutants into a protein and compared the first 60 and 120 amino acids, respectively, with the wild-type sequence (Figure 4c, d). For MPK1 protein sequences, plant 1-2 possessed two different mutant alleles with one translating into a protein shortened by five amino acids (QATLS; 1-2a) and another into a protein shortened by one amino acid (S; 1-2b). Plants 2-1, 3-2 and 4-2 carried a different mutation from 1-2b, but translated into the same protein sequence that is shortened by one amino acid (Figure 4c). An analysis with InterProScan revealed that all mutations left the predicted protein kinase domain intact, which starts at amino acid 67 in the wild-type protein. Interestingly, for MPK6 alleles, we were able to detect sequences that result in premature stops for plant 1-2 and plant 4-2 (1-2a allele and 4-2b allele; Figure 4d). These mutations are likely to interrupt MPK6 protein function; however, each of these alleles was paired with another allele (1-2b and 4-2a) that preserved the ORF and most of the protein sequences (Figure 4d). Plants 1-2 and 4-2 might possess at least one functional MPK6 allele. Plants 2-1 and 3-2 carried mutations, which resulted in the deletion of one amino acid (D; 2-1/3-2/4-2a). In contrast to the results from MPK1 and MPK6 alleles, all new MPK5 alleles detected in the PTGb9 lines resulted in a premature stop of the MPK5 protein (Figure 4e). The inheritance of mutations in MPK1 and MPK6 was strongly biased towards ORF-preserving mutations, probably because of the lethality of MPK1 or MPK6 knock-out; however, the mutations present new, undescribed and functional alleles of these genes.

Parts of the MAP kinase conserved site are unimportant for MPK6 function

Although all mutations in the MPK1 mutant alleles occurred before the predicted protein kinase domains, one target site for MPK6 was placed inside the conserved domain of MAP kinases (PS01351). The start of this conserved region is important for proper MPK structure and function in humans (Cobb and Goldsmith, 1995; Dorin et al., 1999; Rodriguez-Viciana et al., 2006). Interestingly, a seven amino acid deletion in allele 1-2b destroyed the first few amino acids (F-DNHIDA) of this important MPK signature pattern (Figure 4d). The second allele of the same plant (1-2a) possessed a different mutation that disrupted the ORF and produced a premature stop (Figure 4d). This suggests that the mutated MPK6 protein with partial deletion of the conserved MPK signature is still functional in protecting the plant from lethality. Therefore, these seven amino acids, even though highly conserved among MPKs, are not important for at least partial functions of the MPK6 protein.

PTG/Cas9-mediated editing induces a variety of mutations

We noticed a general tendency of our PTG/Cas9-induced mutations to affect multiple nucleotides. Most previous studies using CRISPR/Cas9 in Arabidopsis and rice reported a strong tendency to 1-bp indels in the targets of transformed plants (Endo et al., 2014; Feng et al., 2014; Hyun et al., 2014; Zhang et al., 2014; Mikami et al., 2015). In contrast, other studies identified longer deletions (≥3 bp) as the main type (Zhou et al., 2014; Xu et al., 2015). We analyzed the sequencing results of edited sites in T0 plants for all constructs to investigate the type of indels produced by PTG/Cas9. The PTG/Cas9-induced mutations in a total of 54 independent sites were slightly biased towards 1-bp insertions (29.6%; Figure 5), but to a much lower degree than in previous reports (37–54%; Feng et al., 2014; Zhang et al., 2014).

Figure 5.

Frequency of PTG/Cas9-induced indels in transgenic rice lines. The detected indel mutations and their frequencies were calculated based on a sample of 54 sequences derived from independent mutational events. The calculated frequencies of all insertions and deletions are shown in the box.

A recent study found that TALENs produce 69.9% deletions, with 81.1% of them affecting multiple nucleotides in rice (Zhang et al., 2015a); however, the overall efficiency reached only 25%, even though the codons were optimized for TALEN expression in rice (Zhang et al., 2015a). In our PTG/Cas9 system with efficiencies of 67–100%, deletions accounted for 64.8% of the mutations (Figure 5), with 59.3% of all mutations affecting multiple nucleotides. This result shows that PTG/Cas9 enriches a variety of indels similar to TALENs, but advantageously exhibits a much higher targeting efficiency, even at multiple sites. Deletions account for the biggest type of mutations, with up to 74 bp (Table S3). In contrast, insertions and conversions were observed only for up to 4 or 3 bp, respectively. The observed variety of indels was greater for the MPK5 and MPK6 locus (up to 74 and up to 48 bp, respectively), compared with MPK1 and MPK2 (up to 15 and up to 11 bp, respectively), suggesting that the target region may influence the mutational variety (Table S3).

PTG/Cas9 allows precise and inheritable chromosomal deletions

Polycistronic tRNA-gRNAs (PTGs) provide an efficient way to express multiple gRNAs simultaneously. Targeting a coherent chromosomal region with two gRNAs can result in the deletion of a fragment between both sites. We now asked whether these chromosomal deletions are inherited following the Mendelian rules. To address this question, we analyzed the T1 progeny of PTGb9 lines that were heterozygous for a fragment deletion in MPK5. We assumed a full-length allele if the PCR product amplified by the primers, flanking both target sites, was the same size as predicted from the wild-type genomic sequence. On the other hand, a 727-bp smaller PCR product would indicate a deletion. The simultaneous occurrence of both, full-length and truncated, fragments was interpreted as heterozygosity. In the PTGb9 T0 generation, lines 4, 5 and 6 carried a full-length MPK5 gene on one allele and a 727-bp deletion on the other allele, as confirmed by sequencing (Figures S8a and S9). The deletion occurred at the precise joining points of the breaking sites, 3-bp upstream of the PAMs (Figure S9). The initial PCR genotyping of two T1 plants from each line revealed that the chromosomal deletion could be stably inherited (Figure S8a). T1 plants 4-2 and 6-1 were homozygous for either the full-length or truncated alleles, respectively, demonstrating that identical alleles can be fixed in the genome by chromosome reassortment.

We further analyzed the progeny of lines 4, 5 and 6 for the expected ratio of 1:2:1 inheritance for T1 progeny of heterozygous T0 plants. We tested a subset of eight T1 plants from line 4, ten T1 plants from line 5 and nine T1 plants from line 6. The results from all three tested lines supported the assumption that the inheritance of the deletion and full-length alleles followed the Mendelian segregation (Figure S10). Additionally, T1 plant 6-1, which was homozygous for the deletion, produced a T2 progeny only containing the chromosomal deletion fragment (Figure S8c). These results underline that PTG/Cas9-induced chromosomal deletions are normally inherited in a Mendelian fashion. If a plant is heterozygous for a deletion, a homozygous progeny can be obtained by self-pollination and maintained in future generations.

Discussion

A difference between nuclease-induced mutation and insertional mutation via T-DNA or Tos17 retrotransposon in rice is that nucleases like Cas9 merely cut the DNA, but that mutation is dependent on the NHEJ repair mechanism of the cell. We demonstrated that NHEJ-induced mutations preserving the functional ORF were selected and maintained in the resulting mutants of essential genes. Shi et al. (2015) made similar observations while identifying essential genes in cancer cells by negative selection via CRISPR/Cas9. Mutations in the 5′ coding exons of essential genes preserved the existing ORF, and the variants remained functional (Shi et al., 2015). It is therefore possible to judge the essentiality of genes based on the observed mutational types. If ORF-preserving mutations dominate, one might hypothesize that the function of this gene is essential. Interestingly, new functional alleles with deletions of amino acids can be obtained with this approach, as shown with our MPK1 and MPK6 lines.

The enrichment for ORF-preserving mutations in MPK1 and MPK6 led to our discovery of their essential roles in rice development. We demonstrated the embryo lethality of MPK1 loss of function based on abnormally small embryos and reduced germination rate (Figure 2; Tables 2 and S1). While this manuscript was in preparation, a study by Yi et al. (2016) further confirmed this embryo lethality by analyzing heterozygous T-DNA mutants (there referred to as OsMPK6). The mpk1 mutant exhibited defects in cell differentiation and arrestment of embryonic development at the globular stage (Yi et al., 2016). In agreement, previous studies could only analyze cell lines with the mpk1 mutation, and failed to report viable mpk1 mutant plants (Kurusu et al., 2005; Kishi-Kaboshi et al., 2010); however, knock-out mutants of AtMPK6, the Arabidopsis ortholog to MPK1, show no obvious growth or developmental defects (Wang et al., 2007). Only an Atmpk6 Atmpk3 double mutation is lethal, and in an Atmpk6 background, AtMPK3 acts haploinsufficiently by causing defective ovule development and female sterility (Wang et al., 2007, 2008). The knock-out of either AtMPK6 or AtMPK3 in Arabidopsis can be at least partially compensated for by the remaining MPK, whereas we showed that the mere knock-out of rice MPK1 already caused detrimental defects.

Whereas previous studies already hinted to the potential lethality of MPK1 knock-out in rice, no such data were available for MPK6. We discovered the essentiality of MPK6 based on ORF-preserving mutations and the fact that heterozygous parents failed to produce homozygous knock-out seeds (Figure 4; Table S2). The cause of lethality is different for MPK6 compared with MPK1 knock-out because an abnormal embryo was not found in MPK6 mutant seeds. The MPK6 loss of function may cause no seed development at all, rather than just arresting embryo development, as supported by the reduced seed-setting of heretozygous MPK6 plants (Figure S6). Importantly, based on a seven amino acid deletion at the MAP kinase site of MPK6 (Figure 4d), we show that this conserved domain can be disturbed without losing full MPK6 functionality. This region was previously hypothesized to be important for proper MPK protein structure and function (Cobb and Goldsmith, 1995; Dorin et al., 1999). A mutation in the same site was associated with disease in humans (Rodriguez-Viciana et al., 2006).

Similar to MPK1, the essentiality of MPK6 was unexpected, based on research in different species. In Arabidopsis, the homozygous knock-out mutant of AtMPK4 (the ortholog of rice MPK6) exhibits a severely dwarfed phenotype but remains viable (Petersen et al., 2000; Brodersen et al., 2006). This example indicates that discoveries from Arabidopsis are sometimes non-transferable to monocot crops. By characterizing genome-edited mutants, our study reveals that rice MPK1 and MPK6 have essential and non-redundant functions in plant growth and development, which is different from Arabidopsis where single MPK genes are dispensable. In addition, gene editing provides new alleles with the deletion of several amino acids, which can be used to further dissect the function of essential genes. The creation of new allelic variants was previously not possible with insertional mutants, like T-DNA lines.

Based on our study, we suggest the following strategy to study essential genes: (i) Use new genome-editing technologies because the NHEJ-mediated mutations can produce heterozygous mutants with new alleles that allow the analysis of at least parts of the function of the gene. (ii) Target conserved protein sites and domains to induce the deletion of putatively important amino acids. If the site-defining amino acids are deleted, but the protein is still functional and the allele is inherited into homozygous mutants, one can conclude that the targeted site or domain is not critical for the studied phenomena. By contrast, if a new allele cannot be inherited into future generations, then the targeted protein site or domain is essential for proper protein function. (iii) Use highly efficient genome-editing techniques like PTGs to increase the chance of multiple bp deletions and the excision of multiple amino acids. In some cases, however, it may be necessary to complement CRISPR/Cas9-mediated mutagenesis with the methods of gene suppression (e.g. RNA interference, artificial miRNAs or CRISPR interference). One limitation of only analyzing CRISPR-edited mutations is that heterozygous mutants might lack a phenotype, even if the functional allele has amino acid deletions. Gene suppression might help uncover a phenotype without causing lethality by significantly reducing the level of gene transcripts without completely eliminating it.

The multiplexed PTG/Cas9 genome-editing technology shows promise for analyzing not only single genes but also multi-gene families. The analysis of PTGb8 plants showed that 67% of all obtained lines were putative double mutants for MPK6 and MPK2 (Figure S4). This high frequency of biallelic mutations on multiple target sites minimizes the effort to screen for multi-gene knock-outs. Furthermore, it is possible to program the PTG/Cas9 system to delete chromosomal fragments (Figure S8). This enables researchers to remove whole genes or regulatory elements from the genome. In addition, mutants could be self-pollinated to remove the genome-editing device and to produce transgene-free mutants (Figure S7), eliminating possible positional effects of random T-DNA integration. Our study also demonstrated that chromosomal deletions were inherited and segregated according to the Mendelian manner (Figure S10), an analysis that was missing in some previous reports (Zhou et al., 2014; Gao et al., 2015; Xie et al., 2015).

It is a matter of debate whether TALEN and CRISPR/Cas9 create different types of indel mutations in Agrobacterium-transformed plants. Zhang et al. (2015a) showed that TALEN creates a more diverse array of indels than CRISPR/Cas9 because of their differences in inducing the DSB. We found that boosting editing efficiency could also increase the diversity of indels created by CRISPR/Cas9 (Figure 5; Table S3). The PTG system drastically increases gRNA expression compared with the conventional method (Xie et al., 2015). Our results indicate that increased editing efficacy enriches multiple base-pair mutations and the diversity of indels. Current studies showed that Cas9 can cleave target sites with 1-bp indels in the target region when either the target DNA or RNA spacer creates a bulge to realign the RNA:DNA heteroduplex (Lin et al., 2014; O'Geen et al., 2015). The high gRNA expression and editing efficiency of PTG/Cas9 might promote the additional cleavage of mutated target sites with 1-bp indels, and therefore cause the shift towards multiple bp deletions (Figure 3). We found this to be true regardless of the number of co-expressed gRNAs, as we used data from all constructs with between one and eight co-expressed gRNAs.

In conclusion, our study demonstrates that NHEJ-mediated mutation created by CRISPR/Cas9 genome editing can facilitate functional analysis of both essential and non-essential genes in plants. Based on the analysis of genome-edited mutations, including ORF-preserving frequency and new alleles, we propose specific recommendations for studying essential genes and important protein domains. We discovered essential and non-redundant functions of MPK1 and MPK6 in rice development, which was an unexpected finding as single knock-outs of their orthologs in Arabidopsis are viable. We further demonstrated the high effectiveness of PTG/Cas9 to create a variety of heritable mutations at multiple sites, which facilitates the creation of different allelic variants and the functional discovery of essential genes or protein domains.

Experimental procedures

Plant materials and growth conditions

Rice cultivar Kitaake (O. sativa spp. japonica) was used in this study. Seeds were dried for 36–48 h at 45°C to break dormancy before germination in warm water (37°C) for 2 days. The germinated seeds were planted into METROMIX 360 Soil (SunGro Horticulture, http://www.sungro.com) and grown in a glasshouse with 12 h of supplemental light per day at 28°C day/23°C night. Plants were fertilized with 0.25% urea and 0.1% Sprint iron solution after the first week, and fertilized with 0.25% urea in subsequent weeks until flowering.

PTG/Cas9 gene constructs

The PTG constructs PTGb6, PTGb7 and PTGb9 have been previously described (Xie et al., 2015). PTGb3, PTGb4, PTGb5, PTGb2 and PTGb8 were constructed by inserting the previously described and assembled PTGs PTG3, PTG4, PTG5, PTG2 and PTG8 (Xie et al., 2015) into the BsaI-digested pRGEB32 binary vector (Addgene Plasmid #63142; http://www.addgene.org).

Agrobacterium-mediated rice transformation

Binary vectors were transformed via electroporation into the Agrobacterium tumefaciens strain EHA105. Rice calli derived from mature seeds of the cultivar Kitaake were transformed with the Agrobacterium-mediated method, according to a previously described protocol (Hiei and Komari, 2008).

Genomic DNA extraction

DNA was extracted from 100 to 200 μl of liquid nitrogen-ground leaf material by adding 0.9 ml of prewarmed extraction buffer (140 mm sorbitol, 220 mm Tris-hydrochloride, pH 8.0, 22 mm ethylenediaminetetraacetic acid, 800 mm sodium chloride, 34 mm sarkosyl, and 22 mm cetyl trimethylammonium bromide), with incubation at 65°C for 1 h in a 1.5-ml reaction tube. After adding 400 μl of chloroform:isoamylalcohol (24:1), the sample was mixed on a rotator for 20 min at room temperature of around 22 °C before centrifugation at 13 400 g for 15 min. A 2/3 volume of isopropanol was added to the upper aqueous phase before incubation at −20°C for 30 min to precipitate DNA. The DNA was pelleted by centrifugation at 13 400 g for between 30 sec and 5 min, and the pellet was washed with 70% ethanol before incubation with TE buffer (10 mm Tris-hydrochloride, pH 8.0, and 1 mm ethylenediaminetetraacetic acid) containing 0.1 mg ml−1 RNase A at 37°C for 30 min. The DNA was precipitated by adding 1/10 volume of 3 m sodium acetate, pH 5.2, and 2.0–2.5 volumes of absolute ethanol with incubation at −20°C overnight. The DNA was pelleted and washed with 70% ethanol before dissolving the dried pellet in an appropriate volume of TE buffer. The concentration was measured by a spectrophotometer.

Genotyping of genome-edited progeny

Polymerase chain reactions (PCRs) were performed with GOTAQ DNA Polymerase in GOTAQ Reaction Buffer (Promega, http://www.promega.com). DNA fragments were analyzed by electrophoresis with a 1% agarose gel and stained with ethidium bromide. The primers used can be found in Table S4. The genome-editing device in the progeny was detected with primers specific for the 1.9-kb U3p:PTG9 cassette and a 1-kb Cas9 fragment. Chromosomal deletions were detected by PCR with primers flanking the two target sites of each gene. Indels on targets with RE sites were detected by PCR-RE assay: PCR products encompassing the target were digested with the appropriate RE for 2–3 h. Selected PCR products were sequenced to determine the specific mutation. Double peaks were resolved using degenerate sequence decoding (Ma et al., 2015b). If the double peaks could not be decoded, the PCR product was cloned into pGEM-T EASY vector (Promega) by TA cloning.

Germination assay, embryo phenotyping and seed-setting rate

Germinated seeds were counted after 2 days and the resulting data were analyzed statistically. For embryo phenotyping, dried seeds of the mutants and wild-type were dehusked and examined under a stereomicroscope. The seeds were categorized as either having a normal embryo or an abnormal embryo (Figure 2b). The seed-setting rate was measured by counting the number of seeded spikelets over the number of total spikelets per panicle of mature plants.

Alignment of protein sequences and prediction of functional domains

The predicted protein sequences of the MPK1, MPK6 and MPK5 mutant alleles were aligned with their wild-type sequence (msu rgap 7) by the clustal omega multiple sequence alignment tool (Li et al., 2015). The functional protein domains of the sequences were predicted with the web-based InterProScan tool (Jones et al., 2014).

Gene accession numbers

The GenBank RefSeq accession numbers of the targeted genes are Os06g0154500 (MPK1), Os08g0157000 (MPK2), Os03g0285800 (MPK5) and Os10g0533600 (MPK6).

Acknowledgements

This work was supported by Monsanto's Beachell-Borlaug International Scholars Program and Penn State's J. Lloyd Huck Dissertation Research Grant to B.M. This work was also supported by an Agriculture and Food Research Initiative Competitive Grant no. 2013-68004-20378 from the USDA National Institute of Food and Agriculture. The authors declare no conflicts of interest.

Ancillary