The recombinant clustered regulatory interspaced short palindromic repeats (CRISPR)/Cas system has opened a new era for mammalian genome editing. Here, we constructed pX330 plasmids expressing humanized Cas9 (hCas9) and single guide RNAs (sgRNAs) against mouse genes and validated them both in vitro and in vivo. When we randomly chose 291 target sequences within protein coding regions of 73 genes, an average number of off-target candidates (exact match 13 nucleotides from 3′ target and NGG) found by Bowtie software was 9.2 ± 21.0 (~1.8 times more than the estimated value, 5.2). We next validated their activity by observing green fluorescence reconstituted by homology dependent repair (HDR) of an EGFP expression cassette in HEK293T cells. Of the pX330 plasmids tested, 81.8% (238/291) were found to be functional in vitro. We finally injected the validated pX330 plasmids into mouse zygotes in its circular form against 32 genes (including two genes previously tested) and obtained mutant mice at a 52.9 ± 22.3% (100/196) mutation frequency. Among the pups carrying mutations on the autosomes, 43.6% (47/96) carried the mutations in both alleles. When off-target candidate sites were examined in 63 mutant mice, 0.8% (3/382) were mutated. We conclude that our method provides a simple, efficient, and cost-effective way for mammalian gene editing that is applicable for large scale mutagenesis in mammals.
Gene knockout organisms are excellent models for studying human genetic diseases, as well as gene functions involved in behavior, physiology and development (Guan et al. 2010). To achieve designed mutagenesis, the process usually involves introduction of a drug resistant gene via homologous recombination in embryonic stem (ES) cells, culturing ES cells, aggregating ES cells with preimplantation embryos, transplanting into a pseudopregnant female, then confirming the germ line transmission by mating chimeric offspring (Skarnes et al. 2011). While this approach is commonly used it remains arduous, expensive and time consuming. Not only this, but it also requires a highly trained researcher to achieve all of the techniques necessary for successful mutagenesis.
The next generation of targeted mutagenesis arguably began with the emergence of zinc-finger nucleases (ZFNs) and/or transcription activator-like effector nucleases (TALENs) (Gaj et al. 2013). ZFNs and TALENs are artificially generated by the fusion of a FokI endonuclease with a DNA recognition motif. The resulting enzymes then recognize target DNA via peptide-DNA affinity. The FokI endonuclease component of the ZFN/TALEN then generates a double strand break (DSB), which is subsequently repaired through non-homologous end joining (NHEJ), which is highly error prone and often leads to the production of small indels (Gaj et al. 2013). Targeted mutations can also be achieved by co-introduction of a reference single stranded DNA (ssDNA) or double stranded DNA (dsDNA). This occurs via homology dependent repair (HDR) or high-fidelity homologous recombination (HFHR) following the generation of the DSB, resulting in designed mutations at a specific genomic locus (Menke 2013). Due to the efficiency of the DSB mediated mutation (allowing the generation of gene disrupted mice in essentially a single step) several research groups have recently reported gene targeted mice and rats generated by injecting the mRNA coding ZFNs/TALENs directly into the zygote (Mashimo et al. 2013; Sung et al. 2013). However, the design and preparation of these enzymes has proven extremely difficult, ultimately halting the spread and use of this technique.
A genome editing system adapted from type II bacterial organisms known as CRISPR, has been shown to have a similar function in the mammalian genome (Cong et al. 2013; Mali et al. 2013). CRISPRs work in tandem with CRISPR-associated (Cas) proteins (Gasiunas et al. 2012), of particular interest is Cas9. The use of the CRISPR/Cas system to generate gene disrupted mice has developed in the past 2 years to the point where specific gene disruption has been achieved with a high efficiency (Wang et al. 2013). Cas9 proteins are endonucleases that can be programmed to produce sequence specific DSB in vitro. In the original bacterial cells this system targeted foreign nucleic acids, such as those from infecting phages (Garneau et al. 2010; Makarova et al. 2011). The breaks occurred when CRISPR RNAs (crRNAs), Cas proteins and trans-activating crRNAs (tracrRNAs) combine to form ribonucleoprotein complexes that then targeted and degraded the foreign nucleic acids (Cong et al. 2013; Mali et al. 2013). CRISPR/Cas9 has been adapted to work in mammalian cells by the production of a vector containing the CRISPR seed sequence (also referred to as single guide RNA or sgRNA) followed immediately by a protospacer adjacent motif, or proto-spacer adjacent motif (PAM) sequence, and the humanized Cas9 (hCas9) endonuclease (pX330 vector; Cong et al. 2013). Upon injection of the vector into one of the pronuclei of fertilized oocytes, hCas9 is guided to the target site and cleaves the DNA resulting in a DSB (Mashiko et al. 2013).
In this report, we examined the feasibility of a CRISPR/Cas plasmid injection into zygotes for a large scale gene mutagenesis in mice. We used a simple validation system for gene targeted DSB via observation of green fluorescence reconstituted by HDR of an enhanced green fluorescent protein (EGFP) expression cassette (Fig. 1a; Mashiko et al. 2013). Also, we quantified the ability of this assay to differentiate between the capabilities of various CRISPR seed sequences to generate targeted gene disruptions in vivo. This was assessed by the rate of mutation as well as the size/type of indel produced. More specifically in regards to the CRISPR system itself, we investigated the various PAM sequences to ascertain which, if any, are preferable when designing seed sequences. Another consideration mentioned in the literature is the frequency of off-target recognition and cleavages (Fu et al. 2013). This report compares the off-target cleavages both observed in CRISPR produced mice and predicted by off-target search software and comparisons to the theoretical off-target recognitions are made and discussed.
Materials and methods
All animal experiments were approved by the Animal Care and Use committee of the Research Institute for Microbial Diseases, Osaka University.
Plasmid and mRNA preparation
The plasmids expressing hCas9 and sgRNA were prepared by ligating oligos into the BbsI site of pX330 (http://www.addgene.org/42230/; Cong et al. 2013). The pCAG-EGxxFP target plasmid was described previously (Mashiko et al. 2013). In brief, N-terminal and C-terminal EGFP coding regions overlapping 482 bp were polymerase chain reaction (PCR) amplified and placed under a ubiquitous CAG promoter (the chicken beta-actin promoter with the cytomegalovirus early enhancer element and the rabbit beta-globin splice acceptor and polyadenylation signal; Okabe et al. 1997; Niwa et al. 1991). The approximately 600 bp genomic fragments containing sgRNA target sequence were PCR amplified and placed into the multi-cloning sites (BamHI, NheI, PstI, SalI, EcoRI, and EcoRV) flanked by the EGFP fragments.
HEK293T cells transfection
Five hundred nanogram of pCAG-EGxxFP-target was mixed with 500 ng of pX330 with/without sgRNA sequences and then introduced into 4 × 105 HEK293T cells by the conventional calcium phosphate transfection method. The EGFP fluorescence was observed under a fluorescence microscope at 48 h after transfection.
B6D2F1 female mice were superovulated and mated with B6D2F1 males, and fertilized eggs were collected from the oviduct. The pX330 plasmids were injected into one of the pronuclei at 5 ng/μL. The injected eggs were cultivated in potassium simplex optimization medium (KSOM) overnight then two-cell stage embryos were transferred into the oviducts of pseudopregnant ICR females. The pups were genotyped by PCR and subsequent sequence analysis.
Potential off-target sites were found using free software, Bowtie (http://bowtie-bio.sourceforge.net/index.shtml) with rules outlined previously (Mali et al. 2013; Wang et al. 2013; Yang et al. 2004). Twelve to 14 bases preceding the PAM sequence with AGG, GGG, CGG, and TGG were aligned with the mouse genome (mm9). The 0.5–1 kb genomic fragments containing the off-target in the centre were PCR amplified and sequenced.
All the values were the means ± standard deviation (SD) of at least three independent experiments. Statistical analyses were performed using student's t-test. Differences were considered significant at P < 0.05.
Design and construction of CRISPR/Cas plasmids with pX330
To inactivate genes, we first searched nucleotide sequence, N20(NGG) after a translational start site (ATG) as the Cas9 target site. The last NGG is the PAM sequence, thus we used the first N20 as seed sequences for sgRNA. If the first nucleotide was not G, we added an extra G at the 5′ end, as the U6 promoter prefers a G for transcriptional initiation. If there were any in frame ATGs within 60 nts, we preferred to select the seed sequences downstream of the last ATG. When we aligned 12, 13, and 14 nt seed sequences plus NGG against the mouse genome (mm9) using the free software, Bowtie (http://bowtie-bio.sourceforge.net/index.shtml), average numbers of off-target sequences were found to be around 1.5, 1.8, and 2.4 times more than the estimated value (Table 1 and Table S1). The target sequence with AGG as the PAM sequence tended to have more off-target candidates but the significant difference was observed only between 13 nt with AGG and CGG (P = 0.036).
Table 1. Number of off-target candidates found with Bowtie
Off-target candidates were searched with Bowtie software against mouse genome (mm9). *P =0.036. †Total number of nucleotides in mm9 database is 2.78 × 109 bases. The values were calculated by 5.56 × 109/414, 15 and 16, respectively.
For the construction of CRISPR/Cas plasmids, we used a pX330 plasmid that simultaneously expresses hCas9 and sgRNA under chicken beta-actin hybrid (CBh) and human U6 promoters, respectively (Wang et al. 2013; Fig. 1b). The N20 seed sequences with or without the extra 5′ G were directionally cloned into pX330, taking advantage of BbsI as it generates the desired 4 base 5′-overhangs. As for the validation plasmid, we used a pCAG-EGxxFP plasmid containing 5′ and 3′ EGFP fragments that shares 482 bp under an ubiquitous CAG promoter (Mashiko et al. 2013; Fig. 1b). An approximately 600 bp genomic region (596 ± 151 bp, n = 73 target genes) containing the target site in the centre was PCR amplified and inserted between the EGFP fragments.
Validation of pX330 in vitro
To validate which sgRNA sequence worked best, we co-transfected the pCAG-EGxxFP-target and pX330-sgRNA plasmids into HEK293T cells. The reconstituted EGFP fluorescence was observed under a fluorescence microscope at 48 h after transfection. We used the pCAG-EGxxFP-Cetn1 and pX330-Cetn1/sgRNA1 as the positive control as it has been shown to work well both in vitro and in vivo in our previous study (Mashiko et al. 2013). The fluorescence intensity was classified into four groups (4; brighter than control, 3; same as control, 2; darker than control, 1; very dark; Fig. 1c). Among the 291 pX330 plasmids examined, 81, 89, 68, and 53 were classified into groups with scores 4, 3, 2, and 1, respectively (Table 2 and Table S1). As we expected, the sgRNAs carrying poly-T polIII terminator sequence resulted in a lower activity (score 1 for all six sgRNA carrying TTTTT), thus we excluded these six cases from the rest of the study. Although there was no significant differences between sgRNAs with or without initial G (2.7 ± 1.0, n = 266 and 2.8 ± 1.1, n = 19, respectively), the addition of extra 5′ G influenced the sgRNA activity in some cases. The improvement and depreciation was observed in 3 and 1 out 8 sgRNAs examined, respectively (Fig. 2). The GC contents in the seed sequences were 49.4 ± 13.6%, 55.5 ± 14.0%, 54.4 ± 13.0%, and 52.9 ± 10.2%, in score 1, 2, 3, and 4, sgRNAs, respectively (Table S1). The significance was observed in the values between that of score 1 versus 2 (P =0.02) and 1 versus 3 (P =0.04) but not in 1 versus 4 (P =0.12).
Table 2. Validation of pX330 plasmids in vitro
pCAG-EGxxFP-target plasmids and pX330-sgRNA plasmids were co-transfected into HEK293T cells and fluorescence was observed and scored under a fluorescence microscope at 48 h after transfection. pCAG-EGxxFP-Cetn1 and pX330-Cetn1/sgRNA1 were used as positive controls (score3). The scores with 6 sgRNAs containing ttttt were not included.
2.67 ± 1.09
2.80 ± 1.02
2.70 ± 1.01
2.73 ± 1.05
Generation of mutant mice with pX330
We next injected the validated pX330 plasmids with a score of 3 or 4 into one of the pronuclei of fertilized eggs at 5 ng/μL. To reduce the risk of integration of the plasmid DNA into the host genome, we injected the plasmid in its circular form. The pups developed from these eggs were genotyped by PCR and subsequent sequence analysis (Table 3). A total of 100 out of 196 pups (52.9 ± 22.3%) were identified as gene manipulated organisms (GMOs), and three of them were found to be carrying the hCas9 transgene. It is noteworthy that the two genes on the Y chromosome were successfully mutated in four mutant pups. Among the remaining 96 mutant pups, 47 carried the mutations in both alleles and 17 of them were identical. Whereas the majority of the indels caused by the hCAS9/sgRNA complex were less than 10 bps (78 out of 114 fully identified mutations; 68.4%), we sometimes obtained indels of over 100 bps (13/114; 11.4%). The successful germ line transmission was observed with at least 10 mutations and the phenotypes of the mutant mice are now under investigation.
Table 3. Generation of mutant mice via pX330 plasmid injection
pX330 plasmids were injected into mouse zygotes at 5 ng/μL. The injected eggs were transferred into pseudopregnant females. The mutations were identified by sequencing polymerase chain reaction (PCR) amplified genomic fragments containing the target in the centre. GMO, gene modified organism. *The four pups carrying mutations on Y chromosome were excluded. †Data from previous research (Mashiko et al. 2013).
Off-target analysis in CRISPR/Cas9 mediated mutants
To analyze off-target cleavages, we chose candidates that matched the 13 nt seed sequence with an NGG PAM sequence. We amplified and sequenced the approximately 600 bp genomic regions surrounding the potential off-target sites. A total of 382 sites in 63 mutant mice were examined and three sites were mutated (Fig. S1).
We have previously reported that injection of the px330 plasmid in its circular form into the zygote results in hcas9/sgRNA targeted indels as efficiently as the injection of hCas9 mRNA and sgRNA (Mashiko et al. 2013). In the present study, we increased the number of target genes and examined the feasibility of obtaining a large scale mutagenesis in mice. We first cloned N20 preceeding NGG as seed sequences with or without an extra 5′ G into a pX330 vector and transfected HEK293T cells with pCAG-EGxxFP-target plasmid. When we used previously described pX330-Cetn1/sgRNA1 as positive control, 238 out of 291 (81.8%) pX330 vectors showed the reconstituted green fluorescence, indicating the simplicity, ease, and efficiency of the pX330 mediated gene modification. It should also be noted that the cloning of oligos into pX330 and the subsequent validation in HEK293T cells were able to be performed within a week.
When we scored pX330 activity, we did not find any significant differences between the PAM sequences (Table 2). However, further studies using the same seed sequence against different PAM sequences should be examined to definitively conclude which PAM sequence is the better option for targeting. In the present study, we used the human U6 RNA polymerase III promoter that preferably requires G for transcription initiation and a poly-T tail for the termination. There was no significant difference between the average scores obtained with G or without G. However, if the sgRNA initiates with a nucleotide other than G, the addition of extra G increases the chances of designing superior constructs (Fig. 2). These information are useful for the site directed mutagenesis such as nucleotide substitution because the cleavage site and modification site should be covered by the approximately 130 nt reference ssDNA. Although some of the sgRNA that has TTTT worked (2.33 ± 0.98, n = 12), all six sgRNAs containing TTTTT did not work. Therefore, a poly-T stretch of more than five nucleotides should be avoided when designing the sgRNA. Although score 1 sgRNA tended to have lower GC contents than that of score 2 and 3 sgRNAs, it was not significant from that of score 4 sgRNA. In addition, GC contents of score 1 sgRNA and 4 sgRNA varied from 25.0% to 90.5% and 23.8% to 80.0%, respectively, indicating that the GC contents may be less important for sgRNA activity.
After the validation in vitro, the pX330 plasmids with scores of 3 or 4 were injected into mouse zygotes and mutant mice were obtained in all 32 genes tested (including the data from our previous report) (Mashiko et al. 2013). An average of 8.2% (196/2397) of treated embryos developed to term and 52.9 ± 22.3% (100/192) of these pups were mutated (Table 3). Out of these mutant pups, 3.0% (3/100) carried the hCas9 transgene and the transgenic efficiency is lower than our average with linearized DNA, 33.4 ± 23.0% (173/684, n = 26 constructs). Thus, we conclude that the pronuclear injection of pX330 is a simple, easy, fast and viable approach to make targeted gene knockout mice.
More significantly, we found 47 out of 96 mutants carried biallelic mutations (Table 3). The remaining four mutants carried mutations in Y chromosome specific genes. Hence, mice carrying homozygous mutations can be generated within 4 weeks. Since conventional gene targeting strategies require 8 months to obtain homozygous mutants (2 weeks for vector construction, 1 month for ES cell screening, 1 month for chimera production, 2 months for chimera maturation, 1 month for germ line transmission, 2 months for heterozygote maturation, 1 month for homozygous fetal development), our techniques very much accelerate the mutant mice generation phase (Fig. 3). This includes a saving on animal lives as well as a reduction in labor and expense.
As for off-target cleavages, while a single nucleotide exchange within the seed 10 bases of sgRNA could severely diminish its target specificity (Cong et al. 2013), the risk of off-target mutation remains in Cas9/sgRNA mediated mutant animals. In the present study, we searched the off-target sites that exactly matched at least 13 bases at the 3′ end of the seed sequence and the PAM sequence (N can be A, G, C, or T), and found only three off-target cleavages among the total 382 sites examined in 63 mutant mice (Fig. S1). Although there remained a risk of cleavages depending on overall homology as indicated in a recent report (Fu et al. 2013; Pattanayak et al. 2013), our data coupled with published data indicates fewer occurrences of the off-target cleavages, at least with the transient expression of Cas9/sgRNA in the oocyte (Wang et al. 2013; Yang et al. 2004). To reduce the risk of off-target cleavages, the seed sequences should be screened with software such as Bowtie and the sequence with the least off-target candidates should be selected for use. The generation of mutant mice via different sgRNA sequences also decreased the risk of misinterpreting the phenotype.
Currently, the International Knockout Mouse Consortium (IKMC) has generated protein coding gene locus targeted ES cell clones of approximately 90% of the mouse gene loci (website; www.knockoutmouse.org). However, there is not an equivalent number of gene disrupted mice – only 13% according to the latest data (accessed Oct/2013). This difference can be attributed to several factors, such as trends in research, difficulty in vector construction and subsequent difficulties with homologous recombination, embryonic lethality requiring conditional knockout vectors, the expertise required to generate a knockout mouse as well as the cost and the need for an appropriate standard of animal facilities (Muller 1999; Austin et al. 2004). Here, we conclude that pronuclear injection of a circular plasmid expressing Cas9/gRNA complex is a rapid, simple, and reproducible method for the targeted mutagenesis in mice and may supply the answer to several of the problems currently plaguing ‘traditional’ mutagenesis techniques, helping the IKMC project to achieve its goal of a KO mouse for every gene locus.
We thank F. Zhang for pX330 plasmid, Y. Esaki and S. Nishioka for technical assistance in generating mutant mice. This work was supported in part by the MEXT of Japan.