Genetic mutations in humans are associated with congenital disorders and phenotypic traits. Gene therapy holds the promise to cure such genetic disorders, although it has suffered from several technical limitations for decades. Recent progress in gene editing technology using tailor-made nucleases, such as meganucleases (MNs), zinc finger nucleases (ZFNs), TAL effector nucleases (TALENs) and, more recently, CRISPR/Cas9, has significantly broadened our ability to precisely modify target sites in the human genome. In this review, we summarize recent progress in gene correction approaches of the human genome, with a particular emphasis on the clinical applications of gene therapy.
The human genome is complex and variable among individuals. A recent study comparing the genomic sequences of nearly 1000 human genomes worldwide identified more than 38 million single nucleotide polymorphisms (SNPs), 1.4 million short (1–50 bp) insertions and deletions (indels) and 14 000 larger deletions or structural variations (1000 Genomes Project Consortium et al. 2012). Such large genetic variations account for personal characteristics and the evolution of mankind; however, particular mutations in essential genes sometimes cause genetic disorders.
According to the Human Gene Mutation Database (HGMD, http://www.hgmd.org/) and the Online Mendelian Inheritance in Man database (OMIM, http://omim.org/), at least 3000 to 4000 genes are known to be associated with phenotypic traits or genetic disorders. These mutations vary in size from the single nucleotide level to the microscopic level of chromosomal abnormalities.
Single-nucleotide variations (SNVs) are the most common variations in the genome in individuals and are associated with amino acid alterations (non-synonymous mutations), gain of de novo stop codons (nonsense mutations) and changes in splicing patterns. The vast majority of SNPs are found in the intergenic region, which affects transcriptional factor binding and regulation (Schaub et al. 2012). Small indels within the protein coding region are likely to induce frameshift mutations. Relatively larger deletions or duplication of chromosomal segments, so-called CNVs (copy number variations), are associated with many genetic diseases. Translocation between two repeated regions (such as the Alu element) and the de novo insertion of retrotransposon/retroviral elements are classified into this category (Hastings et al. 2009). Chromosomal abnormalities, such as translocations and aneuploidy, affect several gene functions that are often linked to the development of severe complex syndromes or cancer. In this article, we review the recent progress in new gene correction technologies, and summarize the challenges for developing novel gene therapy approaches.
The concept of gene therapy
Gene therapy is based on the principle of the genetic modification of living cells for use in treating various disorders. The final goal of gene therapy is to cure patients who suffer from genetic disorders, including congenital diseases, infectious diseases, and cancer. This concept has existed for several decades; however, in practice, “genetic modification” exclusively refers to the transfer of therapeutic genes into living cells or the human body.
Multigenic disorders are too complex to tackle; however, monogenic disorders, particularly those caused by small mutations, are likely to be cured by the supplementation or augmentation of non-mutated genes. Traditional gene therapy approaches for treating congenital disorders primarily rely on gene transfer, which introduces one or, at most, a few transgenes to compensate for the gene function. Therefore, the key to performing successful gene therapy depends on the use of efficient transduction techniques in human cells. In fact, the potential application of gene therapy became realistic after the development of viral vectors in the 1980s, and the initial clinical trials conducted in the early 1990s achieved successful gene transfer using retroviral vectors. Since then, several different classes of viruses, such as retroviruses, lentiviruses, adenoviruses and adeno-associated viruses (AAVs) have been engineered to remove pathogenicity and used to deliver therapeutic transgenes in the clinical setting. In fact, several promising results have been reported in recent gene therapy clinical trials, including the retroviral transduction of CD34+ bone marrow cells to treat adenosine deaminase (ADA)-severe combined immunedeficiency (SCID) (Aiuti et al. 2009) and X-linked severe combined immunodeficiency (SCID-X1) (Hacein-Bey-Abina et al. 2010), the lentiviral transduction of CD34+ bone marrow cells to treat β-thalassemia (Cavazzana-Calvo et al. 2010) and the intravenous injection of AAV vectors to treat hemophilia B (Nathwani et al. 2011).
Apart from the effectiveness of gene delivery, there remain limitations to gene supplementation strategies. First, the disease-causing mutations persist following gene augmentation. If the mutated gene produces an abnormal gene product that exerts a dominant negative effect, the mutation must be removed. For example, osteogenesis imperfecta patients suffer from fragile bones caused by dominant-negative mutations in the COL1A1 collagen genes. Augmentation of the wild-type COL1A1 gene does not rescue the phenotype; therefore, the targeted disruption of the mutant gene is required. In this regard, AAV vector-mediated gene targeting technology has been applied to patient-derived MSCs in order to disrupt the mutant COL1A1 gene (Chamberlain et al. 2004). Second, the transgene expression should be regulated via cis-regulatory elements within the delivery vector, so that the transgene expression level is not too high to prevent overcompensation or too low to achieve a therapeutic outcome. Transgene silencing is a well-known and important issue for transgene-based approach, especially when stem cells are targeted (Hotta & Ellis 2008). Third, if an integrating vector is used to deliver the transgene, it is necessary to pay extra attention to insertional mutagenesis, which can disrupt essential genes or activate neighbor genes at the integration site (Hacein-Bey-Abina et al. 2003). Fourth, due to limitations in the delivery size of the transgene, current gene therapy is not amenable to treating multigenic diseases or chromosomal abnormalities. In order to overcome these issues, the development of new strategies and new techniques is required.
Targeted genome editing techniques
Gene targeting is a well-established technique used in model organisms to manipulate the genomic sequence by using the host homologous recombination system. Since gene targeting mediates target sequence-specific recombination events through homology-arms, it is an ideal technique for correcting certain genetic mutations for gene therapy. In principle, gene targeting is mediated by the introduction of a donor vector containing homology sequence arms in order to mediate site-specific strand exchange. In model organisms, such as bacteria, and certain cell types, such as chicken DT40 cells or mouse ES cells, exhibit a relatively high homologous recombination efficiency that is practical enough for laboratory use, although most human primary cells are not. In order to facilitate the delivery of the donor vector into human cells, several methods mediated by viral vectors have been developed historically; for example, integration-defective retroviral vectors (Ellis & Bernstein 1989), adenoviral vectors (Mitani et al. 1995) and AAV vectors (Russell & Hirata 1998). Viral vector-mediated gene targeting approaches, particularly the use of adenovirus or AAV vectors, have been proven to be effective in many human cell types and are promising techniques for application in gene therapy clinical trials.
Recent progress in tailor-made nucleases
To further facilitate the occurrence of gene targeting events at the desired site, the host DNA repair machinery can be activated by inducing site-specific DNA damage. Initial attempts were based on the use of natural restriction enzymes with a large DNA recognition domain (i.e. homing endonucleases); however, the technique was quickly expanded to intentionally engineer DNA binding motifs in order to design the target site. Depending on the DNA recognition motif, four distinct platforms of engineered nucleases have been developed in this context: meganucleases (MN) (Arnould et al. 2011), Zinc-finger nucleases (ZFNs) (Palpant & Dudzinski 2013), TALENs (Joung & Sander 2013) and the CRISPR/Cas9 system (Gaj et al. 2013) (Table 1). Recent advances in these nucleases have broadened our ability to manipulate human genome sequences more easily and effectively. The ability to edit a certain location of the genome has a big impact on basic research for understanding of the gene functions and regulatory mechanisms. Moreover, the development of new technology has opened up new and exciting applications of gene therapy (Pâques & Duchateau 2007).
Table 1. Various engineered nucleases
Target DNA recognition
Construction of custom nuclease
Target site recognition size
*1: Streptococcus pyogenes based Cas9.
Randomized mutilation at the DNA recognition residues and screening of combinatorial library
Zinc finger domains
FokI nuclease domain
Assembly of 3-4 zinc finger domains and screening for context dependent binding specificity
(9 or 12 bp) × 2
RVD (repeat variable diresidue) repeats
FokI nuclease domain
Assembly of 8-31 RVD repeats
(8–31 bp) × 2
CRISPR RNA (crRNA) or guide RNA (gRNA)
Oligonucleotide synthesis of gRNA and molecular cloning (or RNA synthesis)
20 bp + “NGG” (*1)
Engineered nucleases recognize a designed sequence, normally 18–40 bp in total size, and cleave genomic DNA. The host cell activates the DNA damage response and induces either of the two DNA repair pathways: non-homologous end joining (NHEJ) or homology-directed recombination (HDR). The former pathway induces small (typically 1–50 bp) deletions and/or sometimes minor insertions, whereas the latter pathway stimulates the substitution of nucleotide sequences or template insertions, depending on the supplemented donor DNA template (Fig. 1). Each repair pathway results in a different form of genomic sequence alterations; therefore, the target site of the engineered nuclease and the design of the donor template DNA must be carefully considered depending on the application of gene editing. In this review article, we discuss gene therapy applications, with a particular focus on providing a technical overview of how to correct mutated genes with different types of mutations.
Examples of gene therapy using engineered nucleases
In 2005, the first demonstration was published showing that ZFNs can be used in human cells to restore a mutation of X-linked SCID in the IL2Rγ gene (Urnov et al. 2005; Lombardo et al. 2007). Since then, as shown in Table 2, several researchers have demonstrated the feasibility of treating several disorders using gene correction strategies with engineered nucleases. Gene correction strategies can be largely divided into four categories: disruption of endogenous genes, frameshift induction in order to restore the protein reading frame, insertion of a foreign sequence using target-specific knock-in and substitution of a mutated sequence.
Table 2. Correction of human genetic diseases using engineered nucleases
Many of these strategies remain in the proof-of-concept stage; however, of note, there is already one example of a clinical trial to prevent HIV (human immunodeficiency virus) infection for AIDS (acquired immunodeficiency syndrome) therapy using a gene knock-out strategy.
CCR5 disruption as AIDS therapy
Human immunodeficiency virus is classified as a retroviridae lentivirus and has the ability to integrate its genome sequence into human host chromosomes upon infection. Therefore, AIDS can be regarded as a genetic disease caused by infection. Massive expansion of HIV decreases the number of CD4+ T cells and weakens the immune function, leading to the onset of AIDS. Currently, combination treatment with anti-HIV drugs, known as highly active antiretroviral therapy (HAART), can be used to reduce the viral load and stimulate the host immune system; however, affected patients require continuous, life-long drug treatment due to the latency of HIV (Tyagi & Bukrinsky 2012). For HIV to enter host cells, CD4 antigens and chemokine receptors, such as CCR5 or CXCR4, are required to invade macrophages and T-helper lymphocytes. Interestingly, a 32 bp homozygous deletion between the transmembrane domains of CCR5 (the CCR5Δ32 mutation) results in a frameshift mutation in which affected individuals display high resistance to HIV-1 infection (Samson et al. 1996).
Based on this finding, gene engineering technology has been applied to mimic the CCR5 mutation in order to cure HIV-1 infection in clinical trials (Fig. 2). The first proof-of-principle was demonstrated by delivering an Ad5/35 adenoviral vector encoding a pair of ZFNs into primary human CD4+ T cells (Perez et al. 2008) and human CD34+ hematopoietic stem/progenitor cells (Holt et al. 2010). Small indels (from 8 bp insertion to 43 bp deletion) were introduced into the transmembrane domain-1 (TM1) of the CCR5 gene, which is located 381 bp upstream from the natural CCR5Δ32 mutation. The transient expression of CCR5 ZFNs specifically disrupted 40–60% of CCR5 alleles in human CD4+ T cells and protected against HIV-1 infection in both in vitro HIV-1 challenges and an in vivo NOG mice infection model. Phase I/II clinical trials headed by Sangamo Biosciences are currently ongoing to evaluate the curative effects of CCR5-disrupted CD4+ T cell transplantation in HIV-1-infected subjects (Maier et al. 2013).
In addition to CCR5 disruption strategies, artificial nucleases can be used to target the long terminal repeat (LTR) promoter of HIV, which is critical for viral genomic RNA transcription. As a proof-of-principle study, Ebina et al. demonstrated disruption of the LTR region using CRISPR/gRNA in cultured cell lines (Ebina et al. 2013). This approach has the potential to target the latently infected HIV provirus and eliminate the HIV provirus sequence from the host genome. Further studies are required to ensure efficient HIV disruption in vivo; however, the use of a combination of different strategies is worthwhile to combat quickly mutating HIV.
Hemophilia B gene therapy using targeted integration
Target sequence specific nucleases can direct the insertion of foreign DNA with appropriate homology arms into a desired location. To reduce the risk of endogenous gene disruption and protect against silencing due to chromosomal positional effects, the targeted location should ideally be free from essential genes and nearby oncogenes and be permissive of the transgene expression, such as that of ROSA26 (Irion et al. 2007) or AAVS1 loci (Smith et al. 2008). Moreover, the targeted integration approach is applicable to semi-tailor-made gene correction, which can be used to cover diverse genetic mutations (i.e. size and location differences among individuals) in a single gene. For example, hemophilia B is a blood coagulation disorder caused by a mutation in the blood coagulation Factor IX (F9) gene, and most of the known F9 mutations exist across the coding sequences of exons 2-8. Therefore, the insertion of intact cDNA with exon 2-8 into the first intron of the F9 gene allows for the expression of functional F9 proteins under the control of the exogenous promoter (Fig. 3). As a proof-of-concept study, Katherine High's group used this strategy to treat hemophilia B mice in vivo (Li et al. 2011) by using a model mouse that lacks the murine F9 gene and contains a mutated (Y155 stop) human F9 mini-gene with the first intron. AAV vectors (serotype 8) encoding ZFNs targeting the first intron of hF9 and a donor template that contained exons 2-8 were injected into the peritoneum of the hemophilia B newborn mice. The injected mice showed significantly higher levels of hF9 secretion in the plasma and improved blood coagulation activity, as measured using an APTT (activated partial thromboplastin time) assay. Even though the efficiency of genome editing in vivo remains low (1–3% following intraperitoneal AAV8-ZFP injection), which is currently not practical for human trials, the results are promising for the future of in vivo gene correction approaches.
In relation to hemophilia B, hemophilia A is a more common blood coagulation disorder and is caused by a mutation in coagulation Factor VIII (F8). Due to the large size of the F8 cDNA (7 kb) and the complexity of the disease-causing mutation, performing gene correction for hemophilia A is more challenging. The F8 gene contains an intron-less gene called the F8 associated 1 (F8A1) gene within intron 22, and two other copies of F8A1 genes, named F8A2 and F8A3, are located approximately 500 kb and 580 kb upstream from the first exon of F8, respectively. These three pseudogene regions are almost 10 kb in size and share very high sequence similarities (>99.5%) to each other. Large inversions (500–580 kb) mediated by two pseudogene regions are observed in nearly half of severe hemophilia A patients (Graw et al. 2005).
Of note, Jin-Soo Kim's group recently demonstrated that treatment with a single pair of ZFNs targeted to two repeated regions separated by 140 kb can induce inversion between the two regions in a human cell line, although the frequency was very low (0.1–0.4%) (Lee et al. 2012). Further improvement of the technique is required for practical use, including the addition of a donor template spanning the inverted region, the coordination of two genomic regions spatially and the incorporation of a recombinase system, such as Cre-loxP or Flp-FRT.
DMD gene therapy by frameshift induction
Most gene correction strategies rely on the inclusion of a donor template to restore a particular mutation. However, in certain types of genes and diseases, treatment with target-specific nucleases alone without a donor template can restore the gene function. One such example is the dystrophin mutation in Duchenne muscular dystrophy, a severe disease involving muscle degeneration. The dystrophin protein is an anchor protein that links the subsarcolemmal cytoskeleton and extracellular matrix in skeletal muscle cells (Blake et al. 2002). Nonsense mutations and frameshift deletions due to the gross deletion of multiple exons, primarily clustering from exon 42 to exon 55, are the major causes of the disease. On the other hand, in-frame mutations of the dystrophin gene often result in a much milder phenotype, called Becker muscular dystrophy, suggesting tolerance to small indels or mutations within the anchor region. Applying gene therapy for Duchenne muscular dystrophy using the gene augmentation approach is challenging due to the size of cDNA (11 kb). Currently, several clinical trials are underway to restore the reading frame of the dystrophin gene by skipping particular exons in order to adjust the normal reading frame following the administration of antisense RNA analogues or morpholino oligomers (Verhaart & Aartsma-Rus 2012; Aoki et al. 2013). The exon skipping approach is promising; however, the therapeutic effect is transient, as the antisense reagents last for only a few months in vivo (Cirak et al. 2011; Goemans et al. 2011).
With respect to mimicking the exon skipping of dystrophin by antisense oligos, engineered nucleases offer permanent genetic manipulation (Fig. 4), which lasts longer than manipulation of the post-transcriptional step (Foster et al. 2012). Jacques Tremblay's group first demonstrated that the restoration of frameshift mutations could be induced into micro-dystrophin transgenes in mice using MN treatment (Chapdelaine et al. 2010) and subsequently extended their study to construct several MNs (targeting introns 38, 42 and 44) and ZFNs (targeting exon 50) in order to introduce small indels at certain points in dystrophin and restore the normal reading frame (Rousseau et al. 2011). Compared with MN or ZFN, TALEN and CRISPR/gRNA offer more flexibility in target site design, which allows researchers to target more precise locations depending on the site and type of mutation in the patient. Ousterout et al. introduced TALENs that target exon 55 to restore dystrophin (lacking exons 48–50 or 46–50) by inducing the development of small indels in patient-derived myoblasts immortalized via the retroviral transduction of hTERT and CDK4 transforming genes (Ousterout et al. 2013). Successful in-frame correction was demonstrated following the detection of dystrophin proteins. Both NHEJ-mediated small indels induction and HDR-mediated donor knock-in approaches can be used to achieve in-frame restoration. Popplewell et al. produced an integration-competent lentiviral vector expressing a MN targeting intron 44 and inserted template cDNA spanning exons 45–52 of the dystrophin gene using an integration-deficient lentiviral vector. Following the transduction of myoblasts carrying a deletion of exons 45–52 that was immortalized by the transduction of retroviral hTERT and lentiviral MyoD, the authors detected the corresponding mRNA transcript (Popplewell et al. 2013). These studies clearly demonstrate the proof-of-principle that frameshift mutations of the dystrophin gene are amenable to genomic manipulation with tailor-made nucleases. Unlike antisense exon skipping approaches or the adenoviral delivery of the micro-dystrophin gene, these genetic modifications are permanent, and the therapeutic effects should be retained for a longer period. Unfortunately, primary myoblasts derived from patients are less proliferative; therefore, performing clonal expansion is not feasible unless the cells are transformed. Bulk cell analyses may mask the presence of rare off-targeted cells, and mixing corrected and non-corrected gene products may result in other issues, such as dominant negative effects or immunological responses. In this regard, the development of a better human cell model that is proliferative and allows for clonal expansion, yet retains a normal haplotype is required.
iPS cells as a target for ex vivo gene therapy
iPS cells constitute a unique cell type that can be isolated directly from the patient and acquires an unlimited self-renewal capacity and ability for multipotential differentiation, yet retains a normal karyotype (Takahashi et al. 2007). These properties make iPS cells as a good model for studying human disease in vitro and testing various gene therapy approaches. In addition, the transplantation of iPS cell derivatives holds promise for future cell therapy applications applicable for ex vivo gene therapy approaches. Thus far, several studies have demonstrated that many gene editing methods are applicable to human iPSCs, including plasmid vectors (Zwaka & Thomson 2003; Irion et al. 2007), ZFNs (Lombardo et al. 2007; Hockemeyer et al. 2009), AAV vectors (Khan et al. 2010), adenoviral vectors (Suzuki et al. 2008; Aizawa et al. 2012), BAC vectors (Song et al. 2010; Howden et al. 2011), TALENs (Hockemeyer et al. 2011; Ding et al. 2013a) and CRISPR/Cas9 (Ding et al. 2013b). Viral vector-mediated gene targeting, particularly that involving adenovirus and AAV vectors, appears to be effective in terms of the efficiency of transduction and homologous recombination; however, preparing these viral vectors requires certain expertise and an adequate biosafety facility. The appropriate gene targeting technique should be selected depending on the gene correction strategy and mutation type of the disease.
Single amino acid substitution for sickle cell anemia
Sickle cell anemia is often caused by a single amino acid substitution (Glu6Val) in exon 1 of the β-hemoglobin (HBB) gene. The mutation induces the polymerization of hemoglobin and disturbs the shape and function of red blood cells. Treatment with sickle cell anemia patient-derived iPSCs using ZFNs and donor vectors has been shown to induce successful recombination of exon 1 (Sebastiano et al. 2011; Zou et al. 2011). The corrected iPSCs were differentiated into erythroid cells, and restored HBB transcripts were detected.
Removal of a premature stop codon for epidermolysis bullosa
Recessive dystrophic epidermolysis bullosa is caused by genetic defects in the type VII collagen (COL7A1) gene. Patient-derived iPS cells have been treated with TALENs and a donor vector containing selection cassettes flanked by loxP (Osborn et al. 2013). The use of selection approaches with antibiotic-resistant genes ensures efficient targeting, and the subsequent removal of the selection cassette eliminates undesired activation of endogenous genes by the promoter of the selection cassette. Although the exonic coding region can be precisely corrected, a single loxP sequence will remain in the intronic region, even after the removal of the selection cassette.
Footprint-free gene correction of α1-antitrypsin
Yusa et al. (2011) applied an elegant approach using patient-derived iPS cells and demonstrated a proof-of-concept that footprint-free genomic correction can be achieved using ZFNs and piggyBac DNA transposon as a removable selection cassette. iPS cells derived from an α1-antitrypsin deficiency patient with the Z mutation (Glu342Lys) were first genome edited using ZFNs and a selectable cassette flanked by terminal repeats of the piggyBac transposon. The piggyBac selection cassette was subsequently removed from the targeted locus using the transient expression of piggyBac transposase, taking advantage of the fact that the piggyBac transposon leaves no footprint in the host genome. A similar approach has also been reported (Choi et al. 2013) using TALENs.
Selection-free gene correction for Parkinson's disease
Parkinson's disease is a late-onset neurodegenerative disorder that is primarily characterized by the loss of dopaminergic neurons. The genetic association is weak, as only 15% of patients are familial cases, and mutated genes are diverse. One such example is the mutation in the α-synuclein (SNCA) gene, and some mutations (A53T, E46K and A30P) are known to cause Parkinson's disease. Soldner et al. applied a donor template either as plasmid DNA or as single-strand oligo deoxynucleotide (ssODN) to correct the mutations without the use of a conventional selection cassette. Although the efficiency of ssODN-mediated gene correction is low (approximately 0.4%), this approach allows for the simple construction of a donor template and eliminates the step of removing the selection cassette (Soldner et al. 2011).
Chromosomal modification for treating Down syndrome
The ability of artificial nucleases to be used to insert a foreign DNA sequence into a desired site allows for the manipulation of chromosomes (Fig. 5). For example, the insertion of a negative selection marker HSV thymidine kinase (TK) gene into a trisomic chromosome 21 using AAV vector-mediated knock-in, the additional chromosome 21 can be spontaneously eliminated from iPSCs derived from Down syndrome patients by selecting against the TK expression (Li et al. 2012). Another potential approach of gene therapy for Down syndrome was demonstrated by the epigenetic suppression of one trisomic chromosome 21. In females, one of the two copies of the X chromosome is epigenetically suppressed in order to compensate for the gene dosage via the allele-specific expression of non-coding RNA called Xist. Interestingly, the insertion of the Xist gene into chromosome 21 using ZFNs induces the heterochromatinization of the targeted chromosome (Jiang et al. 2013a). Although this approach is still premature for use in the clinical setting, it demonstrates the feasibility of using gene editing technology to manipulate chromosomal abnormalities, such as Down syndrome.
Reducing immunity using HLA editing
The gene editing approach is potentially applicable for reducing immunogenic reactions. Human leukocyte antigen (HLA) genes are essential for forming the major histocompatibility complex (MHC) and determining the compatibility of donors for cell transplantation. In particular, MHC class I molecules, such as HLA-A, B and C, are ubiquitously expressed on various cell surfaces and therefore play a central role in distinguishing self-cells from non-self-cells. Importantly, the β2-microglobulin (B2M) gene is a common subunit for all HLA class I antigens and is essential for the surface expression. Therefore, the homozygous deletion of the B2M gene diminishes the surface presentation of class I HLA molecules and reduces immunogenicity. Riolobos et al. demonstrated the successful disruption of the B2M gene using AAV targeting in hESCs and showed a lack of the CD8+ T cell response, as measured according to the interferon-γ expression (Riolobos et al. 2013). A similar approach has also been demonstrated using TALENs (Lu et al. 2013). In knockout mouse studies, it has been reported that hematopoietic stem cells lacking the B2M gene are eliminated by natural killer cells. Therefore, this approach is not applicable for hematopoietic differentiation, although it may be applicable for solid organ transplantation to provide universal donor cells.
Challenges in applying engineered nucleases in the clinical setting
Immunogenicity of transduced proteins
The human immunological system is precise; therefore, any foreign proteins can induce a host immunoreaction. Engineered nucleases are not an exception. For example, the DNA binding domain of TALEN consists of a TAL effector domain derived from Xanthomonas, and the DNA cleavage domain of FokI is derived from Flavobacterium okeanokoites. The CRISPR/Cas9 system is adapted from Streptococcus pyogenes or other bacteria. The ectopic expression of such protein components can induce host immunoreactions. Repeated exposure to engineered nucleases requires careful attention. Furthermore, the RNA component of the CRISPR system has the potential to induce native immunity mediated by Toll-like receptors. In addition, the precisely corrected gene product itself can be a potential trigger of immunoreactions, especially when the patient lacks the correct gene product from birth, and such adverse effects may eliminate the gene-corrected cells. The use of immunosuppressive drugs to target the immunogenic gene is likely worthwhile.
How to deliver the genes into the affected tissue/cell types?
In the gene therapy field, the process of achieving gene delivery into the desired cell type remains one of the biggest hurdles to overcome. In order to maximize therapeutic efficacy, engineered nucleases must be delivered effectively in less toxic and invasive ways. The delivery method varies depending on the target cell or tissue type and therapeutic strategy. For the in vitro gene therapy approach, general transfection methods, such as electroporation or lipofection are applicable. For the in vivo gene therapy approach, viral vectors are currently most effective. MNs (Rousseau et al. 2011; Popplewell et al. 2013), ZFNs (Lombardo et al. 2007) and likely CRISPR/Cas9 can be delivered using retroviral or lentiviral vectors; however, TALENs are too repetitive in their DNA sequences and there is a high risk of rearrangement during virus packaging (Holkers et al. 2013). The permanent integration of nucleases into the host genome raises a concern regarding increases in the risk of off-target mutagenesis. Depending on its serotype, AAV has very high infectivity for many tissues and is the preferred vector in current gene therapy trials. Indeed, ZFNs have been packaged in AAV vectors (Händel et al. 2012; Ellis et al. 2013), and in vivo gene correction using such vectors has been demonstrated (Li et al. 2011). However, the smaller packaging size limit (4.7 kb, including the promoter and other expression components) can hinder the viral production of large nucleases, such as TALEN (3–4 kb) and CRISPR/Cas9 (4 kb). Adenoviruses have a larger packaging size (maximum 30 kb) and exhibit reasonable infectivity in vivo, demonstrating the ability to deliver TALENs (Holkers et al. 2013), although they possess relatively high immunogenicity and cell toxicity, especially at high doses. For cell transduction of ex vivo gene therapy, electroporation- or lipofection-based DNA transfection methods are applicable. Depending on the transduction efficiency of the target cells, non-DNA methods, such as mRNA or protein transduction approaches, may be feasible for delivering engineered nucleases. It is important to determine which transfection method is preferable depending on the engineered nuclease and target cell type.
Side-effects of off-target mutagenesis
Currently, the largest concern regarding the use of genome editing approaches in clinical applications is safety associated with off-target mutagenesis. Engineered nucleases can cause target-site dependent and independent cleavage, resulting in small indels. Such indels may damage tumor suppressor genes and confer a growth advantage that ultimately leads to tumorigenicity. The accidental disruption of a gene important for cellular activity is also a concern; however, such cells are likely to be eliminated by cellular homeostasis. Random integration of donor DNA is also a big concern, as HDR-mediated targeting is not 100% efficient. In this regard, the development of some sort of negative-selection system or clonal isolation method of selecting correctly targeted clones is required.
To date, several methods have been developed to detect the binding specificity of nucleases and to search for potential off-target sites. In order to determine the affinity and specificity of the DNA binding domain in vitro, the SELEX (systematic evolution of ligands by exponential enrichment) assay (Tuerk & Gold 1990) is often used. This assay uses the DNA binding domain of the reference nuclease to pull down DNA ligands from a randomized sequence pool, and bound DNA fragments are sequenced to determine the preferred binding sequence in an in vitro context (Perez et al. 2008; Hockemeyer et al. 2009, 2011). This assay is particularly relevant for ZFNs, as the sequence specificity of ZFNs depends on the context of the zinc finger arrays. Applying the protein structures of engineered nucleases, modeling the molecular structures allows for the calculation of protein-DNA interaction affinities to estimate DNA binding profiles (Yanover & Bradley 2011).
Currently, the most popular off-target analysis for predicted sites is the use of mismatch-specific DNA nucleases, such as CEL-I nuclease (also known as SURVEYOR nuclease) or T7 endonuclease I (T7EI). Following the amplification of the candidate off-target sites, the dsDNA fragments are denatured and re-annealed so that the mutated and non-mutated DNA strands form a heteroduplex. The above mismatch-specific DNA nucleases digest only the heteroduplex DNA fragments and form smaller DNA fragments that are detected on agarose gel electrophoresis. Massively parallel sequencing technology is also used to detect mutations at candidate off-target sites (Meng et al. 2008; Rousseau et al. 2011; Jiang et al. 2013b; Mali et al. 2013b) and offers highly sensitive detection of mutagenesis information, including the type of indels, size distribution and junction sequence.
Taking advantage of the fact that DNA double strand breaks enhance the integration or ligation of donor templates by NHEJ, in vivo sites cleaved by engineered nucleases can be marked by incorporating appropriate tags, such as AAV vectors (Petek et al. 2010), integration-deficient lentiviral vectors (IDLVs) (Gabriel et al. 2011; Osborn et al. 2013) or adapter sequences (Pattanayak et al. 2011). Regarding testing nuclease specificity for clinical use, it is important to combine more than one method to compensate for each other.
In cases of ZFN targeting for CCR5 to prevent HIV infection, researchers perform a SELEX assay to determine the DNA sequence preferences of ZFN binding in vitro. Then, the target sequences are searched for the human genome and the top 15 potential off-target sites are listed using bioinformatics. Ultra deep pyrosequencing (454 system) of potential off-target sites can be used to identify lower efficiency yet detectable mutagenesis in the CCR2 gene (Perez et al. 2008). CCR2 is a member of the same gene family as CCR5, with a high sequence similarity (85%), and is located just 10 kb upstream from CCR5 on chromosome 3. In fact, CCR2 has a potential ZFN binding site that differs by only 1 bp on both left and right ZFN binding sites. It has been reported that the CCR2-V64I mutation delays HIV-1 progression, but CCR2 is rarely used as a co-receptor of HIV-1 in vitro (Kostrikis et al. 1998). Further careful investigation is required in ongoing trials.
CRISPR/gRNA recognize approximately 23 bp of the target site, which is a relatively shorter sequence than that recognized by TALENs. In addition, recent papers have demonstrated that the RNA-mediated CRISPR/Cas9 system possesses high tolerance to a few base pair mismatches towards the 5′ half of the target site and a high potential for off-target risk in human cell lines (Fu et al. 2013; Hsu et al. 2013). In fact, similar to ZFNs, CRISPR/gRNA targeting the CCR5 gene also targets the CCR2 gene (Cradick et al. 2013). It is worth noting that, the fidelity of the DNA repair mechanism varies according to the cell type and, in general, transformed cell lines and cancer cells exhibit more tolerance to DNA mutations. Comparing off-target frequencies among different nucleases and different cell types is required.
CRISPR/Cas9 normally induces double strand breaks; however, a single amino acid mutation (Asp10Ala or D10A) allows for the digestion of only single-stranded DNA (Jinek et al. 2012), called “nickase”. Compared with that of double strand breaks, single strand break-mediated HDR efficiency is low, although recent reports have demonstrated that the combination of two nickases allows for the induction of off-set double strand breaks with 5′ overhang and the stimulation of template-mediated HDR events (Mali et al. 2013a; Ran et al. 2013). Due to the low mutagenesis activity of single nickase, the “double-nicking” approach can be used to reduce the risk of off-target mutagenesis.
We appreciate that, similar to the “star activity” of restriction enzymes, any nuclease has the potential to digest off-target sites. Now, the question is how to estimate and minimize the side-effects associated with off-target mutagenesis. Regardless of the type of engineered nuclease and gene editing approach, it is critically important to design and select unique target sequences in the genome, so that no other sites share a similar DNA sequence in order to reduce the risk of target-site dependent off-target mutagenesis.
Massively parallel sequencing technologies for detecting mutations
The occurrence of target sequence-dependent off-target mutagenesis is relatively easy to evaluate compared to that of target sequence-independent mutagenesis. However, engineered nucleases may induce small indels randomly, regardless of their DNA binding sequences. Such off-target mutagenesis would be unpredictable and the only way to fully address this issue is to conduct an unbiased whole genome analysis. Whole genome sequencing is now feasible using massively parallel sequencers for detecting small indels and nucleotide substitutions with a fine-tooth comb; however, due to the high cost and complexity of data analyses, sequence capture of exon regions has been attempted more broadly (Yusa et al. 2011; Ding et al. 2013a; Ousterout et al. 2013). This is reasonable that basically all gene products are derived from exon sequences, although current exome capture kits can capture only approximately 2% of the human genome (approximately 50 Mb in total). The choice of sequencing platform is also very important. Pyrosequencing-based platforms, such as the 454 GS FLS or 454 GS Junior system, have the ability to assess relatively longer (500–1000 bp) lengths; however, they are susceptible to homopolymeric tracts (such as the poly A sequence) and tend to underestimate the homopolymer length (Loman et al. 2012). Sequence-by-synthesis-based short read (150–250 bp) sequencers, such as the HiSeq 2500 or MiSeq, are more accurate in base calls, although detecting CNVs is challenging due to their short read nature. In order to detect CNVs, the use of a combination of other deep-sequencing analyses or other karyotype analyses (such as G-banding, mFISH or CGH arrays) is required. A more recent single-molecule sequencing platform, PacBio RS, has the ability to read an unprecedented length of a maximum 23 000 bp (median ~2000 bp); however, the current system suffers from a high error rate of base calls (15–18%). Regardless, it is still challenging to detect off-target mutagenesis induced by engineered nucleases, as engineered nucleases typically induce small indels that are difficult to distinguish from natural variations in the human genome.
Limitations in the design of the nuclease target site
Currently, due to the binding preferences of each nuclease, not all desired sequences can be targeted. For example, ZFNs have some design restrictions due to the DNA binding preference of the zinc finger domains. TALENs recognize a T residue on the N-terminus domain; therefore, it is preferable to start the target sequence for TALENs with the “T” nucleotide. A recent structure-based protein evolution approach demonstrated that the modified TALE domain can recognize any nucleotide base other than “T,” thus suggesting the further flexibility to design target sites (Lamb et al. 2013). The CRISPR/Cas9 system requires a protospacer adjacent motif (PAM) sequence for sequence targeting, including “NGG” for S. pyogenes, “NNAGAAW” for S. thermophiles and “NNNNGATT” for Neisseria meningitidis (Hou et al. 2013). The PAM sequence can be a limitation for selecting the target site for the CRISPR/Cas9 system; however, using different Cas proteins obtained from different species will expand the range of targetable sequences. Further improvements in engineered nucleases will allow researchers to design target sites more freely.
Targeting multigenic disorders remains challenging; however, the use of simultaneous gene editing becomes more realistic with the CRISPR/Cas9 system. Rudolf Jaenisch's group demonstrated that up to five genes can be disrupted in murine ES cells (Wang et al. 2013).
In addition, most engineered nucleases are targeted to the nucleus following the incorporation of nuclear localization signals. Therefore, in order to target mitochondrial DNA (mtDNA), the localization signal must be replaced accordingly. In addition, each cell contains several hundred mtDNA copies, of which disease-causing mutated mtDNA is a part, not all. Therefore, the disease-targeting nuclease must be specific for the disease-causing mutation. Bacman et al. designed the target site of TALENs to be the mutated sequence only and demonstrated the loss of mutated mtDNA (Bacman et al. 2013).
The ethical issues associated with engineered nucleases are similar to those associated with gene therapy. The risks of gene editing include tumorigenicity, undesired integration of nucleases or donor templates and the germline transmission of the modified genome. Considering the patients who suffer from incurable and intractable diseases, and taking into account the evidence that engineered nucleases can restore mutated genes in basic research settings, examining the cost-benefit ratio could lead to the further development of techniques.
The technology of gene editing using engineered nucleases is evolving rapidly. Although several technical challenges and uncertainties remain, the promise of using engineered nucleases to edit the human genome is tremendous. Further advances in understanding and improvements in technology will open the next era of gene therapy.
A.H. is supported by JST, PRESTO. H.L.L. is a recipient of a JSPS Research Fellowship for Young Scientists.