Optimization of CRISPR–Cas system for clinical cancer therapy

Abstract Cancer is a genetic disease caused by alterations in genome and epigenome and is one of the leading causes for death worldwide. The exploration of disease development and therapeutic strategies at the genetic level have become the key to the treatment of cancer and other genetic diseases. The functional analysis of genes and mutations has been slow and laborious. Therefore, there is an urgent need for alternative approaches to improve the current status of cancer research. Gene editing technologies provide technical support for efficient gene disruption and modification in vivo and in vitro, in particular the use of clustered regularly interspaced short palindromic repeats (CRISPR)–Cas systems. Currently, the applications of CRISPR–Cas systems in cancer rely on different Cas effector proteins and the design of guide RNAs. Furthermore, effective vector delivery must be met for the CRISPR–Cas systems to enter human clinical trials. In this review article, we describe the mechanism of the CRISPR–Cas systems and highlight the applications of class II Cas effector proteins. We also propose a synthetic biology approach to modify the CRISPR–Cas systems, and summarize various delivery approaches facilitating the clinical application of the CRISPR–Cas systems. By modifying the CRISPR–Cas system and optimizing its in vivo delivery, promising and effective treatments for cancers using the CRISPR–Cas system are emerging.

targets. 10 Importantly, the advent of multiple toolkits further extended the application of the CRISPR-Cas systems in cancer.
In this review, we conducted an in-depth analysis of the mechanism of action of the CRISPR-Cas systems, and the research progress of newly emerging toolboxes to modify and deliver CRISPR systems.
We also highlighted the potential added effects of incorporating synthetic biology into the CRISPR-Cas systems for cancer treatment. We further explored the delivery issues currently faced by the CRISPR-Cas system for clinical studies, with the hope of further refining the CRISPR-Cas systems to facilitate clinical application.

| THE COMPONENTS AND MECHANISMS OF CRISPR-Cas SYSTEMS
The CRISPR-Cas systems, consisting of CRISPR arrays and highly diverse Cas genes, are adaptive immune systems evolved by bacteria and archaea in their immune system against invading phages and foreign plasmid DNA 11,12 (Figure 1). Structurally, CRISPR arrays contain a leader (adjoining the first repeat of CRISPR loci and considered as the promoter of CRISPR arrays), short direct repeats (forming hairpin structures to stabilize the secondary structure of RNA), and nonrepetitive spacers (captured exogenous DNA sequences) 13,14 ( Figure 2a). These arrays can be transcribed and processed into CRISPR RNAs (crRNAs), which are used to direct Cas nucleases to cleave complementary exogenous DNA sequences. 15 At least 45 natural Cas proteins have been identified in different bacteria, as exemplified by the well-known Streptococcus thermophilus (St1) has three Cas genes: Cas9, Cas1, and Cas2. 16 The domain organization of SpCas9 consists of NUC lobe and REC lobe (Figure 2b). Meanwhile, depending on the architecture of the CRISPR array and the signature interference effector, CRISPR-Cas systems can be classified into two classes [containing six types (I-VI) and 33 sub-types]. 17,18 Class I systems (including type I, III, and IV) encompass multisubunit Cas effector proteins, which bind to crRNA and generate target interference. Additionally, class II systems (including type II, V, and VI) require only a F I G U R E 2 The structure of the Class II CRISPR-Cas systems and gene editing mechanism (with CRISPR-Cas9 as an example). (a) Typical structure of CRISPR locus. The CRISPR gene sequence is mainly composed of the leader, repeats and spacers. The leader sequence is located upstream of the CRISPR gene and is considered as the promoter of the CRISPR sequence. The repeats are about 20-50 bp base length and the transcription products can form hairpin structures. The spacers are exogenous DNA sequence that are captured by the bacteria. (b) The domain organization of SpCas9 consists of NUC lobe and REC lobe. BH, Bridge helix. (c) Schematic representation of the sgRNA:target DNA complex. Artificially designed target sequences of sgRNAs function as crRNA-tracRNA complexes, which can direct Cas9 proteins to specifically cleave target genes. (d) Schematic representation of representative Cas proteins from different families (shown are Cas9, Cas12a, and Cas13a). In CRISPR-Cas9, the sgRNA-encoded spacer binds to the target dsDNA near the PAM. Base pairing activates the HNH and RuvC nuclease structural domains, which separate the two strands. In CRISPR-Cas12a, the crRNA-encoded spacer binds to the target base and activates the RuvC nuclease, cleaving both strands with multiple-turnover general ssDNase activity (arrow). In CRISPR-Cas13a, the target sequence is RNA. Correct base-pairing activates HEPN nuclease general ssRNase activity (arrow). (e) Genome editing using CRISPR-Cas9. The Cas9 nuclease binds to the sgRNA, which in turn is directed to the target DNA by complementary base pairing. PAM sequence (NGG, NAG) must be present in the anterior segment of the target sequence. Cleavage of the next double-stranded DNA (dsDNA) triggers the error-prone nonhomologous end joining (NHEJ) or homologous directed repair (HDR) mechanism single, multidomain large Cas effector protein to form a complex with crRNA in the interference process. 19 Accordingly, the class II systems represented by CRISPR-Cas9 require only one Cas effector protein to function as a cleavage, 20 while the class I systems demand multiple Cas effector proteins, limiting their applications. 21 Hence, class II systems exhibit tremendous promises for genome engineering in cleavage of target DNA and RNA.
Simply put, bacteria and archaea are able to store a small segment of viral gene (named spacer) into the CRISPR array when they are first invaded by a virus. When the same virus invades again, the bacteria are able to recognize the virus based on the spacer and disable it by cutting off the DNA of the virus. The specific process involves three major steps (using CRISPR-Cas9 as an example): adaptation, expression, and interference. The adaptation stage is the spacer acquisition to form memory of previous infections and is what makes CRISPR-Cas immunization adaptive and heritable. Spacer acquisition relies on Cas1 and Cas2 which are present in almost all CRISPR-Cas systems. 18 Cas1 is catalytic and Cas2 has a structural function. 22 Adaptation mechanisms show a preference for foreign DNA over self-DNA as the key to avoiding autoimmunity. For example, the RecBCD repair complex present in Escherichia coli is able to degrade larger portions of the foreign genome and serves as the basis for preferential access to nonself DNA. 23 Once the exogenous DNA is injected into the host, the proteins encoded by Cas1 and Cas2 would recognize protospacer adjacent motif (PAM) in the exogenous DNA sequence, and then take DNA sequence adjacent to the PAM as a protospacer. 24 Next, the Cas1/2 protein complex snips the protospacer from exogenous DNA to form a spacer which is inserted between two repeats at the 5 0 -end of CRRSPR array with the assistance of enzymes. 25,26 At the expression stage, CRISPR array (repeats and spacers) is transcribed to generate pre-crRNA (crRNA precursor) and tracrRNA protein. 27 The effector complex composed of Cas9 and the tracrRNA:crRNA duplex exerts interference after a second cleavage by an unknown RNase-which removes the 5 0 repeat-derived tag. 28 The spacer is in free state. This complex would monitor exogenous DNA sequence at all times. 29 During the interference stage, the Cas9 effector protein is already bound to the guide RNA prior to target selection and cleavage, thus participating in crRNA maturation. The spacer recognizes the complementary sequence in exogenous DNA. The entire complex is also localized to specific PAM, and the DNA double-strand is then unraveled. The spacer sequence hybridizes to the complementary strand, while the other strand remains free. Subsequently, Cas9 protein is localized to the correct PAM sequence. 30 Base pairing of crRNA with the target strand induces an R-loop structure that eventually triggers cleavage of the target and nontarget strands by the domains of Cas9 protein, respectively, resulting in flat-end cleavage at three nucleotides upstream of the PAM. 28 This eventually results in DNA double-strand breaks (DSB), silencing of exogenous DNA expression, and successful immunization. 31 Thus, the interference protects hosts from invasion of exogenous genome and also gives a chance for gene editing since typical CRISPR-Cas9 system brings a break of doublestranded DNA.
It can be seen that CRISPR array is used to identify exogenous DNA sequences, while Cas9 protein acts as scissors for cleavage.
Among them, tracrRNA-crRNA serves as the navigator of the system and guides Cas9 for precise targeting. Importantly, in 2012, Jinek et al. 32 designed a single-stranded guide RNA (sgRNA) that replaced the crRNA-tracrRNA complex, which can direct the Cas9 protein to specifically cleave the target gene ( Figure 2c). This success confirms the feasibility of artificially designed sgRNAs for target sequence by means of synthetic biology, thus enabling gene repair and modeling of mutations, knock-out, knock-in, fusion. 33 Especially in recent years, the rise of multiple Cas proteins has once again brought the advantages of CRISPR-Cas systems to the forefront with remarkable achievements.

| COMMONLY USED CAS PROTEINS AND THEIR NOVEL DERIVATIVES
Although many CRISPR-Cas systems have been identified, only a few of them have been used as research tools. Among them, class II systems relying on a single-effector Cas protein are widely used for gene editing in mammals. Specifically, the representative effector proteins in the class II are Cas9 (type II), Cas12 (type V), and Cas13 (type VI) [34][35][36] (Figure 2d). The characteristics of the different Cas proteins can be seen in Table 1.

| Cas9
Cas9, the most widely used effector protein, has two structural domains with cleavage activity: the HNH domain (responsible for cleaving the complementary DNA strand with crRNA) and the Ruvc domain (responsible for cleaving the noncomplementary DNA strand). 37 The commonly used Cas9 protein is derived from Streptococcus pyogenes (SpCas9), whose PAM sequence is NGG (N is any nucleotide). To further extend the diversity of PAM sequences, over 10 different Cas proteins have been identified in the last few years.
For example, the smallest Cas9 nuclease is from Campylobacter jejuni (CjCas9), which has only 984 amino acids and its PAM sequence is NNNNNACAC. The small size of CjCas9 allows for good intracellular delivery, but its targeting range and flexibility are relatively limited. 38 Cas9 nuclease cleaves DNA and produces genome editing effects via nonhomologous end joining (NHEJ) or homology-directed repair  39 This form of mutation produces single-stranded gaps rather than double-stranded breaks, and allows gene editing via the HDR pathway. If doublestranded DNA needs to be cleaved, the two gRNAs will be designed to be on opposite DNA strands and in close proximity (sequences no more than 20 bp apart), thus effectively introducing a DSB. Ultimately, on-target stringency can be increased while off-target mutations are minimized with Cas9n. 40 Further, by mutating both nuclease active regions of Cas9, a dead Cas9 (dCas9) that specifically recognizes only sgRNA and has no shearase activity was generated. 41 dCas9 is mainly fused with transcriptional regulatory elements or chromosomal modification elements to build new tools for the regulation of transcriptional and epigenetic modifications such as CRISPRa (transcriptional activation), CRISPRi (transcriptional interference), and CRISPRoff (controlling gene expression with high specificity while leaving the DNA unchanged). [42][43][44] For example, dCas9 promotes or represses the transcription of target genes by binding to activation domains (VP16, VP64, NF-κB) or repression domains (KRAB, MIX1).
To further overcome the targeting limitations of PAM, Walton et al. 45 designed new Cas9 variants, named as SpG and SpRY that bind and cleave DNA without specific PAM and are capable of unrestricted targeting the majority of the human genome with single basepair precision. It is thus clear that the off-target and PAM sequence defects of the CRISPR-Cas9 system can be refined, leaving the efficient delivery of CRISPR-Cas9 as the remaining obstacle for in vivo application of the CRISPR-Cas9 systems, which will be addressed in a subsequent section. Overall, optimized CRISPR-Cas9 system can be applied to a wider range of fields, including gene therapy for cancer.

| Cas12
Unlike Cas9, Cas12 nuclease contains only a RuvC-like domain that cleaves two strands to induce DSB. 46 Since possessing RNAase and DNAase activity, the Cas12 nuclease relies on a single crRNA guide for DNA localization and cleaves at the distal PAM end to produce 5-nt sticky ends, in contrast to Cas9 which normally cleaves near the PAM end to produce blunt ends. 47 Widely used Cas12a (known as Cpf1) is from Acidaminococcus spp. (AsCas12a) and Lachnospiraceae spp. (LbCas12a), with a small molecular mass of 1200 to 1300 amino acids. 48 On the one hand, unlike the G-rich PAMs required for Cas9, Cas12a can recognize T-rich PAMs, thus further increasing the number of potential target sites. On the other hand, Cas12a can follow its own cleavage pattern and PAM sequences to generate staggered ends, facilitating precisely targeted integration of DNA. The restricted recognition of PAM (5 0 -TTTN-3 0 ) by AsCas12a and LbCas12a limits their application in the field of gene editing. 49 An enhanced AsCas12a variant (enAsCas12a) has been designed to improve genome editing activity. Meanwhile, the targeting range for Cas12a has been expanded greatly by newly engineered AsCas12a variants that recognize PAMs 5 0 -TYCV and 5 0 -TATV, or PAMs 5 0 -VTTV, 5 0 -TTTT, 5 0 -TTCN, and 5 0 -TATV. 50,51 Another reported Cas12a with a Francisella novicida origin (FnCas12a) has a PAM sequence of 5 0 -KYTV-3 0 (K is T and G; Y is C and T; V is A, C, and G) and possesses DNA cleavage activity in human cells at multiple loci. 52 The extended PAM sequences enhance selectable regions of target sites and enrich applications of Cas12a in gene editing.
Cas12a is not only flexible but also shows a high degree of specificity. Kim et al. 53 used Digenome-seq to analyze the whole genome after the action of different gene-editing enzymes. It was found that for the same crRNA, LbCas12a and AsCas12a had 6 and 12 off-target sites, respectively, which were far fewer than those caused by Cas9 (>90 sites). Moreover, Kleinstiver et al. 54  There is very limited information for Cas13c system.
Unlike Cas9, Cas13 proteins do not require PAM sequences to identify their targets, but they do have a protospacer flanking site (PFS) structure dependency, that is, that the base before the original spacer sequence should be A, C, or U. [70][71][72] Notably, in eukaryotic and prokaryotic cells, Cas13 is activated after recognition and cleavage of RNA, but also has "collateral shearing" RNase activity, which can shear adjacent single-stranded RNA and cause cell dormancy or programmed cell death. 70 in a way that rewires naturally occurring biological circuits (either genes or proteins) to achieve the desired logical forms of cellular control. 79 In the last decade, synthetic biology has begun to develop rapidly, with a series of pioneering milestones such as the smallest artificial synthetic cell (named JCVI-Syn3.0), 80 and artificial cells that can grow and divide normally. 81 Gradually, we could "read," "write," and "compile" the genome, and have the ability to design and synthesize life. Today, biological design by applying engineering principles such as standardization, 82 modularity, 83 digital logic, 84 and mathematically predictable behavior 85 has become central to synthetic biology.
Notably, synthetic biology is developing rapidly in the field of medicine and is bound to have a dramatic impact on the medical field. 86 For example, Williams et al. 87

| Genetic circuits in cancer
The robust and precise switching on and off one or more genes of interest is essential for many biological circuits as well as for industrial applications. Synthetic biology aims at designing modular genetic circuits. At present, CRISPR-based genetic circuits include logic gates, cascades, bistable switches, and temporal and spatial pattern generators, among which logic gates are the most widely explored 90 ( Figure 3a).
Initially, CRISPR systems were used for the design and application of individual logic gates. The dCas9-Mxi1-based NOR gate designed in Saccharomyces cerevisiae enables direct conversion of gRNA inputs into gRNA outputs, allowing the gates to be "wired" together. It implements arbitrary internal logic for a variety of synthetic cellular decision-making systems and shows minimal leak transcriptionally and digital responses, forming the basis for large, synthetic, cellular decision-making systems. 91  existed. Subsequently, the G4/ThT complex was formed after Cas9n/ sgRNA complex recognized the two cleavage sites. The fluorescent signal was activated and the process was treated as "ON". The input cases (1, 0 or 0, 1 or 0,0) generated insufficient fluorescent signal, and the output was considered as "OFF." 96 Furthermore, the human telomerase reverse transcriptase (hTERT) promoter is considered to be a cancer-specific promoter, 97 while the human uroplakin II gene (hUP II) promoter is a bladder-specific promoter. 98 Liu et al. 99  With the advantages of high programmability, modularity, and orthogonality, CRISPR-based genetic circuits will enable us to construct more complicated and sophisticated synthetic circuits, making them the more powerful therapeutics for cancer therapy and other medical treatments.

| Optogenetic devices in cancer
CRISPR systems-mediated gene editing has been well-established as a powerful tool for in vitro and in vivo gene regulation, but has not yet been able to achieve such regulation in spatial and temporal manners.
Optogenetics offers an unprecedented ability to achieve precise spatial and temporal control of cellular activity using light of appropriate intensity and wavelength as a trigger signal. 101

| Cellular barcoding for cancer therapy
Cellular barcoding involves individual cells being tagged with unique nucleic acid sequences so that they can be tracked through space and time. At present, cellular barcoding has been widely adopted for fate mapping, lineage tracing, and high-throughput screening, and has greatly contributed to the understanding of developmental biology and gene function. 114 Alejo Rodriguez Fraticelli et al. 115  DNA barcoding can label enormous numbers of cells, but can only provide volumetric resolution and does not yield high-precision phenotypic and somatic cell resolution. 126 To address these issues, a barcoding system that operated at the protein level has been reported.   Each form of delivery has advantages and disadvantages. Specific details can be found in Table 2. Regardless of delivery forms, the largest challenge lies in delivering the cargo across the membrane. A variety of viral, physical, and chemical methods have been derived to achieve successful delivery across the cell membrane

| CURRENT DELIVERY SYSTEMS OF CRISPR-Cas
The multiple forms and derivatives of the CRISPR-Cas systems as dis- (iii) Delivering plasmid DNA encoding Cas9 and sgRNA, such as pX330, pX458, and pX459. 142,143 The advantages and disadvantages of these three approaches are shown in Table 2. The CRISPR-Cas systems for targeting DNA must enter the nucleus of the target cell in order to have a therapeutic effect. To achieve such a purpose, a number of delivery systems have been developed for the CRISPR-Cas9 system. Depending on whether viral transduction is used, CRISPR-Cas9 delivery strategies can be broadly classified as viral or nonviral approaches, the latter also includes a variety of physical and chemical delivery strategies (Table 3).  LV vector simultaneously. 159 LVs have the ability to transduce nondividing cells efficiently compared to other viral vectors, making them an instrumental tool for gene therapies in somatic and germline cells. 160 Since all viral genes are deleted, the LV vector does not activate the immune system. 161 Furthermore, the production of LVs is simpler than that of AAVs and AVs. 162 Taking these advantages, Holmgaard et al. 163

| Nonviral delivery
Several nonviral vectors have been developed and successfully applied for CRISPR-Cas delivery (either DNA, mRNA, or protein), such as physical methods to disrupt the cellular barriers, chemical modifications to improve cargo transport to avoid the barriers, and physical encapsulation of the cargo in the vector molecule. 170 The main advantages of nonviral vectors are their ability to accommodate large size delivery, ease of generation, good controllability, and safety.

| Physical delivery
Delivery of CRISPR-Cas is achieved by exposing cells to mild physical conditions that temporarily disrupt the physical barriers that prevent the cargo from reaching its intended destination. Microinjection is a physical method of injecting Cas9 and sgRNAs directly into cells using a microscope and a microinjection needle. 171 This method has been used for DNA, RNA, or RNP delivery of Cas9, as well as for direct delivery of gRNAs. [171][172][173] However, the method is heavily relied on well-established experimental facilities and delicate handling to avoid permanent damage to the membrane, and is therefore commonly used for in vitro cell experiments. Electroporation is the temporary disturbance of the lipid bilayer of the plasma membrane by an electric field, thereby enhancing the permeability of the cell membrane. 174 The method has been successful in delivering DNA, RNA, and even RNPs in vitro. 139,175,176 However, electroporation-mediated gene editing is costly and requires specific induction conditions to be set for different cell types. Importantly, high level of cell death caused by electroporation also limits its clinical application. 177 Both physical methods have been widely used in vitro, but only a few strategies have been used for in vivo delivery of the CRISPR-Cas system, including hydrodynamic injection. 178 Hydrodynamic injection enables delivery of CRISPR-Cas by creating temporary pores in the cell membrane at high pressure for a short period of time. 179  Cell penetrating peptides (CPPs) are short peptides with the ability to across cell membranes. CPPs are suitable for preclinical and clinical studies due to their low cytotoxicity compared to other vectors and their eventual degradation to amino acids. 185 They are currently used as tools to achieve efficient Cas9 protein and sgRNA delivery. 195,196 Ramakrishna et al. 197

| Other emerging nonviral delivery
Graphene oxide (GO) and black phosphorus (BP) nanosheets can also be used for CRISPR-Cas9 delivery. GO has good bio-compatibility and safety, and is capable of loading Cas9/sgRNA RNP after modification with PEG and PEI. GO-PEG-PEI can rapidly transfer RNP into human cells while retaining Cas9 activity, with a gene editing efficiency of 39%. 205 Besides, BPs, also known as isotopes of elemental However, cancer cells have evolved an "immune escape" mechanism to survive. Cancer immunotherapy (also called "immuno-oncology") is used to enhance the anti-cancer ability of immune cells so that cancer cells cannot complete immune escape. 225 (Table 4). Collectively, the CRISPR-Cas system represents a powerful and promising tool in cancer modeling, diagnosis and treatment.
However, there are still some challenges associated with CRISPR technologies remain to be circumvented before moving to clinical applications, including adaptability, editing efficiency, delivery methods, offtarget effects, and potential on-target mutagenesis. 232,233 For example, the p53 gene is known as the "guardian of the genome". DNA doublestrand breaks could be recognized by the p53 gene, which in turn prevents cell division and corrects the error, thus affecting editing efficiency of CRISPR systems. It has been shown that CRISPR-Cas9 gene editing can activate the p53-mediated DNA damage response, which increases potential risks. 234 These issues must therefore be addressed if CRISPR systems are to be used to precisely target cancerrelated genes in human patients. Excitingly, the advent of extended toolkits and synthetic biology offers a solution to these issues. By artificially controlling various components, the behavior of molecules can be precisely regulated, allowing us to efficiently circumvent the shortcomings of CRISPR systems during cancer diagnosis and treatment. For instance, Huang et al. 235 applied logic circuits and optogenetic devices to the split-dCas9 system in an attempt to inhibit bladder cancer progression. The system used AND logic gates with the hTERT and hUP II promoters as activation keys. Using a similar strategy, the expression of p53 or E-cadherin protein could be activated by induction of blue light, thereby inhibiting tumor cell proliferation. Altogether, the combined use of one or more synthetic biology strategies offers a potential therapeutic intervention for cancer treatments.

| CONCLUSIONS
The CRISPR-Cas system is considered as a powerful tool for cancer treatment due to its robust gene editing capabilities.

FUNDING INFORMATION
The present work was supported by the Natural Science Foundation