Expanding the CRISPR toolbox for Chinese hamster ovary cells with comprehensive tools for Mad7 genome editing

The production of high‐value biopharmaceuticals is dominated by mammalian production cells, particularly Chinese hamster ovary (CHO) cells, which have been widely used and preferred in manufacturing processes. The discovery of CRISPR‐Cas9 significantly accelerated cell line engineering advances, allowing for production yield and quality improvements. Since then, several other CRISPR systems have become appealing genome editing tools, such as the Cas12a nucleases, which provide broad editing capabilities while utilizing short guide RNAs (gRNAs) that reduce the complexity of the editing systems. One of these is the Mad7 nuclease, which has been shown to efficiently convey targeted gene disruption and insertions in several different organisms. In this study, we demonstrate that Mad7 can generate indels for gene knockout of host cell proteins in CHO cells. We found that the efficiency of Mad7 depends on the addition of protein nuclear localization signals and the gRNAs employed for genome targeting. Moreover, we provide computational tools to design Mad7 gRNAs against any genome of choice and for automated indel detection analysis from next‐generation sequencing data. In summary, this paper establishes the application of Mad7 in CHO cells, thereby improving the CRISPR toolbox versatility for research and cell line engineering.

producing clones are attained through random integration of transgenes; however, this approach provides no control over the transgene integration sites. As a result, the arbitrary nature of integration sites and their variable accessibility for gene expression may lead to clonal instability and loss of productivity (Bandyopadhyay et al., 2019;Vcelar et al., 2018;Wilson et al., 1990;Wurm, 2004).
Furthermore, random transgene integration requires labor-intensive and time-consuming screening to select the best-performing clones Noh et al., 2013). Similar elaborate workflows were necessary for gene knockouts, where many early accomplishments required preliminary protein engineering of endonucleases, such as zinc-finger nucleases or transcription activator-like effector nucleases (Fan et al., 2012;Sakuma et al., 2015).
Significant advances were made in cell line engineering with the introduction of the CRISPR-Cas9 technology, which enabled easy genome manipulation, including targeted gene integration and the knockout of genes (Adli, 2018;Grav et al., 2015Grav et al., , 2017Pickar-Oliver & Gersbach, 2019). Targeted integration of transgenes reduced the risk of gene disruption caused by random integration and allowed for predictable and steady gene expression with minimal clonal variation (Grav et al., 2018;Lee et al., 2019;Pristovšek et al., 2019). Similarly, the convenience and simplicity of gene editing have led to the development of multiplex knockouts of disadvantageous genes (Kol et al., 2020). In this way, the CRISPR tool has been used to improve various parameters of CHO cell bioprocessing, such as viability, protein folding, and secretion (Fischer & Otte, 2019;Hansen et al., 2017;Schweickert & Cheng, 2020;Shin et al., 2021). It has also facilitated the control and manipulation of protein glycosylation as well as the elimination of interfering host cell proteins (HCPs) (Amann et al., 2019;Kol et al., 2020).
Mad7, also known as ErCas12a, is a class II type V-A Cas12a nuclease engineered from and named after the Madagascar-isolated Eubacterium rectale (R. M. Liu et al., 2019;Wierson et al., 2019). It consists of 1262 amino acids organized in a 147.9 kDa monomeric polypeptide, which due to the extensive modification only has 76% nucleotide homology with the native variant (Z. . Notably, the Mad7 nuclease shares just 31% amino acid homology with the canonical Acidaminococcus Cas12a (also known as AsCpf1), making it more distantly related to Cas9 than the canonical Cas12a (Wierson et al., 2019).
Like other Cas12a proteins, Mad7 favors T-rich protospacer adjacent motif (PAM) sites and needs only CRISPR RNA (crRNA) to bind a DNA target site, where it generates a staggered double-strand break (DSB) (Lin et al., 2021;Price et al., 2020). These molecular characteristics differ from those of the Cas9 protein, allowing the different nuclease types to be used in complementary and distinct ways (Swarts & Jinek, 2018). Mad7 identifies 5′-YTTN-3′ PAM sites, making it suitable for genome editing in circumstances where Cas9 utilization is challenging due to low guanine-cytosine content (Lin et al., 2021;Price et al., 2020). Because Mad7 requires only the crRNA element, no trans-activating CRISPR RNA (tracrRNA) component is necessary to mature and combine the crRNA with the nuclease (Bayarsaikhan et al., 2021;Zetsche et al., 2015). For this reason, the guide RNA (gRNA) sequences of the Mad7 systems are substantially shorter than those of the Cas9 (~100 nt), and programming Mad7 for genome editing applications requires gRNA sequences of just 42 nucleotides, making the synthesis of gRNAs for ribonucleoprotein (RNP) delivery more cost-efficient (Tang & Fu, 2018). Mad7 crRNAs are designed with 5′ direct repeats of either 21 or 35 nucleotides followed by a 21 nucleotide protospacer region at the 3′ end that targets the desired genomic site for DNA cleavage (Price et al., 2020). When delivered via plasmids, gRNAs expression is driven from Type 3 RNA polymerase III, typically the U6 promoter (Gao et al., 2017;Ma et al., 2014). For high expression, gRNAs expressed from the U6 promoter necessitate a guanine residue (G) in the (+1) position (Kulcsár et al., 2017). Because the Mad7 crRNA leads with G in the 5′ direct repeat, the subsequent spacer element can be designed freely. Contrarily, when expressing Cas9 gRNAs from U6 plasmids, it is recommended to choose spacer regions with an initial G or incorporate this by adding or substituting the 5′ nucleotide (Grav et al., 2017;Mullally et al., 2020).
Mad7 will bind to genomic regions matching the designed RNA protospacer that are adjacent to a suitable immediate upstream PAM site. Here, the Mad7 enzyme uses a single RuvC-like endonuclease domain to cut the DNA in a staggered manner, leaving 4-nucleotide 5′-overhangs at the PAM distal end of the genomic target site. The two resulting cleavage sites are located 19 bases after the PAM on the sense strand and after 23 bases on the antisense strand ( Figure 1). In contrast to Mad7, the canonical Streptococcus pyogenes Cas9 protein uses two nuclease domains (HNH and RuvC-like) to break each DNA strand and generate a nearly synchronous bluntended DSB proximal to the PAM (Bayat et al., 2018;Paul & Montoya, 2020;Swarts & Jinek, 2018).
After a DSB at the genomic target site, DNA repair is commenced through either homology-directed repair (HDR) or the more prominently occurring nonhomologous end joining (NHEJ) (Safari et al., 2019). HDR can be leveraged to integrate donor DNA using homologous regions spanning the DSB, and NHEJ will rejoin the ends of severed DNA strings (Sergeeva et al., 2019). The sitespecific nuclease will keep cleaving accurate repairs until the errorprone NHEJ introduces an indel. However, because Mad7 generates staggered cuts opposite its critical gRNA seed segment, small NHEJ indels do not disrupt the nuclease association. Thus, if the target site seed is preserved, Mad7 can recut the same site until a large indel or HDR recombination event occurs (Kim et al., 2017;Tang & Fu, 2018;Zhang et al., 2021). This capacity makes the nuclease system practical for gene knockout and donor DNA integration into the targeted genome site (Bayarsaikhan et al., 2021;Zhang et al., 2021). Contrarily, the Cas9 protein cleaves the DNA in the seed segment of its gRNA, where a single indel can inhibit Cas9 activity and stop further cleavage of the genomic target site (Swarts & Jinek, 2018).
The gene editing activity and functionality of Mad7 have been confirmed in several bacteria, yeasts, filamentous fungi, HEK293 cell lines, zebrafish, rodent embryos, and stem cells (Bayarsaikhan et al., 2021;Han et al., 2022;Jarczynska et al., 2021). Similarly, Mad7 has been used to integrate donor DNA ranging in size from 1 ROJEK ET AL. | 1479 to 14 kb into mammalian cells, making it a suitable alternative CRISPR enzyme for cell line engineering in the industrial relevant CHO production cell line (Bayarsaikhan et al., 2021).
In this study, we evaluated Mad7 for genome editing in CHO cells by measuring indel rates in targeted regions of HCP genes in the CHO cell genome. Two Mad7 variants with 1× or 4× nuclear localization signals (NLS) were tested using RNP and plasmid-based delivery protocols. We observed that both RNP-and plasmid-based delivery protocols of Mad7 were sufficient to convey targeted genome editing in CHO cells, but also that the 4×NLS variant performed significantly better than the 1×NLS variant. Furthermore, we demonstrate that Mad7 could be applied in practical cell line engineering for functional knockouts of glutamine synthetase (GS) and targeted integration of a recombinase-mediated cassette exchange (RMCE) landing pad expressing exogenous genes. Lastly, we provide two software packages for genome editing with Mad7: the first is a performance update to the gRNA finding tool CRISPy (Ronda et al., 2014), now known as CRISPyR, which also works with F I G U R E 1 Mad7 endonuclease forms a complex with a CRISPR RNA (crRNA) without using a trans-activating CRISPR RNA (tracrRNA) element. The crRNA contains a 21 nt programmable spacer element at the 3′ end, which determines the DNA cleavage site. Upon target site recognition, the spacer element associates with the strand opposite an immediately upstream protospacer adjacent motif (PAM) site (YTTN). Here, Mad7 generates a double-strand break with staggered 5′ overhangs 19 and 23 bp downstream of the PAM site on the sense and antisense strand, respectively, using RuvC-like endonuclease domains. Cas9 endonuclease forms a complex with a two-part guide RNA (gRNA) composed of a crRNA and a tracrRNA. Alternatively, the crRNA and tracrRNA can be synthesized as a single guide RNA (sgRNA). In this case, the spacer element consists of 20 nt and is positioned at the 5′ end of the crRNA. The spacer binds the strand opposite a directly downstream PAM site (NGG). After target site recognition, Cas9 generates a double-strand blunt-end cut 3 bp upstream of the PAM using RuvC-like and HNH endonuclease domains.
Mad7 PAM sequences. The second piece of software is Hamplicon, a tool for automating indel detection analysis of next-generation sequencing (NGS) amplicon data.
CRISPyR was used to find all Mad7 compatible target sites in the Bgn and the Timp1 gene exons, and four gRNAs targeting the first exon of either gene were selected. For Cas9, the gRNAs and amplicon primers were designed using the CHOPCHOP web tool using default settings for the Cas9 nuclease (Labun et al., 2019). gRNA and amplicon sequences are provided in Supporting Information: Table 1. crRNAs for Mad7 RNPs were purchased as Alt-R CRISPR-Cpf1 crRNA from Integrated DNA Technologies (IDT). sgRNAs for Cas9 RNPs were purchased as TrueGuide Synthetic gRNA from Thermo Fisher Scientific (note: chemical modifications include 2′-O-methyl analogs and phosphorothioate linkages to increase editing efficiency and protect against nuclease degradation). The gRNAs were prepared by reconstituting the gRNA in IDTE buffer to a concentration of 100 µM.

| Design and production of Mad7 variants
Two versions of Mad7 were designed. A 1×NLS (pNic28-EcMad7-1×NLS) and a 4×NLS (pNic28-EcMad7-4×NLS) version. Both were polyhistidine-tagged for protein purification. Escherichia coli BL21 Star (DE3) cells were transformed with the His-tagged Mad7 expression vectors. The 2×YT medium supplemented with kanamycin was inoculated with a single colony and incubated overnight at 37°C.
The culture was diluted in 1-2 L 2×YT medium to an OD600 of 0.1 and grown at 37°C to an OD600 of 0.6. At this point, the cultures were placed in an ice bath for 15-20 min. Next, isopropyl thiogalactoside was added to a final concentration of 0.2 mM, and protein expression was allowed to take place at 18°C overnight (18-20 h). Cells were harvested by centrifugation and resuspended in lysis buffer (20 mM Tris, 500 mM NaCl and 10 mM imidazole, pH 8.0) supplemented with cOmplete Protease Inhibitor Cocktail (Merck; cat#11697498001). After resuspension, benzonase nuclease (Merck; cat#E1014-5KU, 10 μL per 40 mL lysate) and lysozyme (Merck; cat#10837059001, 1 mg/mL lysate) were added, and the cell suspensions were left on ice for 30 min. Cells were disrupted on an Avestin EmulsiFlex C-5 homogenizer (15-20 kpsi), and the insoluble cell debris was removed by centrifugation (15,000g, 4°C, 15 min). All subsequent chromatography steps were carried out at 10°C. The cleared lysates were loaded on 5 mL HisTrap FF columns (Cytiva; cat#17525501). The resins were washed with 10 column volumes of wash buffer (lysis buffer, but with 20 mM imidazole), and the proteins were eluted with 10 column volumes of elution buffer (lysis buffer, but with 250 mM imidazole). Fractions containing the proteins (typically 13.5 mL) were pooled and diluted to 25 mL in dialysis buffer (250 mM KCl, 20 mM HEPES and 1 mM DTT, and 1 mM EDTA, pH 8.0). The samples were dialyzed against 1 L of dialysis buffer at 10°C using a dialysis membrane tubing with a molecular weight cutoff of 6-8 kDa (Spectra/Por standard grade regenerated cellulose, 23 mm wide). The dialysis buffer was replaced after 1-2 h, and dialysis continued overnight. The next day, the dialyzed samples were diluted two-fold in 10 mM HEPES (pH 8) and immediately loaded on 5 mL HiTrap Heparin HP columns (Cytiva; cat#17040601) preequilibrated with buffer A (20 mM HEPES, 150 mM KCl, pH 8.0).
The resins were washed with 2 column volumes of buffer A, and the proteins were eluted using a linear gradient from 0% to 50% of buffer B (20 mM HEPES, 2 M KCl, pH 8.0) over 12 column volumes.

| Plasmid construction
The plasmids used for the transfection protocols were constructed with USER cloning. The DNA fragments for USER cloning were generated by PCR amplification with Phusion U Hot Start PCR Master Mix (Thermo Fisher Scientific; cat#F533S) and uracil-containing primers designed with Amuser 1.0 (Genee et al., 2015;Lund et al., 2014). DNA fragments were generated and assembled to construct the final plasmids. E. coli codon- at 37°C in a humidified atmosphere with 5% (v/v) CO 2 on an orbital shaker platform rotating at 120 rpm with a 25 mm shaking amplitude.
Viable cell density (VCD) was monitored using a NucleoCounter NC-200 (ChemoMetec). The cells were passaged in working volumes of 30-60 mL at 0.3 × 10 6 cells/mL every 2-3 days at least three times before transfection, ensuring the cells were in exponential growth on the day of transfection.

| Plasmid transfection for indel rate analysis
On the day of transfection, 1.5 × 10 6 cells were seeded per well in a 12-well plate (Falcon, cat#351143). The cells were transfected with Freestyle MAX Reagent (Gibco, cat#16447100) according to the manufacturer's instructions using a 1:1 (w:w) ratio between gRNA and Mad7 plasmids. After transfection, the cells were incubated for 48 h at 37°C in a humidified atmosphere with 5% (v/v) CO 2 on an orbital shaker platform rotating at 120 rpm with a 25 mm shaking amplitude before cell harvest for genomic DNA extraction.

| Preparation of transfected samples for indel rate analysis with NGS
At 48 h posttransfection, the cells were mixed by pipetting up and down. A measure of 100 μL cells was then transferred to 96-well PCR plates (Sarstedt;cat#72.1978.202), and the plates were centrifuged at 1000g for 10 min at room temperature. The supernatants were removed by rapid inversion onto a KimWipe. A measure of 100 μL QuickExtract (Lucigen; cat#QE9050) was added to each well, and the cells were resuspended by pipetting up and down several times.
Finally, the plates were incubated at 65°C for 15 min and then at 95°C for 5 min before being stored at −80°C.
Amplicons of the target regions were generated using primers with specific overhang adaptors (primers in Supporting Information:

| NGS data analysis and visualization
The amplicon NGS data were analyzed using the Hamplicon tool and visualized using a Jupyter Lab notebook (https://github.com/laeblab/ Mad7-CHO analysis). All results can be replicated by running the code in the notebook against the raw data uploaded to Zenodo (https://doi.org/10.5281/zenodo.5020293).

| Validation of GS knockout by L-glutamine starvation
Three CHO-S GS knockout clones and a wild-type clone were thawed and recovered in prewarmed CD CHO medium (Gibco; cat#10743029) supplemented with 8 mM L-glutamine (Gibco; cat#25030081). All cultivations were conducted in 6-well plates (Falcon; cat#351146) incubated at 37°C in a humidified atmosphere with 5% (v/v) CO 2 on an orbital shaker platform rotating at 120 rpm with a 25 mm shaking amplitude. When cell viabilities of >95% were obtained, each culture was split in two by centrifuging at 200g for 5 min and resuspended in two different prewarmed media with and without 8 mM L-glutamine supplementation. The cultures were maintained and passaged at 0.5 × 10 6 viable cells/mL every 2 days. VCD was monitored using a NucleoCounter NC-250 Cell Counter (ChemoMetec). Junction PCR primers are listed in Supporting Information: Table S1. PCR products were visualized on a 1% agarose gel, and clones showing correct amplicons were selected for further verification with copy-number analysis.

| Statistics
One-way analysis of variance was applied to test for statistical differences in the mean of indel rates across samples and the averaged value of the control groups within identical amplicons.

| Software for gRNA design and amplicon analysis
As no software was available for designing Mad7-compatible gRNAs, we extended the Cas9 gRNA software, CRISPy, to include Mad7 configurations. Moreover, the program was rewritten in RUST for performance reasons. The resulting CRISPyR is freely available online (https://github.com/laeblab/crispy). The software works by first indexing all potential gRNA sites in a user-supplied genome sequence. Afterwards, this index can be used to search for gRNAs in desired regions. The selected gRNAs are then compared to all other potential gRNA sites in the genome to evaluate potential offtargets. The program is highly performant and can be run on a standard laptop.
Additionally, we developed a tool called Hamplicons to minimize the time spent analyzing NGS amplicon data from genome editing.
Hamplicons uses hamming distance to analyze amplicons and is also freely available online (https://github.com/laeblab/hamplicons). The user supplies Hamplicons with a FASTA file containing the expected wild-type amplicons plus the raw sequencing (FASTQ) files from the experimental samples. The program then analyzes all FASTQ files to identify the amplicons and generates a report of the indels found in the data. In its base version, Hamplicons is set up to work with Illumina paired-end sequencing with short overlaps between read pairs. The folder and file structure generated by Illumina sequencers is recognized and used to correctly pair and group files. The indel analysis recognizes whether the merged read-ends are identical/ similar to the wild-type amplicon sequences using hamming distance.
When a read is recognized as a particular amplicon, the merged read's length is compared to the expected amplicon length, and any discrepancies are reported as indels.

| Mad7 with 4×NLS is efficient for generating indels in CHO cells
To evaluate whether Mad7 works for genome editing in CHO cells, we used CRISPyR to generate four gRNAs for each of the two HCP genes, Bgn and Timp1. These gene products were previously identified as impurities in CHO cell bioprocesses and were chosen because they were successfully targeted and knocked out using Cas9 and thus known to be readily accessible for CRISPR enzymes (Kol et al., 2020).
Using an RNP electroporation protocol, we tested the genome editing efficiency of two versions of Mad7, one with 1×NLS and one with 4×NLS. The RNPs were prepared by combining the two Mad7 variants with the panel of designed gRNAs. The RNPs were then delivered at two different concentrations, high (50 pmol/sample) and low (10 pmol/ sample). The transfected cells were incubated for 48 h before being harvested for genome extraction. Subsequently, PCR amplicons of the targeted genomic regions were generated for NGS analysis.
At high RNP concentrations, Mad7-4×NLS efficiently created indels in a substantial share of the cell population when combined with several of the designed gRNAs (up to~35%) (Figure 2). It was also observed that the editing efficiencies differed significantly depending on the employed gRNA. For all gRNA designs, the 4×NLS variant outperformed the 1×NLS, which often did not produce statistically significant indel rates.
Transfecting samples with low concentrations of Mad7 RNP did not routinely generate indel rates significantly above the control samples (Supporting Information: Figure 1).

| Indel size patterns produced by Mad7 differ from those produced by Cas9
In parallel with the Mad7 experiments, we also tested the Cas9

| Plasmid-based expression of Mad7 also works for targeted genome editing
As an alternative to the RNP protocol, we tested the genome editing efficiency of plasmid-delivered Mad7 in cotransfections with separate gRNA expression plasmids. Plasmid transfections do not require the initial production and purification of the Mad7 protein, making the approach more straightforward for routine cell line generation.
Expression plasmids were generated for the two Mad7 variants (1×NLS and 4×NLS) and the gRNA panel from the previous RNP experiment. Subsequently, CHO cells were transfected with the plasmids using lipid-based transfection and incubated for 48 h before cell harvest. Genomic DNA was extracted from the harvested cells and used as the template for PCR amplification of the targeted regions. Finally, the amplicons were analyzed using NGS.
Indel analysis showed that expressing Mad7 and gRNA from plasmids was sufficient for CHO cell genome editing, albeit the indel rates were lower than those observed for the RNP protocol ( Figure 5). Similar to the RNP protocol, though less distinct, the 4×NLS version seemed more effective than the 1×NLS version.
F I G U R E 2 Genomic indel rates detected in Chinese hamster ovary (CHO) cells generated by electroporation-delivered Mad7 ribonucleoproteins (RNPs) at high concentration (50 pmol/sample). The Mad7 variants, 1× nuclear localization signals (NLS) and 4×NLS, were tested in complexes with one of eight different gRNAs targeting Bgn or Timp1. The samples are grouped by the next-generation sequencing (NGS) amplicons used to evaluate indel rates in the target regions. The control groups include samples with RNPs that were not electroporated (No electroporation), samples that were electroporated without RNPs (No RNP), and samples without RNPs that were not electroporated (wildtype cells). All samples were conducted in three biological replicates with error bars representing the standard deviations. Multiplicity adjusted p values: *p < 0.05; **p < 0.01; ***p < 0.001; ****p < 0.0001.
F I G U R E 3 Genomic indel rates detected in Chinese hamster ovary (CHO) cells generated by electroporation-delivered Cas9 ribonucleoproteins (RNPs) at low concentration (10 pmol/sample). The Cas9 nuclease was tested in complexes with one of eight different guide RNAs (gRNAs) targeting BGN or TIMP1. The samples are grouped by the next-generation sequencing (NGS) amplicons used to evaluate indel rates in the target regions. The control groups include electroporated samples without RNPs (No RNP) and untreated wild-type samples. All samples were conducted in three biological replicates with error bars representing the standard deviations. Multiplicity adjusted p values: *p < 0. 05; **p < 0.01; ***p < 0.001; ****p < 0.0001.

| Mad7 can be integrated into the cell line generation process for functional knockouts and precision integration
After investigating the Mad7 enzyme activity, we set out to verify its application for routine cell line generation. The initial aim was to obtain classical GS knockout cell lines that can be used in the GS-mediated gene amplification system by coupling a gene of interest to the GS gene and then starving the cells for glutamine or inhibiting the GS activity with MSX (Cockett et al., 1990;Noh et al., 2018). For this purpose, a gRNA expression plasmid was designed with a spacer targeting an exact match in each of the two Glul gene variants in the CHO genome. Cells were cotransfected with the Mad7 4×NLS plasmid, the Glul targeting gRNA plasmid, and a transfection-control plasmid expressing GFP. Two days after transfection, GFP fluorescent cells (n = 95) were single-cell sorted using FACS. The clones were expanded, and genomic DNA was harvested for NGS analysis. Frameshift indels were found in one of the two Glul gene variants in 58.9% of the sorted clones, and both variants with a frequency of 5.3%. Three double-variant Glul knockout clones were selected to confirm the phenotypic change, which was verified by monitoring cell viability during cultivation with and without L-glutamine supplementation ( Figure 6).
Next, we investigated if the Mad7 nuclease could facilitate the targeted integration of transgene elements into the CHO genome through HDR. The ability to perform targeted integration is essential for generating CHO cell lines, and knock-ins in preselected transcriptionally active sites have been used to generate highproducing clones expressing transgenes with minimal clonal variation (Grav et al., 2018;Sergeeva et al., 2020). The accurate insertion F I G U R E 4 Size distribution and frequency of indels generated from targeted genome editing with Mad7 and Cas9. The graph shows the proportions of indels in relation to the total number of indels. The x-axis limits have been adjusted to better visualize the majority of indel sizes detected.
F I G U R E 5 Genomic indel rates detected in Chinese hamster ovary (CHO) cells generated after combined plasmid expression of Mad7 and given guide RNA (gRNA). The Mad7 variants, 1× nuclear localization signals (NLS) and 4×NLS, were tested in cotransfections with one of eight different gRNAs targeting Bgn or Timp1. The samples are grouped by the next-generation sequencing (NGS) amplicons used to evaluate indel rates in the target regions. The control groups include samples transfected with Mad7 but without a cotransfected gRNA and nontransfected wild-type cells. All samples were conducted in three biological replicates with error bars representing the standard deviations. Multiplicity adjusted p values: *p < 0.05; **p < 0.01; ***p < 0.001; **** p < 0.0001.
F I G U R E 6 Viability of glutamine synthetase (GS)-knockout clones during cultivation. Full lines indicate cultivations supplemented with 8 mM L-glutamine, and striped lines denote cultivations without L-glutamine supplementation.
allows for minimal perturbation of the genetic context, mitigating confounding factors such as chromosome position effects, copynumber variability, and the risk of gene disruption that random integration could cause (Lombardo et al., 2011;Pilbrough et al., 2009).
Here, we aimed to generate cell lines containing an RMCE landing pad. Once such cell lines are established, recombinases can be used in DNA exchange reactions to more easily insert different transgenes and generate isogenic clones with predictable gene expression (Inniss et al., 2017;Kelley, 2020;Lee et al., 2019). Dhfr-deficient cells were

| DISCUSSION
CRISPR-Cas9-mediated CHO cell line engineering has routinely been used to improve the production of biopharmaceuticals (Schweickert & Cheng, 2020;Sergeeva, 2020). This study demonstrates that the alternative CRISPR nuclease, Mad7, can successfully be applied for CHO cell genome editing. We evaluated the editing efficiencies of two distinct Mad7 variants (1×NLS and 4×NLS) for the knockout of two HCP genes, Bgn and Timp1. The results showed that Mad7 could produce indels when delivered as RNPs via electroporation (up to 35%) and plasmids via lipofection (up to 10%). In general, the 4×NLS variant outperformed the 1×NLS variant, and editing efficiencies were observed to depend significantly on the applied gRNA design.
An additional examination of the generated indel sizes indicated that Mad7 has a disposition for larger nucleotide deletions than Cas9.
F I G U R E 7 Schematic representation of the targeted integration of the landing pad into the preselected locus using Mad7 to initiate homology-directed repair (HDR). The donor plasmid vector contains the landing pad cassette flanked by loxP and lox2272 recombinase sites allowing for recombinase-mediated cassette exchange (RMCE) using the Cre recombinase. The cassette holds elements for expression of the fluorescent protein mCherry and dihydrofolate reductase (Dhfr). A gene encoding the short-lived fluorescent protein ZsGreen was placed outside the homology arms on the donor vector to exclude random integration events during cell sorting. gRNA, guide RNA.
F I G U R E 8 Relative copy number of mCherry in clonal cells compared to a reference sample COSMC-mCherry from . The error bars represent the standard deviations of three technical replicates.
Lastly, we demonstrated that plasmid-delivered Mad7 could be used in cell line engineering for functional knockout of GS and the targeted integration of an RMCE landing pad expressing mCherry and Dhfr.
While Mad7 could produce moderately high indel rates, it was clear that the commercial Cas9 tested in parallel performed even better. More precisely, electroporation of Mad7 RNPs at low concentrations was mostly insufficient to provoke significant genome editing. On the other hand, Cas9 RNPs delivered at the same concentration often achieved very high indel frequencies. We speculate that there can be multiple causes for the substantial discrepancy between the two nucleases. It has been documented that cleavage efficiency varies significantly between target sites and organisms, implying that several factors may influence the binding and cutting efficacy of the RNP complex (G. . For gRNA design, these parameters include the sequence composition, chromatin accessibility, and RNA secondary structure, all contributing to the effectiveness of the gRNA (Jensen et al., 2017). Based on these parameters, numerous advanced computational tools have been created for designing highly performant gRNAs for the wellestablished Cas9 nuclease, which could be advantageous in a direct comparison (Konstantakos et al., 2022;. Moreover, we also employed a commercial Cas9 platform that uses undisclosed chemical modifications of the sgRNAs to improve editing efficiency and resistance to nuclease degradation. Given the robust usability, it seems possible that tool and protein development could be made for Mad7 to enhance genome editing efficiency further. The performance of Mad7 in CHO cells seems to be on par with the canonical Cas12a nuclease, which was previously established for classical gene knockout and multigene genomic deletions in CHO-K1 cells (Schmieder et al., 2018;Schweickert et al., 2021). Cas12a demonstrated indel frequencies of up to~27% in a similar RNP transfection protocol but also highlighted the need to screen multiple gRNAs for each gene target to identify highly efficient gRNAs, as half of the gRNAs (n = 8) did not create detectable indels (Schweickert et al., 2021). Likewise, Cas12a has also been evaluated from plasmids expression and was estimated to cause indel frequencies of~20%, and in fact, outperforming Cas9 in the same experimental setup (Schmieder et al., 2018). In short, these individual studies indicate that Mad7 operates within the same efficiency range as the wellknown and more widely utilized Cas12a.
Establishing Mad7 for genome editing in CHO cell lines is a valuable extension to the CRISPR toolbox used to generate cell lines producing valuable biopharmaceuticals. The extension introduces a new class II type V-A system that uses PAMs orthogonal to Cas9, employs small gRNAs, and exhibits effective targeted gene disruption and insertions (Z. . In general, other type V-A nucleases have been reported to have fewer off-target effects in mammalian cells than Cas9, making them well-suited for mammalian cell line engineering (Bayarsaikhan et al., 2021;DeWeirdt et al., 2021;Strohkendl et al., 2018). While not previously described or tested in this setup, as a type V-A nuclease, Mad7 is expected to hold an intrinsic RNAse activity that enables the maturation of multiple pre-crRNA from a single crRNA array without needing RNase III (Safari et al., 2019). This ability has previously been described for the canonical Cas12a and enables multiplex genome editing using a single customized crRNA array instead of multiple separate sgRNA segments (including promoter, gRNA, and terminator) (Tong et al., 2021;Zetsche et al., 2017). Similar to other CRISPR systems, Mad7 has also been engineered to orchestrate gene transcription levels using a catalytically "dead" variant, dMad7, for targeted transcription inhibition in both single target and multiplex CRISPRi (McCarty et al., 2020;Price et al., 2020;Zocca et al., 2022). In the same way, it seems likely that Mad7 can also be employed for gene activation in CRISPRa, as other Cas12a enzymes have (Y. Liu et al., 2017). If these tools are used together, they can be envisioned as taking part in multimodal gene regulation, in which orthogonal CRISPR systems target different gene regions for multiplexed transcriptional activation and inhibition (Z. Price et al., 2020;Schmieder et al., 2018).
Certain reservations should be made when evaluating the results of this study. Most importantly, there is a limitation to the indel sizes that can be detected using the experimental approach with NGS of amplicons. If the nuclease causes an extensive deletion during DNA repair that interferes with the amplicon primer binding regions, this event will not appear in the results. Likewise, if large insertions form, the resulting target-region amplicons can become too big for detection via the paired-end sequencing method that requires short overlaps between read pairs (the theoretical detection limit is~150 nt insertions and~280 nt deletions). For this reason, only relatively small indels are reported here, which may lead to underreporting of alternative editing events. Such contingent underestimations may primarily concern Mad7, as other proteins in the Cas12a family have previously been documented to cause large deletions more systematically than Cas9 (Bayarsaikhan et al., 2021;Kim et al., 2017). It is also important to note that different target-region amplicons were used to evaluate the Mad7 and Cas9 systems independently, meaning that any direct benchmarking between the two should be made with certain reservations. Lastly, no protocol optimization was conducted before assessing the editing efficiencies of the tested nucleases, and an E. coli codon optimization Mad7 sequence was used in the indel rate analysis experiments with plasmid delivery. Therefore, it seems likely that even higher efficiencies can be achieved by optimizing codon usage, nuclease to gRNA ratios, and other transfection parameters.
In summary, this paper demonstrates the use of Mad7 for genome engineering in CHO cells and provides software for gRNA design and indel analysis. The findings show that nuclear localization sequences are critical for the editing efficiency, with the 4×NLS version outperforming the 1×NLS version in almost all instances.
Furthermore, the indel rates generated by individual gRNAs vary significantly, emphasizing the importance of gRNA design optimization. By implementing Mad7 for genome editing in CHO cells, we broaden the range and versatility of CRISPR systems for research and CHO cell line engineering, which is essential for producing valuable life-saving biopharmaceuticals.