Genome‐scale screening in a rat haploid system identifies Thop1 as a modulator of pluripotency exit

Abstract Objectives The rats are crucial animal models for the basic medical researches. Rat embryonic stem cells (ESCs), which are widely studied, can self‐renew and exhibit pluripotency in long‐term culture, but the mechanism underlying how they exit pluripotency remains obscure. To investigate the key modulators on pluripotency exiting in rat ESCs, we perform genome‐wide screening using a unique rat haploid system. Materials and Methods Rat haploid ESCs (haESCs) enable advances in the discovery of unknown functional genes owing to their homozygous and pluripotent characteristics. REX1 is a sensitive marker for the naïve pluripotency that is often utilized to monitor pluripotency exit, thus rat haESCs carrying a Rex1‐GFP reporter are used for genetic screening. Genome‐wide mutations are introduced into the genomes of rat Rex1‐GFP haESCs via piggyBac transposon, and differentiation‐retarded mutants are obtained after random differentiation selection. The exact mutations are elucidated by high‐throughput sequencing and bioinformatic analysis. The role of candidate mutation is validated in rat ESCs by knockout and overexpression experiments, and the phosphorylation of ERK1/2 (p‐ERK1/2) is determined by western blotting. Results High‐throughput sequencing analysis reveals numerous insertions related to various pathways affecting random differentiation. Thereafter, deletion of Thop1 (one candidate gene in the screened list) arrests the differentiation of rat ESCs by inhibiting the p‐ERK1/2, whereas overexpression of Thop1 promotes rat ESCs to exit from pluripotency. Conclusions Our findings provide an ideal tool to study functional genomics in rats: a homozygous haploid system carrying a pluripotency reporter that facilitates robust discovery of the mechanisms involved in the self‐renewal or pluripotency of rat ESCs.


| INTRODUCTION
Rats are ideal animal models for pharmacological and physiological studies due to their suitable size and similarity to humans. 1 Authentic rat embryonic stem cells (ESCs) can be derived from preimplantation blastocysts with chemically defined 2i/LIF or 3i/LIF medium 2,3 ; these cells present a naïve pluripotent state and have the capacity to contribute to the germline. These germline-competent rat ESCs greatly facilitate the production of transgenic rats because of their stem cell advantages, 4,5 which are beneficial for developmental and genetic studies in rats. Although rat ESCs show advanced differentiation potential in vivo, their differentiation in vitro is very difficult to achieve because of the severe apoptosis that occurs in this process. 3,6 This drastically hampers the application of rat ESCs in drug selection research in vitro. To differentiate rat ESCs in vitro, an embryoid body (EB)-based method and cultivation in conditioned serum medium can be applied to produce differentiated rat cardiomyocytes and neural progenitors. 7,8 However, these strategies for the differentiation of rat ESCs in vitro are too complicated to be robust, mainly due to the unknown mechanisms underlying self-renewal or pluripotency. It will be fascinating to investigate the key modulators or pathways involved in the self-renewal or differentiation of rat ESCs.
Rat haploid ESCs (haESCs), which are recently successfully produced, are novel pluripotent stem cells with only one set of chromosomes. These cells enable the generation of a genome-scale homozygous mutant library without allelic backups. 9 Similar to rat haESCs, mouse haESCs can also be used to uncover essential targeting genes of critical biological processes, including pluripotency exit. 10,11 Although some groups have attempted to supplement the culture medium of rat ESCs to maintain self-renewal and pluripotency more stably, 12 even making rat ESCs capable of tetraploid complementation, 13 the exact mechanisms underlying the effects of these approaches remain unknown. Rex1 is a sensitive naïve pluripotency marker gene that is often used as a reporter to monitor self-renewal or pluripotency exit 14 and is recently found to be suitable for rat ESCs. 15 In this study, we introduced a Rex1-GFP reporter into rat haESCs and performed high-throughput genetic screening of random differentiation. Multiple inserted genes and pathways involved in this process were revealed. The findings will be useful for investigating the mechanisms of self-renewal or pluripotency exit.

| The Rex1-GFP reporter indicates the differentiation of rat HaESCs
We chose a rat haESC (RAH-1) 9 cell line with a high percentage of haploids ( Figure S1A,B) and standard morphology ( Figure 1A) to perform experiments. To construct the Rex1-GFP vector, we designed a template including T2A (a coding sequence of a self-cleaving linker peptide), a green fluorescent protein (GFP) gene and homologous recombination insertion sites located downstream of Rex1 ( Figure 2B). Next, we electroporated RAH-1 cells with targeting vectors containing a Cas9-GFP system and sorted GFP-positive cells 2 days later for further culturing. After several rounds of enrichment of GFP-positive cells by fluorescence-activated cell sorting (FACS), successful homologous recombination of RAH-1 with GFP downstream of Rex1 (RAH-1 GFP ) was achieved. Genotyping PCR further confirmed that RAH-1 GFP carried a Rex1-GFP reporter ( Figure S1C).
To assess the Rex1-GFP reporter, we further investigated RAH-1 GFP through daily culture and differentiation. Almost every colony of RAH-1 GFP cells presented green fluorescence in long-term culture when observed in a FITC channel ( Figure S1D). The percentage of GFP-positive cells among RAH-1 GFP cells was as high as 97.3% according to FACS analysis ( Figure 1C). Immunostaining for OCT4, NANOG and SSEA-1 in RAH-1 GFP cells demonstrated that pluripotency was not affected by insertion of the Rex1-GFP reporter ( Figure S1E).
Thereafter, we initiated random differentiation of RAH-1 GFP cells according to a previous protocol 7 with slight modification for 9 days.
The green fluorescence in the differentiated cell cultures decreased gradually over time, especially after attachment of EBs ( Figure 1D).
FACS analysis of the GFP-positive cells during this process (which decreased from 97.7% to 7.3%) further confirmed this observation ( Figure 1E). The quantitative PCR (qPCR) results also suggested that the expression levels of pluripotency genes (Oct4, Nanog and Stella) in differentiated cell cultures decreased over time ( Figure 1F), whereas the expression levels of differentiation genes (Irx3, Gata4 and Nestin) increased over time ( Figure 1G). Together, these results showed that the Rex1-GFP reporter was a visible tool that could be used to monitor pluripotency and differentiation of rat ESCs efficiently.

| Using RAH-1 GFP cells to discover genes restricting pluripotency exit
To achieve genome-wide mutations, we utilized the PB transposon system to introduce mutations into the RAH-1 GFP cells for subsequent genetic screening. Into a PB vector (PB-SA-RFP) carrying a tdTomato gene (expressing red fluorescent protein [RFP]) was inserted an SA (splice acceptor) sequence 16 Figure 2D). qPCR analysis suggested that the expression levels of pluripotency genes (Oct4, Nanog and Klf4) in cells differentiated from mutant RAH-1 GFP cells were significantly higher than those in cells differentiated from RAH-1 GFP cells ( Figure 2E). This demonstrated that mutant RAH-1 GFP cells were arrested at differentiation, albeit under random differentiation conditions.
To further investigate the differentiation-arrested RAH-1 GFP mutants, we harvested GFP-positive cells (57.6% by FACS) from among cells differentiated from mutant RAH-1 GFP cells and plated them back onto feeder cells with 2i/LIF medium ( Figure 2F). Typical rat ESC colonies emerged from the cell cultures of sorted cells nearly 2-3 days later, which we termed the screened library. To analyse the inserted genes preliminarily, we randomly picked 10 subclones from the screened library and performed inverse PCR. Multiple apparent and different insertion fragments were observed after inverse PCR ( Figure S2B), which were further sequenced to reveal the inserted genes. Several inserted genes, including Cacna1b and Dact1, were identified, indicating that mutations involved in random differentiation of rat ESCs could be detected in our system ( Figure S2C). Next, we performed splinkerette PCR 17 with the screened library mixture to prepare samples for NGS. The products of splinkerette PCR showed smeared bands, suggesting that our mutations covered a large proportion of the genome ( Figure 2G).
Our splinkerette PCR products were sent to a local company for NGS, the raw data of which were analysed according to previous reports. 17,18 According to deep sequencing, approximately 5 million independent insertions located at 18,000 genes were identified.
Nearly 50% of the PB-SA-RFP vectors were inserted in the sense orientation, and the rest were inserted in the antisense orientation  Figure 3B). An enrichment analysis with KEGG pathway databases revealed that most inserted genes were related to GABAergic synapses, oxytocin signalling pathways, Ras signalling pathways, and so on. ( Figure 3C). To comprehensively understand the inserted genes, we analysed them according to previously reported pluripotency-related pathways (the MAPK, Wnt, JAK-STAT and other pathways). Multiple inserted genes, including Smpd4, Igfbp4, and so forth, were enriched in these pathways ( Figure 3D). To validate the list of inserted genes, we found the top 10 frequently inserted genes (Figures 3E,F and S3A) and analysed their expression levels in rat wild-type ESCs (WT-ESCs, DA-5-3) 19 and RAH-1 cells by qPCR analysis. The qPCR results showed that Thop1 was highly expressed in both DA-5-3 and RAH-1 cells ( Figure 3G), which has not been reported relative to pluripotency to date. We were very interested in addressing whether Thop1 played essential roles in the pluripotency exit of rat ESCs.

| Disruption of Thop1 retards pluripotency exit in rat ESCs
To test whether our candidate inserted genes were related to pluripotency exit of rat ESCs, we chose one of the candidate genes (Thop1) to perform proof-of-principle validation experiments. Thop1 is a coding gene encoding thimet oligopeptidase (THOP1), which is mainly expressed in the testes and the brain 20 and is conserved in many species, including rats. 21 THOP1 is a zinc metallopeptidase that metabolizes a number of bioactive peptides and degrades peptides released by the proteasome but has not been found to be involved in pluripotency exit. Well grown DA-5-3 cells were labelled with the same Rex1-GFP reporter for further investigations ( Figure S4A). To delete Thop1 in DA-5-3 GFP cells, we transfected DA-5-3 GFP cells with Cas9-puro specific sgRNA vectors ( Figure S4B). Puromycin-selected resistant colonies ( Figure S4C) were randomly picked, and genotyping was performed. We obtained three Thop1-knockout (KO) subclones with which to perform subsequent experiments ( Figure 4A-C).  showed that only DA-5-3 GFP cells cultured without PD0325901 presented obvious p-ERK levels. In contrast, Thop1-KO GFP cells cultured without PD0325901 and both groups of cells cultured in 2i/LIF medium did not show significant p-ERK ( Figure 5G), suggesting that Thop1-KO was able to inhibit p-ERK in the same manner as the presence of PD0325901.
To further analyse the role of Thop1 in the random differentiation of rat ESCs in vitro, we overexpressed (OE) Thop1 in DA-5-3 GFP cells via reconstructed PB-Thop1-OE vectors, as indicated ( Figure S6A).
Thop1-OE colonies were enriched by puromycin selection, and DA-

5-3 GFP -vector cells (cells transfected with empty vectors) were used
as controls ( Figure S6B). Both the qPCR and western blotting results showed that the expression levels of Thop1 were significantly higher in Thop1-OE GFP cells than in DA-5-3 GFP and DA-5-3 GFP -vector cells ( Figure 6A,B). During daily culture in 2i/LIF medium, the cell viability of Thop1-OE GFP cells had no difference from those of DA-5-3 GFP and DA-5-3 GFP -vector cells ( Figure S6C). However, Thop1-OE GFP cells showed obvious differentiation morphology in the cell cultures by observation ( Figure 6C). The differentiation of Thop1-OE GFP cells  Figure 6D,E). Next, we performed random differentiation with Thop1-OE GFP , DA-5-3 GFP and DA-5-3 GFP -vector cells for only 6 days.
Thop1-OE GFP cells presented more rapid differentiation than DA-

5-3 GFP and DA-5-3 GFP -vector cells both by observation and FACS
analysis on Day 6 ( Figure S6D,F). Besides, the cell viability of Thop1-OE GFP differentiated cells was also lower than those of DA-5-3 GFP and DA-5-3 GFP -vector differentiated cells on Day 6 ( Figure 6G). The western blotting results showed that the expression level of p-ERK in the Thop1-OE GFP cells increased obviously, compared to those in DA-5-3 GFP and DA-5-3 GFP -vector cells when cultured in 2i/LIF medium ( Figure 6H), suggesting that Thop1 OE could promote p-ERK expression to initiate differentiation in vitro.

| DISCUSSION
Rat haESCs have been widely studied since they were established and can not only give rise to transgenic rats via intracytoplasmic injection 9 but also be useful in interspecies hybridization research. 24 Although mouse haESCs have been proven to be powerful tools with which to determine the functions of recessive genes and mutations, 25 whether rat haESCs are also useful for genetic screening has remained elusive.
Here, we performed high-throughput mutation in rat haESCs using a modified PB transposon system and obtained two independent mutant libraries covering almost the whole genome with millions of insertions ( Figure 3A,B). A fragment of SA 26 constructed into the PB vector was utilized to help us improve the efficiency of mutations ( Figure S2A), enabling integration into both exons and introns to generate trapped gene. Combining splinkerette PCR and NGS is a welldeveloped strategy to read tremendous amounts of raw data on PB integrations without too much noise. 27 This strategy was used in our rat haploid system to quickly reveal the outcomes of high-throughput mutation based on homozygosity. In addition to the transposon mutagenesis strategy, CRISPR sgRNA pools have also been widely utilized to conduct genome-wide mutation in diploid cells; these pools are also suitable for genetic screening for various purposes, including investigation of pluripotency exit. 28,29 The key reason for the difficulty of rat ESC differentiation in vitro is that the mechanisms underlying self-renewal and pluripotency exit are not well known. To address this knowledge gap, we performed genetic screening of pluripotency exit using our rat mutant library. It is difficult to distinguish surviving cells from differentiated rat ESCs, mainly because severe apoptosis occurs during the differentiation process in vitro. 6,7 This greatly limits the application of rat ESCs in regeneration medicine studies and pharmacological development. Through high-throughput genetic screening with rat Rex1-GFP haESCs, we revealed multiple inserted genes related to pluripotency that were involved in various pathways ( Figure 3D). Particularly, deletion of Thop1 (one of our found inserted genes) was validated to retard the differentiation of rat ESCs by inhibiting the phosphorylation of ERK1/2. Although we also observed differentiation when cells were cultured without LIF, there was no significant difference between Thop1-KO cells and WT-ESCs ( Figure 5D,E). Whether Thop1-KO has a relationship with the LIF/STAT3 pathway needs more investigation. In addition, overexpression of Thop1 led to p-ERK expression in rat ESCs, thus inducing significant differentiation in vitro. Overall, Thop1 is a pluripotency exit gene for rat ESCs that plays roles in the MAPK pathway ( Figure 6F).
There was no direct report that Thop1 was related to the pluripotency. However, we found that Dusp1, Dusp9 and Cyld were up-regulated in Thop1-KO rat ESCs. Dusps family was a critical mediator of BMP signalling to control appropriate ERK activity, being critical for cell fate determination of mouse ESCs. 31,32 Cyld encoded a deubiquitinase cylindromatosis (CYLD), which could inhibit K63 ubiquitination and prevent activation of ERK1 in human cancer cells. 33,34 All the above results supported that Thop1 could affect pluripotency exiting through regulating of p-ERK1/2.
In conclusion, we utilized rat Rex1-GFP haESCs to perform genetic screening and uncovered useful information on pluripotency exit. Mutations including Thop1 at the genome scale were identified as being related to differentiation. Our findings not only prove that rat haESCs have advantages for use in functional genomics but also shed light on the probable mechanism underlying the self-renewal and pluripotency of rat ESCs. Haploid cells were harvested according to logical gating with a 1n peak on a cell sorter (Beckman, EQ).  Table S1. To deliver vectors into rat ESCs, approximately 2 Â 10 6 cells were electroporated with 5 μg of plasmids using an electroporator (Thermo, NEON) at 1300 V for 10 ms with 3 pulses.

| Molecular and cellular analysis and CCK-8 assay
Immunostaining and karyotype analysis were performed according to a previous report. 36 The primary antibodies used included anti-OCT4 (Abcam, ab18976), anti-NANOG (Abcam, ab80892) and anti-SSEA-1 (CST, 4744) antibodies. The secondary antibodies were purchased from Abcam. The nuclei were stained with Hoechst 33342 for 10 min.
For the cell viability assay, the differentiated cells were plated into 96-well plates at a density of 5000 cells per well and cultured with serum medium for 24 h. Then, the cell cultures were incubated with 5 mg/ml buffer from a Cell Counting Kit (CCK-8) (Yeasen, 40203ES60) for another 4 h. The signals of the culture media were read using a BioTek luminescence reader (Bio-Rad) at 450 nm.

| Inverse PCR and splinkerette PCR
For inverse PCR, the genomic DNA extracted from cells was digested by PSUI (Thermo, FD1554), self-ligated, and amplified with two rounds of nested PCR. 17 The PCR products were purified using a PCR product purification kit (Sangon Biotech, B518141) and inserted into the pEASY-Blunt Simple plasmid (Transgene, CB111-02) in preparation for Sanger sequencing. The basic steps of splinkerette PCR were similar to those of inverse PCR but with different primers. All primers and splinkerette adaptors used are listed in Table S1. Sanger sequencing was performed by Tsingke, and the splinkerette PCR products for NGS were sent to another local company (Novogene).

| Bioinformatic analysis of insertions and RNA-seq
For the PB-mutation library, the adapters and PB tags were removed from the read pairs before mapping to the genome using the FASTX toolkit (http://hannonlab.cshl.edu/fastx_toolkit/commandline.html).
HaSAPPy 18 was run to align the trimmed reads to the genome assembly from UCSC (mm10). 38 We analysed the insertions for each gene with Gencode 39 with the default parameters.
All the RNA-seq data were sequenced by a local company (Novogene). The raw data were processed with the FASTX toolkit to remove noise and the adaptors. Genes with an adjusted p value <0.05 that were identified by DESeq2 were considered DEGs. The R heatmap function was used to perform hierarchical clustering.