Advanced techniques for gene heterogeneity research: Single‐cell sequencing and on‐chip gene analysis systems

Gene heterogeneity leads to the differences in cellular behaviors in a wide range, such as tumor drug‐resistant mutation, epithelial‐mesenchymal transition, and migration, posing significant challenges to the development of biomedicine. Traditional gene analysis methods, such as polymerase chain reaction, employ a mass of cells as the gene source, resulting in that the gene properties from a specific single cell are hidden in massive gene information. Recent decades have seen the emerging single‐cell gene analysis techniques with their unprecedented opportunities to study gene heterogeneity with high precision and high throughput. In this review, we summarized the state‐of‐the‐art techniques for single‐cell sequencing and on‐chip gene analysis systems. The principles of each technique are introduced in detail, with the focus on the application scenarios in gene heterogeneity research. Looking forward, we also introduced the challenges in current technologies and point out the future direction for facilitating the technical improvement and clinical applications of single‐cell gene analysis techniques.

Fully understanding the subtle gene differences among single cells is crucial for decoding gene heterogeneity. 6 In the past decades, conventional genetic analysis methods in cytobiology, such as polymerase chain reaction (PCR) and gel electrophoresis, primarily take bulk populations of cells as analytes, which neglect the specific gene phenotypes from single cells. For the purpose of gene heterogeneity research, there is an urgent desire to accurately interrogate these distinctive subpopulations at the single-cell level. [7][8] In recent years, advanced cytobiology and bioengineering technologies have offered alternative choices to realize gene analysis at the single-cell level. From the point of view of cytobiology, single-cell sequencing technologies have been widely applied to uncover molecular mechanisms, analyze gene multi-omics, and classify tumor subtypes. [9][10][11][12] For instance, nanopore sequencing technology has the capability of ultra-long reading length without the limitation of gene labels, which attracts the attention of researchers in single-cell gene sequencing. [13][14] In the field of bioengineering, the micro-electromechanical systems make the design of Lab-on-Chip systems available for single-cell analysis with marked advantages, including accelerated efficiency, automation, and ultrahigh throughput. [15][16][17] Especially, the on-chip single living cell gene analysis enables in-situ insight into the dynamic gene properties and real-time cell behaviors. [18][19][20] To address the limitations of gene content from single cells, 21 researchers in cytobiology are gradually forming alliances with bioengineering scientists to develop advanced single-cell gene analysis technologies for gene heterogeneity research.
In this review, we systematically introduced the recent advances in both off-chip single-cell sequencing techniques and on-chip single-cell gene analysis systems ( Figure 1). We summarized the recent progress of singlecell sequencing from three types of cellular omics, that is, genome, transcriptome, and epigenome. Furthermore, the novel on-chip bioanalytical systems for the highthroughput cell manipulation and the interrogation of single-cell genes are introduced in detail according to their structural features. We also discussed the current challenges faced in single-cell gene analysis, as well as the future opportunities for clinical application. This review comprehensively exhibits the advanced single-cell gene analysis techniques in cytology and bioengineering, expecting to promote the study of gene heterogeneity and provide references for precision medicine.

SINGLE-CELL SEQUENCING TECHNOLOGY
Single-cell sequencing technology is widely used for sequencing cell genomes, transcriptomes, and epigenomes, which has shown its versatile abilities to reveal the gene heterogeneities among individual cells. [22][23][24] Since the single-cell RNA sequencing was reported by Tang et al. in 2009, 25 single-cell sequencing technologies have aroused significant impact in the field of oncology, developmental biology, microbiology, and neuroscience, which was ranked as one of the most anticipated biotechnologies by nature in 2020. [26][27][28] In this section, we summarized the recent single-cell sequencing technologies for the analysis of genome, transcriptome, and epigenome.

Single-cell genome sequencing
Single-cell genome sequencing (SGS) is an effective method to screen single nucleotide polymorphism (SNP) and copy number variation (CNV) from individual cells. 11 In its workflow, SGS generally contains the procedures of whole-genome amplification (WGA) due to the limited number of DNAs in a cell (∼6-12 pg). 29 How to avoid the technological noise introduced during the amplification is the main challenge in SGS. 30 In addition, the demands for detection throughput and accuracy are the continuous driving force to promote the development of gene sequencing technology. 31 Therefore, the recent efforts are focused on these two aspects for improving SGS.

Whole-genome amplification
According to the amplification characteristics, WGA can be divided into PCR amplification, isothermal amplification, the hybrid method based on PCR and isothermal amplification, and linear amplification. [32][33][34][35][36] PCR amplification includes primer extension preamplification and degenerate oligonucleotide-primed PCR (DOP-PCR). The primer extension preamplification method takes advantage of a random primer containing 15 bases to anneal at 37°C and extend the sequences at 55°C in a circular manner for gene replication. By using this method, Zhang et al. achieved full amplification of the genome from single monoploid cells. 32 The primer of DOP-PCR has a six-base random sequence at the 3′ end to combine with genomic DNA randomly at 25°C. 33 Then the temperature is raised to 55 • C to start cyclic PCR inversion and amplify the entire genome. DOP-PCR, as an efficient amplification technology, has been used to sequence and quantify genes in single cells. 4,37 While its low coverage and high error rates may lead to false SNPs. By modifying primers and DNA polymerase, the improved DOP-PCR has significantly increased the amplification efficiency and the quality of the original DOP-PCR. 38 However, due to the F I G U R E 1 Single-cell gene analysis techniques for gene heterogeneity research. Recent single-cell sequencing techniques for the analysis of genome, transcriptome, and epigenome. Single-cell genome sequencing techniques mainly include next-generation sequencing techniques featured sequencing-by-synthesis, and the third-generation sequencing techniques based on the electrical signal introduced by the DNAs crossing the nanopore. Single-cell transcriptome sequencing added the procedure of RNA reverse transcription (RT) before sequencing, which can be divided into full-length sequencing and tag-based sequencing. Single-cell epigenetic sequencing techniques are introduced according to the application of analyzing DNA modification, histone modification, chromosome accessibility, and chromosome conformation. On-chip single-cell gene analysis systems consist of the micro-/nano-chips based on the microarray, microdroplet, and physical mechanics. Microarray chips provide gene analysis with single-cell resolution by microstructure-based cell manipulation. Microdroplet chips isolate numerous cells into single cells by dividing the cell-contained solution into microdroplets. Physical mechanics, such as dielectrophoresis, acoustic wave, and optical trap, provide the non-contact way to manipulate and interrogate the cells participation of Taq DNA polymerase, the non-specific amplification, incomplete site coverage, and high base mutation rate limit the further application of improved DOP-PCR. Multi-displacement amplification (MDA) is one of the most widely used isothermal amplification methods. 34 Under a constant temperature, the primers consisting of six bases are randomly hybridized with the template. Then a strand displacement reaction occurs under the action of phi29 DNA polymerase. The replaced single strand randomly combined with the primers to form branching amplification. Due to the use of phi29 DNA polymerase, MDA has a larger genome coverage than PCR, which supports its applicability in genotyping of SNP, chromosome mapping, DNA blotting, and gene sequencing. A whole-genome and exome single-cell sequencing method based on MDA known as single nucleus genome sequencing has also been developed for sequencing single-cell DNAs from tumor cells. 39 This method uses G2/M nuclei as targets with an average gene coverage of 91%. However, under the conditions of isothermal amplification, MDA requires templates of high quality to avoid non-specific interference. To solve this problem, droplet MDA encapsulates single-cell genes into nanoliter-scale reaction volume by a commercially available liquid distribution for gene amplification, which achieves up to 80% of single-cell genome coverage with the virtue of minimal contamination and high throughput. 40 The hybrid method based on PCR and isothermal amplification, such as multiple annealing and looping-based amplification cycles (MALBACs), is initiated by a group of random primers with 27 bases at 5′ end and eight random bases at 3′ end ( Figure 2A). 35 The random primers hybridize with templates uniformly at 0°C and produce semi-amplifiers at 65°C with DNA polymerase. The semiamplicons are denatured at 94°C, annealed at 0°C, and extended at 65°C to form full amplicons. The full amplicons are further cyclized under the temperature of 58°C to prevent further amplification and cross-hybridization of DNAs. Researchers have used MALBACs to realize the detection of digital CNV of the whole genome from single cancer cells and reveal significant CNV differences in the chromosomes of cancer cells. 41 Compared with MDA, MALBACs show more uniform genome coverage. However, the SNV false-positive rate of genes amplified by MALBACs was higher than that of MDA because the fidelity of DNA polymerase in MALBACs was lower than that of phi29 DNA polymerase.
Linear amplification via transposon insertion (LIANTI) first fragments single-cell genes by well-designed Tn5 transposition and T7 promoter ( Figure 2B). 36 The DNA fragments labeled by the T7 promoter are linearly amplified into thousands of copies of RNAs by in vitro transcription. After reverse transcription and RNA digestion, the second strands are synthesized to form double-stranded LIANTI amplifiers for DNA library preparation. LIANTI reduces the non-specific amplification, meanwhile, significantly improves the amplified uniformity and accuracy, which has been used for the detection of micro-CNV with a resolution of 1000 bases. 35 To improve the throughput of LIANTI, Yin et al. designed a three-level single-cell combinatorial indexing to achieve a throughput of 1 million cells/test, which significantly enhanced the detection efficiency of CNV. 42

Genome sequencing technology
The amplified single-cell gene library relies on gene sequencing technology to determine the sequence of the gene. Nowadays, sequencing technology has undergone three generations of technological development. The first generation of sequencing technology was invented by Shendure et al. in 1977. 43 This sequencing method uses dideoxynucleotide that cannot form a phosphodiester bond during the DNA synthesis process due to the lack of the hydroxyl group at the 3′ end, to interrupt the DNA synthesis reaction. Therefore, the first-generation sequencing technology is called "the dideoxy chain-termination method", also known as the Sanger method. In the sequencing process, four types of radioactive isotopelabeled dideoxynucleotide were severally added to the four DNA synthesis reaction systems. After gel electrophoresis and autoradiography, the DNA sequence to be tested is determined according to the position of the electrophoretic band. This sequencing method is simple, convenient, and highly accurate, and is widely used in the gene identification of known mutation sites. 35 However, the Sanger method is difficult to complete the screening of large samples with no candidate genes or a large number of candidate genes.
The dominant technologies for single-cell gene sequencing include reversible terminator sequencing, such as HiSeq/MiSeq and Ion Torrent platforms. 43 Both HiSeq/MiSeq and Ion Torrent adopt the strategy of sequencing-by-synthesis for gene sequencing, which is also called next-generation sequencing. In the sequencing process on HiSeq/MiSeq platform, 44 the deoxynucleotide triphosphates (dNTPs) labeled with specific fluorescence are protected by 3′-OH to guarantee that only one dNTP bonds to the synthetic chain in one cycle. The fluorescence signals on dNTPs can be recorded and converted into the information of dNTPs. After the quenching of fluorescence and removal of the 3′-OH protecting group, the next round of sequencing reaction is carried out. More than 99.9% accuracy and low cost make the HiSeq/MiSeq platform quickly become the leading sequencing technology for clinical diagnosis and gene research. 44 In the Ion Torrent platform, the dNTPs are designed with two pyrophosphate groups. 45 The DNA strands are fixed in the micropores of the semiconductor chip, and then dNTPs are sequentially incorporated. When the DNA polymerase polymerizes the dNTPs onto the extending DNA strands, a hydrogen ion is released to change the pH in the reaction well. The sensor can convert the ion signal directly into a digital signal to read the DNA sequence. Without the expensive optical system, the Ion Torrent platform uses the semiconductor to collect signals, which significantly reduces the cost and simplifies operations. 46 Compared with the Sanger method, next-generation sequencing greatly increases the sequencing throughput and reduces the sequencing time of a human genome from 3 years to less than 1 week. However, their read length is much shorter (50-300 bp) than Sanger sequencing (∼1000 bp), so that more than 70% of human gene structure changes cannot be detected. [47][48][49][50] The third-generation sequencing techniques, including single-molecule real-time (SMRT) sequencing and nanopore sequencing, which are also called single molecular sequencing, significantly increase the read length (>10000 bp) of gene sequencing. 14,51-54 SMRT sequencing contains nano-scale zero-mode waveguide holes on which a DNA polymerase and a DNA sample  Figure 2C). 14 When the laser illuminates the bottom of zero-mode waveguide holes, only the fluorescent group carried by the dNTP in this area is lightened and detected, which greatly reduces the background fluorescence interference. This sequencing method does not require PCR amplification and can cover high guanine and cytosine (GC) and highly repetitive regions. In nanopore sequencing, proteins with a nanopore, such as α-hemolysin, Mycobacterium smegmatis porin A, and Curli specific genes G, are anchored on the resistance film and provide the nanochannels for the pass of nucleic acid molecules ( Figure 2D). 55 Meanwhile, the specific current change of nucleic acid is recorded and used to judge the specific bases during the transmembrane process. By identifying the electronic signal of bases, nanopore sequencing enables label-free gene sequencing with ultra-long reading lengths equal to or even longer than that of SMRT sequencing. [56][57] However, at present, the accuracy and nanopore assembly technology of nanopore sequencing still needs to be further improved. 58

Single-cell transcriptome sequencing
The cell transcriptome, for example, messenger RNA (mRNA) and small interfering RNA (siRNA), profiles random variability according to the internal or external influences, such as differentiation stages and environmental stressors. 59 Single-cell transcriptome sequencing offers profound insights into the variability of gene expressions. 60 Because of the challenges on direct RNA sequencing, dozens of methods have been involved in developing the single-cell RNA reverse transcription and complementary DNA (cDNA) library construction before gene amplification and sequencing, 30,61 which can be briefly divided into two categories: full-length sequencing and tag-based sequencing.

Full-length sequencing
The full-length sequencing techniques sequence the complete mRNAs of cells without RNA fragmentation to obtain uniform genomic coverage and the increased number of locatable readers. Their wide coverage allows the data analysis of isoform discovery, splicing events, and SNP identification of allele gene expression. 10 The fulllength sequencing for single-cell transcriptome analysis mainly includes switching mechanism at 5′ end of the RNA transcript sequencing (Smart-seq), Quartz-seq, multiple annealing, and dC-tailing-based quantitative singlecell RNA sequencing (MATQ-seq), and single-cell universal poly(A)-independent RNA sequencing (SUPeR-seq).
In the process of Smart-seq, mRNAs hybridize with primers containing oligos (dT) and form poly C by the addition of cytosine nucleotide. 62 Then the primers containing oligos (dG) hybridize with poly C to form the cDNAs. The full-length cDNAs are amplified by PCR to produce nanogram-scale DNAs for gene sequencing. Relying on its high RNA coverage, Smart-seq is suited for analyzing alternative splicing in detail, which had been applied to identify the gene expression of circulating tumor cells (CTCs) and highlight intratumoral heterogeneity in primary glioblastoma. [62][63] To further increase the sensitivity and cDNA yield of Smart-seq, locked nucleic acids, high concentrations of MgCl 2 and lycine are introduced to improve the thermal stability and cDNA production, respectively ( Figure 3A). 64 However, the secondgeneration Smart-seq is only suited for the analysis of polyadenylated RNAs. In addition, its analysis results cannot reflect the specificity of mRNAs.
In the Quartz-seq method, exonuclease I is employed to digest reverse transcriptase primers while adding the poly-A tailing at the 3′ end of the first-strand cDNA. The second strand is synthesized by using the tag primers to inhibit the amplification of survival primers. Quartz-seq possesses the advantages of high reproducibility and sensitivity, which have been used to analyze the heterogeneity of gene expression and different cycle stages among single cells. 65 Quartz-seq 2 uses the reverse primers composed of barcode sequence, unique molecular identifier (UMI), and oligo-dT to endow each cell with a unique barcode. 66 The high conversion efficiency of UMI allows Quartz-seq 2 to detect more transcripts from limited sequence reads at a minimal cost.
Inspired by the primers used in MALBACs, the primers of MATQ-seq technology are mainly composed of G, A, and T bases for obtaining uniform genome coverage (Figure 3B). 67 At the same time, in order to reduce the deviation during the PCR amplification process, UMI with a random hexamer is used to mark the sequences specifically. MATQ-seq method reaches 89.2% capture efficiency without the concern of 3′ or 5′ end deviation, which provides an efficient tool for detecting low-abundance genes. In addition, MATQ-seq realizes the distinguishment of differences in gene expressions among the same type of cells. 67 Circular RNAs (circRNAs) are single-stranded and covalently closed RNA molecules, which have been reported that they play an important part in miRNA inhibition and tumorigenesis. [68][69] In order to study circRNAs, SUPeRseq was developed by the universal anchor sequence (AnchorX-T15N6) to transcribe the circRNAs, and universal anchor sequence (AnchorY-T24) to generate the second cDNA chains. 70 Then, adenine triphosphate deoxynucleotide and adenine triphosphate dideoxynucleotide are added at the ratio of 100:1 to synthesize poly(A) fragment at  76 Copyright 2012, Elsevier the 3′ end of the cDNAs. SUPeR-seq has been used to detect circRNAs in the early embryonic development of mammals. By using random primers with anchor sequences, SUPeR-seq has also been used to detect both poly(A)+ and poly(A)-RNAs from single cells without the concern of ribosomal RNAs contamination and 3′ end bias. 70 Although this method shows high sensitivity, the timeconsuming process of cell separation remains a challenge for the high-throughput single-cell circRNAs analysis. 71

2.2.2
Tag-based sequencing Tag-based sequencing techniques always use UMIs to achieve the simultaneous characterization of multiplex samples with improved throughput, which is mainly used for the quantification of gene expression. 72 In general, the tag-based approach is relatively insensitive because the mappable reading is limited to one end of the transcripts. Among the tag-based sequencing techniques, the 3′ tag-based sequencing are developed rapidly, including massively parallel single-cell RNA-sequencing (MARSseq), cell expression by linear amplification and sequencing (CEL-seq), and gene expression cytometry sequencing (Cyto-seq). MARS-seq developed by Amit's group is the representative method for large-scale parallel sequencing. 73 In MARS-seq, single cells are severally separated into 384wells by fluorescence activating cell sorter (FACS). The mRNAs from lysed cells are annealed with T7 promotercontained unique molecular recognition. The adoption of three levels of barcodes from molecular, cellular, and plate-level tags facilitates the MARS-seq to realize the robust multiplex analysis. This method not only significantly improves the sequencing throughput and repeatability but also reduces the experiment cost, which has been used to handle in vivo samples containing a variety of cell subsets. [73][74] However, it is easy to produce errors at the 3′ end in the purification step. Improved MARS-seq (MARS-seq 2) optimizes the concentration and composition of reverse transcription primers, as well as the removal step of residual DNA templates, which greatly reduces the cell-to-cell contamination and the error rate at 3′ end. 75 MARS-seq 2 has been used to identify cell types in different tissues and unique model systems. 75 To overcome the challenge of limited amounts of RNAs, Hashimshony et al. developed the CEL-seq technique to amplify the RNAs with high efficiency ( Figure 3C). 76 In this method, individual cells are mixed with uniquely barcoded primers for cDNA synthesis. Then the cDNAs are fragmented and purified before performing the modified version of Illumina directional RNA protocol. The sequences with both Illumina adaptors are selected for PCR and then sequenced with paired-end reads. Because the barcode samples can be merged, the amplification efficiency of CEL-seq is greatly improved. By adding five base pairs of UMI in the upstream of barcodes, CEL-seq 2 enables detection of the above 30% of genes than CELseq with higher accuracy at the cost of increased operation time. 77 Cyto-seq has gained widespread attention due to its ability to perform high-throughput analysis of single cells. 78 In the process of Cyto-seq, mRNAs from single cells hybridize with combinatorial library-modified beads in the well arrays. These beads are labeled with specific recognition probes to track every single cell. Subsequently, the mixtures are pooled for reverse transcription, gene amplification, and sequencing. Researchers have applied Cyto-seq to examine thousands of heterogeneous cells for characterizing the complex samples of the human hematopoietic system and demonstrated the ability of Cyto-seq to identify major subsets within human peripheral blood mononuclear cells. 78

Single-cell epigenetic sequencing
Epigenetics refers to changes in gene expression levels caused by non-gene sequence changes, which is essential in investigating functional genomic elements, regulatory gene variations, and cellular behavior diversity. [79][80][81] The study of epigenomes mainly includes DNA modifications, chromatin accessibility, histone modifications, and chromosome conformation. 82 At the single-cell resolution, single-cell epigenetic sequencing techniques offer a broad view for exploring epigenetic heterogeneities that are masked in the conventional measurement of bulk cellular populations. 83

DNA modification
The chemical modification of DNA groups can change the genetic performance without changing the DNA sequences. 84 For example, DNA methylation may change the chromatin structure, DNA conformation, DNA stability, or DNA-protein interaction by binding a methyl group on the 5′ carbon bond of cytosine in CpG dinucleotides. [85][86] Recently, several methods have been developed for sequencing the DNA modifications at the single-cell level, including reduced representation bisulfite sequencing (RRBS), single-cell BS, and single-nucleus methylome sequencing (snmC-seq).
The RRBS is a genome-wide DNA methylation sequencing method that uses restriction endonuclease to digest DNAs and the size-selection strategy to enrich CpG-dense  87 In single-cell RRBS, genomic DNAs from lysed single cells are digested by restriction enzymes and connected with DNA fragment adapters. 88 After being converted by bisulfite, transformed DNAs are purified with a Zymo spin column for PCR amplification. This method can detect above 1.5 million CpG sites with methylation status in the single-cell genome, and distinguish the non-methylation and complete methylation in a single CpG site. 88 However, it shows a limitation of low coverage (10%) of CpG sites. Compared with singlecell RRBS, the single-cell BS method increases the coverage of CpG dinucleotides to 50% in the entire genome. 89 The procedures, including cell lysis, bisulfite conversion, preamplification, adaptor tagging, library amplification, sequencing, alignment, and methylation calling, can be completed within 3 days for DNA methylation analysis. The snmC-seq strategy integrates the FACS, plate-based bisulfite treatment, and DNA purification to analyze DNA methylation from the single nucleus. 90 The high read mapping rate and multiplex reactions significantly improve its throughput for large-scale cell type classification. Luo et al. have used this method to uncover >6000 methylomes from single neuronal nuclei and identify neuronal subpopulations in the frontal cortex. 90 In order to further increase the read mapping rate and reduce adapter dimers, they subsequently proposed snmC-seq 2 with optimized experimental factors to improve the quality of libraries ( Figure 4A). 91

Chromosome accessibility
Chromosome accessibility is the degree to which nuclear macromolecules are able to physically contact chromosomal DNA, which broadly reflects the regulatory capacity of chromatin organization and function. 92 Pico-seq, singlecell transposaseŋaccessible chromatin using sequencing (scATACŋseq), single-cell transposome hypersensitive sites sequencing, and single-cell micrococcal nuclease sequencing are universal methods for detecting chromosome accessibility at the single-cell level.
Pico-seq is developed to detect the DNase I hypersensitive sites in the chromatin ( Figure 4B). [93][94] The single cells sorted by FACS are digested in the presence of DNase I. After that, end repair, adaptor ligation, PCR amplification, and sequencing are seriatim performed for the interrogation of open chromatin regions. This method greatly expands the application of DNase I hypersensitive sites' analysis in genetic transformation research. Similarly, scATACŋseq uses a programmable microfluidic platform to capture single cells and the hyperactive Tn5 transposase to insert Illumina sequencing adaptors into the accessible chromatin regions. 95 After transposition, the gene library is collected by barcode primers-based PCR. This method can dissect single-cell epigenetic heterogeneity and link cis-and trans-effectors to the variability of accessibility spectra. However, it falls short of the low recovery rate of DNA fragments, low throughput, and poor data collection efficiency. Single-cell transposome hypersensitive sites sequencing adopts the super-mutant Tn5 transposase to perform transcriptional amplification in vitro. 96 This method improves the coverage of cell-specific distal enhancers and is more sensitive than scATAC-seq. In single-cell micrococcal nuclease sequencing, MNase not only acts as the endonuclease to cleave the internucleosomal DNAs but also acts as the exonuclease to degrade DNA cleavage products. 97 In addition to chromatin accessibility, this method can simultaneously measure genome-wide nucleosome positioning in the same single cells. 98

Histone modification
In eukaryotic cells, histone packages linear DNA molecules into the strings of nucleosomes. The histone modifications, such as acetylation, methylation, and phosphorylation, participate in the control and regulation of gene expression. 99-100 Its errors in epigenetic regulation may cause a series of diseases, including allergy and related disorders. 81 To study the histone modifications at the single-cell level, a series of sequencing technologies has been developed, including single-cell chromatin immune-cleavage sequencing, combinatorial barcoding and targeted chromatin release, and so forth. Chromatin immunoprecipitation in parallel with sequencing can be used to study the DNA-protein interaction in the whole genome, which is an alternate technique for histone modification analysis. 101 In this method, DNA-protein complexes are fragmented and treated with exonucleases to trim unbound oligonucleotides. Antibodies are employed to bind the DNA-protein complexes for the specific enrichment of the DNA fragments binding to the target proteins. Then the enriched DNA fragments are amplified by ligationŋmediated PCR before high-throughput sequencing. However, conventional chromatin immunoprecipitation in parallel with sequencing requires millions of cells as the starting materials. 102 The subsequent single-cell chromatin immune-cleavage sequencing (scChIC-seq) achieves the dissection of DNAprotein interaction at the single-cell level ( Figure 4C). 103 Nevertheless, its genome comparison rate of single cells was only about 6.1%, which greatly increases the sequencing cost and reduces the throughput. He's group integrated barcoding and targeted chromatin release to develop the universal, high-quality, and high-throughput combinatorial barcoding and targeted chromatin release technology. This strategy enables the measurement of tens of thousands of single cells (∼12,000 reads/cell) per test under both native and cross-linked conditions. 104

Chromosome conformation
The 3D chromosome structures have shown an intimate connection with gene transcription, replication, DNA damage, and repair. 105 Chromosome conformation capture (3C) method allows the detection of spatial proximity between DNA sequences by the digestion and subsequent re-ligation of the crosslinked chromatin in the cell nucleus. 106 The subsequent techniques have been continuously improved on this basis and developed the methods for single-cell chromosome conformation analysis, such as single-cell high-throughput 3C sequencing (scHi-C), single-cell combination index Hi-C (sciHi-C), and diploid chromatin conformation capture. Hi-C takes the whole nucleus as the research object to study the spatial relationship of chromatin DNAs by using high-throughput sequencing technology and bioinformatic analysis method. 107 Jin et al. adopted this method to study the interaction of enhancer and promoter in chromatin, and suggested that the targets of cell-specific enhancers are already hardwired into the chromatin architecture in each cell lineage. 93 The scHi-C shows higher resolution for the interrogation of chromatin conformation, with the capability to confirm the changes in the 3D genome structure among single cells. 108 However, low throughput becomes the barrier in the development of scHi-C. In order to improve the throughput, sciHi-C is developed by using two rounds of split cell barcodes (Figure 4D). 109 Thus, thousands of single cells can be analyzed in one assay. While its deficiency is obvious originating from the insufficient resolution. Compared with sciHi-C, Tan et al. reported the diploid chromatin conformation capture method with a high spatial resolution. 110 This method combines the transposon-based WGA with an algorithm to detect the chromatin contacts and impute the two chromosome haplotypes linked by each connection, realizing the reconstruction of the single-cell genome structure.

ON-CHIP SINGLE-CELL GENE ANALYSIS
Conventional single-cell gene analysis technologies are difficult to get rid of the common issues on low throughput and high cost, which impedes their wide application in clinics. 111 In addition, the complex operational processes may also lead to the contamination of samples, resulting in the compromise of accuracy and the dropout of genes with low abundance (∼picograms). 112 On-chip technologies enable manipulating millions of cells in nanoliter reaction chambers to decrease the input contents, reduce contamination, and improve detection throughput. 15,[113][114] By integrating with gene analysis technologies, on-chip technologies offer more capacities, including high throughput, small sample consumption, and convenient operability for single-cell gene heterogeneity research. [115][116][117] According to the principles and structural features, we introduced the current on-chip systems based on microarray and microdroplet. In addition, the physical mechanics-assisted chips based on microfluidics as the advanced technology for accurate cell manipulation are also recommended for the sake of specific cells control in single-cell gene analysis.

Microarray-based chips
The microarray-based chips feature numerous repeated microstructure units for patterning cells. 116 Thus, gene analysis of different cells can be simultaneously implemented in these isolated yet non-interfering spaces. The merits of microarray-based methods, including the controllable structure and portable operation, make them a widespread platform for high-throughput cell manipulation and single-cell gene analysis. 118 According to the characters of microstructure units, we enumerate three types of array chips based on microtraps, microwells, or microchambers, for introducing their applications in single-cell gene analysis.

Microtrap arrays
Microtrap arrays adopt well-designed microstructures to isolate single cells on microfluidic chips. 119 Cells that flow along the microchannels are trapped by microstructures and then proceeded to the gene analysis. For example, Kimmerling et al. utilized hydrodynamic microtrap arrays to capture and culture single cells ( Figure 5A). 120 After multi-generational culture, single-cell RNA sequencing was implemented for revealing both lineage and cell cycledependent transcriptional signatures. For the isolation and purification of genomic DNA from individual cells, Tian and coworkers developed a micropillar array that could capture single cells and extract DNAs within the microfluidic channel for WGA. 121 In addition to capturing cells based on their sizes, the modifications of antibodies or aptamers bring chips to the functions of immunological recognition and specific capture, which has been widely used in single-cell analysis. Kwon's group used EpCAM antibody-labeled microcolumns to capture the CTCs and a laser to isolate the microcolumns with target cells. 122 This single-cell separation method maintains the integrity of cells and guarantees the accuracy of gene sequencing. However, the above-mentioned methods are all based on the lysed cells for gene extraction and sequencing, making it impossible for the behavior tracking of living cells. By integrating the modified atomic force microscopy probe, Li et al. realized the in-situ mRNA extraction from the single CTCs trapped in the high-density microfluidic array ( Figure 5B). 123 This platform has the potential for the in-situ monitoring of cell behaviors at the single-cell level, such as drug resistance and stem cell biology.

Microwell arrays
The microwell array-based chips have the structural properties of large-scale microwells with similar sizes of single cells. 124 The cells are patterned into single-cell arrays on the microwell chip with the assistance of external forces, such as centrifugal force, magnetism, gravity, and vacuum. [125][126][127][128][129][130] Since the scale of microwells is easily expanded by chip design and fabrication, the microwell array-based chips can meet the requirements of high throughput readily. Gao et al. proposed a microwell array containing more than 20,000 units to separate CTCs into independent units by centrifugation ( Figure 5C). 125 After on-chip cell lysis and PCR amplification, duplex EGFR mutation genes were parallelly read out from single CTCs. Relying on the principle that magnetic bead-combined cells move along under magnetic field, the loading of cells can also be achieved under magnetic field. 126 Meanwhile, as all cells are separated into small units, the CTC-contained units are purified by eliminating the interference with white blood cells. The CTCs in microwells could be directly extracted by micromanipulation for single-cell genetic analysis. To further improve cell extraction efficiency, a semiautomated single-cell aspirator is developed to enrich CTC populations from a specialized microwell array. 127 This method is a convenient platform for reliable single-cell molecular characterization, including transcriptomic and genomic assays. For the interrogation of RNAs in living cells, Chang's group developed a 3D micropore array-based electroporation platform to deliver RNA-targeted molecular beacons into single cells ( Figure 5D). 131 The cells were rapidly loaded into thousands of microwells by vacuum. By virtue of the concentrated electric field on micropores, safe cell membrane perforation and efficient molecular beacons delivery were completed under a low voltage. This platform provides a simple yet high-throughput probe delivery method for the intracellular RNA investigation, which exhibits its potential capacity for monitoring living cell behaviors. 132-133

Microchamber arrays
Microchamber arrays are coupled with microvalves for flow control and microchambers for cell confinement, which realizes the single-cell control and genetic anal-ysis in an automatic manner. 134 For example, Huang's group adapted a microvalve-based multilayer microfluidic device, which allowed the on-chip cell capture and lysis. 17 Eight single cells were simultaneously extracted mRNA for cDNA preparation, increasing the throughput and experimental efficiency for the followed single-cell whole-transcriptome sequencing. Li et al. established a tri-states microvalve-based microfluidic chip to comprehensively perform precise controlling of the cell/reagent movement. 135 This method was capable of in-situ detection of multiple oncogenic mutations at the single-cell level.
In addition, by integrating on-chip cell manipulation and single-cell sequencing, they also realized the analysis of key driver and drug-resistance mutations from single CTCs in clinical blood samples ( Figure 5E). 136 Although the tristates valve structure has the advantages of high accuracy and minimum cell loss, it puts forward strict requirements for the chip preparation due to its structural complexity. To simplify the fabrication and operation of the microchamber, the one-pot microfluidic device was developed to perform single-cell RNA sequencing according to the principle of diffusion-based reagent swapping (Figure 5F). 137 The reagents were driven into the chambers by the concentration gradient, thus dispensing multiple chambers for multi-step treatment.
Although the microarray-based devices increased the throughput while reducing sample consumption, their inherent deficiencies are noticeable. Due to the microscale of cells, the structures of microarray for cell manipulation are always confined to limited space, which brings inevitable troubles to the fabrication of devices. Even if the current precision processing technologies have the strength to fabricate various chips, complex devices remain challenging in clinical applications. In addition, the certain structure of microarray chips makes them handle a fixed number of single cells at once. In particular, the throughput of devices based on microtraps and microchambers always compromises the complexity of their structure. In terms of operational convenience and detection throughput, there is still a gap between most research-based chips and the real clinical application scenario.

Microdroplet-based chips
Microdroplet-based chips provide valuable tools for singlecell gene analysis due to their capability of massively isolating numerous individual cells with minimal reactants. 138 Picolitre or even femtolitre droplets as individual reactors significantly avoid wasting samples, while offering the single cells a controllable and portable reaction condition. 139 Rapid progress has arisen in microdroplet-based technology to generate droplets on-demand and detect single-cell genes on-chip. 140-141

Microdroplets for single-cell isolation and analysis
One of the applications of microdroplet-based chips is to divide massive cells into single cell that is encapsulated in a small droplet. 134 Based on the specific design of microfluidic channels, the aqueous phase is sheared into monodispersed microdroplets where individual cells are confined by an inert oil phase. The obtained singlecell microdroplets provide versatile platforms for singlecell gene analysis. 142 For example, Chen et al. integrated a phase-switch droplet microfluidic chip with the digital lysate approach to construct a molecular map of transcriptome perturbation during the cell cycle. 143 Subsequently, they utilized this platform to encapsulate single cells into picoliter-sized hydrogel droplets and realized the singlecell RNA sequencing, with high efficiency and accessibility ( Figure 6A). 144 Another common application of microdroplet-based chips is that the cells are encapsulated separately and then mixed with the microdroplets containing reaction reagents. In this way, single-cell gene analysis can be accomplished in the microdroplets with minimal sample consumption. For example, Rotem et al. combined microdroplet and DNA barcode to sequence chromatin data at single-cell resolution ( Figure 6B). 145 This proposed method was able to determine the identity of each cell and recapitulate high-quality chromatin profiles for each cell state. This strategy that integrated microdroplets with barcodebased gene sequencing is also suited for the ultra-highthroughput identification of microbes from environmental samples (> 50,000 microbes/run), showing its powerful applicability in bioanalysis. 146 In addition to singlecell gene sequencing, the combination of microdroplet and isothermal amplification technology provides a simple method for the detection of specific gene sequences in single cells. [147][148] Guo and coworkers employed the DNA hybridization chain reaction to probe the target miRNA in single cells. 149 The miRNAs released from single cells interacted with the encapsulated DNA amplifiers in the microdroplets. Due to the amplification of cyclic cascade reaction, single-cell miRNAs could be detected by the increased fluorescence signals, with a high throughput of 300-500 cells/min. Chung et al. developed a dropletbased microfluidic platform that permitted on-chip droplet sorting and merging for multi-step amplification reaction assays within 1 h (Figure 6C). 150 The lysis buffers and reactant mixtures were sequentially added in microdroplet reactors to carry out the single-cell reverse transcription loop-mediated isothermal amplification for the specific quantification of mRNA expression levels in single cells.

3.2.2
Microbead-assisted microdroplet Microbeads as carriers are encapsulated in microdroplets and subsequently mixed with single cells for gene extraction and effective enrichment due to their strong adsorption capacity. 113,151 These gene-combined microbeads are easily controlled to proceed with multiple processes of single-cell gene analysis. For example, poly(dT)-coated magnetic beads and single cells are simultaneously encapsulated in emulsion microdroplets by flow focusing for mRNA capture. 152 The magnetic beads as the carriers of mRNAs participated in the entire subsequent analytical procedures, including mRNA purification, reverse transcription, PCR, and high-throughput sequencing. By controlling the magnetic separation rack, this method  Figure 6D). 154 Targeting the drug-resistant breast cancer models, they found typical chromatin lost signature within untreated drug-sensitive tumors and resistant cells. To promote the widespread application of microdroplet-based microfluidic techniques, a 3D-printed and barcoded microparticle-assisted droplet microfluidic control instrument is developed (Figure 6E). 155 Thirteen transcriptomically distinct clusters have been revealed from the sequencing results of 20,387 isolated single cells. This automatic instrument paves the path for microdroplet-based microfluidic chips to perform single-cell transcriptome characterization with low cost and user-friendliness. The single-cell gene analysis methods based on microdroplet techniques have an absolute advantage in throughput (up to thousands of single cells per second). However, because the single-cell gene derives from lysed cells, the simultaneous behavior analysis of living cells is impossible in the microdroplets with limited volume. Especially, for cell transcriptomes that are susceptible to environmental disturbance, only real-time dynamic detection in living cells can obtain abundant and real information. In addition, a comprehensive analysis of multiple markers, including DNAs and RNAs, is also a problem that the microdroplet-based chips need to solve.

Physical mechanics-assisted chips based on the microfluidics
Physical mechanics generated by dielectrophoresis (DEP), acoustic wave, or optical trap can control cells in a  175 Copyright 2020, Springer Nature non-contact yet precise manner. [156][157] With the assistance of physical mechanics, the microfluidic chips may achieve the cumulative advantages of controllability and free label in single-cell separation and manipulation. 117 The emerging physical mechanics-assisted chips are expected to become the candidates for exploiting multifunctional onchip single-cell gene analysis strategies. 158

Dielectrophoresis
DEP moves cells by a polarizing force in a non-uniform electric field that is generated from alternating current or direct current potentials. The cells can be changed the moving directions, including attraction or repulsion from the electrodes, by controlling the electric field frequency. 159 Due to the simple generated component of DEP (i.e., microelectrodes), DEP has been a versatile tool for on-chip cell manipulation, which can further promote the development of on-chip single-cell gene analysis in the aspect of automation. [160][161] DEP microelectrode array cooperated with microwell array are applied for the precise loading of large-scale sin-gle cells. 161 For example, Fan's group aligned electrode array with microwell array to provide a controllable platform for the effective load of cells in microwells (Figure 7A). 162 Followed the mixture of barcoded beads, mRNA from single lysed cells was captured by the DNA oligomers for the preparation of the sequencing library. Then, the unique transcriptome information was obtained by singlecell RNA sequencing. Under the control of DEP, Lee's group developed a 3D nanochannel-electroporation platform with the capability of loading about 60,000 cells/cm 2 on a microwell array-based chip ( Figure 7B). 128 This platform realized the efficient delivery of plasmids into living cells for gene editing, which is possible to be applied to the single-cell gene analysis by delivering molecular probes into cells. In addition, DEP techniques are also used to control the position of cells in microdroplets. 163 The size of microdroplets was further decreased by the shear of a bifurcated channel for reducing the reaction volume and matrix interference with cell detection. Currently, a commercial DEP-based chip named DEP Array has been used to control cells to form the single-cell array. 164 The specific single cells are moved along the set routes under an electric field. This system has been applied to separate CTCs from clinical samples by the identification of immunofluorescent staining under a fluorescence microscope.

Acoustic wave
An acoustic wave is generated by importing alternative electric current signals at a resonance frequency to an interdigitated transducer deposited on a piezoelectric substrate. 165 Acoustic wave-assisted control systems, also called acoustic tweezers, take advantage of this principle to convert the electrical signal into acoustic radiation force to trap single cells, which have shown promising potentials for single-cell manipulation. [166][167] Guo et al. presented acoustic tweezers to 3D manipulate cells in a microfluidic chamber by 2D surface acoustic waves ( Figure 7C). 168 The propagation of surface acoustic waves through the microfluidic chamber produces a 3D acoustic field and induces acoustic streaming. Meanwhile, stable 3D trapping nodes were formed in the chamber to capture the cells. Accordingly, the cells could be precisely manipulated by adjusting the positions of these 3D trapping nodes. By integrating microfluidic chip and acoustic field, the rotational manipulation of single cells is also realized, which is vital to reveal hidden genetic and cellular details. 169 Vlandas' group combined miniaturized acoustic tweezers with holography to synthesize acoustic wavefields for selective manipulation and positioning of individual human cells. 170 These methods exhibit the advantages of acoustic wave-based manipulation, such as high precision and safety for the single living cells. Additionally, the acoustic wave-produced droplets are also used for the release of single cells. Zhao's group employed the acoustic droplet dispenser to controllably generate oxidized alginate microdroplets that contained enzymes to degrade the on-chip CTCs accurately. 171 The followed gelation of single cells in alginate microdroplets guaranteed the integrity of CTCs for gene analysis.

Optical trap
An optical trap consists of a laser beam that is highly focused by a high-numerical-aperture objective lens to a diffraction-limited spot. The intense light gradient near the focal region forms a stable 3D trapping zone for capturing and manipulating cells. 172 Optical trap-assisted devices combine the transparent microfluidic channel with laser beams for the high throughput and precise capture of single cells. 173 As the formation of optical trap and Raman signals both rely on the lasers, their cooperation not only realized the identification of specific cells but also simplified the device. 174 Stocker's group constructed a cell sorting plat-form by integrating an optical trap with a microfluidic chip ( Figure 7D). 175 According to the specific Raman spectra during cells pass the microchannel in lasers, this method could sort up to 500 cells/h from microbial communities with high accuracy (98.3% ± 1.7%) and automation. Due to the limited reach range of optical traps, in addition to the common strategies of adopting microfluidic chips to send cells in the optical trap, the combination of multiple physical forces also realizes the single-cell manipulation. For instance, the electric field generated by DEP moves the cells to the controllable range of the optical trap, then the cells are captured by the optical trap and driven into the specific on-chip microstructure precisely. 176 The cooperation of these two technologies makes full use of their respective strengths to realize the study of cell interaction at the single-cell level, which has the potential to control single cells for high-throughput genetic analysis.
Although the physical mechanics-based methods bring noticeable benefits to single-cell research, such as noncontacted cell manipulation, 3D cell manipulation, and so on, the requirements of complicated and expensive external control, complex programming, and tedious condition optimizations may compromise their practicalities.

CONCLUSION AND OUTLOOK
Single-cell gene analysis technology provides us the feasibility to decipher the important information regarding life science, and shed light on a series of new fields in identifying the gene heterogeneity among cells from the same source. In this review, we summarized the recent single-cell sequencing techniques and on-chip singlecell systems, focusing on their applications in single-cell gene analysis. Novel single-cell sequencing techniques are mainly introduced their principle and properties according to three types of omics, that is, genome, transcriptome, and epigenome. On-chip gene analysis systems, as the current innovative methods, are emphasized their application of the single-cell manipulation for high-throughput and convenient single-cell gene analysis from the classification of implementation rationale. Despite marked development, single-cell gene analysis technologies still face a variety of challenges on their way to revolutionizing life science and biotechnologies. In terms of single-cell sequencing, gene amplification is a universal solution for dealing with the few genetic samples in a single cell. However, during the amplification process, errors caused by the low fidelity of the amplification enzyme have become uncertain factors in genetic analysis, which reduces the accuracy and repeatability of library construction. Improving the fidelity of enzymes or introducing error correction systems can effectively improve the objectivity and accuracy of gene library construction. Furthermore, this issue can be solved if the library construction without amplification is directly sequenced. However, this requires high accuracy of sequencing methods to avoid reducing gene coverage. In addition, massive gene sequencing results are often accompanied by complex data analysis processes to obtain useful information on gene heterogeneity. Therefore, the development of simple and reliable sequence analysis methods is also a necessary part worked by future single-cell sequencing technologies.
In terms of on-chip gene analysis systems, most methods mainly focus on improving certain technical levels to separate single cells. However, the on-chip exploration of in-depth mechanisms of gene heterogeneity is always neglected. For example, the on-chip gene analysis system can give full play to its advantages in the trace of living cells, as well as the systematic study between genetic differences and cell behaviors. In addition, considering the practical problems of clinical application, the analysis of gene heterogeneity is inseparable from the identification of a large number of samples, which arouses an urgent requirement of the high-throughput and automatic platforms. This is another direction of researchers to develop an integrated gene analysis platform that can realize single-cell separation, gene extraction, and sequence detection. In conclusion, the current single-cell gene analysis technology needs further R&D to conduct the research of gene heterogeneity and serve as a powerful support for precision medicine.

A C K N O W L E D G M E N T S
This work is supported by fundings from the Beijing Natural Science Foundation (No. 7212204), Beijing Advanced Innovation Center for Biomedical Engineering and Beihang University, and National Natural Science Foundation of China (Nos. 32071407 and 62003023).

C O N F L I C T O F I N T E R E S T
The authors declare no conflict of interest.