Integration of single‐cell and bulk transcriptomics reveals immune‐related signatures in keloid

Keloid is a pathological dermatological condition that manifests as an overgrowth scar secondary to skin trauma. This study endeavored to excavate immune‐related signatures of keloid based on single‐cell RNA (scRNA) sequencing data and bulk RNA sequencing data.


| INTRODUC TI ON
Keloid performs as a fibroproliferative skin disease characterized by excessive deposition of extracellular matrix (ECM). It mainly occurs in the chest, back, earlobes, and joints secondary to wound healing, often accompanied by local pruritus, pain, and limited range of motion, which seriously worsens the function and appearance of the affected part and the life quality of patients. 1 The main methods to treat keloid clinically include surgery, radiotherapy, compression therapy, topical medication, and glucocorticoid injection. However, the above treatments have a high recurrence rate and poor efficacy. 2 Therefore, it is essential to investigate the pathogenesis of keloid to promote the curative outcome. With the in-depth study of the pathogenesis of keloid, the transition of the immune microenvironment is considered to be one of the critical factors influencing the progression of keloid. Dysregulated chronic inflammatory state is common in keloid, where various immune cells, cytokines, and their related intracellular signal transduction pathways participate in the formation and development of keloid. 3,4 Investigating the changes in the molecular immune mechanism of keloid and discovering immune-related gene markers can deepen our understanding of the pathogenesis, promote the precise therapy of keloid and improve the prognosis.
Single-cell RNA (scRNA) sequencing analysis is an innovative technique whose ability and practicability for gene expression analysis have been proved in identifying cell types and subtypes in multiple diseases previously, allowing clustering of cells to study the differences in gene expression and cell development between groups. 5,6 Keloid comprises numerous cells, including fibroblasts, myofibroblasts, and vascular endothelial cells. Therefore, scRNA sequencing analysis allows us to study the genetic features of keloid accurately.
In this study, we investigated the expression characteristics of immune-related genes and possible regulatory mechanisms in keloid through bioinformatic analysis combining single-cell and bulk transcriptomics.

| Data collection
Two datasets were mined from the NCBI GEO database (https:// www.ncbi.nlm.nih.gov/geo/). The GSE163973 dataset contains scRNA sequencing data of three keloid samples and three normal scar samples. 7 The GSE113619 dataset includes bulk RNA sequencing data from 18 keloid and 14 healthy control skin samples. 8 After de-duplication, 1509 immune-related genes (IRGs) were derived from the ImmPort database.

| Preprocessing of scRNA sequencing data
The "Seurat" package was utilized for quality control and effective cell screening. 9 The "FindVariableFeatures" method was then applied to extract genes with high intercellular variability, and the top 2000 genes with significant fluctuations were extracted for subsequent analysis. After normalization of gene expression in effective cells with a linear regression model, principal component analysis (PCA) was carried out to reduce the dimensionality and choose the best parameters for clustering.

| Cell clustering and the screen of marker genes
We deployed the "tSNE" algorithm to reduce the dimensionality of the top 20 principal components (PCs) and implemented a clustering analysis. 10 Subsequently, the parameters were set as "min.pct = 0.2" and "only.pos = TRUE", and "FindAllMarkers" in the "Seurat" R package was used to screen the marker genes for each cluster. Afterward, we identified the cell types and matched the marker genes of the corresponding cluster to specific cell types based on the "SingleR" package and the CellMarker database. 11,12

| Pseudotime trajectory analysis
The R package "Monocle" was deployed for pseudotemporal ordering analysis following three steps: gene screening, dimensionality reduction, and cell sorting. 13

| Identification of differentially expressed genes (DEGs)
With the threshold of p-value <0.05 and |log 2 FC| > 0.2, we filtered the differentially expressed genes (DEGs) in each cell type between keloid group and normal scar group in GSE163973 through "Seurat" Conclusion: In conclusion, we analyzed the cell types and functional differences in the keloid through scRNA and bulk RNA sequencing data. We identified three immunerelated signatures (VCAM1, CALCRL, and HLA-DPB1) in keloid, providing a basis for further in-depth investigation of the molecular mechanisms of keloid and exploration of therapeutic targets.

K E Y W O R D S
biomarker, bulk RNA sequencing, immune-related genes, keloid, single-cell RNA sequencing R package. Meanwhile, DEGs between keloid and control groups in the GSE113619 dataset were screened using the "DESeq2" package with the threshold of p-value <0.05 and |log 2 FC| > 0.5. 14

| Functional annotation
Gene Ontology (GO) functional enrichment analysis, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis, and single-gene Gene Set Enrichment Analysis (GSEA) was implemented by the R package "clusterProfiler" with the threshold of p-value < 0.05. 15

| Screening of immune-related signatures
In order to screen the immune-related signatures in keloid, we intersected the results of DEGs of endothelial cells between keloid and normal group from scRNA-sequencing data, DEGs between keloid and healthy control skin samples from bulk RNA sequencing data, and immune-related genes obtained from immPort database by plotting the Venn diagram on the online website Jvenn. 16 The "pROC" R package generated the ROC curves for biomarkers.

| Potential therapeutic drugs prediction
The Drug Gene Interaction Database (DGIdb) (https://www.dgidb. org/) was used to predict potential drugs or molecular compounds that interacted with the biomarkers, and the drug-gene interaction network was visualized by the Cytoscape software. 17

| Validation of the expression of immunerelated signatures
To further confirm the results analyzed from the public database, four keloid samples and four skin samples were collected for validation experiments from our center from January 2019 to July 2021 (Table S1).
The total RNA of 10 samples was extracted by Trizol (Ambion, Austin, USA) following the manufacturer's instructions. Based on the producer's indication, the cDNA was synthesized using the First strand cDNA synthesis kit (Servicebio). Then, RT-qPCR was carried out utilizing the 2×Universal Blue SYBR Green qPCR Master Mix (Servicebio) under the direction of the manual. The primer sequences for PCR are listed in Table 1. The relative expression level was standardized to the loading control GAPDH and computed by the 2 −ΔΔCq method. 18

| Statistical analysis
All analysis was conducted using the R software 3.6.1, and the Wilcoxon test or Student's t-test was employed to compare the data from different groups. If not otherwise stated, the p-value < 0.05 was considered statistically significant.

| Quality control and clustering of scRNA-seq data
We analyzed scRNA sequencing data from the GSE163973 dataset, which included 21 488 cells in the keloid group and 19 167 cells in the control group. After filtering, 11 856 cells from the keloid group and 12 116 cells from the control group were retained for downstream analysis ( Figure S1A). Through variance analysis, we acquired the top2000 highly variable genes ( Figure S1B). Next, we proceeded with PCA to reduce the dimensions of the data. Consequently, PCA showed no significant divergence of cells in the keloid and control groups ( Figure S1C).
Afterward, the tSNE algorithm was utilized for clustering the 20 principal components ( Figure S1D,E), and all cells were discerned into 23 cell clusters ( Figure 1A,B). Furthermore, we uncovered the marker genes for each cell cluster, and the top5 marker genes were presented in a heatmap and bubble graph ( Figure 1C,D). Based on the markers obtained for each cluster, we annotated each cell cluster employing "singleR" and "CellMarker". Clusters 0,4,7,14,15,18 were annotated as endothelial cells. Clusters 1,2,11,20 were annotated as tissue stem cells. Clusters 3,9 were annotated as keratinocytes. Cluster 5 was annotated as fibroblasts. Clusters 6,12,13 were annotated as chondrocytes. Clusters 8,16,21 were annotated as endothelial cells. Cluster 10 was annotated as DC. Cluster 17 was annotated as neurons. Cluster 19 was annotated as T cells. Cluster 22 was annotated as smooth muscle cells ( Figure 1E).
The expression level of the marker genes in the 10 cell types mentioned above was illustrated in Figure S2. We then counted and compared the number and percentage of each cell type in the keloid and control group.
As revealed in Figure 1F,G, there was a significant difference in the percentage of endothelial cells between the keloid and control group.

| Pseudotime trajectory analysis
To further investigate the potential relationships between the different cell types, we conducted a pseudotime trajectory analysis. The

| Functional distinction in each cell type between the keloid and control groups
To explore the distinction in the function of each cell type between the keloid and control groups, we implemented variance expression analysis and functional enrichment analysis of each cell type. Depending on p-value < 0.05 and |log 2 FC| > 0.2, we excavated the DEGs in each cell type and listed them in Table S2. The corresponding functional enrichment results were detailed in Table S3. The partial enrichment results of GO (BP, Biological Process), GO (CC, Cell Component), GO (MF, Molecular Function), and KEGG for each cell type were shown in the bubble plots ( Figure 3). For endothelial cells, DEGs were mainly enriched in functions as "extracellular matrix organization," "extracellular structure organization," "external encapsulating structure organization," "cellular response to tumor necrosis factor," "positive regulation of cell adhesion," "wound healing," "regulation of cell growth", "regulation of apoptotic signaling pathway," and immune-relevant biological processes and pathways such as "immune response-regulating signaling pathway," "TNF signaling pathway," and "IL-17 signaling pathway". We also noted that DEGs in chondrocytes, epithelial cells, fibroblasts, keratinocytes, neurons, smooth muscle cells, and tissue stem cells were relevant to "regulation of cell growth".

| Identification of immune-related signatures in keloid
To further screen for DEGs in keloid samples, we downloaded transcriptome data from the GSE113619 dataset and conducted differential expression analysis. A total of 806 DEGs (keloid vs. control) were filtered based on a threshold of p-value < 0.05 and |log 2 FC| > 0.5, including 296 upregulated and 510 downregulated genes ( Figure 4A,B, Table S4). Functional enrichment analysis was executed further to probe the functions of the DEGs. As displayed in  Figure 4C,D. We observed that these genes were mainly related to "regulation of cell-matrix adhesion," "regulation of actin filament-based process," "regulation of apoptotic signaling pathway," "Focal adhesion," "VEGF signaling pathway," cell cycle and apoptosis-related pathways, such as "FoxO signaling pathway," "mTOR signaling pathway," and "Hippo signaling pathway," and immune-related process and pathways, such as "regulation of T cell receptor signaling pathway," "regulation of dendritic cell chemotaxis," "B cell receptor signaling pathway," "Toll-like receptor signaling pathway," and "T cell receptor signaling pathway".
As both DEGs of endothelial cells from scRNA sequencing data and DEGs from bulk sequencing data were enriched in multiple immune-related pathways and biological processes ( Figure 5A-D), we took the intersection of the above DEGs with IRGs and three genes (VCAM1, CALCRL, and HLA-DPB1) were acquired ( Figure 5E). The AUC values of the ROC curves of these three genes were all greater than 0.7, indicating that the expression of these three genes had a favorable ability to distinguish keloid samples from control samples ( Figure 5F). Hence, VCAM1, CALCRL, and HLA-DPB1 were defined as immune-related signatures in keloid. In addition, the AUC value of the combination of the three biomarkers (0.938) was higher than any of the sole genes, indicating higher efficiency of making a distinction ( Figure 5F). The conclusion was further confirmed by the PCA result, which showed that control and keloid samples could be separated by the expression of the three genes ( Figure 5G).

| Potential molecular mechanisms of immunerelated signatures in keloid
To probe the possible mechanism of three immune-related signatures in keloid, we implemented the single-gene GSEA for each gene. The detailed information of enriched GO items, KEGG, and Reactome pathways was listed in Table S6. Top10 GO items, KEGG, and Reactome pathways were shown in Figure 6. We uncovered that CALCRL might be associated with "regulatory T cell differentia-

| Gene-drug network of immune-related signatures in keloid
To initially explore potential drugs targeting the three immunerelated signatures, we predicted 16 drugs targeting three biomarkers through the DGIdb database (Table 2). Ultimately, we constructed a gene-drug network containing 19 nodes and 16 edges (Figure 7).
In this network, Clozapine and Aspirin targeted HLA-DPB1.

| Validation of immune-related signatures in clinical samples
As illustrated in Figure 8A

| DISCUSS ION
Wound healing and keloid formation are complicated procedures involving various cell types, inflammatory and immune responses, and growth factors. 19 However, there is no consensus regarding the most acceptable treatment with negligible side effects for improving keloid. Recent evidence suggested that VCAM-1 was closely associated with the progression of inflammation and angiogenesis. During inflammatory responses, VCAM-1 was triggered by TNFα and required for leukocyte adhesion and transendothelial migration. 26,27 A previous study discovered that in mouse models of pulmonary TA B L E 2 The predicted drugs targeting the three immune-related signatures.  in tissue. 32 In keloid, dexamethasone can effectively inhibit the expression of VEGF and the proliferation of fibroblasts. 33 Another study demonstrates that dexamethasone can inhibit the expression of VCAM-1 in rats. 34 Therefore, we suspect that dexamethasone may reduce the expression of VCAM-1 during wound healing by reducing inflammation and attenuating keloid.
Calcitonin receptor like receptor (CALCRL) has been reported to function as a receptor for CALCA, which could regulate in vitro angiogenesis of human endothelial cells and placental development, and the knockout of CALCRL led to hydrops fetalis and cardiovascular defects. [35][36][37] Another study uncovered that CALCRL might induce anti-inflammatory signaling, resulting in decreased endothelial inflammation in vitro and in vivo, playing a protective role in atherosclerosis. 38 In addition, CALCRL was supposed to play a vital role in multiple types of tumor growth and neovascularization by amplifying signals necessary for pathologic angiogenesis and lymphangiogenesis. [39][40][41] Therefore, we assume that the upregulation of CALCRL may also lead to excessive dermal and epidermal neogenesis by enhancing  were similar pathophysiological characteristics between keloid and migraine, including transforming growth factor-β1 gene and neurogenic inflammation, so the above drugs may also inhibit keloid, although there was insufficient evidence and further verification is needed. 45 HLA-DPB1 is a member of the major histocompatibility complex class II, located in the highly polymorphic human leukocyte antigen region II on human chromosome 6. HLA-DPB1 antigen is mainly distributed on the surface of B lymphocytes, T cells, and macrophages.
By combining with peptide molecules, HLA-DPB1 antigen can present antigen information to CD4 + T cells, promote T-cell proliferation and cytotoxic reaction, and thus play an essential role in the immune system. 46 At present, it has been found that multiple SNP loci of the HLA-DPB1 gene are related to immune diseases such as rheumatoid arthritis and vasculitis. 47,48 The previous study also demonstrated HLA-DPB1 as a risk factor for relapse in antineutrophil cytoplasmic antibodyassociated vasculitis, indicating its significant role in the inflammatory and immune response in endothelial cells. 49 The gene-drug network predicted Clozapine and Aspirin as therapeutic medications target against HLA-DPB1. Aspirin is a non-selective COX inhibitor, and has an extensive clinical application in pain, fever, and inflammation. An earlier study has shown that in vitro aspirin reduces scar-related gene expression and scar tissue formation in injury tendon healing. 50 However, no evidence indicates that it can impede keloid formation during skin wound healing. We predicted through the database that clozapine and aspirin might be targeted drugs for HLA-DPB1. Clozapine is a dopamine and 5-hydroxytryptamine receptor blocker, and is intended for the treatment of schizophrenia. A recent study has shown that overexpression and agonism of the 5-hydroxytryptamine receptor might promote skin regeneration after injury. 51 Therefore, clozapine may inhibit keloid formation by antagonizing the 5-hydroxytryptamine receptor to reduce excessive epithelial and dermal hyperplasia. Persistent inflammation of the wound microenvironment is a high-risk factor of keloid. 3 As a famous non-steroidal anti-inflammatory drug, aspirin has been proven effective in inhibiting keloid, but its interaction mechanism with HLA-DPB1 and its role in keloid formation stays unclear. 52 In summary, we screened different cell types in keloid, explored the functional differences of the above cells, uncovered the distinction in the proportion of vascular endothelial cells between keloid and normal skin, and identified three immune-related signatures through the combination of scRNA sequencing and bulk transcriptome analysis for the first time. Furthermore, we verified the molecular signatures of keloid by ROC curve and RT-qPCR, which might provide novel insight into the molecular mechanism of the occurrence and progression of keloid. However, the potential molecular mechanism requires further study and experimental verification in future research.

AUTH O R CO NTR I B UTI O N S
Hanwen Wang, and Ziheng Zhou participated in study design and data collection. Hanwen Wang, Ziheng Zhou, and Jinming Tang carried out data analysis. Hanwen Wang drafted the final manuscript.
Julin Xie, Shaohai Qi, and Jinming Tang revised the manuscript. All authors read and approved the final manuscript.

FU N D I N G I N FO R M ATI O N
This work was supported by the National Natural Sciences Foundation of China (82072178 and 81871566).

CO N FLI C T O F I NTE R E S T
The authors declare that they have no competing interests.

E TH I C S S TATEM ENT
The study protocol was approved by the Ethics Committee of The First Affiliated Hospital, Sun Yat-sen University.

DATA AVA I L A B I L I T Y S TAT E M E N T
The datasets used and/or analyzed during the current study are available from the corresponding author upon reasonable request.