Understanding of mouse and human bladder at single‐cell resolution: integrated analysis of trajectory and cell‐cell interactive networks based on multiple scRNA‐seq datasets

Abstract Objectives To elaborately decipher the mouse and human bladders at single‐cell levels. Materials and Methods We collected more than 50,000 cells from multiple datasets and created, up to date, the largest integrated bladder datasets. Pseudotime trajectory of urothelium and interstitial cells, as well as dynamic cell‐cell interactions, was investigated. Biological activity scores and different roles of signaling pathways between certain cell clusters were also identified. Results The glucose score was significantly high in most urothelial cells, while the score of H3 acetylation was roughly equally distributed across all cell types. Several genes via a pseudotime pattern in mouse (Car3, Dkk2, Tnc, etc.) and human (FBLN1, S100A10, etc.) were discovered. S100A6, TMSB4X, and typical uroplakin genes seemed as shared pseudotime genes for urothelial cells in both human and mouse datasets. In combinational mouse (n = 16,688) and human (n = 22,080) bladders, we verified 1,330 and 1,449 interactive ligand‐receptor pairs, respectively. The distinct incoming and outgoing signaling was significantly associated with specific cell types. Collagen was the strongest signal from fibroblasts to urothelial basal cells in mouse, while laminin pathway for urothelial basal cells to smooth muscle cells (SMCs) in human. Fibronectin 1 pathway was intensely sent by myofibroblasts, received by urothelial cells, and almost exclusively mediated by SMCs in mouse bladder. Interestingly, the cell cluster of SMCs 2 was the dominant sender and mediator for Notch signaling in the human bladder, while SMCs 1 was not. The expression of integrin superfamily (the most common communicative pairs) was depicted, and their co‐expression patterns were located in certain cell types (eg, Itgb1 and Itgb4 in mouse and human basal cells). Conclusions This study provides a complete interpretation of the normal bladder at single‐cell levels, offering an in‐depth resource and foundation for future research.


| INTRODUC TI ON
Single-cell sequencing provides unprecedented resolution to help us understand and analyze the process of normal tissue growth and disease occurrence. 1 In particular, intercellular communication analysis at the single-cell level plays an essential role in helping us analyze physiological and pathological states. 2 The bladder is an important organ in the human urinary system, and the comprehensive analysis at the single-cell level is helpful for us to elaborate and explore its regular physiological basis and also in order to establish the foundation for further dissecting of certain changes in conditions of inflammation or tumor. Previous large-scale cell atlas analysis [3][4][5] has constructed the fundamental structs of multiple organs, but none of them comprehensively decomposed the details of the bladder. On the contrary, scattered data from prior reports 6,7 associated with bladder (single-cell level) with a limited number of cells per study may not be able to provide an all-round understanding. Yu et al. 8 have reported an admirable study sketching a singlecell transcriptomic map of bladder in both mouse and human, but this research is entirely lacking any cell-cell interaction analysis. Li et al. 9 have demonstrated a novel cell type labeled by Plxna4 using scRNA-seq in the mouse bladder, but this study is only focusing on the urothelial layer and without cell communicative network analysis or validation of any human samples. Studies have suggested that cross talk between cells profoundly impacts the development and the regeneration of the respiratory system. 10 Data from scRNA-seq also provide insights into the landscape of intercellular cross talk of liver cells in health and disease. 11 On the basis of multiple datasets, we comprehensively analyzed the bladder data of human and mice at the single-cell level and focused on the trajectory analysis of urothelial and interstitial cells in the bladder and their interactions between cells.

| Single dataset preprocessing and normal bladder at the single-cell level
First, to establish initial independent analysis datasets for mouse and human bladder, respectively, we collected single-cell RNA sequencing data of normal bladder tissues from GSM4201633 12 (mouse, 3′ 10X Genomics, Illumina HiSeq 2500) and GSM4850577 6 (human, 5′ 10× Genomics, HiSeq X Ten). Then, data were analyzed using Seurat (v3.0 and v4.0) 13 in R (version 4.0.0) following standard workflow from website vignette (https://satij alab.org/seura t/). Uniform manifold approximation and projection for dimension reduction (UMAP) was done as the preferable way to display the most clustering data unless it is not available. After clustering of cells, the identification of cell groups was primarily according to the previous well-known marker genes 7,14 using the "FindAllMarkers" function which identifies differentially expressed genes between one cluster and all other cells using Wilcoxon rank-sum tests.
The cell-cycle score was calculated by the "CellCycleScoring" function which would assign each cell a score, based on its expression of G2/M and S-phase markers, and other scores (eg, hypoxia, metabolism, and histone modifications) were added by the "AddModuleScore" function which could calculate supervised module scores for any given gene list. R Package clusterProfiler 15 (version 3.18.1) was used for gene set enrichment analysis (GSEA) using differentially expressed genes (both positive and negative) across cell clusters. Most GSEA gene lists were obtained from the msigdbr (version 7.1.1) package. Genes associated with N 6 -methyladenosine (M6A) modifications were manually gathered from the published articles. 16,17 Developmental trajectory and pseudotime analysis of pan-fibroblasts in mouse and human bladder was performed using monocle3 18 (version 0.2.1). Of note, these two datasets were considered independent analysis objects and were not into subsequent integration.
Our initial preliminary attempts found that the integration process would affect the subsequent trajectory analysis of urothelial and interstitial cells in the bladder. Whether reciprocal principal component analysis (RPCA) or canonical correlation analysis (CCA), both would impair the significance of typical pseudotime genes. So, we performed trajectory analysis on each sample (with the available amount of cells), respectively, using the monocle3 package.
Top genes (with the highest Moran I score) were displayed in scatter plots. In addition, the top 50 genes of each analysis via a pseudotime pattern were collected, and then mitochondrial or ribosomal genes were excluded; the selection range would expand to the top 60-100 if there were not sufficient left genes (n > 25) after removal. These genes were showed in Venn plots to explore the potential common key trajectory genes by intersection.

| Integrated mouse and human bladder datasets and cell-cell interactions
Then, these multiple datasets were integrated into the mouse and human combined datasets, respectively, based on the "FindIntegrationAnchors" function, which can find a set of anchors between a list of Seurat objects, and these anchors can later be used to integrate the objects with CCA method. Then, these cells were subsequently normalized, scaled, linear-dimensional/ nonlinear reduced, and clustered by Seurat mainly using default parameters. 13 Cell-type identities were assigned on the basis of the present cluster biomarkers (differentially expressed gene) and the previously reported marker genes of human and mouse bladder. 8 Cell-cell communication networks were identified and visualized by CellChat (Version 1.0.0) package. 19 Nebulosa (v1.3.0) R package was used to calculate and visualize the joint density of co-expression patterns.

| Overview of the single dataset of mouse and human bladder at the single-cell level
After quality control, 6,962 and 6,982 cells, respectively, of mouse and human bladder were analyzed. In mouse and human bladder, we recognized 21 and 18 cell clusters, respectively ( Figure 1A-B). Cell identity was almost consistent with previous and their source papers.
But, we reconsidered the definition of a particular cluster described as fibroblast/smooth muscle cell (SMC) by Shuai He et al. 6 Most fibroblast/SMC were identified as myofibroblasts in our study, which is more often used and widely accepted in other researches. Only a few cells with extremely high SMC markers (eg, ACTA2, ACTG2, and CNN1) and relatively low typical fibroblast markers (eg, S100A4, COL3A1, and COL1A1) were defined as the myofibroblasts/SMC cluster. Besides, the human bladder dataset (GSM4850577) was lacking in urothelial cells, and it might be caused by different sampling procedures compared with other studies. 6,8 Though the mouse bladder dataset (GSM4201633) contained a pool of eight mice bladders, it seemingly had a relatively low cell capture per bladder which might be due to its stricter dissection and recombination processing for samples. 12

| Marker gene of cell clusters in mouse and human bladder
For both human and mouse bladder, the following markers col-

| Assumed biological activity scores across cell types
Similar to the default calculation function of cell-cycle scores in the Seurat package, curated gene lists were employed to create activity scores of glucose, lactate metabolism, hypoxia, histone 3 (H3) acetylation, histone (H) methylation, and M6A. After scaled to 1-6 (6 for highest score) using different color bars by default method, scores were showed in the feature plot ( Figure 1C-D). In mouse bladder, glucose score was significantly high in most urothelial cells and part of smooth muscle cells, while lactate score had a similar but slighter trend. Hypoxia score was relatively low in urothelial cells (especially for urothelial cells 3), and the score of H3 acetylation was roughly equally distributed across cell types, while histone methylation and M6A scores were high in urothelial and endothelial cells. Though lacking urothelial cells, a similar pattern was seen in the human bladder; for example, H3 acetylation scores were evenly distributed among cells, and hypoxia score was higher in fibroblasts and immune cells. Glucose score was also elevated in immune cells and urothelial cells (a small population containing 17 cells) in the human bladder.
These scores were much more consistent inside the same tissue type (epithelial or interstitial tissues) than across different tissue types. Serial of H3 acetylations (H3K4ac, H3K9ac, H3K23ac, and H3K27ac) had already been recognized as major regulators of transcription activation. 20 In our study, the score of H3 acetylation varied for individual cells inside a specific cell cluster but seemingly had a similar pattern across all cell types, which may indicate that transcriptional levels pattern at different tissues might be resembled. However, discrepancies of metabolic and hypoxia levels among tissues were easily observed, representing the fact that each kind of tissue had its unique function and morph. These phenomena also raised a question that evidence found by bulk RNA-seq data at a prior era that could not locate physiological and pathological changes on a specific cell cluster may be worthwhile for re-study again at the single-cell level.

| Differentially expressed genes and pseudotime trajectory of pan-fibroblasts in mouse and human bladder
We subset original datasets (Seurat clusters) to create pan-fibroblasts datasets and then did differentially expressed genes (DEGs) analysis also by the "FindAllMarkers" function. Mouse pan-fibroblasts In human, DEGs were related to multiple biological functions of wounding, wounding healing, response to external stimulus, coagulation, and also a variety of receptor activity pathways.
Data of pan-fibroblasts were import into Monocle 3 package from Seurat objects, and pseudotime trajectory analysis was constructed ( Figure 2C-D) following standard workflow with default parameters. In mouse bladder, fibroblast 4 seemed to be the beginning cells, fibroblasts 1 and 3 played intermediate roles, and fibroblasts 2 would be the end of the trajectory. We also extracted several top genes via a pseudotemporal pattern, among them, Car3, Dkk2, Tnc, and Bmp5 were highly expressed in the early stage, and S100a6, Col1a2, and Fth1 were in the later stage. We also depicted the distribution of these genes via four clusters. Car3, Dkk2, Tnc, and Bmp5 were almost exclusively expressed in fibroblasts 4 cluster. In the human bladder, fibroblasts 4 was at the beginning point, fibroblasts 1 and 3 were at the middle ways, while fibroblasts 2 were located at the end of development. Top pseudotemporal genes including FBLN2, PLA2G2A, SH3BGRL3, and S100A10 that highly expressed in early and/or middle stages and DPT, COLEC11, and C7 in the later stage of trajectory. Fibroblasts 2 exclusively expressed COLEC11 but lacking FBLN2 and PLA2G2A expressions, while fibroblasts 3 did not express DPT. Notably, the core pseudotemporal genes associated with fibroblasts trajectory seemed to be different across species (mouse and human). Furthermore, a previous article showed that Tnc and Bmp5 exhibited a cross-organ similarity in Tnc + Cd34− fibroblasts in the colon and bladder of the mouse, while Dkk2 displayed a co-expression pattern with Tnc in the mouse bladder. 14 In addition, PLA2G2A, SH3BGRL3, and S100A10 have also been reported as top pseudotemporal genes in the human bladder from another study. 8 Together, these data supported the repeatability and the robustness of our trajectory analysis and reflected potential organotypic or species differences. Using Venn plots ( Figure 3A), we identified several interdatasets common genes via pseudotime of pan-fibroblasts or urothelial cells. Tnc, Clec3b, Car3, Cxcl14, Grem2, Dkk2, and Spon1 were found significant in all mouse pan-fibroblasts datasets. CCDC80 and FBLN1 were found in four (out of five) human pan-fibroblasts datasets. Similarly, Tmsb4x, Gstm1, S100a6, and Gsta4 were found significant in all mouse urothelial cells, and EIF1, TPT1, FTH1, UPK1A, TMSB4X, S100A6, etc. were simultaneously found in two (out of three) human urothelial cell datasets. Then, we tried to explore the heterogeneity and homogeneity of chosen genes among datasets and species. First, we noticed that some genes (such as Car3 and Dkk2) significantly showing a pseudotime pattern in mouse pan-fibroblasts even could not be found in the scRNA-seq expression matrix of their human counterparts.

| Identification of common trajectory genes of urothelium and pan-fibroblasts in bladder based on multiple non-integrated datasets
Furthermore, Cd34 and Tnc were reported as markers of differentially located fibroblasts in mouse and showed a strong pseudotime pattern in mouse pan-fibroblasts datasets but not in human, while FBLN1 was a top gene in human pan-fibroblasts datasets but not in mouse ( Figure S2). Second, only a few candidates (eg, S100A6 and TMSB4X for urothelial cells) were considered as shared pseudotime genes in both human and mouse ( Figure S3A-B). Unfortunately, the pseudotime trends of S100A6 and TMSB4X seemed to be slight across all datasets, even though they still hit the statistical threshold. Thirdly, even if a certain gene was found significant in all datasets,its pattern would easily differ from the different data sources. CXCL14 and DCN seemed to have an inconsistent pattern in pan-fibroblasts among datasets and between species (Figure 3C-D). However, we found out that UPK1A

| Integrated datasets of mouse and human bladder
Besides the above independent analysis on each dataset respectively, we tried to integrate these datasets (not including

| Cell-cell communication in integrated datasets
Then, we performed cell-cell interaction analysis by the CellChat R package to further explore the dynamic cross talk in the bladder.  The details of significant contributing signals of incoming and outgoing patterns in all cell clusters are displayed in Figure S5A-B.

| Interactions between urothelial basal cells and fibroblasts in mouse bladder
We further explored the potential communication in depth between  Figure 6A and Figure S6A-B). In the collagen signaling network, fibroblasts, especially, myofibroblasts were the senders, all urothelial cells were receivers, and this pathway was influenced by almost all cell clusters except for immune cells. In the FN1 pathway, myofibroblasts were the strongest sender, urothelial cells (especially basal cells) were the receivers, and it was only heavily mediated by smooth muscle cells. In the ncWNT signaling network, urothelial cells were the strong senders; myofibroblasts were the major receiver.

| Interactions between urothelial basal cells and smooth muscle cells in human bladder
In addition, we discovered the estimated interactions between the bottom of the urothelium (basal cells 1-2) and the certain type of smooth muscle cells (detrusor) in the human bladder. Using similar processes, we explored the interactions when smooth muscle cells were targets and basal cells were sources, and vice versa ( Figure 5). also played a significant role as a receiver in the CD46 pathway network, while smooth muscle cells 1 did not. However, smooth muscle cells 1 were the receiver for the visfatin signaling network, while smooth muscle cells 2 was not.

| Interactive networks of integrin superfamily in the bladder and its adhesion to cells and extracellular matrix
Integrin superfamily contains 18 αsubunits (ie, ITGA1, ITGAV, Then, we browsed the expression of these subunits in mouse (all 26 subunits) and human (25 subunits, ITGAD not available) bladder ( Figure S7A-B). In both mouse and human, ITGB2 was primarily expressed in immune cells, ITGB1 was the most broadly expressed subunit gene, and ITGA1 was only expressed in smooth muscle and endothelial (potentially include pericytes) cells. Co-expression of ITGB1 and ITGB4 was located in basal cells in both mouse and human, implying this layer was tightly connected to the basement membrane ( Figure 7D-E). Co-expression of Itga11 and Itgb1 has precisely occurred in myofibroblasts of mouse bladder ( Figure 7F).
Meanwhile, some differences between species have been observed; myofibroblasts in the human bladder not only highly expressed ITGA11 but also relatively highly expressed ITGA8 when myofibroblasts in mouse bladder only highly expressed Itga11. The overall expression of ITGA2 was significantly lower in the mouse bladder. Itgb7 was highly expressed in immune cells of mouse bladder, while ITGB7 was barely expressed across all human bladder cell clusters. Taken together, these findings were consistent with the previous reports, 21-23 except for a few potential organs or species specificities.

| DISCUSS ION
In recent years, whether it is the construction of large-scale cellular atlases or the analysis of single-cell data from a small number of samples, we are trying to understand cell fate, development, and communication at an unprecedented depth, so as to ultimately grasp and interpret the underlying mechanisms of phenomena that can be observed by us even at the naked eye level. In our study, we eventually collected and analyzed more than 23,000 mouse cells and 29,000 human cells from 11 bladder samples (six different scRNAseq datasets). Notably, sample isolation and processing, along with the different scRNA-seq platforms, would significantly affect the results. Rapid and efficient sample processing and non-over-digested single-cell suspensions might be a prerequisite to maintain favorable cell status (relatively low mitochondrial or ribosomal contamination) and facilitate subsequent data analysis. Followed by standard Seurat package workflow, we suggested a resolution between 0.3 and1 to find distinct cell clusters for bladder cells ranging from 3,000 to 20,000 ( Figure S8). Normally, a higher resolution would only create more sub-clusters of urothelial cells or fibroblasts, and whether these sub-groups of cells exist unique features was still unknown since there was no widely accepted optimal resolution for distinguishing different cell groups. Interestingly, it appeared that more immune cells can be detected in human samples, and we suspected that this could be related to the germ-free environment in which the mice were raised and the short laboratory animal lifespan.
The canonical marker genes of most cell clusters in the bladder were well established. However, we found that sometimes it was difficult to distinguish myofibroblasts from smooth muscle cells when a group of cells highly expressed ACTA2 and barely expressed fibroblast markers. Also, it would be challenging to decipher immune cells in the bladder when the cell amount was limited and marker genes were mixed. As a stratified epithelium, urothelium is typically com- Metabolic patterns were one of the customized features for a certain cell cluster. We found that urothelial cells seem to have a higher glucose metabolic activity score, implying that they were in a constant state of renewal and, therefore, have a high energy Fibroblasts are known to be mesenchymal origin cells and comprise the majority of interstitial cells in the bladder 26 with undisputed important biological functions, especially for tissue fibrosis, wound contraction, and the formation of extracellular matrix. 14 DEGs were intensely associated with morphogenesis pathways, which was expected, as fibroblasts somehow could shape the water content and tensile properties in tissues. 14 Tightly connected to the ECM, these enriched pathways were also correlated with multiple receptor signaling. As we can see, these DEGs-enriched pathways were nearly identical between mouse and human, representing a great similarity in fibroblast function between species.
In accordance with the previous findings, we validated that PLA2G2A, S100A10, and SH3BGRL3 were lowly expressed at the bladder. PLA2G2A is a prominent marker of fibroblasts in the bladder, and its expression would be substantially reduced in bladder tissue from patients with prune belly syndrome. 27 Also, evidence from human and rat lung tissue microarray data indicated that PLA2G2A was overexpressed in patients with idiopathic pulmonary fibrosis, and in our speculation, which was most likely due to an increase in the abundance of fibroblasts. 28 ECM genes (eg, FBLN1 and FBLN2) also exhibited a pseudotime pattern in the human pan-fibroblasts dataset, fibroblasts derived from patients with synpolydactyly (hand malformations) showed alterations in the level of FBLN1 splice variants, 29 and ablation of Fbln2 in mice cardiac fibroblasts protected against progressive ventricular dysfunction, reducing the mortality after myocardial infarction. 30 In mouse bladder, Car3, Dkk2, Tnc, and Bmp5 were among top trajectory genes with pseudotime expression features, overexpression of DKK2 would reduce the activation of human cardiac fibroblasts, 31 Tnc was involved in modulating ECM integrity and preventing skin aging, 32 and Bmp5 was an antifibrotic factor that related to fibroblast-myofibroblast transdifferentiation in rat kidney interstitial fibroblasts. 33 Cxcl14 and Grem2 consistently showed a pseudotime trait in all five mouse pan-fibroblasts; previous studies displayed that the Cxcl14 axis in fibroblasts can interact with multiple cancer cells and acts as a multi-modal stimulator with tumor-supporting properties, [34][35][36] and the activation of Grem2 in fibroblasts would promote pulmonary fibrosis. 37 For the development of urothelial cells, TMSB4X was a top gene in all datasets regardless of the mouse or human tissue sources, depletion of TMSB4X would cause abnormal stability of adherence junction in epidermal cells, 38 and a developmental trajectory using single-cell proteomics revealed TMSB4X significantly decreased during hair-cell differentiation. 39 In addition, increasing expression of UPK2, UPK1A, and UPK3A ( Figure 3B) has been seen through all datasets in both mouse and human bladder. Knockout of these genes in mice would cause several abnormalities, such as poorly differentiated umbrella cells and vesicoureteral reflux with hydronephrosis. 40 In mouse embryonic day 11-12, progenitor cells of urothelium were formed with the expression of SHH, FOXA2, TP63, and uroplakins (most be UPK3A) but without KRT5. 40  the remaining genes (Gstm1, Tmsb4x, S100a6, and Malat1) were still aligned with our study. This phenomenon might imply that the urothelial cells may have been more heavily damaged during the sample preparation because of their exposure as the outermost layer or their intolerance and sensitivity to cell digestive agents (multiple enzymes).
To give a better perspective of the entire bladder, we have integrated the above datasets (two initial independent datasets not included) using the Seurat R package (CCA method) with slightly more looser quality control parameters compared with earlier independent analysis. In general, the results displayed that the integration appropriately addressed and merged the original cell subpopulations, underlying the major cell types in the mouse and human bladder ( Figure 4A-B).
To our knowledge, this is the first study to integrate bladder scRNAseq data from different platforms, focusing on this specific organ, and thus produced the largest data of normal bladder at single-cell levels.
Then, cell-cell communication analysis was conducted using a recently published R package CellChat. 19 Notably, these dynamic interactive networks were broadly dispersed across cell types ( Figure 4C-D).  41 Also, the existence of laminin, collagen, and elastin in the bladder submucosa matrix was maintained as valuable bioactive factors even after the decellularization and extraction processes. 42 For these interstitial cells (eg, fibroblasts), their close proximity to the urothelium and smooth muscle cell (detrusor) seemed to suggest their modulating or bridging role in the bladder wall. 40 Communication between human bladder smooth muscle cells and suburothelial myofibroblasts was directly associated with overactive bladder syndrome and could be profoundly affected by different cytokines. 43 At last, we took integrin superfamily, TGF, and VEGF pathways as cases for illustrating the different roles and distribution patterns of signaling in mouse bladder (Figure 7). It is obvious and intuitive that the different patterns correspond to varying functions and localization of signaling pathways.
In summary, we collected multiple datasets to comprehensively dissect the bladder at a single-cell level. To date, this is the first and largest integration study of the normal bladder using singlecell transcriptome data. DEGs and pseudotime analysis of panfibroblasts revealed similarity in function and potential distinct development trajectory between mouse and human bladders.
Whether these heterogeneities are caused by any technical factors during scRNA-seq needs further investigation. TMSB4X and S100A6 show a pseudotemporal signature in the multiple mouse or human urothelial cell datasets, and the specific roles they play need to be further examined. Tons of interactive communications could be recognized in our large-scale integrated bladder datasets, and future studies could proceed to explore whether these paired signals are significantly altered under pathological conditions. Also, we provide information on which signaling pathways are enriched in particular cell clusters (eg, urothelial basal cells, fibroblasts, and smooth muscle cells) of the bladder and what roles (eg, sender, receiver, and mediator) different cells play in the pathways. The exact mechanisms of how these signaling pathways are synergistically regulated by a variety of distinct cells and function stably are worth further exploration.

ACK N OWLED G M ENTS
The present study was supported by the National Natural Science Foundation of China (Grant No. 81570684 and 81970657). The funders had no roles in the design of the study, data collection, interpretation, and analysis, decision to publish, or writing of the manuscript.

CO N FLI C T O F I NTE R E S T
The authors declare no competing interests.

E TH I C S A PPROVA L A N D CO N S E NT TO PA RTI CI PATE
Not applicable to this study.

CO N S ENT FO R PU B LI C ATI O N
All authors read and approved the final version of the manuscript for publication.

DATA AVA I L A B I L I T Y S TAT E M E N T
The datasets used in the present study are available on the Gene Expression Omnibus (GEO, https://www.ncbi.nlm.nih.gov/geo/). The analysis code can be found in the GitHub repository (https:// github.com/Bowen Shi-scRNA/ bladd er-scRNA -seq).