Single‐cell RNA landscape of cell fate decision of renal proximal tubular epithelial cells and immune‐microenvironment in kidney fibrosis

2. The plotcelltrajectory() function was used for visualization after ordering cells. The analysis of branch expression analysis modeling, BEAM in Monocles package, was performed for pseudotime-related genes. The results were visualized using the plot_genes_branched_heatmap function with genes grouped by the threshold of q < 1e-4 from the BEAM analysis.


Integration, dimensionality reduction, and cell cluster annotation
To integrate the gene expression data of all samples, the functions of FindIntegrationAnchors and IntegrateData, developed by the Satija Lab 4 , were applied. Highly variable genes (HVGs) were identified using the FindVariableFeatures() function with nfeatures = 3,000. The cell cycle effect was scored and eliminated by the ScaleData() function. Dimensionality reduction was conducted by the embedding methods of t-SNE and UMAP. The FindAllMarkers() function was used to identify differentially expressed genes (DEGs) for cell clusters compared to the others (adjusted p < 0.05) (Supplementary Table S2). Cell annotation was performed by referring to the DEGs, published studies, Humpheys lab website (http://humphreyslab.com/SingleCell/), and the PanglaoDB website (https://panglaodb.se/index.html). 5-9. Then, 28 clusters were identified by Seurat, and dimensionality reduction was visualized by t-SNE and UMAP. Particularly, they were: (1)

Differentially expressed genes (DEGs) and enrichment analysis of function
The DEGs in the cell populations between the IFTA (IgAN, CIN) and none to mild IFTA (donor, MN, LN) groups were obtained by the FindMarkers() function with p < 0.05 and logfc.threshold = 0.25 (Supplementary Table 3). The DEGs of PTs were divided into "up", "stable", and "down" according to avg_log2FC >0, =0, or <0. Functional enrichment analysis was performed based on the GO and KEGG database by the compareCluster() function of the ClusterProfiler R package (v3.18.1) 10-12 . GSVA was also performed with all genes of PTs 13 , based on the HALLMARK and KEGG databases. The AUCell package (v1.4.1) of R was used to evaluate the activity of biological pathways in the concerned cell types. For quantifying the metabolic activity in a specific cell cluster, ssGSEA 14 and scMetablism 15 were applied. The ssGSEA was performed based on all DEGs of PTs (logfc.threshold = 0.01). 3

Interaction analysis between cell types
To determine and visualize cell crosstalks between innate kidney cells and the immune microenvironment, the R package of CellChat 16 was used. Briefly, we loaded the Seurat object group by cell types into CellChat.
Secreted signaling in CellChatDB.human was set as the reference database. The standard preprocessing steps were then applied by the standard parameter set for the identifyOverExpressedGenes, identifyOverExpressedInteractions, projectData, computeCommunProb, computeCommunProbPathway, and aggregateNet functions. Non-Negative Matrix Factorization (NMF) was used to acknowledge the communication patterns of cells.
PHATE, RNA velocity, and pseudotime trajectory analysis PHATE (v1.0.7) was applied for nonlinear and unsupervised dimensionality reduction and visualization of cell development branches during IFTA development 17, 18 . Cells with less than 1,200 UMI/cell or over 25,000 UMI/cell, and the genes detected in fewer than 100 cells were filtered before normalization. The default parameters were used for the official pipeline. RNA velocyto was used for calculating the ratio of unspliced to spliced mRNA 19 . Transcriptional kinetics and developmental trajectory of PTs during the progression of IFTA were modeled by scVelo (v0.2.4) 20 and PAGA 21, 22 using default parameters and the filtered PT cluster from Seurat (as mentioned in Quality control and preprocessing). The R package of Monocle2 (v2.8.0) 23 was used for analyzing the pseudotime trajectories of PTs. Negbinomial.size was applied as the parameter expressionFamily in the newCellDataSet function to process the sparseMatrix.
Low-quality cells (mean expression of genes less than 0.1) were filtered before the trajectory analysis. The SetOderingFilter() function was then used for selecting cell clustering genes with parameters of mean_expression set as 0.1 and dispersion_empirical set as 1. Dimension reduction was then performed using the function of reduceDimension with parameters of "DDRTree" and "maxcomponents" set as 2. The plotcelltrajectory() function was used for visualization after ordering cells. The analysis of branch expression analysis modeling, BEAM in Monocles package, was performed for pseudotime-related genes. The results were visualized using the plot_genes_branched_heatmap function with genes grouped by the threshold of q < 1e-4 from the BEAM analysis.
Briefly, co-expression modules were built by GENIE3 between transcription factors (TFs), genes, and the corresponding weight in the interested cell group. The RcisTarget package was used to find overrepresented TF-binding motifs for the credibility of gene regulatory networks (each TF and its target genes). Cell-typespecific transcription factors were validated using the RSS algorithm 26 .

Animals
We used male Nr1h4-/-mice (8-10 weeks old) as described in our previous study 75 , and sex-matched C57BL/6 wild-type mice from Jackson Lab.

Cell culture
The human PT cell line, HK-2 (ATCC CRL-2190), was cultured using Gibco TM DMEM containing bovine serum. For in vitro studies, the cells were stimulated by H2O2 (10 mM) for 2 h.

Folic acid model
Wild-type or Nr1h4-/-mice were injected with 0.3 M NaHCO3 in the control group or 0.3 M NaHCO3 + 250 mg/kg folic acid (FA) in the experiment group. Mouse kidneys were harvested seven days after injection.

Administration of the Nr1h4 inhibitor
The mice used in the experiment were orally-gavaged with Gly-β-MCA (10 mg/kg) (MCE #HY-114392) dissolved in corn oil after FA modeling every other day for one week 28 , or intraperitoneally injected with Z-Guggulsterone (25 mg/kg) dissolved in corn oil (MCE #HY-110066) every day for one week 29 . The mice in 5 the control group were intraperitoneally provided with only corn oil after FA modelling every day for one week.

Western blotting
The tissue was homogenized in RIPA and centrifuged. The supernatant was collected and quantified using the BCA method. Then, the SDS loading buffer was added, sonication was performed, and the supernatant was boiled for 5 min, following which electrophoresis was performed using 10% SDS-PAGE gels. The samples were then transferred onto a PVDF membrane and blocked by BSA. The samples were incubated with the primary antibodies (anti-rabbit PAH, Abcam, #ab191415; anti-mouse NR1H4/FXR, CST, #72105S; anti-Rabbit HNF4A, Abcam, #ab200142; anti-mouse BAX, Proteintech, #60267-1-Ig; anti-rabbit CASPASE-3, CST, Cat#9662; anti-mouse BCL2, CST,#15071S) and the corresponding secondary antibodies. Membrane development was performed using the HRP Substrate Kit from Millipore and imaged using ImageReader LAS-4000 (Fujifilm).

QPCR
We isolated the total RNA from the kidney cortex with TRIzol and quantitated it by performing UV spectrophotometry. The RNA (1 ng) was then used to synthesize cDNA using a kit from Tiangen (#KR118).
The expression of the mRNA was assessed by performing qPCR using TB Green® Premix Ex Taq™ (TaKaRa, #RR420A).

Small interfering RNA (siRNA) transfection
The siRNAs targeting PAH were provided by Genomeditech and added to HK-2 cells at a final concentration of 100 nM 24 h before they were treated with H2O2 using Lipofectamine3000 from Thermo (#L3000150).

Flow cytometry
The HK2 cells with or without H2O2 modeling and stained with Annexin-PI were analyzed through flow cytometry (BD Pharmingen™, #556547). 6 SPSS v26.0 (IBM), Prism v9.0 (GraphPad), and R v4.0.5 were used for conducting the statistical analysis of the clinical cohort data. Differential analysis between the two clinical groups was conducted by performing the Student's t-test or Mann-Whitney U test. The differences were considered to be statistically significant at p < 0.05.

External data validation
The renal scRNA-seq data on IgA nephropathy with IFTA grade (GSE127136) 30 and snRNA-seq data on diabetic nephropathy with IFTA grade (GSE131882) 31 were downloaded from the GEO database.
Integration, dimensionality reduction, cell cluster annotation, and downstream analysis by Seurat were performed following the same method as mentioned above.