Single‐cell RNA sequencing of human prostate basal epithelial cells reveals zone‐specific cellular populations and gene expression signatures

Despite evidence of genetic signatures in normal tissue correlating with disease risk, prospectively identifying genetic drivers and cell types that underlie subsequent pathologies has historically been challenging. The human prostate is an ideal model to investigate this phenomenon because it is anatomically segregated into three glandular zones (central, peripheral, and transition) that develop differential pathologies: prostate cancer in the peripheral zone (PZ) and benign prostatic hyperplasia (BPH) in the transition zone (TZ), with the central zone (CZ) rarely developing disease. More specifically, prostatic basal cells have been implicated in differentiation and proliferation during prostate development and regeneration; however, the contribution of zonal variation and the critical role of basal cells in prostatic disease etiology are not well understood. Using single‐cell RNA sequencing of primary prostate epithelial cultures, we elucidated organ‐specific, zone‐specific, and cluster‐specific gene expression differences in basal cells isolated from human prostate and seminal vesicle (SV). Aggregated analysis identified ten distinct basal clusters by Uniform Manifold Approximation and Projection. Organ specificity compared gene expression in SV with the prostate. As expected, SV cells were distinct from prostate cells by clustering, gene expression, and pathway analysis. For prostate zone specificity, we identified two CZ‐specific clusters, while the TZ and PZ populations clustered together. Despite these similarities, differential gene expression was identified between PZ and TZ samples that correlated with gene expression profiles in prostate cancer and BPH, respectively. Zone‐specific profiles and cell type‐specific markers were validated using immunostaining and bioinformatic analyses of publicly available RNA‐seq datasets. Understanding the baseline differences at the organ, zonal, and cellular level provides important insight into the potential drivers of prostatic disease and guides the investigation of novel preventive or curative treatments. Importantly, this study identifies multiple prostate basal cell populations and cell type‐specific gene signatures within prostate basal epithelial cells that have potential critical roles in driving prostatic diseases. © 2023 The Authors. The Journal of Pathology published by John Wiley & Sons Ltd on behalf of The Pathological Society of Great Britain and Ireland.


Introduction
It has been well established that developmental origin for certain tissues contributes to disease risk.This phenomenon is particularly evident in the prostate, rendering it a good model to understand fundamental questions about cells and tissue of disease origin.The human prostate is an androgen-regulated organ of the male reproductive system that is anatomically segregated into three distinct glandular zonescentral (CZ), peripheral (PZ), and transition (TZ) [1][2][3].Histologically, all three zones have an epithelial compartment, containing luminal secretory cells, basal cells, and neuroendocrine cells, and a stromal compartment, containing fibroblasts and smooth muscle cells; however, there is zonal variation in the cellular composition of the stromal and epithelial compartments that is not well understood [2][3][4].Importantly, these developmental and histologic differences underlie a divergence in pathology between prostatic zones; while most prostate cancer tumors arise in the PZ, benign prostatic hyperplasia (BPH) occurs primarily in the TZ, and prostatic disease rarely originates from the CZ [1,4].Moreover, the CZ is reported to be developmentally linked with seminal vesicles (SVs), which are another androgen-driven organ that rarely develops disease [3].Thus, identifying baseline molecular differences between the prostate zones (CZ versus PZ versus TZ) at a cellular level has the high potential to provide insight into the zonal-specific origins of prostatic diseases.
Within each of these zones, the basal cell population has emerged as a critical player in differentiation and proliferation during prostate development and regeneration [5][6][7][8].Because basal cells are thought to remain bi-potent even in the adult prostate, it is likely that this cell population contributes to the aberrant cell proliferation in both prostate cancer and BPH [6,7].Therefore, investigating genetic signatures in the basal cell populations of the TZ and PZ has high potential to provide insight into prostate disease etiology, especially in comparison to basal cells in the CZ.Several studies have identified functional differences between basal cells across prostatic zones including differential response to sex steroid hormones, differential rates of proliferation and apoptosis, and differential gene expression [9][10][11].However, these studies used a targeted approach to determine expression changes, which limits our knowledge of genetic differences to previously identified markers.Additionally, many studies have noted the importance of the paracrine interaction between the stromal and epithelial compartments in the development of prostatic disease; basal cells, as part of the epithelial compartment, serve as the interface between stromal cells and the secretory luminal epithelial cells [11,12].Characterizing the basal cell populations and the associated stromal differences in each of these zones provides a more complete picture related to the development of disease.
Recently, single-cell RNA sequencing (scRNA-seq) has become a powerful technique to investigate cellular population and gene expression differences between cells and tissues in a sensitive and specific way.Several publications have sought to characterize baseline changes that may underlie disease in each prostatic zone [13][14][15].These studies provide valuable insight into the heterogeneity of cell types within the prostate zones; however, one common limitation is the broad scope of cell type characterization, resulting in the identification of a single homogeneous basal epithelial population across all prostatic zones [13][14][15].To gain better resolution for differentially expressed genes in the basal cell populations between zones, we isolated human primary prostate epithelial cells from CZ, PZ, and TZ biopsies using basal/ epithelial-cell selection media.Utilizing scRNA-seq, we investigated differential gene expression and gene ontology pathways between organs (prostate and seminal vesicle), prostate zones (CZ, PZ, and TZ), and by population clustering.Further, we validated our findings using publicly available RNA-seq datasets and immunostaining of human prostate tissue.Finally, we utilized RNA-seq to determine differential gene expression and pathway analysis of stromal cells in each prostate zone.Understanding the differences, specifically in the basal populations at the organ, zonal, and cellular levels, provides important insight into the potential drivers of prostatic disease and guides the investigation of novel preventive or curative treatments.On a broader scale, this study identifies a strategy to utilize gene signatures in normal tissues to understand subsequent pathologies.

Tissue collection and processing
Fresh human prostate and seminal vesicles were obtained from six patients under an IRB-approved tissue collection protocol with informed consent (supplementary material, Table S1).Human prostate epithelial cells (PrECs) and seminal vesicle epithelial cells (SVECs) were processed and maintained in basal/epithelial selection media (Keratinocyte Serum-free Media, KSFM, Thermo Fisher Scientific, Waltham, MA, USA) as described previously [16,17] and in detail in Supplementary materials and methods.Human stromal cells (PrSCs and SVSCs) were cultured in RPMI (Thermo Fisher Scientific) supplemented with 10% fetal bovine serum.Eight biopsies were collected from each patient (two from each zone -CZ, PZ, TZ, and two from SVs); one patient was used for scRNA-seq and the remaining five were used for validation.Zone specificity and exclusion of cancer was confirmed in collaboration with board-certified pathologists in the UIC Health pathology department and tissue pathology core facility.Input cells and viability were recorded before sequencing (Table 1).1).

Genome alignment and Seurat analysis
Genome alignment was completed in collaboration with the UIUC Carver Biotechnology Center.A custom reference genome was made using Gencode's GRCh38 genome and v35 gene models (equivalent to Ensembl release 101) using 10X Genomics' Cell Ranger software v 4.0.0.The called cells from Cell Ranger were read into R (R 4.1.2npackd.org,https://satijalab.org/seurat/) and analyzed using Seurat (v.4.0.5, https://satijalab.org/seurat/), as described previously [18] and in detail in Supplementary materials and methods and supplementary material, Figures S1 and S2.In brief, the standard Seurat pipeline [sctransform normalization, principal component analysis, cluster calling, and UMAP (Uniform Manifold Approximation and Projection)] reduction was used for quality control analysis [19].Differentially expressed genes (DEGs) were determined in SVECs versus PrECs, and by pairwise comparisons of the prostate zones (CZ versus PZ versus TZ) (supplementary material, File S1).Datasets have been made publicly available within the Gene Expression Omnibus (GEO) under accession number GSE241556.

RNA sequencing of stromal cells
Total RNA was isolated and purified from PrSC/SVSC cultures using the RNeasy Mini Kit (Qiagen, Valencia, CA, USA).Libraries were prepared using the KAPA Stranded mRNA-Seq Kit for Illumina ® platforms (Roche Sequencing and Life Science, Inc., Santa Clara, CA, USA), and Illumina ® TruSeq Adaptor primers, according to the manufactor's protocols (supplementary material, Table S2).Libraries were sequenced on an Illumina ® platform in collaboration with UIUC as single end reads.Raw sequence data were trimmed using the Trimmomatic-0.36 package [20] with default parameters, and quality control was confirmed using the Galaxy online tool (https://usegalaxy.org/).Each sequence was aligned to human genome (hg38) using kallisto v0.46.1 [21] and analyzed using EdgeR -Bioconductor package [22].Gene set enrichment analysis (GSEA) was subsequently conducted using pairwise comparison, normalized enrichment score (NES), and false discovery rate (FDR < 0.05) (supplementary material, File S2).
IRB approval was obtained when necessary.Quality control, genome alignment, and feature annotation are described in detail in Supplementary materials and methods.In brief, quality control of raw FASTQ and trimming utilized FastQC and cutadapt [28,29].Genome alignment was performed in STAR using the hg38 human reference genome [30].PCR duplicates were removed using Picard [31] and reads were cross-checked against a reference database using BWE MEM for human ribosomal sequences from NCBI and Ensembl [32].Genomic features were quantified using FeatureCounts, and Ensembl was used for annotations for human genome v.hg38 [33,34].These combined RNA-seq datasets were interrogated for our zonal markers.Pairwise statistics between genes of interest using CPM-TMM (TPM) values was conducted using a Mann-Whitney U-test.

Gene ontology (GO) and expression analysis
Pathway analysis was performed in ShinyGO version 0.77 [35] using the following parameters: pathway database = GO biological process; number of pathways to show = 20; FDR cutoff < 0.05; and pathway size 2 > X > 2000.For organ-specific analyses, gene ontology was performed on the top 100 genes from SV up-and down-regulated gene lists.For zone-and cluster-specific analyses, the complete lists of significant genes identified by pairwise comparison were queried for gene ontology in ShinyGO.Heat maps were generated in GraphPad/Prism v. 9.4.1 (GraphPad, Dotmatics, Boston, MA, USA).

Quantitative PCR for validation
PrEC sets (PZ, TZ, and CZ) and matching SVECs from three additional patients were used for validation of the organ-and zone-specific gene expression identified by scRNA-seq.To quantify expression, quantitative PCR (qPCR) was performed as previously described [36].In brief, RNA was isolated using the Qiagen RNeasy Mini Kit with DNAse digestion (Qiagen, Valencia, CA, USA) and converted to cDNA (qScript cDNA Synthesis Kit; QuantaBio, Beverly, MA, USA).Expression was quantified using Power SYBR Green Master Mix (Invitrogen, Waltham, MA, USA) with custom primers (supplementary material, Table S3), relative to reference gene RPL13A (ΔCT).Relative expression (ΔΔCT) was calculated as fold-change compared with CZ.Experiments were performed in triplicate and statistical significance was determined by ANOVA using GraphPad/Prism v. 9.4.1.

Immunohistochemistry (IHC)
Serial sections from a single patient whole-mount human prostate were provided by Dr Douglas Strand from University of Texas-Southwestern.Human kidney, bladder, and muscle tissues provided in collaboration with UI Health Biorepository were stained as antibody controls (supplementary material, Figure S3).IHC was performed following the Abcam IHC protocol -IHC-P protocol v3.doc (http://abcam.com).In brief, tissues were dewaxed in xylene and rehydrated in an ethanol series; antigen retrieval was performed in citrate buffer (pH 6).Tissues were blocked using hydrogen peroxide and normal horse serum, and DAB was used as a chromogen with hematoxylin as a nuclear counterstain.

Statistical analyses
Statistical analyses for qPCR validation and DbGaP were performed in GraphPad/Prism v. 9.4.1 using a one-way ANOVA and Mann-Whitney U-tests, respectively.All statistical analysis for scRNA-seq was performed in Seurat in collaboration with the HPCBio Core at UIUC.In brief, a differential expression test was performed between each cluster and the rest of the sample for each feature.The log 2 fold-change was reported with the p value determined by a negative binomial test and adjusted for multiple testing via the Benjamini-Hochberg procedure.RNA-sequencing statistics were performed with EdgeR using an exact test, as described previously [22].

Results
To comprehensively evaluate differentially expressed genes (DEGs) between each prostate zone, we utilized the 10X Genomics platform to perform single-cell RNA sequencing (scRNA-seq) on human basal/epithelial PrEC cultures isolated from CZ, PZ, TZ, and seminal vesicle (SV) biopsies from a single patient.We identified 11 distinct cell populations by unsupervised clustering and UMAP analysis (Table 2 and Figure 1A).By grouping the UMAP projections by zone, we found distinct clustering of SVECs compared with all prostate zones (Figure 1B).Overall, there was little cell population variability between PZ and TZ samples; however, clusters 7 and 9 appeared to be CZ-specific (Figure 1B,C).Utilizing literature-based cell population markers, we identified each PrEC cluster as basal, basal/dividing, basal/transit-amplifying, basal/intermediate, or basal/stem; two CZ-specific clusters; and three SVEC clusters (Figure 1D) [13,[37][38][39][40]. Interestingly, at least one basal marker was present in every cluster, in some cases coexpressing with markers for transit-amplifying, stem, luminal, or proliferating cells (Figure 1D and supplementary material, Figure S2).Based on these data, there are clear population differences (1) at the organ level, where SVECs cluster separately from all PrEC samples; (2) at the zonal level, where CZ samples are the most distinct of the three prostate zones; and (3) among discrete basal populations within each zone that have not been previously described.

Gene expression was distinct in human seminal vesicle versus prostate basal cells
To elucidate organ-specific population differences, we determined differential gene expression between SVECs and PrECs.Gene ontology analyses of the top 100 differentially expressed genes identified SVEC up-regulated pathways including tubule differentiation/development, response to gonadotropin stimulus, androgen metabolic process, and bud morphogenesis (Figure 2A).Conversely, SVEC down-regulated pathways included prostate gland development/morphogenesis, cell fate specification and commitment, urogenital sinus development, blood vessel morphogenesis, and response to growth factors (Figure 2A).The top 20 up-and downregulated genes between SVECs and PrECs from each of the zones (CZ, PZ, and TZ) are shown in a heat map (Figure 2B).Several up-regulated (EMX2, PAX8, SIM1) and down-regulated (CDH11, PITX1, SALL3) genes were chosen for further validation based on a uniform and distinct expression pattern in SVECs and PrECs (Figure 2C).Quantitative PCR validation of samples from three additional patients showed significantly increased expression of EMX2, PAX8, and SIM1 in SV compared with CZ, PZ, and TZ, and significantly decreased expression of CDH11, PITX1, and SALL3 in SV compared with the prostate zones (Figure 2D).These data support our findings that SVECs cluster distinctly from PrECs.
Understanding the fundamental differences in basal cells within these organs may provide a foundation for determining susceptibility to disease.Identification of distinct prostate basal cell lineages 215

Single-cell sequencing analysis revealed gene expression differences in basal cells across prostate zones
To further assess PrEC zone-specific differences, we used gene ontology pathway analysis to determine biological pathways that are up-regulated in each zone (Figure 3A).Interestingly, angiogenesis and tube development pathways were up-regulated in PZ and TZ but not in CZ (Figure 3A).Specific up-and down-regulated genes in each zone from scRNA-seq were identified (Figure 3B,C and supplementary material, File S1) and validated using additional patient samples (n = 3).APLN and NNMT were significantly increased in PZ versus CZ and TZ; PADI3 was significantly increased in CZ versus PZ and TZ; and SFRP1, CCN6, and KRT23 were significantly increased in TZ versus CZ and PZ (Figure 3D).In publicly available RNA-sequencing  Identification of distinct prostate basal cell lineages 217 datasets comparing 38 benign tissue biopsies adjacent to prostate cancer (PZ) with 20 benign BPH-adjacent biopsies from the TZ, we confirmed that APLN and NNMT were significantly increased in PZ compared with TZ, and that SFRP1 was significantly increased in TZ compared with PZ (Figure 3E) [19,[23][24][25][26][27].CCN6 was also  increased in TZ versus PZ but did not reach statistical significance (Figure 3E).To test the potential clinical relevance of these findings, we used publicly available datasets to investigate PZ-specific genes in prostate cancer; APLN and NNMT were significantly increased in metastases (n = 425) and castration-resistant prostate cancer (CRPC, n = 115) compared with benign adjacent (n = 38) (Figure 3F).Similarly, TZ-specific genes were differentially regulated in BPH; SFRP1 was significantly decreased in BPH (n = 54) and BPH stromal nodules (n = 9) compared with normal prostate (n = 20), and CCN6 was significantly increased in BPH stroma from laser-capture microdissection (LCM, n = 7) compared with normal prostate (Figure 3F).These data support the concept that DEGs in basal cells among prostate zones likely dictate the functional and pathological divergences observed in clinical data.
To determine the in situ expression and localization of these zone-specific markers, we performed immunohistochemistry on whole-mount human prostate tissue for NNMT (PZ-specific), PADI3 (CZ-specific), and CCN6 (TZ-specific) (Figure 4).NNMT was highly expressed in basal cells of PZ versus TZ, CZ, and urothelial cells in the urethra (Ur) (Figure 4A).Similarly, PADI3 was highly expressed in basal cells of CZ, where it appeared to localize to the cell membrane (Figure 4B).CCN6 was highly expressed in some basal cells of the TZ, and was not expressed in PZ, CZ, or Ur (Figure 4C).Importantly, all three proteins appear to be basal cellspecific in their respective zones, validating our singlecell sequencing analysis and providing evidence of basal population differences between prostate zones.

RNA-sequencing analysis revealed gene expression differences in stromal cells across prostate zones
Anatomically, prostatic basal epithelial cell populations are adjacent to the stroma [41,42].There is extensive evidence in the prostate literature that supports a paracrine signaling role for stromal cells, where secreted factors including growth factors, cytokines, extracellular matrix (ECM) proteins, and microRNAs are involved in prostate development and homeostasis [41][42][43].Importantly, disruptions in this paracrine signaling axis have been implicated in the development and progression of prostatic disease [12,44]; however, zone specificity of this stromal paracrine signaling has not been addressed.To determine zone-specific differences for stromal cells in the prostate, we performed bulk RNA-sequencing on stromal cells isolated from each prostate zone (CZ, PZ, TZ), followed by gene ontology (Figure 5A and supplementary material, File S2).The top ten upand down-regulated genes between zones were compiled into heatmaps (Figure 5B,C).In all three zones, there was differential regulation of ECM genes; VCAN and TNXB were up-regulated in CZ, COL3A4 was down-regulated in PZ, and MARCOL and SPINK13 were down-regulated in TZ, suggesting that secreted factors from the stroma may play a role in the differential gene expression in prostate epithelial cells (Figure 5B,C).Interestingly, in CZ stromal cells, zinc-activated ion channel ZACN and zinc metalloprotease AMZ1 were up-regulated (Figure 5B).In the PZ stroma, GABBR2 and RYR2 were up-regulated, suggesting increased ion-mediated signaling (Figure 5B).Finally, in TZ stroma, CXCL5 was up-regulated, while NFKBIL1 was downregulated, suggesting a role in chemokine/immune response (Figure 5B,C).Taken together, these data suggest that zonespecific paracrine signaling in stromal cells may contribute to zone-specific gene expression in basal cells.

Prostate zone-specific basal populations were identified by scRNA-seq
These zone-specific gene expression changes in both basal and stromal cells suggest that there are differences at the population level across zones that likely indicate functional divergences.To further characterize basal populations within these prostate zones, we created an array of all genes that are expressed in at least three clusters; this analysis showed that despite clear differences between basal population clusters, several basal cell markers are present in each population (Figure 5A).Interestingly, the two CZ-specific clusters expressed very few markers that were highly conserved in the other basal populations but did express several shared genes with the Basal/TA population (ISG15, S100A2, PRAC1,  220 JE Vellky et al S100A6, MAGED2) (Figure 6A).Conserved genes between basal populations were quantified using a Venn diagramwhile no genes were shared in all five populations, many genes overlapped in pairwise comparison (Figure 6B).The two CZ-specific populations had 23 genes in common, with 42 and 29 genes specific to CZ-1 and CZ-2, respectively (Figure 6B).Gene ontology analysis of up-and down-regulated genes in each basal population showed several interesting pathways including up-regulation of vasculature development in Basal/TA cells, up-regulation of response to interferon and cytokines in the Basal/Int.2 population, and down-regulation of apoptotic process in Basal/Int. 1 cells (Figure 6C,D).To determine CZ-specific basal cell pathways, we carried out gene ontology analysis on the 23 shared up-regulated genes for the CZ-specific clusters.This analysis identified several pathways of interest including sequestering of metal ion, cristae formation, response to interleukin-1, and programmed cell death (Figure 6E).Finally, an array of differentially expressed genes between the two CZ-specific clusters showed that CZ-1 contained several stem-related genes (LCN2, KRT6A, KRT13, CD24) [38], while CZ-2 contained genes related to vesicular export.Taken together, these data suggest zone specificity of distinct basal cell populations in normal prostate tissue, particularly in the CZ.A better understanding of these distinct cell populations could provide insight into the differential pathologies between prostate zones.

Discussion
Here, we have elucidated and characterized organspecific, zone-specific, and basal cluster-specific gene expression differences in human prostate and seminal vesicle.These studies validated previous findings and implicated several additional genes as novel biomarkers for prostate disease.They identified and delineated gene expression differences between prostate zones and associated them with disease initiation, thus potentially dictating functional and pathological divergences observed clinically.Distinct zonal-specific paracrine Identification of distinct prostate basal cell lineages 221 pathways in prostate stromal cells were likewise identified.Taken together, these data collectively implicate zonal-specific and basal cell-specific genes as potential key drivers of BPH and prostate cancer which could elucidate biomarkers for prevention, staging, and pharmacologic intervention.
Our data challenge the current model of a homogeneous basal population and suggest the presence of multiple basal sub-populations with distinct phenotypes and disease etiologies.We identified basal markers that were shared between all zones, but also significant gene expression differences among these basal populations.Differential pathway analyses implicated functional divergences among basal cell clusters, including 'epithelial cell differentiation' and 'vasculature development' in the Basal/TA cluster, immune response-related pathways in the Basal/Int.2 cluster, and 'angiogenesis' and 'tube development' in the Basal/Int.1 cluster.A better understanding of basal population heterogeneity could provide insight into aberrant cell growth related to zonespecific pathologies.
The conclusions drawn here also present interesting avenues for future research.First, aggregated clustering analysis identified a clear divergence in cell populations in the central zone compared with the peripheral and transition zones.Two populations were specific to CZ samples, supporting the dogma that the CZ is the most distinct of the prostate zones.Interestingly, gene ontology analysis of biological pathways that are upregulated in CZ versus PZ and TZ identified 'endoderm development' as a significantly up-regulated pathway in the central zone, supporting the idea that CZ, like PZ and TZ, arises from the endodermal urogenital sinus, not the mesodermal Wolffian duct [1,3].Additionally, transcription factor expression in different germ layers is a critical component in determining organ specification; specifically, FOXA1 and NF1B have been identified as regulators of prostate-specific gene expression [45].In our data, we see enriched FOXA1 expression in the prostate versus seminal vesicle, supporting a common endodermal origin among prostatic zones (data not shown).Second, the multiple genes identified have potential functions as drivers of disease progression and thus as potential therapeutic targets (supplementary material, Table S4).For example, APLN is a bioactive peptide that has been implicated in GPCR signaling to regulate fluid homeostasis, cardiovascular function, insulin secretion, and vasculature development [46].It has also been implicated in several cancer types, including prostate cancer, where genetic loss-of-function was shown to decrease the invasion and migration of prostate cancer cells [47,48].Moreover, NNMT is a methyltransferase that plays a role in xenobiotic detoxification, and has been implicated as an independent prognostic marker of progression-free survival and overall survival in patients with advanced prostate cancer [49].Conversely, SFRP1 was up-regulated in TZ versus PZ patient samples in our data and may play a role in BPH pathology.SFRP1 is a repressor of the WNT signaling pathway; the role of WNT signaling in BPH has previously been characterized and provides rationale for the decrease of SFRP1 that we saw in BPH samples versus normal TZ.These findings indicate that genes identified in our study were similarly implicated in previous studies to contribute to adverse pathologies, supporting the clinical relevance of our findings.
While these data provide insight into gene expression differences that may suggest pathologic divergence in prostate zones, there are limitations.First, the PrECs utilized in this study were dissociated and cultured in vitro prior to sequencing to isolate basal cell populations that are often overlooked in scRNA-seq of whole prostate samples.It is likely that growing cells in culture selects for certain populations and excludes others, and therefore conclusions from in vitro studies must be validated in patient samples in situ to confer clinical relevance.In terms of population exclusion, we should recognize that some basal populations may have been lost in the process of culturing cells; in particular, we did not see the presence of club or hillock populations that have recently been characterized [13].If resolution for basal cell subtypes improves, future studies should investigate zonal basal population heterogeneity in fresh tissue samples.Second, the biopsies used in this study were from men diagnosed with prostate acinar adenocarcinoma.Despite histologic confirmation of 'normal' tissue, it is possible that adjacent normal tissue from men with prostate cancer is not genetically representative of tissues from healthy donors.In this case, we do think our samples are representative of normal tissue based on the IHC validation of a healthy donor (Figure 4), but further sequencing of healthy prostate should be addressed in future studies.Third, while it is likely that both luminal and basal progenitors contribute to disease progression and recurrence [13,15,[50][51][52], this study focused on the basal cell compartment to gain better resolution of basal population heterogeneity.Integrating the roles of both basal and luminal progenitor populations is necessary to completely understand developmental and regenerative growth of the human prostate, and follow-up studies should address this interface.
In conclusion, this study has utilized single-cell RNA sequencing to identify organ-specific, zone-specific, and basal cluster-specific gene expression differences in human prostate tissue.The presented data add to our knowledge of differential gene expression within human prostate zones and assess pathways and population differences among basal cells.A better understanding of these genetic divergences may provide insight into the genetic pathways involved in the development of prostatic disease, allowing for improved therapeutic interventions.
design, data collection and data analysis.JD and MM-C contributed to data analysis, data interpretation and generation of figures.KV-N and AK-B contributed to data collection.DVG contributed to study design and data interpretation.All authors were involved in writing the manuscript and had final approval of the submitted and published versions.

Figure 2 .
Figure 2. Organ-specific gene ontology, gene expression, and validation.(A) Gene ontology analysis for up-regulated and down-regulated pathways in seminal vesicle epithelial cells (SVECs) versus prostate epithelial cells (PrECs), where bars indicate fold-enrichment and color indicates the FDR-adjusted p value.(B) Heat map of the top 20 up-and down-regulated genes between SVECs and PrECs from each zone (CZ, PZ, and TZ), where red is high expression and blue is low expression.(C) Dot plots representing cell-by-cell positivity for several genes of interest.Color depicts the intensity of log 2 expression in each cell, where red is high expression, orange is low expression, and gray is no expression.EMX2, PAX8, and SIM1 were highly expressed in the SVEC clusters versus PrEC, while CDH11, PITX1, and SALL3 were highly expressed in PrEC versus SVEC.(D) Quantitative PCR of SVECs and PrECs (CZ, PZ, TZ) collected from additional patients was used to validate select genes.EMX2, PAX8, and SIM1 were significantly increased in SV compared with CZ, PZ, and TZ, while CDH11, PITX1, and SALL3 were significantly decreased in SV compared with CZ, PZ, and TZ.Bar charts show fold-change (FC) normalized to RPL13A and relative to CZ for n = 3 patients.Error bars show SEM.Statistical analysis in GraphPad/Prism used one-way ANOVA with Tukey's multiple comparison test.FDR, false discovery rate.*p < 0.05, **p < 0.01, ***p < 0.001.Figure C was generated using Loupe Browser 6.0 (10X Genomics).

Figure 3
Figure 3 Legend on next page.
218 JE Vellky et al © 2023 The Authors.The Journal of Pathology published by John Wiley & Sons Ltd on behalf of The Pathological Society of Great Britain and Ireland.www.pathsoc.orgJ Pathol 2024; 262: 212-225 www.thejournalofpathology.com

Figure 3 .
Figure 3. Prostate epithelial cell zone-specific gene ontology, gene expression, and validation.(A) Gene ontology for differentially up-regulated genes in each prostate zone (CZ, PZ, TZ).Bars indicate fold-enrichment and color indicates the FDR-adjusted p value.(B and C) Heat maps of the top 12 up-and down-regulated genes in each zone (CZ, PZ, and TZ), where red is high expression and blue is low expression.(D) Quantitative PCR of PrECs (CZ, PZ, TZ) collected from additional patients was used to validate select genes.Expression of APLN and NNMT was significantly increased in PZ versus CZ and TZ; PADI3 was significantly increased in CZ versus PZ and TZ; and SFRP1, CCN6, and KRT23 were significantly increased in TZ versus CZ and PZ.Bar charts show fold-change (FC) normalized to RPL13A and relative to CZ for n = 3 patients.Error bars show SEM; one-way ANOVA with Tukey's multiple comparison test.(E) Data mining from publicly available RNA-sequencing data compared 38 benign tissue biopsies adjacent to prostate cancer (PZ) with 20 benign BPH-adjacent biopsies from the TZ.APLN and NNMT were significantly increased in PZ compared with TZ, and SFRP1 was significantly increased in TZ compared with PZ.CCN6 was higher in TZ than in PZ but did not reach statistical significance (Student's t-test).(F) Data mining from publicly available RNA-sequencing data in prostate cancer progression and BPH.APLN and NNMT were significantly increased in advanced prostate cancermetastases (n = 425) and CRPC (n = 115) compared with benign adjacent (n = 38).NNMT was significantly increased in NEPC (n = 15) compared with benign adjacent (n = 38).SFRP1 was significantly decreased in BPH (n = 54) and BPH stromal nodules (n = 9) compared with normal TZ prostate (n = 20), and CCN6 was significantly increased in BPH stroma from laser-capture microdissection (BPH Stroma LCM, n = 7) compared with normal TZ prostate.Bar charts show fold-change transcripts per million (TPM).Error bars show SEM; one-way ANOVA with Tukey's multiple comparison test.FDR, false discovery rate.*p < 0.05, **p < 0.01, ***p < 0.001; ****p < 0.0001.

Figure 4 .
Figure 4. Expression and localization of zone-specific markers in situ.(A) Immunohistochemical staining for NNMT (brown) in human prostate showed high expression in basal cells of the peripheral zone (PZ, yellow arrowheads) and in stromal cells adjacent to the urothelial cells of the urethra (Ur).There was minimal NNMT expression in the central zone (CZ) and transition zone (TZ).(B) PADI3 (brown) was detected in all zones of the prostate, with the highest expression in basal cells of the CZ (yellow arrowheads), where it appears to localize to the membrane.(C) CCN6 (brown) was highly expressed in discrete basal cells of the TZ (yellow arrowheads).There was low expression in the CZ and Ur, and minimal expression in PZ.Top panels show stitched images of the whole transverse prostate section from 4Â objective magnification images and zonal panel images were taken at 60Â objective magnification.Hematoxylin (blue) was used as a nuclear counterstain.

Figure 5 .
Figure 5. Prostate stromal cell zone-specific gene ontology and gene expression.(A) Gene ontology for differentially up-regulated genes in each prostate zone (CZ, PZ, TZ) from RNA sequencing of stromal cells.Bars indicate fold-enrichment and color indicates the FDR-adjusted p value.(B) Heat map of the top ten up-regulated genes in stromal cells from each zone (CZ, PZ, and TZ), where red is high expression and blue is low expression.(C) Heat map of the top ten down-regulated genes in stromal cells from each zone (CZ, PZ, and TZ), where red is high expression and blue is low expression.CZ, central zone; PZ, peripheral zone; TZ, transition zone; TPM, transcripts per million.

Figure 6 .
Figure 6.Analysis of distinct basal populations across prostate zones.(A) Shared expression markers are represented in an array, where green indicates expression and white indicates no expression.Inclusion criteria included markers that were expressed in at least three clusters.At least one basal cell marker was present in each population.Clusters were identified as Basal, Basal/Dividing (Div.),Basal/Transit-amplifying (TA), Basal/Intermediate (Int.) 1 and 2, and Basal/Stem, and central zone-specific 1 and 2 (CZ-1, CZ-2).(B) Conserved genes between basal populations were quantified using a Venn diagramwhile no genes were shared by all five populations, many genes overlapped in pairwise comparison.The Basal/Stem population was excluded in this comparison due to a lack of robust independent gene expression.Two populations were identified as CZ-specific: CZ-1 and CZ-2.In these clusters, there were 23 genes in common, with 42 genes specific to CZ-1 and 29 genes specific to CZ-2.(C) Gene ontology analysis of up-regulated pathways in each basal population showed distinct differences between clusters.Bars indicate fold-enrichment and color indicates the FDR-adjusted p value.(D) Gene ontology analysis of down-regulated pathways in each basal population showed distinct differences between clusters.Bars indicate fold-enrichment and color indicates the FDR-adjusted p value.(E) Gene ontology analysis of shared genes between the CZ-specific clusters.Bars indicate foldenrichment and color indicates the FDR-adjusted p value.(F) An array of DEGs between the two CZ-specific clusters, where green indicates expression and white indicates no expression.CZ-1 contained several stem-related genes (LCN2, KRT6A, KRT13, CD24), while CZ-2 contained genes related to vesicular export.Figure B was generated using https://bioinformatics.psb.ugent.be/webtools/Venn/.

Table 1 .
Sequencing inputs, reads, and cell count per sample.

Table 2 .
Cells and significant genes per cluster.