Dysregulated expression of androgen metabolism genes and genetic analysis in hypospadias

Abstract Background The aberrant expression of genes involved in androgen metabolism and genetic contribution are unclear in hypospadias. Methods We compared gene expression profiles by RNA sequencing from five non‐hypospadiac foreskins, five mild hypospadiac foreskins, and five severe hypospadiac foreskins. In addition, to identify rare coding variants with large effects on hypospadias risk, we carried out whole exome sequencing in three patients in a hypospadias family. Results The average expression of androgen receptor (AR) and CYP19A1 were significantly decreased in severe hypospadias (p < .01) and mild hypospadias (p < .05), whereas expression of several other androgen metabolism enzymes, including CYP3A4, HSD17B14, HSD3B7, HSD17B7, CYP11A1 were exclusively significantly expressed in severe hypospadias (p < .05). Compound rare damaging mutants of AR gene with HSD3B1 and SLC25A5 genes were identified in the different severe hypospadias. Conclusions In conclusion, our findings demonstrated that dysregulation of AR and CYP19A1 could play a crucial role in the development of hypospadias. Inconsistent AR expression may be caused by the feedback loop of ESR1 signaling or combined genetic effects with other risk genes. This findings complement the possible role of AR triggered mechanism in the development of hypospadias.


| INTRODUCTION
Hypospadias (OMIM#: 300633) is among the most frequently congenital diseases in the male reproductive system. The average incidence of hypospadias is 2/1000 births, and the prevalence is suggested to be increasing during 1980-2010 (Yu et al., 2019). However, the exact biological mechanism underlying the development of hypospadias is unknown. Hypospadias is thought to be a multifactorial disorder, involving both genetic and environmental factors (Joodi et al., 2019). The hypospadias association studies demonstrated that a variety of genes and pathways are known to contribute the etiology of hypospadias, and these genes could affect the risk of hypospadias through protein-protein interactions (PPIs) . Hormones are required for normal sex organs to grow into either male or female organs during the early development. Androgens and estrogens are the important steroid hormones responsible for the development of external genitalia (Balaji et al., 2020;Bouty, Ayers, Pask, Heloury, & Sinclair, 2015).
The penile development is controlled through the balance of androgen receptor (AR, OMIM#: 313700) to estrogen receptor α (ESR1, OMIM#: 133430) levels during the neonatal period (Zheng, Armfield, & Cohn, 2015). Oestrogenic agents and androgen-blocking agents are the two major categories, which are used to induce hypospadias in rats and mice (Cunha, Sinclair, Risbridger, Hutson, & Baskin, 2015). Androgen signaling is regulated by the nuclear AR and play a pivotal role in masculinization during development. Disruption of androgen-induced genes, such as MAFB, can interfere with urethral formation and results in hypospadias (Matsushita et al., 2018). Missense mutations of AR accounted for 3% of patients, which may play a critical role in the cause of isolated hypospadias (Kalfa et al., 2013). While any defect in the androgen signaling may lead to hypospadias, the AR expression in hypospadias are controversial with the variable results. When compared to the non-hypospadiac children, AR expression exhibited various levels, including elevated (Balaji et al., 2020;Pichler et al., 2013), similar (Celayir, Moralioglu, Cetiner, Kir, & Celayir, 2019;Tack et al., 2019) or decreased (Lin, Xie, Chen, & Li, 2016;Silva et al., 2013). Various polymorphisms in ESR1 are associated with the risk of hypospadias (Choudhry et al., 2015), and ESR1 expression are also reported to be associated with hypospadias (Qiao et al., 2012;Wang et al., 2007). Despite the critical functions of androgen signaling and estrogen signaling in the process of hypospadias, regulation of gene expression and genetic contribution for hypospadias remain poorly unknown.
The exact molecular events for the development of hypospadias are just beginning to be addressed, the aim of this study was to investigate the gene expression profiles and the genetic basis to explore the possible biological mechanism, especially regulated by genes involved in androgen metabolism in hypospadias.

| Ethical compliance
The study was conducted after approval from the Ethics Committee of the Shanghai Children's Hospital in China (approve #: 2014R022-F01 and 2020R018-E01). Each patient was informed of the purpose of this study, and written consents were obtained from all participants or their parent/legal guardian. Ethical approval was obtained from the Shanghai Children's Hospital in China.

| Human subjects
We performed RNA sequencing and exome sequencing study design in a male Han Chinese population. In all, 18 subjects, including 15 foreskin specimen for RNA sequencing and three blood samples for whole exome sequencing (WES) analysis, were included in this study.
For RNA sequencing analysis, 15 prepuces of children who underwent consecutive circumcision either because of phimosis (controls; n = 5) or because of hypospadias repair (mild hypospadias, n = 5; severe hypospadias; n = 5), were included in this study. Three hypospadias samples for WES were enrolled from a hypospadias family. The 18 samples were collected with mean age 4.4 ± 2.1 years during 2015 to 2018 at Shanghai Children's Hospital, Shanghai Jiaotong University. Patients with hypospadias were diagnosed by the Department of Urology.

| RNA sequencing
The RNA was isolated from human tissues with TRIzol reagent according to the manufacturer's instructions (Life Technologies). RNA samples that pass quality parameters were used to construct RNA libraries using TruSeq RNA Library Prep Kit v2 (Illumina) according to the manufacturer protocol. The libraries were then sequenced on the Illumina NovaSeq 6000 platform with a paired-end read length of 150 bp.

| RNA analysis
RNA-Seq fastq data was then carried out based on the following protocols. Skewer software (Jiang, Lei, Ding, & Zhu, 2014) was performed to dynamically remove the 3′ ends, linker sequences, and low mass fragments of raw reads. Quality assessment was examined by FastQC tool (www. bioin forma tics.babra ham.ac.uk/proje cts/fastq c/). Trimmed clean reads were aligned against human reference genome (hg38) by STAR (Dobin et al., 2013). Then, the transcriptome was assembled using StringTie (Pertea et al., 2015) software based on the Ensembl reference (Homo sapiens GRCh38.90) (Cunningham et al., 2019). The FPKM (Fragments Per Kilobase of transcript per Million mapped reads) method was applied to estimate differentially expressed genes.
The obtained RNA expression data was used for comparative analysis. We applied surrogate variable analysis (SVA) (Leek, Johnson, Parker, Jaffe, & Storey, 2012) to remove batch effects. The limma package (Smyth, 2004) was used to assess the differentially expressed genes between hypospadiac foreskins and non-hypospadiac foreskins. The P-values were adjusted for multiple comparisons using the False Discovery Rate (FDR). The adjusted p < .05 and 2-fold were used as the cut-off criteria for differentially expressed genes between severe hypospadias or hypospadias (including mild and severe hypospadias) and controls, whereas the p < .05 and 1.5-fold were used as the cut-off criteria for differentially expressed genes between mild hypospadias and controls. Function heatmap.2 was performed for the graphical display of the dendrogram (Gentleman et al., 2004). GeneSense (Chen et al., 2014) and STRING (https://www.strin g-db.org) were used to identify PPIs encoded by hypospadias risk associated genes in human.

| WES, variant annotation and data analysis
We performed the WES analysis and variant annotation according to the method of Chen . As described in this method, coding variants were classified as synonymous, missense, LoF (loss of function, including splice acceptor/donor, stop gained/lost, initiator codon and frameshift indels) and others. The missense variants, that were predicted to be deleterious by SIFT (Kumar, Henikoff, & Ng, 2009) and damaging by PolyPhen-2 (Adzhubei et al., 2010), were annotated as deleterious missense variants (D-mis). Rare damaging variants (LoF and D-mis) were selected that had a minor allele frequency (MAF) ≤1% in ExAC (http://exac.broad insti tute.org) and 1000 Genomes Project. Multiple alignments of the SLC25A5 proteins across species were performed by the CLUSTALW program built in Mega software (http://www. megas oftwa re.net/). RNA-Seq data of 27 different tissues from 95 individuals, which is part of the Human Protein Atlas (www.prote inatl as.org) (BioProject: PRJEB4337) (Fagerberg et al., 2014), was downloaded from NCBI.

| Molecular subtyping based on gene expression and hierarchical cluster analysis
Two different types (severe and mild) of hypospadiac foreskins and non-hypospadiac foreskins (controls) were prepared in this study. Figure 1a showed that the gene expression patterns in severe hypospadias and controls are more different compared with mild hypospadias. Overall, 498 genes were identified differentially expressed in severe hypospadias, mild hypospadias and all (severe and mild) hypospadias compared with controls. Among these proteins encoded by differentially expressed genes, 11 proteins were predicted to interact with ESR1, while four proteins interact with AR. We performed unsupervised hierarchical clustering analysis of the 11 genes, which encode proteins interacting with ESR1. Results showed that the classifications of samples into hypospadias and controls (Figure 1b).

| Combined effects of AR variants and other risk variants cause diverse phenotypes in a three-generation hypospadias family
The hypospadias family of this study included 12 individuals in a three-generation pedigree (Figure 3) originating from Han Chinese. Six male family members were clinically investigated: all three cases showing severe hypospadias (II-2, III-3, III-4) with the variable outcomes, and three individuals are unaffected. The urethral opening of these patients was located in the perineum. Among these three severe hypospadias, patient III3, III4 presented small penis with ventural curvature, bilateral accessible testis in scrotum. III-4 was accompanied with crytorchidism and solitary kidney, and all of them have no other deformity.
In addition, two candidate loci were identified in chromosome X in our recruited family. These two rare damaging variants [p.Glu293* or p.E293* (NM_001152.4: c.877G>T) and p.Gly121Cys or p.G121C (NM_001152.4: c.361G>T)] were located in SLC25A5 (OMIM#: 300150) ( Table 1). All these variants did not exist in 1KGP. The p.Gly121Cys mutation of SLC25A5 gene was found in the ExAC database with MAF < 0.01, while p.Glu293* in SLC25A5 was a novel LoF variant and not found in the ExAC database. The SLC25A5 121 glycine residue and 293 glutamic residue are located in a highly conserved region across species (Figure 4a). All these variants were identified in all patients and is likely responsible for recessive inheritance. Furthermore, SLC25A5 is expressed in testis tissue from NCBI browser (Figure 4b).

| DISCUSSION
AR plays a pivotal role in the development of hypospadias. However, the research studies on dysregulated expression of AR are controversial and the exact mechanism is unclear. In the present study, we investigated the possible mechanisms leading to aberrant expression of genes involved in androgen metabolism in hypospadias by gene expression profiling and WES. Using data from gene expression profiling, we demonstrated that the significantly expressed genes are enriched in biological process of proteins interacting with ESR1. Additionally, we documented that mild and severe hypospadias express lower average transcript levels for AR and CYP19A1, whereas the expression of several steroidogenic enzymes, including CYP3A4, HSD17B14, HSD3B7, HSD17B7, and CYP11A1 are significantly decreased in severe hypospadias, compared with non-hypospadiac tissues. The genetic data further showed that the compound damaging variants of AR genes and other risk genes existed in severe hypospadias with the variable phenotypes.  Our findings suggest that AR and ESR1 signaling play a critical role in the development of mild and severe hypospadias using the gene expression profiling. The human AR gene, also known as the dihydrotestosterone receptor, is located on X-chromosome and a member of the nuclear hormone receptor family. Although AR defect may play a causative role during the development of hypospadias, the AR expression (Balaji et al., 2020;Celayir et al., 2019;Lin et al., 2016;Pichler et al., 2013;Silva et al., 2013;Tack et al., 2019) in hypospadias is still debatable. Lin et al. (2016) and Silva et al. (2013) reported the decreased AR expression in boys with hypospadias compared with controls. Our study in Chinese children with hypospadias demonstrated the similar results to those studies. The finding of the aberrant expression of AR and other enzymes (CYP19A1, CYP3A4, HSD17B14, HSD3B7, HSD17B7, and CYP11A1), suggesting that the hypospadiac androgen synthesis may occur at various levels within individual patient. In addition, our findings suggests that genes encode for proteins interacting with ESR1 are enriched in hypospadias, indicating that ESR1 signaling plays an crucial role in prenatal penile development. The expression of estrogen receptor α in mouse fetal penile tissue was demonstrated in some previous studies, and women exposed to diethylstilbestrol in utero increased the risk of hypospadias (Kim et al., 2004;Klip et al., 2002). Our study provided preliminary evidence that the imbalance of AR and ESR1 signaling are present in hypospadias, which may affect the individual patient by various routes.
Furthermore, our genetic results identified a rare damaging AR allele (p.Arg841His) and other hypospadias risk mutants in all three hypospadias patients suggesting that compound variants might contribute to the genetic etiology of hypospadias in this family. The AR p.Arg841His variant is located in the ligand-binding domain and has been previously reported in Asian patients with androgen intensitivity syndrome (Wang, Gong, Wang, & Qin, 2017), which is known to be associated with hypospadias. The HSD3B1 gene, a member of HSD3B gene family, encodes the type I enzyme and is exclusively expressed in the adrenal, small intestine and placenta. HSD3B1 expression is crucial to produce progesterone for the establishment and maintenance of pregnancy (Lai et al., 2017). Genetic mutation in HSD3B1 is identified to be associated with prostate cancer susceptibility (Chang et al., 2002;Hearn et al., 2018). Carmichael and colleagues detected one SNP in HSD3B1 which was associated with moderate hypospadias through sex hormone biosynthesis and metabolism (Carmichael, Witte, Ma, Lammer, & Shaw, 2014). In this family, a rare damaging variant p.Thr353Met in HSD3B1 has been identified. Collectively, our study provided strong direct evidence that rare damaging variant p.Thr353Met in HSD3B1 could increase hypospadias risk via sex hormone biosynthesis and metabolism. In addition to AR and HSD3B1, two rare damaging alleles in one novel candidate genes SLC25A5 were identified in our recruited family. SLC25A5 gene, previously named adenine nucleotide translocator 2 (ANT2), is ubiquitously expressed and is a transcriptionally active ADP/ATP translocase gene in human (Chen et al., 1990). SLC25A5 has been related to non-syndromic intellectual disability (Vandewalle et al., 2013). And the deletion encompassing SLC25A43, LOC100303728, SLC25A5, CXorf56, UBE2A, NKRF, and SEPT6, is associated with hypospadias (de Leeuw et al., 2010). However, the function of SLC25A5 remains poorly understood. The data herein provided direct evidence that SLC25A5 may contribute the genetic etiology of hypospadias.
Although genetic research in AR has been investigated for association with the risk of hypospadias (van der Zanden et al., 2012), the genetic contribution of AR gene to the risk of hypospadias is not clear. Rare damaging variants have a greater impact on protein function and have larger effects than common variants, which were used in identifying causal genes in birth defect disease (Chen, Kuang, Finnell, & Wang, 2018;Chen, Lei, Cao, et al., 2018;Lei et al., 2019). The diverse compound rare damaging variants of AR and other hypospadias risk mutants in hypospadias might be the major reason to cause the high interpatient variability. The AR network  indicated that ESR1 interacts with AR, and rare damaging variant (p.Thr353Met) in HSD3B1 may also triggers a feedback loop, affecting the expression of the AR.
In conclusion, our study demonstrated that disrupted AR and CYP19A1 expression could play a causative role in the development of hypospadias. Inconsistent AR expression may be caused by the regulation of ESR1 signaling or combined genetic effects of AR with other risk genes.