Androgen receptor‐mediated transcriptional repression targets cell plasticity in prostate cancer

Androgen receptor (AR) signaling remains the key therapeutic target in the management of hormone‐naïve‐advanced prostate cancer (PCa) and castration‐resistant PCa (CRPC). Recently, landmark molecular features have been reported for CRPC, including the expression of constitutively active AR variants that lack the ligand‐binding domain. Besides their role in CRPC, AR variants lead to the expression of genes involved in tumor progression. However, little is known about the specificity of their mode of action compared with that of wild‐type AR (AR‐WT). We performed AR transcriptome analyses in an androgen‐dependent PCa cell line as well as cross‐analyses with publicly available RNA‐seq datasets and established that transcriptional repression capacity that was marked for AR‐WT was pathologically lost by AR variants. Functional enrichment analyses allowed us to associate AR‐WT repressive function to a panel of genes involved in cell adhesion and epithelial‐to‐mesenchymal transition. So, we postulate that a less documented AR‐WT normal function in prostate epithelial cells could be the repression of a panel of genes linked to cell plasticity and that this repressive function could be pathologically abrogated by AR variants in PCa.


Introduction
With an estimated 1.27 million newly diagnosed men worldwide in 2018, prostate cancer (PCa) remains the second most common cancer in men according to the GLOBOCAN project and is the fifth leading cause of death from cancer in men, which places it as a vitally important public health issue [1]. PCa cell growth and survival rely on the bio-availability of androgens, such as testosterone and its derived dihydrotestosterone (DHT), whose action is mediated by androgen receptor (AR) [2][3][4].
Androgen receptor is a ligand-dependent transcription factor that belongs to the nuclear receptor superfamily [2,5]. In a schematic view, in the absence of a ligand, AR is localized in the cytoplasm, folded by chaperon proteins in an inactive but ligand-binding competent state [6]. Following ligand binding, AR translocates to the nucleus and binds to androgenresponsive elements (AREs) present in enhancer, superenhancer, introns, and/or promoter of target genes [7]. Thereafter, AR recruits pioneer factors and cofactors that favor chromatin opening, and then gene expression [8][9][10][11]. This landscape of AR cistrome can be pathologically reprogrammed in human prostate cancer [12][13][14].
Androgen receptor remains a key therapeutic target in the management of hormone-naïve-advanced PCa and CRPC [15,16]. However, the efficacy of androgen deprivation therapy is transient as all patients will ultimately relapse [17,18]. Several molecular mechanisms can drive to CRPC [19], and most of them maintain in an active state AR signaling pathways [17]. Indeed, nonsense mutations and diverse AR gene rearrangements that result in the expression of constitutively active AR variants emerge as important molecular mechanisms that lead to CRPC [20][21][22][23][24]. Besides their role in therapeutic response, AR variants seem to play a key role in prostate cancer progression to a more aggressive stage. Indeed, AR-Q641X and AR-V7, but not AR-WT, lead to the upregulation of mesenchymal markers, in particular, N-Cadherin, Vimentin, Snail, and ZEB1 [25]. Moreover, contrary to constitutively active AR variants, AR-WT binding to AREs present in CDH2 intron 1 following dihydrotestosterone stimulation is not accompanied by an upregulation of Ncadherin [26]. Altogether, these data suggest that AR-WT and constitutively active AR variants control differently the expression of a panel of genes at the transcriptional level.
To go deeper in this hypothesis, we performed in the present study RNA-seq and proteomic analyses, as well as cross-analyses of experimental data with other publicly available PCa cell transcriptomic datasets to decipher a full landscape on distinctive transcriptional activities of AR-WT, AR-V7, and AR-Q641X in PCa cells. We found that DHT-activated AR-WT inhibited the expression of a panel of genes and that this property was pathologically lost with the expression of constitutively active AR variants. We further showed that the panel of repressed genes by DHT-activated AR-WT encoded for effectors of cell membrane and cell adhesion functions. So, we postulate that one of the expected normal functions of AR in prostatic tissue could be to prevent the expression of genes linked to cell plasticity. As cell plasticity is closely linked to tumor progression, it may be interested to pay attention to long-term consequences of AR targeting on PCa cell feature.

RNA extraction
Transduced LNCaP and C4-2B cells were seeded in RPMI-1640 complete medium with charcoal-treated FBS and without geneticin and puromycin. After 48 h, cells were treated for 24 h with 20 ngÁmL −1 doxycycline to induce the expression of AR-WT, AR-Q641X, AR-V7, or eGFP alone. Also, as indicated, cells were concomitantly treated with 10 nM DHT or ethanol (EtOH) as vehicle. Total RNA was isolated using NucleoSpin ® RNA II assay (Macherey-Nagel, Hoerdt, France) according to the manufacturer's protocol. This experiment was performed in three biological replicates for each condition.

Data normalization and differential gene expression analysis on DESeq2
All raw expression data were normalized and implemented in the DESeq2 Bioconductor library (DESeq2 1.30.0) on R (version 4.0.3) with default settings [33,34]. Comparisons of interest were performed in order to obtain the (base 2) log of the fold changes (log2FC) and the corresponding adjusted P-values. Indeed, from our experimental dataset, differentially expressed genes were calculated either between DHTtreated control or AR-WT expressing cells and vehicletreated control, or between vehicle-treated AR-Q641X or AR-V7 expressing cells and vehicle-treated control as indicated.
From GSE125014 dataset, differentially expressed genes were calculated between DHT-or enzalutamideand vehicle-treated LNCaP cells. From GSE148397 dataset, differentially expressed genes were calculated between R1881-or R1881 plus darolutamide-and vehicle-treated VCaP cells. From GSE151429 dataset, differential gene expression was calculated between doxycycline-treated cells and vehicle-treated cells in the absence of androgen. To take into account the large dispersion observed with low read counts and to obtain more accurate log2FC estimates, shrinkage of the estimates (lfcShrink function) was applied using 'apeglm' (version 1.12.0) as type of shrinkage estimator [35].
For all RNA-seq data, log2FC results are expressed as the mean ratio of the indicated number of observations for each condition. P-values were adjusted for multiple testing using the guideline of Benjamini and Hochberg [36], and differences were considered statistically significant when P-value < 0.05. Differentially expressed genes were defined according to the following criteria: P-value < 0.05 and Log2FC > 1 or < −1 for the present study, GSE125014 and GSE151429 datasets, and P-value < 0.01 and Log2FC > 2 or < −2 for the GSE148397 dataset.

Functional enrichment analysis
To identify significantly enriched pathways in different experimental conditions, gene set enrichment analysis (GSEA) was performed on preranked datasets sorted by log2FC using the GSEA 4.1.0 desktop application [37,38]. A conservative scoring approach was defined by setting the scoring scheme parameter to classic (unweighted). Gene set used for analysis with GSEA was the Molecular Signature Database (MSigDB) hallmark collection (v7.2). The GSEA Preranked tool provides for each gene set of the collection an enrichment score that reflects how often members of that gene set occur at the top or bottom of the ranked dataset. Then, the score was normalized for each gene set to account for the size of the set. Only results with a false discovery rate (FDR) q-value < 0.25 were considered significant, as defined by the publishers of the GSEA tool, and presented ranked by their GSEA normalized enrichment score. To further analyze pathways and biological functions that could be specifically associated with AR-repressive activity, significantly downregulated genes in the three datasets were pooled. The resulting panel of AR-repressed genes was uploaded on the Gene Ontology (GO) Consortium's Web site (http:// geneontology.org/) for overrepresentation analysis among the three GO categories biological process (BP), molecular function (MF), and cellular component (CC) [39,40]. Results with a P-value < 0.05 and a fold enrichment > 2 were selected as significant, and the first ten enriched terms of each category were presented according to their P-value. Furthermore, the MSigDB hallmark collection was used on the Enrichr online tool (https://maayanlab. cloud/Enrichr/) for the enrichment analysis of the repressed genes. This approach allows the identification of significant overlaps between our list of genes and a gene set of the collection [41,42]. Significant enriched gene sets (P-value < 0.05) were presented ranked by P-value with indication of the number of overlaps.

Time-course proteomic analysis
Stably eGFP-AR-WT expressing LNCaP cells were seeded in complete medium and then starved during 48 h in RPMI-1640 complete medium with charcoaltreated FBS before treatment with 10 nM DHT or vehicle. After 24 and 48 h, cells were collected by centrifugation and washed 4 times in phosphate-buffered saline. Sample preparation and mass spectrometry were performed at the IGBMC proteomic platform. Briefly, cell pellets containing about 2 × 10 6 cells were lysed in 1% SDS, 0.1 M Tris pH 8.5, 50 mM DTT, and sonicated.

Screen for androgen receptor partner interactions
In order to analyze a potential difference in partner recruitment between AR-WT and constitutively active AR variants, we used the previously described proximity-dependent biotin identification (BioID2) approach [43]. Plasmids pMyc-BioID2-AR-WT, pMyc-BioID2-AR-Q641X, and pMyc-BioID2-AR-V7, in which AR-WT or AR variants were fused to the N-ter of the humanized Aquifex aeolicus BioID2 protein, were constructed from pMyc-BioID2 (#74223, Addgene, Teddington, UK). Then, Myc-BioID2-AR-WT, Myc-BioID2-AR-Q641X, or Myc-BioID2-AR-V7 fragments were inserted in pLVX-TRE3G from the Tet-On 3Ginducible expression lentiviral system (Takara Bio Europe, Saint-Germain-en-Laye, France) to yield a doxycycline-inducible expression of the respective transgenes in LNCaP cells. The unfused Myc-BioID2 transgene was considered as control. For biotinylating AR partners, 1.5 × 10 6 transduced cells were plated in p100 dishes for 48 h up to 80% of cell confluence, and then, cells were placed in fresh medium containing 2 ngÁmL −1 doxycycline, 50 μM of biotin, and 10 nM DHT. After 24 h, cells were lysed in the RIPA lysing and extraction buffer supplemented with 25 UÁmL −1 de benzonase and 1X Protease Inhibitor Cocktail. Cell extracts were then sonicated (50% Amplitude, 10 s Pulse-ON, 20 s Pulse-OFF; Q700 ultrasonic processor, QSonic, Newtown, CT, USA) and clarified by at 14 000 g at 4°C for 15 min. Biotinylated proteins were purified using the streptavidin-affinity approach. Briefly, lysates were incubated overnight at 4°C and under agitation in 200 μL magnetic streptavidin-coupled beads (Invitro-gen™ Dynabeads™, Thermo Fisher Scientific, Waltham, MA, USA). After centrifugation, beads were washed twice in 2% SDS, once in RIPA, twice in a washing buffer containing 10% glycerol, 50 mM HEPES-NaOH pH 8, 150 mM NaCl, 2 mM EDTA, and 0.1% NP-40, and then resuspended in 85 μL Laemmli buffer. Biotinylated proteins were then eluted from beads at 98°C for 5 min and separated in a 7.5% SDS-PAGE. All experiments were repeated three times on separate days.

Mass spectrometry analysis
Protein samples were reduced, alkylated, and digested at 37°C for AR interactome analysis, or double digested with Lys-C and trypsin at 37°C for time-course proteomic approach. Peptide mixtures were then desalted on C18 spin-column and dried on Speed-Vacuum before LC-MS/MS analysis. Peptides were analyzed using an Ultimate 3000 nano-RSLC (Thermo Scientific, San Jose, CA, USA) coupled in line with an LTQ-Orbitrap ELITE mass spectrometer via a nano-electrospray ionization source (Thermo Scientific). Briefly, peptides were loaded in triplicate on a C18 Acclaim PepMap100 trapcolumn (Thermo Fisher Scientific) and then separated on a C18 Accucore nanocolumn (Thermo Fisher Scientific) with linear gradients of acetonitrile and analyzed in TOP20 CID data-dependent MS method. For the time-course proteomic approach, proteins were identified by database searching using Sequest-HT (Thermo Fisher Scientific) with Proteome Discoverer 2.4 software (PD2.4, Thermo Fisher Scientific) on Homo Sapiens database (SwissProt, reviewed, release 2020_04_06, 20286 entries). Oxidation and carbamidomethylation were set as variable and fixed modification, respectively. Peptides were filtered with an FDR at 1%, rank 1, and proteins were identified with a minimum of 2 unique peptides. The Label-Free Quantification was based on the XIC (Extracted Ion Chromatogram), where protein abundancies were calculated from the average of peptide abundancies using the TOP N (where N = 3, the 3 most intense peptides for each protein), and only the unique peptide was used for the quantification. Quantification values were exported in Perseus for statistical analysis [44]. For AR interactome analysis, proteins were identified by database searching against human database using Maxquant 1.6.5.0. Precursor and fragment mass tolerance before recalibration were set at 20 ppm and 0.6 Da, respectively. Trypsin was set as enzyme, and up to two missed cleavages were allowed. Carbamidomethylation was set as fixed modification, oxidation, and N-term acetylation as variable modifications. Proteins were identified with a minimum of two unique peptides and were filtered with an FDR < 1. Normalization and quantitative values (iBAQ) were processed with Perseus 1.6.2.0.

BioID interactome analysis
After the normalization step of mass spectrometry data, differences in AR partners were then calculated between AR-Q641X and AR-WT or AR-V7 and AR-WT conditions. To identify partners that have been specifically lost in the presence of AR variants, differentially underrepresented proteins (P-value < 0.05 and Log2FC < −1) were searched from intersected AR-Q641X or AR-V7 and AR-WT data. The list of interactors underrepresented in the presence of AR variants compared with AR-WT was then submitted to Gene Ontology analysis as described above (2.6). Identification of experimentally proved AR interactors was performed using the BioGRID database (https://thebiogrid.org) [45].

Statistics
qPCR results represent mean AE standard error of the mean (SEM) of three biological repeats. Statistical analysis was performed using Student's t-test by comparing the control, AR-WT, AR-Q641X, or AR-V7 condition versus the eGFP condition treated with vehicle, and P-values < 0.05 (*) were considered to be statistically significant.

Data analysis and graphic representation
All data analysis and visualization were performed in python 3 [46] using the pandas, bokeh, matplotlib, numpy, scipy, and seaborn packages.

Distinctive transcriptomic program of constitutively active AR variants in prostate cancer cells
We have previously reported dual transcriptional activities between constitutively active AR and wild-type AR (AR-WT) in prostate cancer [26]. Constitutively active AR variants have been involved in the expression of mesenchymal markers, while on the contrary, AR-WT seems to impede such expression. Our previous data suggest that AR-WT may play an occluding function to prevent the expression of mesenchymal markers as evidenced for CDH2 expression. To delineate molecular mechanisms involved in this duality, we first compared the global transcriptome profile triggered by AR-WT with those of constitutively active AR variants, AR-V7 and AR-Q641X. RNA-seq was then performed in the androgen-sensitive LNCaP cells, which express an endogenous AR containing the T878A mutation. LNCaP cells were then lentivirally transduced to coexpress in a doxycycline-dependent manner either AR-WT, AR-Q641X, or AR-V7 in fusion with eGFP. LNCaP cells expressing eGFP alone and treated with vehicle were considered as control to calculate the log 2 fold change in gene expression between different experimental conditions (Fig. S1). We first validated our model by analyzing DHT-induced gene expression modifications in control, and in LNCaP cells expressing AR-WT. Twenty-four hours after DHT treatment, 183 genes were downregulated and 466 were overexpressed in control, indicating transcriptional activities of the T878A endogenous AR in LNCaP cells (Table 1).
A panel of 30 known androgen-responsive genes in prostate tissue epithelium was used to further validate the experimental model [47] (Fig. S1). When focused on DHT-activated AR-WT, the number of down-and upderegulated genes markedly shifted to 395 and 465, respectively. Besides, about 1558 and 1091 genes were retrieved deregulated (|log2FC| > 1; P-value < 0.05) in the presence of AR-Q641X or AR-V7, respectively ( Fig. 1A; Table 1). Noteworthily, only 17% of these genes (257 out of 1558 and 184 out of 1091, respectively, for AR-Q641X and AR-V7) were common with those deregulated in the presence of DHT-activated AR-WT (Fig. 1A).
Altogether, these data indicate that AR-Q641X and AR-V7 do not completely mirror those of DHTliganded AR-WT or AR-T878A in LNCaP cells, and highlight again differential transcriptional activities between constitutively active AR variants and DHT-activated full-length AR. It was also interesting to note that when assessed at the same time, in the same cellular model, AR-V7 shared with AR-Q641X only 57% (887 out of 1558; |log2FC| > 1; P-value < 0.05) of deregulated genes, suggesting further transcriptional specificity among these two AR variants (Fig. 1A).

AR variants exhibited reduced transcriptional repression activities in prostate cancer cells
To further investigate differential gene regulation between full-length AR and constitutively active AR variants, we focused on their transcriptional repression activities. Transcriptional repression activity that was evident in control and AR-WT co-expressing cells appeared to be lost in the presence of AR-Q641X and AR-V7 ( Fig. 2; Table 1). Indeed, nearly 28% and 46% of deregulated genes were downregulated, respectively, in control and in the presence of AR-WT. Besides, the percentage of repressed genes dropped to 5% and 6% in the presence constitutively active AR-Q641X and AR-V7, respectively ( Fig. 2; Table 1). We next wondered whether, ARv567es, another constitutively active AR variant elicited similar results. We then used a publicly available RNA-seq GEO dataset, GSE125014, obtained in LNCaP cells expressing ARv456es in a doxycycline-inducible manner. A similar asymmetry of down-and upregulated genes was also observed (Fig.  S2), reinforcing our idea that compared with AR-WT, transcriptional repression capacities of constitutively active AR variants are disturbed.
In this light, the following gene panel, TMPRRSS2, KLK3, ITGA3, HDAC9, COL16A1, SMARCD3, and ITGB4 was selected to further validate by RT-qPCR this dual transcriptional regulation between DHTactivated AR-WT and constitutively active AR variants in LNCaP and in C4-2B cells. As expected, Table 1. Significant differentially expressed genes. LNCaP cells were transduced to express AR-WT, AR-Q641X, AR-V7, or the empty eGFP plasmid (control) and were exposed to DHT or vehicle (EtOH) as indicated. Gene expression levels in the presence of DHT-activated AR-WT, AR-Q641X_EtOH, or AR-V7_EtOH (Datasets A) were compared with the reference dataset B corresponding to Control_EtOH. The number of differentially expressed genes with adjusted p-value < 0.05 and |log 2 FC| > 1 is indicated. The number of under-and overexpressed genes, A < B and A > B, respectively, is also indicated.  TMPRSS2 and KLK3 were positively regulated in all conditions, and the drop in the expression level of ITGA3, HDAC9, COL16A1, SMARCD3, and ITGB4 observed with DHT-activated AR-WT was significantly attenuated in cells expressing constitutively active AR variants ( Fig. 3; Fig. S3). We next checked by realizing a time-course proteomic analysis in LNCaP cells whether the repressive transcriptional activity of DHT-activated AR-WT was noticeable at the protein level. Indeed, a significant and time-dependent increase in the number of underrepresented proteins (|log 2 FC| > 0.6; P-value < 0.05) was observed, including a panel of 14 proteins for which the corresponding encoding gene was part of the downregulated genes in the presence of DHTactivated AR-WT (Fig. S4, Table S2). Together, these data suggest that following DHT activation, AR-WT could trigger the downregulation of a specific panel of genes and that this property would be pathologically lost by constitutively active AR variants.

Similar AR-binding sites associated with both up-and downtranscriptional regulation by DHT-activated wild-type AR
A possible molecular mechanism to explain differences in transcriptional repressive activities between AR-WT and constitutively active AR variants could be distinctive recognition of so-called negative AREs (nAREs) [48]. However, to go deeper in this hypothesis, we first wondered whether there was a difference in ARbinding sites associated with transcriptional activation and repression by DHT-activated AR. So, available AR ChIP-seq data from LNCaP and VCaP cells were downloaded from GEO database and intersected with the lists of up-and downregulated genes. Sequences corresponding to AR-binding sites (500 bp centered on the peak summit) were subsequently retrieved from human reference genome (hg19/GRCh37) and submitted to MEME-ChIP webserver (https://meme-suite. org/meme/tools/meme-chip) for motif analysis (Fig.   Fig. 2. Distribution of RNA-seq data. Volcano plot representing the distribution of RNA-seq data of the four experimental conditions. Genes with adjusted P-value < 0.05 and |log 2 FC| > 1 are shown in red (significantly downregulated genes) and blue (significantly upregulated genes). RNA-seq was performed from three biological replicates. Fig. 3. Loss of transcriptional repression activities of AR variants AR-Q641X and AR-V7 in LNCaP cells. Gene expression was analyzed by qPCR. The log2Fold change in gene expression was calculated between the four experimental conditions and the control (eGFP) cells treated with vehicle as reference. Bar graphs represent mean of 3 biological repeats. Student's t-test was used to compare control, AR-WT, AR-Q641X, or AR-V7 condition with the eGFP condition treated with vehicle. *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001, ns, nonsignificant. S5). FOXA1 and AR motifs were the first DNA sequence motifs that were similarly found in ARbinding sites associated with up-and downregulated genes by DHT-activated AR in LNCaP and VCaP cells (Fig. S6). These data suggest that a difference in DNA sequence of AR-binding sites was not linked to gene downregulation by AR-WT, and prompted us to investigate rather differences in regulatory complexes formed around DHT-activated AR-WT and constitutively active AR variants.

AR variants disengaged from corepressor recruitment
As AR ligand-binding domain and AF-2 are known to present interfaces for recruitment of numerous coregulators, the lack of the C-terminal part in constitutively active AR variants could lead to the formation of particular complexes, explaining therefore the decrease in their transcriptional repression capacities. To analyze regulatory complexes formed around AR-WT and constitutively active AR variants, the BioID2 approach was applied (Fig. S7). A mass spectrometry analysis was thereafter carried out from purified biotinylated proteins from myc-BioID2, myc-BioID2-AR-WT, myc-BioID2-AR-Q641X, and myc-BioID2-AR-V7 transduced LNCaP cells. After raw data normalization, 78 biotinylated proteins were underrepresented in the presence of myc-BioID2-AR-Q641X and myc-BioID2-AR-V7 compared with myc-BioID2-AR-WT. An enrichment analysis indicated that these 78 potential AR partners fit mainly in GO molecular function terms around transcriptional regulation, including 'nucleic acid binding', 'transcription coregulator activity', 'heterocyclic compound binding', 'organic cyclic compound binding', 'transcription corepressor activity', and 'transcription regulator activity' (Fig. 4A). According to the BIOGRID database, among the 13 proteins related to 'transcription corepressor activity', 7 are known as experimentally proved AR partners, including BCOR, NCOR1, NCOR2, and PIAS1 (Fig. 4B).
These data suggest that constitutively active AR variants may lose transcriptional repression capacities due to the building of singular transcriptional regulatory complexes and that this may be linked to the lack of the ligand-binding domain and AF-2.
We next inquired about the reason that constitutively active AR variants that are linked to castration-resistant prostate cancer, the most aggressive stage of the disease, lose their repressive capacities. So, to get insight into key issues involved in AR transcriptional repressive activities in advanced prostate cancer, we decided to further focus on biological and/or molecular signatures associated with our lists of deregulated genes.

The repressive transcriptomic program of wild-type AR targets cell adhesion features in prostate cancer cells
We first used GSEA tool to analyze hallmarks associated with deregulated genes for each experimental condition. The 10 most upregulated hallmark gene sets in the presence of constitutively active AR-Q641X and AR-V7 referred to 'myogenesis', 'androgen response', 'apical junction', and interestingly to 'epithelial-mesenchymal transition' function ( Table 2). Similar hallmark gene sets were associated with transcriptional activities of ARv567es (Fig. S2). These hallmark gene sets associated with cell membrane and migration functions were comforted by the high level of expression of FBLN5, TGM2, COL16A1, Tuberin (TSC2), Integrin Alpha-7, and RRAS (Ras-Related) in the presence of constitutively active AR variants (Fig. 1B). The above-mentioned hallmark gene sets associated with constitutively active AR variants were not gained in control cells, nor in the presence of DHT-activated AR-WT. For these two latter conditions, the following hallmarks 'E2F targets', 'androgen response', 'MTORC1 signaling', 'G2M checkpoint', 'MYC targets', and 'unfolded protein response' were revealed as the ten most significantly upregulated ones (Table 2). This was consistent with the role of DHT-activated AR in PCa cell proliferation after a period of hormone depletion and in the regulation of unfolded protein response pathways [49,50].
We next considered cellular and molecular functions of the panel of downregulated genes in the presence of DHT-activated wild-type AR. A Gene Ontology enrichment analysis revealed 'anatomical structure morphogenesis', 'cell adhesion', 'biological adhesion', 'regulation of neuron projection development', and 'cell morphogenesis' as the top five biological processes associated with the 395 downregulated genes in the presence of DHT-activated AR-WT ( Fig. 5; Table S3). These results lead us to postulate that following DHT activation, AR-WT could ultimately trigger to the repression of a panel of genes and that some of these genes would be involved in cell plasticity.
To strengthen our hypothesis, we further investigated the profile of transcription repression by fulllength AR in available RNA-seq datasets from prostate cancer cells. We choose GSE125014 and GSE148397 datasets originating from LNCaP cells expressing the T878A mutant AR and from VCaP cells expressing a wild-type AR, respectively. LNCaP cells were stimulated with 10 nM DHT for 4 and 24 h, and VCaP cells were stimulated with 1 nM R1881 for 8 or 22 h. In order to compare our dataset to those of GSE125014 and GSE148397, all data files have been processed with DESeq2 for normalization and identification of differentially expressed genes. As expected, AR-T878A in LNCaP cells and AR-WT in VCaP cells triggered to gene repression following ligand stimulation in a time-dependent manner. The number of downregulated genes in LNCaP cells (P-value < 0.05 and |log 2 FC| > 1) was about 3 and 300 after 4 and 24 h of DHT treatment, respectively. In VCaP cells, significantly downregulated genes (P-value < 0.01 and |log 2 FC| > 2) were about 560 and 1260 after 8 and 22 h of R1881 stimulation, respectively (Fig. 6). We next used GSEA tool to analyze hallmarks associated with deregulated genes for each experimental condition. A time-dependent change was observed for the top six downregulated hallmark gene sets, particularly for 'epithelial-mesenchymal transition' (Table 3).
These data clearly demonstrated that gene repression by AR could affect significantly cell membranes and cell adhesion features. We further showed that when a panel of 64 AR-regulated genes involved in these biological processes and molecular functions were chosen, their level of expression was globally higher in the presence of enzalutamide, darolutamide, or in the presence of constitutively active AR variants (Fig. 8), suggesting a similar transcriptional profiling resulting from AR inhibition or the expression of constitutively active AR.

Discussion
Androgen receptor (AR) is widely described as an androgen-dependent transcription factor that plays a critical role during the natural history of prostate cancer. AR contributes to the upregulation of key genes for prostate cancer progression [51][52][53]. However, few studies have focused on the transcriptional repressive function of AR. In this study, genomic activities of wild-type AR (AR-WT) were compared with those of AR-Q641X and AR-V7, two constitutively active AR variants that are associated with castration-resistant prostate cancer. We report here a duality in the repressive function of AR-WT and constitutively active AR variants. Indeed, compared with DHT-activated AR-WT, the number of repressed genes markedly dropped in the presence of AR-Q641X and AR-V7, suggesting that transcriptional repressive function by AR-WT could be pathologically lost in the context of constitutively active AR variants that are devoid of the ligand-binding domain and AF-2.
Androgen receptor genomic activity relies on different key steps including androgen binding, nuclear translocation, AR binding as homodimers to AREs localized in enhancer, superenhancer, intron and/or promoter, recruitment of pioneer factors and cofactors for chromatin remodeling, and ultimately transcriptional control of target genes. While the different mechanisms that link AR to the upregulation of target genes have been widely described [7][8][9]52], molecular mechanisms associated with AR-repressive function are less studied. Also, at the cellular level, functional consequences of AR-repressive role remain poorly studied. It has been reported that AR transcriptional repressive function requires DNA binding [54]. However, no consensus negative ARE has been associated with this AR-repressive function so far [48]. It has been tempting to take advantage of our list of repressed genes in the presence of DHT-activated AR-WT to question publicly available AR ChIP-seq datasets for putative negative ARE. Indeed, the analysis of 500 bp genomic sequences around peak summits from GEO datasets GSE121021 and GSE148358 [31,55] with MEME-CHIP program [56] revealed AR and FOXA1 motifs as the two most represented motifs among AR-binding sites for both the panel of down-and upregulated genes. This suggests that, as far as we can conclude from available AR cistrome datasets, gene repression by a full-length AR does not rely on binding to peculiar AREs. Besides, it has been reported that constitutively active AR variants display their own cistrome [57][58][59]. Consequently, it remains to determine whether particular pioneer factor recruitment and distinctive chromatin conformation around downregulated genes could explain the duality between full-length AR and constitutively active AR for gene repression.
Androgen receptor transcriptional repressive function could also be associated with its ability to recruit repressive complexes causing chromatin inaccessibility [48,[60][61][62]. AR transcriptional activities can be negatively controlled by histone deacetylases (HDACs), such as HDAC1, HDAC2, NCoR/SMRT, or SIRT [4,63,64]. AR interacts with the histone lysine methylase EZH2 that catalyzes H3K27me3 and H3K4me3 repressive marks [65]. Changes in the level of expression of these key epigenetic markers could potentially be a mechanism associated with the loss of repressive capacity of constitutively active AR variants. Such gene expression deregulation was not evidenced in our data as the level of expression of AR corepressors remained mainly in the gray nonsignificant area in volcano-plots representing the distribution of RNAseq data of our four experimental conditions (Fig. 2). A differential coregulator recruitment could be another mechanism that could be relied to the decreased transcriptional repressive capacities observed with constitutively active AR variants. AR C-terminal part englobing the LBD and AF-2 largely contributes to cofactor recruitment. The loss of LBD and AF-2 in constitutively active AR variants could affect corepressor recruitment. The BioID2 approach by biotinylating proteins that interacted directly or indirectly, or were within proximity (~10 nm) to DHT-activated AR-WT, AR-Q641X or to AR-V7 led us to highlight a lower recruitment of corepressors by constitutively active AR variants. Further technology and cellular models are required to investigate more deeply this property. At the cellular level, functional consequences of full-length AR and constitutively active AR variant transcriptional activities have been relatively described [25,[66][67][68][69]. However, cellular consequences of ARrepressive function in prostate cancer cells remain elusive. Indeed, ontology analysis of genes downregulated in LNCaP following the addition of R1881 for 24 h reveals uniquely 'signal transducer activity' as the most represented functional category among R1881 downregulated genes [70]. In LNCaP cell line model again, Zhao et al. [65] suggest that AR-repressed genes are developmental regulators involved in cell differentiation. The functional characterization of the panel of downregulated genes in PC3 prostate cancer cells that do not express AR, following transfection with a full-length wild-type AR, includes GO terms involved in transport and cellular localizations, and in general metabolic process such as the tricarboxylic acid cycle, which is according to the authors consistent with a growth inhibition phenotype [71]. In the VCaP prostate cancer cell model, Gao et al. [72] used AR ChIP-seq and transcriptome profiling to identify genes required for DNA replication as highly enriched among androgen-repressed genes. So, in brief, few functional analyses on genes repressed by AR have been reported so far. Here, our data indicate that at the cellular level, AR-WT repressive function could significantly target genes involved in cell adhesion, which was not the case with constitutively active AR variants.

Conclusions
Altogether, these observations support a model in which androgens and full-length AR signaling negatively regulate EMT in epithelial prostate cells. Besides, constitutively active AR variants would pathologically upregulate EMT genes to promote tumor progression. So, we believe that the transcriptional repressive program of the full-length wild-type AR in prostate cancer is determinant for epithelial cell behavior and inhibition of tumor progression. Consequently, the systematic targeting of full-length AR in prostate cancer deserves attention.

Supporting information
Additional supporting information may be found online in the Supporting Information section at the end of the article.  Fig. S2. Analysis of ARv567es transcriptional activity in LNCaP cells. A doxycycline-inducible expression system (GEO datasets GSE 125014) was used to analyze transcriptomic changes mediated by ARv567es in LNCaP cells. (A) Volcano plot represents the distribution of differential gene expression calculated between doxycycline-treated cells and vehicle-treated cells in the absence of androgen. Genes with adjusted P-value < 0.05 and |log 2 FC| > 1 are shown in red (significantly down-regulated genes) and blue (significantly up-regulated genes). (B) Gene Set Enrichment Analysis of the LNCaP-ARv567es expressing cells data showing significant enrichment for "androgen response", "epithelial mesenchymal transition" and "apical junction" gene sets (NES: Normalized enrichment score). All enrichment scores have a nominal P-value = 0 and an FDR q-value < 0.005. Fig. S3. Validation of transcriptional repression activity of AR in C4-2B cells by qPCR. The log2Fold change in gene expression were calculated between the four experimental conditions and the control (eGFP) cells treated with vehicle as reference. Bar graphs represent mean of 3 biological repeats. Student's t-test was used to compare control, AR-WT, AR-Q641X or AR-V7 condition with the eGFP condition treated with vehicle. *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001, ns, non-significant. Fig. S4. RNA-seq and mass spectrometry (MS) crossanalysis of AR-WT repressive activity. (A) Among the 395 down-regulated genes, only 14 were identified by MS. (B) Volcano plot representing the distribution of MS data and cross-analysis with RNA-seq. Proteins with adjusted P-value < 0.05 and |log 2 FC| > 0.6 are shown in red (significantly under-represented proteins) and blue (significantly over-represented proteins). Number of differentially represented proteins are indicated below the plots. The 14 down-represented proteins at 24 and 48 h after DHT treatment are shown in green. Fig. S5. Pipeline for RNA-seq/ChIP-seq intersection and motif analysis. AR ChIP-seq data available in nar-rowPeak file format were downloaded from the Gene Expression Omnibus (GEO) database. Sample GSM3424005 [55] referring to AR ChIP-seq from LNCaP cells cultured in complete medium provides 21127 AR binding sites (peaks). Samples GSM4462682, GSM4462683 and GSM4462684 [31] corresponding to three replicates of VCaP cells treated with 1nM of R1881 for 22h were first subjected to the bedtools intersect function from pybedtools library on python 3 to identify 30941 AR peaks common to the three replicates. Then, the Genomic Regions Enrichment of Annotations Tool (GREAT version 4.0.4) program was used to associate the AR binding sites to putative target genes with the single nearest method and 1000 kb as the maximum extension (http://great. stanford.edu/public/html). Intersection of these ChIPseq AR target genes with genes identified as differentially expressed in RNA-seq data provided us a list of AR peaks associated with genes up-regulated or downregulated by AR in LNCaP and VCaP cells. In order to proceed to motif analysis, sequences corresponding to AR binding sites (500 bp centered on the peak summit) were retrieved from human reference genome (hg19/GRCh37) and submitted to MEME-ChIP webserver (https://meme-suite.org/meme/tools/meme-chip). Fig. S6. Motif analysis of AR peaks in LNCaP and VCaP cells. The first two DNA sequence motifs found by the MEME-ChIP program in AR binding sites associated to up-regulated and down-regulated genes by AR in LNCaP cells (A) and in VCaP cells (B).  .  Table S1. List of primers used for RT-qPCR. Table S2. Significantly regulated proteins in MS analysis. Table S3. Gene Set Enrichment Analysis down-regulated pathways.