The androgen receptor (AR) initiates important developmental and oncogenic transcriptional pathways. The AR is known to bind as a homodimer to 15-base pair bipartite palindromic androgen-response elements; however, few direct AR gene targets are known. To identify AR promoter targets, we used chromatin immunoprecipitation with on-chip detection of genomic fragments. We identified 1,532 potential AR-binding sites, including previously known AR gene targets. Many of the new AR target genes show altered expression in prostate cancer. Analysis of sequences underlying AR-binding sites showed that more than 50% of AR-binding sites did not contain the established 15 bp AR-binding element. Unbiased sequence analysis showed 6-bp motifs, which were significantly enriched and were bound directly by the AR in vitro. Binding sequences for the avian erythroblastosis virus E26 homologue (ETS) transcription factor family were also highly enriched, and we uncovered an interaction between the AR and ETS1 at a subset of AR promoter targets.
The transcriptional function of the androgen receptor (AR) is essential for normal male sexual development and drives the onset, and subsequent progression, of prostate cancer (PrCa; Chen et al, 2004; Notini et al, 2005). PrCa is the most common solid malignancy in men in the EU, and resulted in more than 85,000 deaths in 2004 alone (Boyle & Ferlay, 2005). Many of the current biomarkers of PrCa are androgen-regulated genes, including prostate-specific antigen (human glandular kellikrein (KLK3)/PSA), illustrating the enhanced AR activity in PrCa. Androgen ablation is an effective first-line therapy for the treatment of advanced PrCa; however, recurrence is common and is associated with androgen independence (Scher et al, 1997). Despite loss of response to anti-androgens, most advanced PrCa express AR and have an active AR signalling cascade (van der Kwast et al, 1991). Cell line models also suggest that the AR is required for androgen-independent PrCa cell growth (Haag et al, 2005). It is important to identify the main pathways downstream of AR transactivation that might mediate the ‘androgen-independent’ function of the AR. These pathways are likely to contribute to the growth and progression of PrCa, and might include new candidate biomarkers or future therapeutic targets.
The activated AR is known to bind as a homodimer to androgen-response elements (AREs), which consists of two 6-base pair ‘half-sites’ arranged as inverted or direct repeats separated by 3 bp. The in vitro-derived (SELEX) consensus sequence for the AR was found to be the inverted repeat 5′-AGAACAnnnTGTACC-3′ (Roche et al, 1992). However, sequence alignment of known AR genomic binding sequences reveals both inverted repeat and direct repeat consensus sequences (Verrijdt et al, 2003). This degeneracy of functional AREs, and also the divergence from the in vitro-derived binding sequence, make computational prediction of AR-binding sites in the human genome problematic. Alternative approaches have therefore been taken to identify AR-regulated genes. Several studies have used expression microarray techniques to identify AR transcriptional targets (Waghray et al, 2001; Chen et al, 2004; Velasco et al, 2004; Haag et al, 2005). Although these approaches have successfully identified androgen-regulated transcriptional events, they cannot discern direct gene targets of the AR from secondary transcriptional events. Therefore, such expression studies neither give insights into the genomic sequences that are bound by the AR nor identify the initiating signals that ultimately produce the large number of downstream androgen-regulated transcriptional events. To identify direct transcriptional targets of the AR, we used chromatin immunoprecipitation (ChIP) with on-array detection (ChIP-chip) in the androgen-responsive LNCaP PrCa cell line.
Results And Discussion
AR ChIP-chip analysis
LNCaP cells were androgen deprived before stimulation with a synthetic androgen (R1881) or vehicle (ethanol). Using ChIP, an AR antibody was labelled and hybridized to an array with a tiling coverage of 24,275 gene promoter regions. Data from biological replicate AR ChIP-chip experiments were filtered by signal intensity (>twofold increase in androgen versus vehicle), reproducibility (Wilcoxon P-value ⩽0.01) and probe coverage (>four probes contributing to these scores; see the supplementary information online for details). This analysis identified as potential AR-binding sites 1,532 promoter regions that were more than twofold enriched in androgen-stimulated cells compared with vehicle-treated cells (Fig 1A; see supplementary Table 1 and supplementary information online).
The thresholds for AR ChIP-chip allowed detection of 15 known direct AR target genes identified using literature searches (supplementary Table 2 online). These 15 known AR targets show a range of AR ChIP-chip enrichment values. For example, the well-studied AR targets in the carbonic anhydrase 3, PSA and pepsinogen C promoters had enrichment scores of 2.26, 2.51 and 4.46, respectively, suggesting that direct AR targets were present even among the lower scoring candidates. There are, at present, several examples in the literature of direct AR gene targets; therefore, to identify a larger number of androgen-responsive genes in our AR ChIP-chip data, we compared our data with gene lists from a published meta-analysis of six gene expression data sets, all of which used the LNCaP cell line to identify androgen-regulated genes (Velasco et al, 2004; see the supplementary information online for details). We identified 92 genes that were enriched by AR ChIP-chip and were also androgen regulated in one or more expression data sets (supplementary Table 3 online), making them strong candidates for direct transcriptional regulation by the AR in response to androgen stimulation (see the supplementary information online).
To validate our AR ChIP-chip promoter targets, a set of 26 gene promoters with a range of enrichment and significance scores were assessed by independent AR-ChIP and quantitative PCR (Fig 1A,B). Three known AR-binding sites in the diazepam-binding inhibitor, KLK2 and KLK3 promoters (Young et al, 1992; Murtha et al, 1993), and 23 new, randomly selected gene promoters were assessed. There was an increase in AR binding at 23 promoters, after 1 h of androgen exposure, suggesting a false discovery rate of approximately 13% in the AR ChIP-chip data (Fig 1B). AR recruitment was also assessed in the DUCaP PrCa cell line at eight candidate promoters. Six of these promoters were enriched in androgen-treated DUCaP cells (Fig 1C), suggesting that many of the AR-binding sites identified by ChIP-chip in LNCaP cells might be common AR targets in PrCa cells.
AR target genes
Gene Ontology analysis of the 1,532 genes associated with AR ChIP-chip-enriched promoters showed AR target genes to be involved in protein synthesis, development, secretion, apoptosis and transcription (Fig 2A; supplementary Table 4 online). To assess the clinical relevance of these AR ChIP-chip target genes, we retrieved expression values for these genes from publicly available clinical PrCa expression array data sets (Holzbeierlein et al, 2004; Varambally et al, 2005; Tomlins et al, 2007). Interestingly, we found genes that were overexpressed in primary PrCa and metastatic PrCa compared with benign prostate epithelia, and also genes that were downregulated after androgen ablation therapy (Fig 2A; supplementary Table 4 online). Further comparisons with published expression array gene lists from cell line experiments showed subsets of AR target genes upregulated in response to androgen treatment and genes that were downregulated after RNA interference (RNAi) ‘knock-down’ of the AR in LNCaP cells (Fig 2A; supplementary Table 4 online; Velasco et al, 2004; Haag et al, 2005).
The 92 AR target genes, which were also found to be androgen regulated in LNCaP cells, represent the strongest candidates for direct AR transcriptional targets (Fig 2A; supplementary Table 3 online). Several of these androgen-regulated AR gene targets were overexpressed in primary PrCa samples compared with benign samples in two independent clinical expression array data sets (Fig 2B,E; Varambally et al, 2005; Tomlins et al, 2007). AR target genes with Gene Ontology annotations for development and transcription had increased expression in primary PrCa and a subset was also upregulated in metastatic PrCa (Fig 2C,D). The upregulation of AR targets with transcriptional annotations suggests complex transcriptional changes downstream of the AR that might be activated in PrCa. These analyses show that many of the direct AR target genes identified by ChIP-chip analysis have cancer-related functional annotations, are upregulated in clinical PrCa and might therefore have potential as biomarkers or future therapeutic targets.
Binding site analysis
We examined the 1,532 AR-binding sites identified by ChIP-chip for the presence of ARE-like sequences, to determine the preferred binding sequences of the AR. Although the occurrence of 15-bp ARE sequences was enriched in the AR promoter targets compared with non-candidate promoters (χ2 test, P<2 × 10−6), only 410 (26.8%) of the 1,532 AR promoter-binding sites contained sequences that resembled the established 15-bp AREs (Fig 3A,B). The AR ChIP targets, which were identified as androgen-regulated genes in published expression array data, had an equal occurrence of AREs (26 of 92 genes, 28%), suggesting that functional AR target genes might lack the established 15-bp AR-binding sequence.
To identify conserved sequences in the AR-bound promoters in an unbiased manner, we used the Nested Motif Independent Component Analysis (MICA) motif recognition software (Down & Hubbard, 2005). Nested MICA analysis using a ‘training set’ of 225 AR ChIP-chip-enriched sequences did not identify any over-represented sequences that resembled the established 15-bp ARE sequence. However, several frequently occurring and highly constrained motifs were present among the Nested MICA searches for enriched 6-bp motifs (supplementary Fig S1A,B online). Among the most frequently occurring non-repetitive motifs were two that resembled one-half of the 15-bp ARE sequences (motifs 2 and 6; Fig 3A,C). The more constrained motif 2 AR ‘half-site’, which aligned to one-half of the 15-bp ARE sequence (Fig 3A,C), was used for subsequent sequence analysis. This 6-bp AR ‘half-site’ occurred in 1,212 (79.2%) of the AR candidate promoter sequences, including 876 (57.2%) of the AR ChIP-chip sequences that did not contain a 15-bp ARE sequence (Fig 3D), and was enriched in the AR ChIP-chip candidate promoters compared with non-candidate promoters present on the array (χ2 test, P<2 × 10−5).
We used an in vitro oligonucleotide pull-down assay to examine AR binding to the 6-bp AR ‘half-site’. As a positive control, the AR was shown to bind to the KLK2 promoter ARE sequence more strongly than a scrambled 15-bp oligonucleotide (Fig 3E,F). The AR also bound specifically to the 6-bp AR ‘half-site’ sequence from the UNQ9419 promoter, but not a scrambled control oligonucleotide (Fig 3E,F), showing that the AR can bind directly to these 6-bp ‘half-sites’. To examine AR recruitment to in vivo 6-bp ‘half-sites’, we used AR ChIP and quantitative PCR for the UNQ9419 promoter region, which lacks a 15-bp ARE sequence (Fig 3G,H). ChIP analysis showed androgen-dependent recruitment of the AR to the UNQ9419 promoter at a level similar to that of the KLK2 promoter (Fig 3I,K). An androgen treatment time course showed transient UNQ9419 upregulation and sustained upregulation of KLK2 expression (Fig 3J,L). These data show that the AR can bind directly to 6-bp ‘half-sites’ and that AR transactivation might occur in genes adjacent to these 6-bp AR-binding sites. Direct AR binding to 6-bp ‘half-sites’ raises important questions about our understanding of AR biology and further work will be required to determine the mechanism by which AR is recruited to these ‘half-sites’ (see supplementary Fig S1C online for potential models).
AR interacts with the ETS1 transcription factor
Further sequence analysis using the Genomatix Matbase program (Cartharius et al, 2005) and Nested MICA identified frequent consensus binding sequences for the avian erythroblastosis virus E26 homologue (ETS) family of transcription factors (Fig 4A; see the supplementary information online). The ETS-like Nested MICA motif 9 and AR 6-bp ‘half-site’ co-occurred in 1,073 (70%) of the AR-enriched promoters (χ2 test, P<2 × 10−8), suggesting that there might be co-recruitment of ETS transcription factors and the AR to a subset of promoters. To investigate further the association of AR and ETS, we selected the ETS1 transcription factor as a candidate, as the ETS1-binding site was among the most common ETS sequence motifs found in the AR ChIP-chip promoters (as identified by Genomatix Matbase) and ETS1 was recently reported to be overexpressed in PrCa (Alipov et al, 2005).
Using ChIP, an ETS1 antibody showed that ETS1 was associated with a subset of AR-bound promoters (Fig 4B,C). The CCNG2 and UNQ9419 promoters contain predicted ETS1-binding sites (Fig 4B) and showed androgen-dependent recruitment of both the AR and ETS1 by ChIP (Figs 1B, 4C). However, the PRAME promoter does not contain a predicted ETS1-binding site (Fig 4B) and was not enriched by ETS1 ChIP (Fig 4C). Knockdown of endogenous ETS1 by RNAi resulted in a matched reduction in CCNG2 and UNQ9419 transcripts, but did not affect the level of PRAME transcript (Fig 4D). Conversely, ETS1 overexpression resulted in enhanced transcription of CCNG2 and UNQ9419, but not PRAME (Fig 4E). Confocal microscopy showed that endogenous ETS1 was located throughout LNCaP cells grown in steroid-depleted media and was redistributed to the nucleus together with transfected AR–green fluorescent protein (GFP; Fig 4F) or endogenous AR (Fig 4G) in LNCaP cells on androgen treatment. In an AR luciferase reporter assay, ETS1 transfection enhanced AR transactivation in an AR- and androgen-dependent manner (supplementary Fig S2F online). The AR was co-immunoprecipitated using an ETS1 antibody, suggesting a direct interaction between the AR and ETS1 in LNCaP cells (supplementary Fig S2D online). These data indicate androgen-dependent recruitment of ETS1 to a subset of AR promoter targets and also that endogenous ETS1 is required for expression of these AR target genes (see the supplementary information online).
Recruitment of the AR to 6-bp ‘half-sites’ raises the question of how the AR is selectively recruited to its targets in large mammalian genomes. This might occur by co-operative binding with other factors, such as ETS1. As ETS1 is recruited to AR promoter targets in response to androgens, it is unlikely that ETS1 acts as a ‘pioneer factor’ for the AR, as has been reported for FOXA1 and the estrogen receptor (Carroll et al, 2005; Laganiere et al, 2005). The implication that ETS transcription factors transactivate the AR might have an impact on our understanding of the functional effects of TMPRSS2-ETS gene fusions given the prevalence of this rearrangement in PrCa (Tomlins et al, 2005).
Cell culture and transfection. LNCaP and DUCaP cells were grown in RPMI (Invitrogen, Paisley, UK) media supplemented with 10% FBS. Cells were transfected with the GeneJuice transfection reagent (Merck, Nottingham, UK), according to the manufacturer's instructions. The pSG5–ETS1 expression construct was a kind gift from Dr A. Bègue (Baillat et al, 2002).
Chromatin immunoprecipitation. Cells were grown to 70–80% confluence in phenol-red-free RPMI (Invitrogen) supplemented with 10% charcoal-stripped FBS (HyClone, Logan, UT, USA) for 48 h before stimulation with 1 × 10−9 M R1881 (synthetic androgen) or an equal volume of ethanol for 1 h. ChIP was carried out as previously described, using 5 μg of AR (N20, Santa Cruz) or ETS1 (C20, Santa Cruz, Biotechnology, Santa Cruz, CA, USA) antibodies (see the supplementary information online for details; Oberley et al, 2003).
ChIP-chip. Biological replicate hybridizations were carried out for AR ChIP from LNCaP cells stimulated for 1 h with R1881 or ethanol (vehicle), co-hybridizing with labelled total genomic DNA to NimbleGen Systems (Madison, WI, USA) 1.5 kb promoter arrays. Raw data for AR ChIP-chip experiments is available through ArrayExpress (accession E-TABM-233, http://www.ebi.ac.uk/arrayexpress/). Array analysis was carried out using the limma package in the R statistical software (see the supplementary information online for details).
Sequence analysis. Nested MICA motif recognition software (v0.7.2) was used to identify conserved sequence motifs in a ‘training’ set of 225 highly enriched AR target sequences (see supplementary Table 5 and supplementary information online for details). Searches for the AR, ETS1 and the conserved 6 bp Nested MICA sequence motifs were carried out on all 1,532 AR ChIP-chip targets, using the position weight matrices shown in supplementary Table 6 online (see the supplementary information online for details).
Quantitative real-time PCR. Real-time quantitative PCRs were carried out in an ABI Prism 7900, using SYBRgreen PCR master mix (Applied Biosystems, Warrington, UK). Reactions were carried out in triplicate and with biological replicates. Primers are shown in supplementary Table 7 online.
Immunoprecipitation. LNCaP cells lysates were incubated with AR (N20, Santa Cruz), ETS1 (C20, Santa Cruz) or control rabbit IgG antibodies. Immune complexes were isolated on protein-A/G beads, washed four times with RIPA buffer and resuspended in sample loading buffer, before western blotting for AR (N20, Santa Cruz).
Immunofluorescence. LNCaP cells on coverslips were treated, transfected and stained for AR (AR-N20, Santa Cruz) and ETS1 (C20, Santa Cruz or 1G11, Novocastra Laboratories, Newcastle upon Tyne, UK) as indicated (see the supplementary information online for details).
C.E.M. was funded by a Cancer Research UK programme grant awarded to D.E.N. I.G.M. is supported by Cancer Research UK. N.L.B.-M. and A.G.L. were funded by a Cancer Research UK programme grant awarded to Simon Tavaré. B.A. was supported by an EMBO Longterm Fellowship.