The androgen receptor gene mutations database: 2012 update


  • Bruce Gottlieb,

    Corresponding author
    1. Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, Quebec, Canada
    2. Department of Human Genetics, McGill University, Montreal, Quebec, Canada
    • Lady Davis Institute for Medical Research, Jewish General Hospital, 3755 Cote Ste. Catherine Road, Montreal, Quebec H3T 2E1, Canada.
    Search for more papers by this author
  • Lenore K. Beitel,

    1. Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, Quebec, Canada
    2. Department of Human Genetics, McGill University, Montreal, Quebec, Canada
    3. Department of Medicine, McGill University, Montreal, Quebec, Canada
    Search for more papers by this author
  • Abbesha Nadarajah,

    1. Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, Quebec, Canada
    Search for more papers by this author
  • Miltiadis Paliouras,

    1. Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, Quebec, Canada
    Search for more papers by this author
  • Mark Trifiro

    1. Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, Quebec, Canada
    2. Department of Human Genetics, McGill University, Montreal, Quebec, Canada
    3. Department of Medicine, McGill University, Montreal, Quebec, Canada
    Search for more papers by this author

  • Communicated by John McVey


The current version of the androgen receptor gene (AR) mutations database is described. A major change to the database is that the nomenclature and numbering scheme now conforms to all Human Genome Variation Society norms. The total number of reported mutations has risen from 605 to 1,029 since 2004. The database now contains a number of mutations that are associated with prostate cancer (CaP) treatment regimens, while the number of AR mutations found in CaP tissues has more than doubled from 76 to 159. In addition, in a number of androgen insensitivity syndrome (AIS) and CaP cases, multiple mutations have been found within the same tissue samples. For the first time, we report on a disconnect within the AIS phenotype–genotype relationship among our own patient database, in that over 40% of our patients with a classic complete AIS or partial AIS phenotypes did not appear to have a mutation in their AR gene. The implications of this phenomenon on future locus-specific mutation database (LSDB) development are discussed, together with the concept that mutations can be associated with both loss- and gain-of-function, and the effect of multiple AR mutations within individuals. The database is available on the internet (, and a web-based LSDB with the variants using the Leiden Open Variation Database platform is available at Hum Mutat 33:887–894, 2012. © 2012 Wiley Periodicals, Inc.


In a commentary in Nature, Hayden (2009) suggested that, in light of the so far disappointing ability of genome-wide association studies to find genes that make a significant contribution to common diseases such as cancer, researchers could utilize available resources more effectively by using next-generation sequencing (NGS) to identify genes responsible for rare disorders, presumably caused by single genes. Therefore, the over 700 locus-specific genetic databases (LSDBs) now listed on the Human Genome Variation Society (HGVS) Website (, which includes the androgen receptor gene (AR) mutation database (ARDB), are likely to be supplemented by many more. Of more importance for genetic databases, however, is that such NGS studies are likely to alter both the nature and the significance of the information in LSDBs.

As we have noted [Gottlieb et al., 2004], the effectiveness of all LSDBs has been based on a number of assumptions. First, that a particular mutation is indeed responsible for an altered phenotype, with few LSDBs actually making an effort to record whether the pathogenicity of specific mutations has been proven. Such data, however, will likely become increasingly important in light of the enormous variation being found within the human genome, so much so, that it has become increasingly difficult to determine a definitive human genome reference sequence [Gottlieb et al., 2010]. In the ARDB, the serendipitous discovery of five normal individuals that have variant AR (i.e., p.S207R, p.P517S, p.S598R, p.R727L, and p.E794D) may further challenge this critical genetic principle, in that a definitive wild-type reference sequence for the AR may actually not exist.

Nevertheless, we have taken the important step of changing the nucleotide and amino acid numbering schemes, as well as the nomenclature, so that now the database conforms to all of the current HGVS standards (

Second, it is assumed that almost all AR mutations have been inherited, i.e., are germline, and therefore will be excellent predictors of the inheritance pattern of the disease phenotype. However, there are, as illustrated by the growing number of AR mutations found in prostate cancer (CaP) tissue, an increasing number of AR mutations that are somatic rather than germline in origin. The third assumption is that a clear distinction exists between mutations and polymorphisms (defined as gene alterations that occur in >1% of a population). However, sequencing an increasing number of whole human genomes has resulted in the discovery of millions of single-nucleotide polymorphisms and other polymorphisms, thousands of which are exonic, which in many instances have been found to be far from benign. Finally, it has been assumed that if an individual exhibits a disease phenotype, then it will almost always be possible to find mutations in the specific genes that are the recognized genetic cause of that disease or phenotypic condition. However, we highlight in this article that a substantial number of individuals possessing a typical androgen insensitivity syndrome (AIS) phenotype do not appear to have any mutations in the sequenced regions of their AR. These observations suggest that the clinical value of the ARDB might not be as great as initially thought, and its usefulness may be further complicated by the fact that specific AR mutations have been found to be associated with different phenotypes, including cases wherein the same mutation can appear to be the cause of both loss-of-function (androgen insensitivity) and gain-of-function (CaP) phenotypes.

All of these challenges, to the basic assumptions behind LSDBs, have resulted in making many databases in their present form less than ideal for determining the significance of specific gene alterations as causal agents of particular diseases and conditions. The ARDB now contains many of these same mutational conundrums and issues. In this article, we will report the changes in data that have been recorded in the ARDB over the past 7 years, and the challenges that these changes pose to the future relevance and usefulness not just of the ARDB, but of all LSDBs.

Database Information

Loss-of-Function Mutations

Androgen insensitivity syndrome

The AR (MIM# 313700) is a member of the superfamily of nuclear receptors that function as ligand-dependent transcription factors. Intracellular AR is essential for androgen action, whether of testosterone or of its 5α-reduced derivative (5α-dihydrotestosterone). Hence, the AR is essential for normal primary male sexual development before birth (masculinization), and for normal secondary male sexual development around puberty (virilization). AR dysfunctions in XY individuals result in AIS (MIM# 300068)

The present version of the ARDB (available at is now based on the NCBI reference sequence NM_000044.2. This is different from the original numbering scheme used over the past 20 years that was based on GenBank mRNA sequence M20132.1 [Lubahn et al., 1988]. However, a note has been added to the ARDB homepage that clearly explains the differences in the nucleotide and amino acid numbering ( This is so researchers can correctly identify mutations when looking up references that report AR mutations, as almost all have used the previous numbering scheme. These differences are: (1) the open reading frame starts at nucleotide 1,116 instead of 363 and (2) the variable polyglutamine tract length is two longer (23 instead of 21), whereas the variable polyglycine tract length is one shorter (23 instead of 24) for NM_000044.2 versus M20132.1, respectively. This has resulted in the AR of the new reference sequence being one amino acid longer, that is, 920aa, leading to a +1 shift in mutation numbering compared with most previously published mutations in the DNA-binding domain (DBD) and ligand-binding domain (LBD). The authors have also completed an update of the ARDB to ensure it conforms to the present HGVS standard reporting nomenclature. In order to further increase the usefulness of the ARDB, in addition to fully and partially searchable versions of the ARDB in FilemakerPro and pdf formats, we now also have an available version in Excel, and have now deposited the data into the Leiden Open Variation Database ( [Fokkema et al., 2011].

The ARDB now contains over 800 entries of mutations causing AIS, representing over 500 different AR mutations from more than 850 patients with AIS. There has been a large increase in the number of reported AR mutations since the last published report on the database [Gottlieb et al., 2004], the number of entries rising by more than 60% from 605 to 1,029 (as on September 1, 2011). This has been partly attributed to the ease of sequencing the AR, but might easily have been even larger due to the fact that many mutations are not reported, in the literature unless they are unique. Furthermore, even unique mutations may not be reported, as they may have been found strictly in a clinical setting, as a result of sequencing of blood solely to establish a diagnosis of AIS. Until now, it has been the policy of the curator not to accept submissions to the ARDB unless they have been accepted for publication to ensure adequate quality control of the data. However, because it has become increasingly less likely that new AR mutations will be published, the home page ( explains that this policy has been changed. This will allow unpublished AR mutations to be included in the ARDB, provided that the curator is satisfied that the sequencing has come from qualified research or clinical laboratories. To ensure quality control, the clinical laboratories will be required to be accredited by a recognized body such as the College of Pathologists, or AABB (formerly the American Association of Blood Banks), or in Europe to be certified by EuroGeneTest as meeting the ISO 15189 standards.

While there is still an unequal distribution of the mutations along the length of the exonic regions of the AR, increasingly, new mutations are being reported that fill in areas where mutations have not been previously reported. It is also apparent that the types of mutations differ along the length of the AR. While it is still true that nearly all AIS mutations in exon 1 appear to cause complete AIS (CAIS; 89 out of 124; Table 1), since 2004 there has been an increase in mild AIS (MAIS) mutants reported (from seven to 22), which are solely due to substitution mutations. Why a significant number of missense mutations between aa 214 and 511 should cause MAIS remains a mystery, but does suggest that missense mutations in this part of exon 1 have a mild affect on AR function. However, while the total of number of exon 1 mutations (excluding those associated with CaP) has more than doubled from 54 to 124, this still represents only about 25% of the total loss-of-function mutations, despite the fact that exon 1 encodes more than half of the AR protein [Gottlieb et al., 1999]. It should be noted that this increase might partially be a reflection of the increasing ease of sequencing exon 1. What has not changed, however, is the very few AR mutants (24) that have been reported in splicing and untranslated regions of the AR (Table 1). In the C-terminal LBD, there is a striking preponderance of single-base substitution mutations, although since 2004 there has been a slight narrowing of the ratio of CAIS (144) to partial AIS (PAIS) and MAIS (110) substitution mutations (1.3/1) compared with 1.4/1 in 2004 [Gottlieb et al., 2004].

Table 1. Nature and Distribution of Unique AR Mutations that Cause Disease
 Type of mutationN-terminal domainaDNA-binding domainbHinge regioncLigand-binding domaindSplice siteIntron
  1. aaa 1–534.

  2. baa 559–624.

  3. caa between DBD and LBD 625–663.

  4. daa 664–919.

  5. eApplies to all domains.

  6. fSomatic mutation.

  7. gDescription of phenotype somewhat ambiguous.

  8. *Mutations not reported in 2004.

Loss-of-function disease
CAISSingle-base substitution828 121151
 Premature termination354118  
 Complete gene deletion4e     
 Partial gene deletion98 4*  
 Deletion (1–4 bases)204 10*  
 Insertion112* 3*1 
 Duplication23* 1*  
 Indel   1*  
PAISSingle-base substitution92039321*
 Multiple-base substitution 1 1  
 Premature termination2     
 Deletion (1–4 bases)2  2 3
MAISSingle-base substitution224* 15g  
 Partial gene deletion   1  
 Deletion (1–4 bases)   1  
Premature ovarian failureSingle-base substitution1  21 
Gain-of-function disease      UTR
Prostate cancerfSingle-base substitution427352 2
 Premature termination mutations1  4  
 Deletion (1–4 bases)22*1*11* 
 Insertion1*   11
Breast cancerSingle-base substitution 2  1 
Larynx cancerfDeletion (30 bases)1     
Liver cancerfSingle-base substitution4*  1*  
Testiclular cancerfSingle-base substitution3*     

The identification of specific mutations in the AR, starting in the early 1990s, as the cause of AIS has quite naturally resulted in the diagnosis of the disorder being increasingly dependent on finding AR mutations. Over the years, our laboratory has occasionally been unable to identify AR mutations in putative AIS patients that exhibit the classical AIS phenotype. Initially, the possible significance of such cases was discounted with the assumption that they were outliers, possibly because of some posttranslational event. As the number of such cases has grown, we decided to examine our own Lady Davis Institute (LDI) AIS database of AIS patients. The results were quite surprising. Out of the 75 listed patients of CAIS, no AR mutation has been identified in 25 patients, and of the 63 reported patients of PAIS, no AR mutation has been found in 37 patients (Table 2). It should be noted that all eight AR exons were sequenced from patients' genital skin and we also screened for mutations in another gene associated with an AIS-like phenotype, namely 5α-reductase 2 (SRD5A2). It is important to note that in many of these cases the diagnosis of AIS was determined before AR sequencing was available, by a detailed examination of traditional AIS-defining phenotype characteristics that included classical physical features (i.e., ambiguous external genitalia), as well as measuring androgen levels and conducting AR biochemical studies on patient-derived genital skin fibroblasts when these were available. Further, our findings are not unique, as the AIS Patient Database at the University of Cambridge, England, reports that while mutations are present in 95% of their CAIS patients, they have only been found in 25% of their PAIS patients (Dr. John Davies, personal communication). What is particularly striking about the LDI patients in whom no mutation has been identified is that a significant number have normal AR binding properties, 11 out of 25 for CAIS and 17 out 37 for PAIS patients (Table 2). This is perhaps particularly surprising in the CAIS patients, although it should be noted that in the ARDB, 30 CAIS patients are reported as having normal binding with 11 having mutations in the DBD. In summary, this suggests that at least in some of these cases, the involvement of aberrant AR protein in the AIS phenotype may not be so clear cut.

Table 2. AIS Patients from the Lady Davis Institute Database
AIS PhenotypeAR mutation foundNumber of patientsZeroReducedNormalNumber of cases with no biochemical data (only DNA available)
  1. aAR Biochemical activity can include Bmax values, and when some binding activity is present, Kd and k values.


Of particular interest is the doubling in the number of MAIS mutations (from 17 to 44), most of which are almost exclusively associated with some form of male infertility. Further, as the majority of these mutations have been found in exon 1, we have suggested that AR exon 1 mutations might be a cause of some cases of idiopathic male infertility [Gottlieb et al., 2005]. Finally, a decreasing number of mutation entries, 15% (151/1,029) as opposed to 21% (128/605) in 2004, contain data indicating the pathogenic effect of the putative mutation as the result of reconstituting the mutation and seeing the effect on AR protein function. This most probably reflects the scientific certitude that any AR mutation identified must be responsible for the AIS phenotype. However, in light of the number of patients with apparent classical AIS phenotypes that are not directly associated with an AR mutation (Table 2), perhaps it is time to at least reconsider relying largely on the presence of AR mutations to diagnose AIS.

Premature ovarian failure

Recently, a study of the AR in Indian women revealed a number of cases of onset of very early menopause, known as premature ovarian failure (POF), being associated with AR mutations [Panda et al., 2011]. Although this is just a single study, as previous attempts to identify a putative cause for POF have been unsuccessful, looking to identify AR mutations could become a promising line of investigation.

Somatic and multiple mutations

The increasing ability to sequence DNA, not just from blood cells, but also diseased tissues has led to the ability to identify mutations that are somatic, rather than germline in origin. As might be expected, the vast majority of such mutations have been found in prostate tumors and are presumed to be gain-of-function mutations (see below). The number of AIS somatic mutations identified is still relatively small, that is, 25 (Supp. Table S1), nine of which are the result of somatic mosaicism. This is perhaps due to the fact that traditionally most AR sequencing has been done using either genital skin tissue or blood, but not both. It is perhaps interesting to speculate that because almost all sequencing is now done using only blood, in some cases wherein no AR mutations have been identified, somatic mosaicism may exist, with the mutation being present solely in genital tissues or vice versa.

One of the most surprising recent developments has been the identification of individuals that have multiple AR mutations (Supp. Table S1), ranging from two to five mutations in each case. Not surprisingly, most (27/36) have been found in prostate tumors, as they are more than likely to be the result of somatic mutations. The nine cases of multiple AR mutations in AIS individuals are more difficult to explain and will be discussed later.

Gain-of-Function Mutations

Prostate Cancer

To date, 159 AR mutations have been found in CaP tissue (MIM# 176807), almost all being single-base substitutions due to somatic rather than germline mutations. While the majority still occurs in the LBD (∼45%), a substantial minority occurs in exon 1 (∼30%). Originally, it was thought that the AR was not expressed in CaP tissues, but this is clearly not the case [Edwards et al., 2003]. Considerable controversy has revolved around conflicting studies that have only sometimes found a significant number of AR mutations in CaP tissues [Culig et al., 2002]. It has been argued that AR mutations only appear during the later stages of CaP, and in addition, some studies have indicated that anti-androgen treatments have actually resulted in AR mutations [Hyytinen et al., 2002; Steinkamp et al., 2009].

A considerable limiting factor is that, with a few exceptions, experiments to prove pathogenicity have not been done for CaP-associated AR mutations, which considerably reduce the value of the data and, even when done, their significance has still been debatable because of the fact that we are dealing with gain-of-function mutations. To date, there have been a few attempts using bioinformatics analysis to predict the pathogenicity of these mutations, including by ourselves [Mooney et al., 2003], but a consensus on a suitable experimental analysis of AR gain-of-function mutants has yet to be devised. However, a possible means of analysis has been suggested in an article that identified an AR mutant (p.R753Q) in a mouse model of CaP, which has also been found to cause AIS. When transfected into CV-1 cells, very little transactivation was observed, but when transfected into a cancer cell line (PC-3), AR-mediated transactivation levels were in the normal range [O′Mahony et al., 2008]. In fact, this dual transfection type of analysis might prove to be a robust indicator of a cancer-causing gain-of-function mutation.

Splice Variants in CaP

One of the biggest challenges in treating CaP with antiandrogens is that in almost all cases, if treatment continues long enough, these patients will develop resistance to treatment. Recent reports have found one possible explanation with the discovery of a number of AR splice variants within CaP tissues [Guo and Qiu, 2011; Haille and Sadar, 2011]. Almost all these splice variants lack a LBD, which could indicate why antiandrogens no longer function.

Kennedy's disease (Spinobulbar Muscular Atrophy, SBMA)

Kennedy's disease (MIM# 313200), a spinobulbar motor neuronopathy associated with MAIS, is one of the classic trinucleotide repeat expansion diseases that cause inherited neurogenerative disorders. It is the result of an expansion of the glutamine-coding (CAG)8–35 CAA tract in exon 1 of the AR to a total n of at least 38 [Beitel et al., 2005]. The MAIS component of spinobulbar muscular atrophy (SBMA) may reflect a loss of AR transcriptional regulatory activity due to an expanded polyglutamine (polyGln) tract. It should be noted that in SBMA, the AIS phenotype is quite variable. Because subjects with CAIS, including those with complete AR deletions, do not develop SBMA, this knowledge suggests that either the polyCAG-expanded AR or the polyGln-expanded AR protein is somehow motor neuron toxic by a gain-of-function, not a loss-of-function.

The biochemical, histopathologic, and neurophysiologic features of SBMA appear to be secondary to motor denervation. However, a myogenic effect has been shown to be caused by overexpression of wild-type AR within skeletal muscle fibers in an SBMA mouse model [Johansen et al., 2009], suggesting novel pathways that may explain AR gain-of-function. Three new possible causes for AR gain-of-function in SBMA—i.e., ligand-driven interactions, impairment of autophagy, and interruption of axonal transport—that have been postulated since the 2004 ARDB update are listed in Supp. Table S2, but the definitive gain-of-function mechanisms that result in SBMA remain elusive [Ranganathan and Fischbeck, 2010].

AR Structure–Function Relationships

The genetic and clinical significance of all LSDBs is based on the genotype–phenotype relationship. In the ARDB, one of the more surprising finding that challenges this is the variable phenotypic expression of specific mutations in which identical mutations produce different phenotypes. There are now 45 specific mutations reported in the ARDB that are associated with variable phenotypes (Table 3). While several scenarios might explain variable AIS phenotypes, and indeed one such mechanism, that is, somatic mosaicism, has already been proven to cause at least nine such cases [Gottlieb et al. 2001]. The fact that in 10 cases the same mutation can cause both AIS, presumably due to a loss-of-function of the AR and CaP, or presumably due to a gain-of-function of the AR, is considerably more problematic.

Table 3. Androgen Receptor Gene Mutations Associated with Variable Phenotypes
 DNA ChangeExonPredicted proteinPhenotype
  1. CAIS, complete androgen insensitivity syndrome; PAIS, partial androgen insensitivity syndrome; MAIS, mild androgen insensitivity syndrome; CaP, prostate cancer.

  2. The nucleotide and amino acid numbering system in the database is based on the NCBI RefSeq NM_000044.2.

  3. *Mutations associated with cancer phenotypes as well as AIS.

1c.521T>G1p.(Leu174*)CAIS and PAIS
2c.646G>A1p.(Gly216Arg)PAIS, MAIS, and normal
3*c.1174C>T1p.(Pro392Ser)CAIS, PAIS, MAIS, and testicular cancer
4c.1644G>T2p.(Leu548Phe)PAIS and MAIS
5c.1714T>C2p.(Tyr572His)PAIS and MAIS
6*c.1742A>G2p.(Lys581Arg)PAIS and CaP
7c.1792C>A3p.(Ser598Arg)PAIS and normal
8c.1822G>A3p.(Arg608Gln)PAIS and MAIS
9c.1847G>A3p.(Arg616His)CAIS, PAIS, and MAIS
10c.1847G>C3p.(Arg616Pro)CAIS and PAIS
11c.1937C>A4p.(Ala646Asp)CAIS, PAIS, MAIS, and normal
12c.2077_2079delA4p.(Asn693del)CAIS and PAIS (male and female)
13c.2086G>A4p.(Asp696Asn)CAIS, PAIS, and MAIS
14c.2110A>G4p.(Ser704Gly)CAIS and PAIS
15c.2117A>G4p.(Asn706Ser)CAIS and PAIS
16*c.2169G>T4p.(Leu723Phe)CAIS and CaP
17*c.2180G>T5p.(Arg727Leu)Normal and CaP
18*c.2224G>T5p.(Trp742Cys)PAIS and CaP
19c.2231G>T5p.(Gly744Val)CAIS and PAIS
20*c.2233C>T5p.(Leu745Phe)CAIS and CaP
21c.2238T>C5p.(Met746Thr)CAIS and PAIS
22c.2250A>G5p.(Met750Val)CAIS and PAIS
23*c.2256G>A5p.(Trp752*)CAIS, PAIS, and CaP
24c.2249A>G5p.(Asn757Ser)PAIS and MAIS
25*c.2291A>G5p.(Tyr764Cys)PAIS and CaP
26c.2296G>A5p.(Ala766Thr)CAIS and PAIS
27c.2299C>T5p.(Pro767 Ser)CAIS and PAIS
28c.2324G>A6p.(Arg775His)CAIS and PAIS
29c.2343G>A6p.(Met781Ile)CAIS, PAIS, and MAIS
30*c.2359C>T6p.(Arg787*)CAIS and CaP
31c.2367G>T6p.(Arg789Ser)PAIS and MAIS
32c.2382G>C6p.(Glu794Asp)MAIS and normal
33*c.2395C>G6p.(Gln799Glu)PAIS, MAIS, and CaP
34c.2402C>T6p.(Thr801Ile)PAIS and MAIS
35c.2444G>A6p.(Ser815Asn)PAIS and MAIS
36c.2464C>G7p.(Leu822Val)PAIS and MAIS
37*c.2517C>T7p.(Leu839)PAIS and CaP
38c.2521G>T7p.(Arg841Cys)PAIS (male and female)
39c.2522G>A7p.(Arg841His)CAIS and PAIS (male and female)
40c.2528T>C7p.(Ile843Thr)CAIS and PAIS
41c.2567G>A7p.(Arg856His)CAIS, PAIS, and MAIS
42*c.2599G>A7p.(Val867Met)CAIS, PAIS, and CaP
43c.2612C>G8p.(Ala871Gly)PAIS and MAIS
44*c.2659A>G8P(.Met887Val)MAIS and liver cancer
45c.2668G>A8p.(Val890Met)CAIS and PAIS

AR CAG Tract Length Variation as a Risk Factor for Disease and/or Aberrant Physical or Mental Conditions

The database lists the length of the polymorphic CAGnCAA (polyGln) and polyGGCn (polyGly) repeats that are present in exon 1, wherever that data are available. Currently, there are 160 entries in the database that show the length of CAG repeats and somewhat less that show the length of GGC repeats. The mean value of the CAG repeat length is 22.34, which is significantly longer than in controls [Elhaji et al., 2001], although the number of database entries is too small to draw any specific conclusions with regard to their possible role in either AIS or CaP. However, as more data become available, it will be interesting to see whether increases in CAG repeat length, which reduce the efficacy of the AR [Mhatre et al., 1993], play any role in determining the AIS phenotype.

Over the past few years, there has been considerable growth in the number of studies that have examined possible relationships between AR CAG repeat and the risk of getting certain diseases and conditions (see Supp. Table S2). These include female breast cancer (MIM# 114480), CaP, uterine and endometrial cancer (MIM# 608089), all of which have been reviewed by Rajender et al. (2007); colorectal cancer [Di Fabio et al., 2009; Ferro et al., 2002]; esophageal cancer (MIM# 133239) [Dietzsch et al., 2003]; and head and neck cancers (MIM# 275355) [dos Santo et al., 2004]. Other noncancer diseases include male infertility, bone and mineral density, reviewed by Rajender et al. (2007)]; acne, hirsutism, and alopecia [Sawaya and Shalita, 1998]; endometriosis (MIM# 131200) [Shaik et al., 2009] and uterine leiomyoma (MIM# 150699) [Shaik et al., 2009]; Alzheimer's disease (MIM# 104300) [Lehman et al., 2003]; arthritis (MIM# 180300) [Kawasaki et al., 1999]; platelet reactivity [Kuliczkowski et al., 2010]; hypertension (MIM# 145500) [Pausova et al., 2010]; muscle and adipose tissue changes [Nielsen et al., 2010]; autism (MIM# 209850) [Henningsson et al., 2009]; personality traits [Westberg et al., 2009]; and violent criminal behavior [Rajender et al., 2008].

AR Coregulators and Interacting Proteins

The number of both AR coregulators and interacting proteins reported has substantially increased from 70 to 311 since 2004, which has been summarized in Table 4. However, it is important to note that in a number of cases either the protein's properties (10.0%) or its AR interaction domain (45.7%) or both have not yet been characterized. Supp. Table S3 is an extract of the list of associated proteins that is available as an online table at the ARDB Website ( This table also lists the protein function, if it has been identified as a coactivator or corepressor, whether the interaction is direct or indirect, and the AR domain involved, if known. There are also hyperlinked references available in the online version. As expression of these AR interacting proteins could vary between and within individuals, such information may well be useful in explaining, at least in some cases, why identical genotypes result in different phenotypes.

Table 4. Summary Data: Androgen Receptor Coregulators and Interacting Proteins
  1. AR NTD, AR N-terminal domain; DBD, DNA-binding domain; DBDh, DBD-hinge region; LBD, ligand-binding domain.

Coregulator activity
 Not reported/no effect3110.0
Interaction with AR
 Not reported4614.8
AR interaction domain
 Multiple domains/full-length AR4313.8
 Not reported14245.7


AIS without AR Mutations

The classical assumption that the AIS phenotype is directly the result of a mutation in the AR has been challenged by the observation that some individuals with an AIS phenotype have not currently been found to have AR mutations. Traditionally, these individuals were assumed to be the result of mutations in other genes associated with an AIS phenotype. In the case of CAIS, it has been found that mutations in two other genes, SRD5A2 and 17β-hydroxysteroid dehydrogenase-3 (17βHSD3), can cause a CAIS-like phenotype. As noted, we did test for SRD5A2 mutations, but not for 17βHSD3 mutations. However, testosterone levels in our patients were high in most cases, which is not usual in patients with 17βHSD3 mutations. Further, we followed a significant number of our patients through puberty and none of them underwent any significant virilization, which is another classical feature of 17βHSD3 deficiency [Lee et al., 2007]. Nevertheless, the number of CAIS cases in which no AR mutations were found is substantially greater than that has been previously reported. The most obvious suggestion being that perhaps these patients are not suffering from CAIS at all, but from a phenotypically similar, but genetically different disease. In particular, those patients who had normal AR binding would seem to be prime candidates for such a conclusion. However, as pointed out, there are 30 reported CAIS patients who have AR mutations, but normal AR binding. One possible explanation is that a postligand-binding factor is involved, such as one of the numerous interacting proteins, but such a hypothesis will require a much more detailed analysis.

For PAIS cases, although the percentage without mutations was significantly higher, it was not as high as in the Cambridge AIS database (Dr. John Davies, personal communication). It has been suggested that mutations in steroidogenic factor 1 or mastermind-like domain containing 1 (MAMLD1) genes might also be responsible for a PAIS phenotype. However, recent examination of 28 PAIS syndrome patients found only one patient with a double polymorphism of MAMLD1 [Gaspari et al., 2011].

Another possibility is that mutant coregulator proteins might affect normal AR functioning. However, as noted in a review by Achermann and Hughes (2008), in almost all such cases, mutations in cofactor genes have not been found. Although the discovery of AIS individuals without AR mutations has been assumed to be an extremely rare situation and so has been largely ignored, the detailed analysis of the LDI AIS database suggests that such cases are in fact not so rare. Indeed, two studies on PAIS have already reported no AR mutations in 85–90% of sporadic cases and 10–15% familial cases [Sultan et al., 2002], or in 72% of all cases [Ahmed et al., 2000]. Further, if mutations in other genes or in cofactor mutations are not directly involved, then one possibility is that the answer might lie in a more detailed examination of the AR protein and AR gene. In particular, the use of NGS of genital tissues might reveal additional subtleties of AR mutations and their expression. The immediate problem is that diagnosis of AIS has now become increasingly based on finding AR mutations in either genital skin tissues or blood. This has resulted in an increasing number of individuals with classical AIS symptoms that have been left in diagnostic limbo when no AR mutations are identified.

Loss- and Gain-of-Function Mutations

As reported in the ARDB, the somatic mutations found in CaP tissues seem to elicit a gain-of-function as opposed to a loss-of-function. This results in individuals who have a normal male phenotype except that they suffer from CaP. In fact, in the case of CaP tumors, which like most tumors consist of heterogeneous tissue, there is often a wide range of cell types within each tumor, ranging from normal to advanced cancers, which may indicate a concomitant genetic heterogeneity within the tumors. Thus, mutation databases will have to reflect such possible genetic variation, by much more closely identifying the tissue phenotype associated with a specific gene alteration, whether a mutation or polymorphism. Fortunately, the arrival of techniques such as laser capture microdissection (LCM) has allowed us to examine genes in very specific cell and tissue types. Over the last few years, we have used LCM to study the much examined relationship between AR CAG repeat length and both prostate [Alvarado et al., 2005; Sircar et al., 2007] and breast cancer [Gottlieb et al. 2010]. Recently, the idea that the same gene could acquire mutations that result in both loss- and gain-of-function was considered impossible. However, studies of such well known genes, such as the tumor suppressor gene, p53, have shown that though most identified mutations cause a loss-of-function, mutations have now been found that promote tumorigenesis, that is, cause a gain-of-function [Brosh and Rotter, 2009]. The ability of genes such as the AR to have more than one function is clearly likely to have considerable significance for single-locus-specific databases such as the ARDB.

Such findings provide more motivation for confirming the pathogenicity of mutations, although this of course cannot be done conclusively except within the context of the whole organism. A mutant protein can be expressed to determine whether its functionality is impaired, which would at least indicate that a particular gene alteration could result in a disease phenotype. Finally, we report on the highly unusual phenomenon of the same AR mutation associated with both a loss-of-function and a gain-of-function (Table 3). These observations further complicate the already problematic relationship between genotype and phenotype and strongly suggest that some posttranslational event might well be responsible for what would appear to be very different protein products. In particular, the role of associated proteins in a particular cellular environment may well be crucial in determining the nature and functionality of the proteins produced.

Multiple AR Mutations within the Same Tissue Samples

The presence of multiple mutations within tissues from single individuals poses a number of interesting questions from a genetic perspective, including the following: How might an individual organism acquire multiple mutant forms of specific genes that each result in a loss-of-function? Indeed, the issue of multiple single-gene mutations within an individual organism has become an important genetic discussion [Drake, 2007]. However, in the case of gain-of-function mutations, that is, in CaP, it might be a little easier to understand in light of certain hypotheses about cancer that are based on tissues accumulating mutations in a number of genes. Therefore, it is perhaps not surprising that the majority of cases of multiple AR have been found in CaP tissues. However, in a recent study in which we examined AR CAG repeat length variation in breast cancer tissues using NGS techniques [Gottlieb et al., 2010], we found that even with a relatively small number of cells (c2500), there appeared to be a number of different variant ARs present. Further, these different ARs seemed to be present in different cells, which may be the case, not just due to CAG repeat length variation, but also in cases of mutations due to sequence variation. Therefore, it is possible that multiple AR mutations are much more common than previously thought, even in tissue samples from AIS individuals, but are undetectable because of the limitations of traditional sequencing techniques. Finally, the presence of a minority of cells with different ARs, including in some cases cells with wild-type genes, could explain how the same AR mutation appears to result in different AIS phenotypes, due to the presence of different AR mutations within the same tissues.


Presently, most LSDBs have been created with the intention of listing and organizing gene mutations and their phenotypic expression in order to correlate specific phenotypes with specific genotypes. The finding of AIS cases without any obvious AR mutation, the identification of mutations that can confer both a gain-of-function and a loss-of-function, and the discovery of multiple mutations within the same diseased tissues, all suggest that such correlations are likely to become increasingly more complex. As we mentioned in our last update [Gottlieb et al., 2004], it had already become clear that the role of LSDB curators was becoming more, rather than less, important, particularly as it related to using LSDBs as tools to help in understanding specific structure–function relationships. However, as further details about the AR and its relationship to its produced protein have been uncovered, it is now clear that solely identifying the genotype of an AIS individual may no longer be sufficient to draw conclusions about the effect of an AR mutation on the persons' phenotype. In particular, tools such as NGS, mass spectroscopy, and molecular dynamic modeling have begun to reveal many more details of the factors that can result in a particular phenotype. Thus, it would seem that the time when gene mutational data obtained from blood is the only significant information required to determine the cause of genetic diseases and conditions is coming to an end. It is clearly time to take a much broader approach to construct genetic databases, wherein other factors, not just the sequence of genes from blood, need to become an integral part of genetic databases.


We thank the Canadian Institutes of Health Research, the Prostate Cancer Research Foundation of Canada, and the Kennedy's Disease Association for supporting our work on AR mutations and the various diseases associated with them.