- Top of page
- Materials and Methods
ABSTRACT: The androgen receptor (AR) is important in reproductive organ development, as well as tissue homeostasis of the pancreas, liver, and skeletal muscle in adulthood. The trinucleotide (CAG)n repeat polymorphism in exon 1 of the AR gene is thought to regulate AR activity, with longer alleles conferring reduced receptor activity. Therefore, the evaluation of the allelic distribution of the AR (CAG)n repeat in various ethnic groups is crucial in understanding the interindividual variability in AR activity. We evaluated ethnic variation of this AR polymorphism by genotyping individuals from the multiethnic Hyperglycemia and Adverse Pregnancy Outcome study cohort. We genotyped 4421 Caucasian mothers and 3365 offspring of European ancestry; 1494 Thai mothers and 1742 offspring; 1119 Afro-Caribbean mothers and 1142 offspring; and 780 Hispanic mothers and 770 offspring of Mexican ancestry from Bellflower, California. The distributions of (CAG)n alleles among all 4 ethnic groups are significantly different (P < .0001). Pairwise tests confirmed significant differences between each pair of ethnicities tested (P < 10−28). The relative AR (CAG)n repeat length in the different groups was as follows: Afro-Caribbean (shortest repeat lengths and greatest predicted AR activity) < Caucasian < Hispanic < Thai (longest repeat length and lowest predicted AR activity). Significant interethnic differences in the allele frequencies of the AR exon 1 (CAG)n polymorphism exist. Our results suggest that there may be potential ethnic differences in androgenic pathway activity and androgen sensitivity.
The human androgen receptor (AR) gene is located on chromosome Xq11.2-q12 and codes for a protein with 3 major functional domains: the N-terminal transactivation domain, the DNA-binding domain, and the androgen-binding domain. The protein functions as a steroid hormone–activated transcription factor, which signals through classical and nonclassical signaling pathways (Katzenellenbogen and Katzenellenbogen, 1996; Rahman and Christian, 2007). AR is expressed in many tissues during development and adulthood, influencing an enormous range of physiologic processes including organ and tissue growth and differentiation, cognition, muscle hypertrophy, bone density, and insulin levels (Gelmann, 2002).
The AR gene contains a highly polymorphic (CAG)n repeat in exon 1 encoding a glutamine tract in the N-terminal transactivation domain of the protein, which becomes active only after AR binds to its ligand (Palazzolo et al, 2008). The polyglutamine tract length is inversely correlated to the transcriptional competence of the receptor, with longer tracts being associated with lower levels of AR-mediated transcription in both normal and disease states (Beilin et al, 2000; Buchanan et al, 2004; Harada et al, 2010).
Expansion of the AR (CAG)n beyond the normal range (>38 repeats) causes spinal and bulbar muscular atrophy (SBMA), also known as Kennedy disease, which is an X-linked recessive form of spinal muscular atrophy. The expansion of AR (CAG) repeat has been implicated in SBMA either through loss of normal AR function, which induces neuronal degeneration, or through the pathogenic AR, which acquires a toxic property, thereby damaging motor neurons (Katsuno et al, 2006). Interestingly, some populations of affected individuals display a variety of endocrinologic symptoms including gynecomastia, infertility, androgen insensitivity, and increased incidence of type 2 diabetes (Mariotti et al, 2000), which would indicate a global disruption of AR function.
AR (CAG)n repeats within the normal range (10–36 repeats) are associated with numerous endocrine cancers, male and female infertility, and many other neurological and endocrine conditions (Palazzolo et al, 2008; Wu et al, 2008; Chatterjee et al, 2009; Ludwig et al, 2009; Shaik et al, 2009). There is a continuous relationship between AR (CAG)n repeat length and metabolic traits in Caucasian men, where repeat number correlates negatively with AR sensitivity and positively with body fat, insulin levels, and leptin in healthy and diabetic men (Zitzmann et al, 2003). Additionally, testosterone and luteinizing hormone levels are higher in diabetic men with >24 repeats, reflecting reduced negative feedback through a less sensitive receptor (Stanworth et al, 2008). In women with polycystic ovary syndrome, a disorder diagnosed by hyperandrogenism, association between testosterone and insulin resistance may be modified by the (CAG)n repeat polymorphism (Mohlig et al, 2006). Therefore, ethnic differences in allele distributions of this marker could have far-reaching impact on diverse conditions in men and women.
Here we report the allelic distribution of the (CAG)n polymorphism in Caucasian, Afro-Caribbean, Hispanic, and Thai populations from the Hyperglycemia and Adverse Pregnancy Outcome (HAPO) cohort. To examine this, we genotyped 14 833 mothers and neonates of these 4 ethnicities and tested for differences in allele distributions of the AR (CAG)n repeat.
- Top of page
- Materials and Methods
Characteristics of the HAPO Study participants and outcomes have been previously reported (HAPO Study Cooperative Research Group et al, 2008; HAPO Study Cooperative Research Group, 2009). The Caucasian population (4421 mothers, 3365 neonates) represented the largest group in this study, and the Hispanics (780 mothers, 770 neonates) were the smallest group. The numbers of samples and chromosomes genotyped for each center and ethnicity are given in Table 1. The ethnic distribution was 52.5% Caucasian, 21.8% Asian, 15.2% Afro-Caribbean, and 10.5% Hispanic.
Table 1. . Maternal and neonate DNA samples by ethnicity successfully genotyped at the AR (CAG)n locus
| || ||Mothers||Neonates||Total|
|Center||Ethnicity||No. Samples||No. Chromosomes||No. Samples||No. Chromosomesa||No. Samples||No. Chromosomes|
|Belfast||Caucasian (United Kingdom)||1331||2662||1045||1537||2376||4199|
|Manchester||Caucasian (United Kingdom)||982||1964||548||805||1530||2769|
|Toronto||Caucasian (North America)||654||1308||651||977||1305||2285|
| ||Total Caucasian||4421||8842||3365||4980||7786||13 822|
|Bellflower||Hispanic (North America)||780||1560||770||1156||1550||2716|
| ||Total||7814||15 628||7019||10 437||14 833||26 065|
The CEPH control samples genotyped with an 89.5% success rate and a plate-to-plate concordance of 99.5%. Two percent of the HAPO participant samples were selected randomly and regenotyped without knowledge of previously assigned genotypes. Among these, 99.9% concordance was observed. We obtained AR (CAG)n genotypes for 90.6% of the samples. AR allele designations represent the number of repeats; for example, allele 6 (A06) has 6 CAG repeats. We identified 34 alleles at the AR (CAG)n locus: A06 and A08–A40. The frequency of each allele varied widely across populations, and distribution of the alleles by ethnic group for mothers and neonates combined is presented in the Figure. The range of alleles differed among ethnicities, with the Thais and Caucasians having the largest range (6–40 repeats, 32 alleles total) and the Afro-Caribbeans having the smallest range (9–37 repeats, 24 alleles total). The Caucasians, our largest group (7786 individuals, 13 822 chromosomes), had the highest number of rare alleles (frequency <.01, n = 18), and the Afro-Caribbeans, our second smallest group (2261 individuals, 3933 chromosomes), had the fewest rare alleles (n = 6). A20 through A23 were common to all ethnicities (frequency >.05). The most common CAG repeats were A18 in Afro-Caribbeans (.17), A21 in Caucasians (.19), A23 in Hispanics (.15), and A22 in Thais (.21). The mean CAG repeat lengths were Afro-Caribbean 19.6 ± 3.2, Caucasian 21.9 ± 2.9, Hispanic 22.6 ± 3.1, and Thai 23.1 ± 3.3. These results are slightly different than those published by Edwards et al (1992), and are likely due to differences in sample size, stringency of ethnic classification, and location of participant recruitment.
Figure Figure. . Distribution of AR (CAG)n alleles by ethnic group in the Hyperglycemia and Adverse Pregnancy Outcome study. Top panel: only alleles with a frequency >.001 are shown. The alleles within the inner box are the alleles used for the χ2 analysis for all ethnicities. Lower left panel: individuals in each ethnicity with the lowest number of CAG repeats. Lower right panel: individuals in each ethnicity with the highest number of CAG repeats. White bars indicate Afro-Caribbean participants; dark gray, Caucasians; black, Hispanics; light gray, Thais. n = number of chromosomes.
Download figure to PowerPoint
Because the maternal and fetal genotypes are not independent (eg, neonates inherit 1 of the mother's alleles), all statistical tests were performed on maternal genotype only. The numbers of chromosomes and their corresponding frequencies for each allele used in the maternal χ2 tests are presented in Table 2. All 4 ethnic populations were in HWE. Because of the large number of rare alleles in each ethnicity and the presence of population-specific alleles, we performed a Pearson's χ2 test on all ethnicities together using only alleles A13–30, which revealed significant differences in allele distributions between the 4 ethnic groups (P < .0001; Table 3). These differences were confirmed using the Monte Carlo test on all alleles present in each population (P < .001). Separate χ2 analyses were also performed for pairwise comparisons of ethnicities, for example, Caucasian vs Hispanic, for all 6 pairwise comparisons. For these tests we maximized the alleles used for each pair, because a slightly different subset of alleles qualified for the pairwise χ2 analyses. These analyses confirmed that there are highly significant differences between each pair of ethnicities tested (P < 10−28), even after adjusting for multiple testing (Table 3).
Table 2. . Number of chromosomes and their corresponding frequencies for each allele used in the maternal χ2 analysesa
|Allele||No. Chromosomes||Frequency||No. Chromosomes||Frequency||No. Chromosomes||Frequency||No. Chromosomes||Frequency|
Table 3. . Results of the Pearson χ2 analyses testing for differences in maternal allele distribution between ethnicitiesa
|Populations||Alleles Tested||No. Chromosomes Tested||df||χ2 Value||P Value|
|All four ethnicities||A13–30||15 496||51||3215.9||<.0001b|
|Afro-Caribbean vs Caucasian||A13–30||11 006||17||1716.7||<.0001b|
|Afro-Caribbean vs Hispanic||A14–29||3756||15||750.0||<.0001b|
|Afro-Caribbean vs Thai||A13–31||5162||18||1621.3||<.0001b|
|Caucasian vs Hispanic||A14–30||10 334||16||203.1||1.86 × 10−34|
|Caucasian vs Thai||A13–32||11 740||19||904.5||1.26 × 10−179|
|Hispanic vs Thai||A14–29||4490||15||170.0||2.39 × 10−28|
- Top of page
- Materials and Methods
We genotyped over 14 800 individuals (26 065 chromosomes) at the AR (CAG)n locus in 4 ethnic groups from the HAPO cohort. This is the first study to document AR (CAG)n allele frequencies in 4 populations of this magnitude.
Interethnic differences in the distribution of genetic polymorphisms and how these affect human susceptibility to various disorders, including the development of type 2 diabetes, have been documented (Chambers et al, 2009; Tan et al, 2010). In multiethnic studies on SBMA, the number of AR (CAG)n repeats seen in affected individuals typically ranges from 35 to 57 repeats, but differs by ethnicity (Lund et al, 2001). Differences in the AR (CAG)n allelic distribution, which are expected to result in population-based differences in androgen sensitivity, may be due to either natural selection of alleles by environmental factors or fixation of allele frequency through founder effects.
In this study, Caucasians had the highest number of rare alleles and Afro-Caribbeans had the fewest. This is likely because of different sample sizes for each population as well as the fact that the Afro-Caribbean group is an island population and may thus have a smaller effective population size. Interestingly, within our Caucasian, Hispanic, and Thai populations, there are a very small number (<0.25%) of individuals with repeat lengths that fall within the disease range for SBMA (≥35 repeats). After sequencing individuals to confirm actual number of repeats, we had 19 mothers (9 Thai, 3 Hispanic, 7 Caucasian), 9 female neonates (3 Thai, 3 Hispanic, 3 Caucasian), and 1 male Caucasian neonate with 35–40 repeats. In addition, there are a number (<2%) of individuals in each ethnicity with very short (≤12) repeats (n = 28–61). These very short repeats have been observed to contribute to phenotypes indicative of AR malfunction, indicating that repeat lengths on either side of the normal range are associated with impaired receptor function (McPhaul et al, 1991). This has been further corroborated by cell culture studies that have revealed a critical range (16–29 CAG repeats) for maintenance of the protein's stability and function (Buchanan et al, 2004).
Age of onset of puberty for both males and females varies by ethnicity (NHANES, 1997). These ethnic differences may be explained by differences in hormone levels or in hormone action in target tissues or potentially by the ethnic variation in the AR described here. Furthermore, these ethnic differences in AR (CAG)n may also contribute to interindividual variation in terms of disease risk, disease severity, and drug response. Given the association between AR activity and (CAG)n repeat length, our study suggests that an individual's response to androgen and androgenic therapies might vary based on AR (CAG)n genotypes.
In conclusion, we have shown that there are significant ethnic differences in allele distribution of the AR (CAG)n polymorphism. This finding provides insight into ethnic differences in normal reproductive processes, as well as predicting risk and severity for a number of diseases. Our study provides an important starting point for better understanding the molecular basis underlying ethnic differences in androgen sensitivity and underscores the importance of using large multiethnic cohorts to examine genetic variation at known disease loci.