Fragile X Syndrome: The FMR1 CGG Repeat Distribution Among World Populations


  • Emmanuel Peprah

    1. Center for Research on Genomics and Global Health, National Human Genome Research Institute, National Institute of Health, Bethesda, MD, USA
    Search for more papers by this author

Emmanuel Peprah, Center for Research on Genomics and Global Health, National Human Genome Research Institute, National Institute of Health, Building 12A, RM 4047, 12 South Dr. MSC 5635, Bethesda, MD 20892, USA. Tel: 301-402-65630; Fax: 301-451-5426; E-mail:

Previous address: Department of Human Genetics, Emory University School of Medicine, 615 Michael Street, RM 335, Atlanta, GA 30322, USA.


Fragile X syndrome (FXS) is characterized by moderate to severe intellectual disability, which is accompanied by macroorchidism and distinct facial morphology. FXS is caused by the expansion of the CGG trinucleotide repeat in the 5′ untranslated region of the fragile X mental retardation 1 (FMR1) gene. The syndrome has been studied in ethnically diverse populations around the world and has been extensively characterized in several populations. Similar to other trinucleotide expansion disorders, the gene-specific instability of FMR1 is not accompanied by genomic instability. Currently we do not have a comprehensive understanding of the molecular underpinnings of gene-specific instability associated with tandem repeats. Molecular evidence from in vitro experiments and animal models supports several pathways for gene-specific trinucleotide repeat expansion. However, whether the mechanisms reported from other systems contribute to trinucleotide repeat expansion in humans is not clear. To understand how repeat instability in humans could occur, the CGG repeat expansion is explored through molecular analysis and population studies which characterized CGG repeat alleles of FMR1. Finally, the review discusses the relevance of these studies in understanding the mechanism of trinucleotide repeat expansion in FXS.


Fragile X syndrome (FXS; OMIM 300624) is caused by the expansion of the CGG repeat in the 5′ untranslated region (UTR) of the fragile X mental retardation 1 (FMR1) gene (OMIM 309550) located on the X chromosome (Fu et al., 1991; Verkerk et al., 1991). The prevalence of FXS is estimated at ∼1/4000 males and ∼1/8000 females, which has been substantiated by other reports (Murray et al., 1996; Turner et al., 1996; Crawford et al., 2001; Garber et al., 2006; Coffee et al., 2009). In over 98% of the cases, FXS is caused by expansion of the triplet repeats; in addition, others have reported that rare single point mutations and genetic variants also cause FXS without expansion of the CGG repeat (De Boulle et al., 1993; Tarleton et al., 2002; Collins et al., 2010). Non-CGG genetic variants account for about ∼1% of mutations (Collins et al., 2010) with length of the CGG being the most important genetic variant which causes FXS and determines the carrier status of individuals. For example, individuals with 5–45 copies of the CGG repeats are unaffected, those with 45–54 CGG repeats are called intermediates or “grey zone,” those with 55–199 CGG repeats are classified as permutations, and individuals with >200 CGG repeats classified as having a full mutation with associated intellectual and developmental disability (Kronquist et al., 2008). The CGG repeat is unstable over a specific threshold, for example premutation carriers can expand to full mutation upon transmission from female to offspring (Fu et al., 1991). Repeat expansions in the intermediate or “grey zone” have variable expansion characteristics; this is attributed to familial factors that influence the stability of the repeat upon transmission to offspring (Nolin et al., 1996). Examination of gametes from foetuses that harbour the FXS mutation showed that the FMR1 mutation exists in maternal oocytes in the unmethylated state (Malter et al., 1997). Individuals with FXS receive the full mutation allele from their mothers, because sperm from full mutation males carry only premutation alleles; however, some reports demonstrate that asymptomatic males can transmit the full mutation to offspring (Zeesman et al., 2004).

The lengthening of the CGG repeat, the cause of FXS, is hypothesized to occur with the addition of length specific interruptions (e.g., AGG, CGA, or CGGG) at the distal end of the CGG array with incremental additions of smaller CGG arrays (Eichler et al., 1995a). The molecular basis of CGG repeat lengthening is suggested to have arisen from independent mutational events with rapid proliferation of interspersion events (Eichler et al., 1995a). Homogeneity of the interspersions are incompatible with known rates of mutation and random mutation theory, suggesting a short evolutionary period for CGG repeat polarized lengthening (Miyamoto et al., 1987; Eichler et al., 1995b). This polarized lengthening mechanism could have occurred via recombination (i.e., unequal chromatid exchange), gene conservation, or replication slippage suggesting a complex mutational history in primates (Eichler et al., 1995b).

Genetic Basis

The protein product of the FMR1 gene, fragile X mental retardation protein (FMRP) is a highly conserved protein found in primate species and other mammals (Eichler et al., 1995b). FMRP is an mRNA binding protein expressed in various tissues and is essential for neuronal and intellectual development (Bassell & Warren, 2008). FMRP inhibits translation of numerous genes involved in synaptic plasticity by altering the expression of these genes via mRNA sequestration (Bassell & Warren, 2008). The localization of FMRP with the polyribosomes of dendritic spines suggests that FMRP can regulate local protein synthesis important for spine development and synaptic plasticity which are essential processes for learning and intellectual development (Antar & Bassell, 2003; Antar et al., 2005). In the absence of FMRP, dysregulation of local translation of mRNA occurs, leading to imbalance in the spatial and temporal control of protein levels at synaptoneurosomes (Muddashetty et al., 2007). Individuals with FXS display long, thin, and immature dendritic spines, which are similar to the dendritic spine morphology of Fmr1 knockout (KO) mice (Comery et al., 1997; Grossman et al., 2006; Mineur et al., 2006; Baker et al., 2010). In addition, Fmr1 KO mice display the learning behaviours, which are also associated with FXS (Grossman et al., 2006; Mineur et al., 2006; Baker et al., 2010).

The CGG repeat in FMR1 is transcribed into mRNA, but the translation initiation site is downstream of the CGG repeat, thus the repeat is not translated (Tassone et al., 2011). The length of the CGG is shown to be inversely associated with translational efficiency as shorter CGG repeats allow for efficient translation (Ludwig et al., 2011; Tassone et al., 2011). Beyond a certain threshold, the length of CGG repeats decreases translational efficiency resulting in both increased FMR1 expression but decreased FMRP production (Tassone et al., 2007; Peprah et al., 2010a). When the FMR1 CGG repeat expands to the full mutation, methylation of the CGG repeats occurs. The expanded CGG tract is recognized as a CpG island, which significantly decreases transcription of FMR1 resulting in ablation of FMRP expression (Godler et al., 2010).

Clinical Manifestations

Premutation carriers have increased FMR1 transcript levels with decreased FMRP levels (Tassone et al., 2007). FXS adult males tend to be tall, have macroorchidism, a prominent forehead, a long narrow face with highly arched palate, prominent mandible, and large ears which become more pronounced with age (Terracciano et al., 2005). Females with FXS have the typical long face and mandibular prognathism phenotype seen in affected males, and large averted ears (Terracciano et al., 2005). Affected individuals of both sexes also have delayed speech and intellectual disability with an IQ range between 20 and 70 (Terracciano et al., 2005). Mosaicism of FXS has been observed; these individuals have IQ that varies from high functioning to moderate or low functioning (Fengler et al., 2002; Han et al., 2006). Psychiatric and mood disorders have been examined in premutation carriers. Several reports indicate a significant association of psychiatric and mood disorders in both male and female premutation carriers (reviewed by Bourgeois et al., 2009). Further work is needed to delineate between disorders not associated with premutation carrier status (i.e., environmental cause and/or life circumstances) and psychiatric disorders attributed to the FMR1 premutation allele (Bourgeois et al., 2009).

Female premutation carriers

Traditionally, it was believed that carriers of the FMR1 premutations were clinically normal; however, recent data has indicated that these individuals have problems associated with their carrier status. Recently, increased psychological symptoms in premutation carriers have been reported (Hessl et al., 2005). In females, one third of individuals with the full mutation have mild intellectual impairment with associated behaviours including shyness, poor eye contact, and learning disabilities (Terracciano et al., 2005).

The length of the CGG repeat contributes to the variation in age at menopause. The FMR1 repeat size in the intermediate or grey zone is associated with an increased risk of fragile X-associated premature ovarian insufficiency (FXPOI; Bretherick et al., 2005; Bodega et al., 2006). FXPOI is defined as menopause before the age of 40 associated with FMR1 premutation carrier status (Kenneson & Warren, 2001; Bodega et al., 2006; De Caro et al., 2008). When the FMR1 repeat size exceeds 79 CGG repeats, the risk for ovarian dysfunction is clinically significant, however this risk appears to plateau or decrease among women with very long CGG repeats (Ennis et al., 2005; Sullivan et al., 2005).

Several groups have demonstrated that female premutation carriers have a higher incidence of FXPOI when compared to women in the general population (Kenneson & Warren, 2001; Bodega et al., 2006; De Caro et al., 2008). It is estimated that approximately 20–28% of female premutation carriers manifest FXPOI (Oostra & Willemsen, 2003; Welt et al., 2004). The hormonal changes exhibited by these women are consistent with early ovarian aging attributed to decreased follicle number and function (Welt et al., 2004).

Clinical effects of FXPOI are loss of fertility and hypoestrogenism (Woad et al., 2006; De Caro et al., 2008). Because of the serious consequences of FXPOI, women who experience ovarian dysfunction atypical for their age without another medical explanation are being tested in increasing numbers for the FMR1 premutation (Pastore et al., 2006). However, pregnancy has occurred in 5–10% of women whose diminished ovarian function lead to a diagnosis of FXPOI (Kalantaridou et al., 1998; Woad et al., 2006).

Male premutation carriers

Male carriers of premutation alleles exhibit mechanistically distinct problems from female carriers (Terracciano et al., 2005). Evidence suggests that premutation males have a reduced ability to recruit the left hippocampus during recall (Koldewyn et al., 2008). Premutation males performed significantly worse on immediate recall tasks compared to age matched controls (Koldewyn et al., 2008). Examination via functional magnetic resonance imaging in premutation males indicated a reduced amygdala volume, with reduced FMRP expression being one of the primary factors for alteration of brain function and behaviour (Hessl et al., 2011).

Fragile X-associated tremor/ataxia syndrome (FXTAS) is estimated to occur in 30% of male premutation carriers (Hagerman et al., 2001; Hagerman et al., 2008). FXTAS is a significant cerebral and cerebellar white matter disease, and affected males exhibit signs of onset of tremor in their 50s with gradual progression of symptoms to incorporate ataxia (Hagerman et al., 2001; Hessl et al., 2005; Greco et al., 2006). The neuropathological characteristics of FXTAS have been extensively characterized (Hagerman et al., 2001; Hessl et al., 2005; Greco et al., 2006). Neurohistological studies of the brains of symptomatic elderly premutation carriers have demonstrated that neuronal degeneration occurs with the presence of eosinophilic intranuclear inclusions in both neurons and astroglia (Oostra & Willemsen, 2003; Greco et al., 2006; Iwahashi et al., 2006). Iwahashi et al. (2006) examined the inclusions in the brains of premutation elderly males and found several inclusion-associated proteins. Surprisingly, there were no dominant protein species in the inclusions and ubiquitinated proteins represented a minor component (Hagerman et al., 2001; Greco et al., 2006; Iwahashi et al., 2006). In FXTAS, inclusion formation is not because of a lack of proteasomal degradation of nuclear proteins but attributed to a gain of function by the FMR1 transcript (Handa et al., 2005; Garber et al., 2006). Female carriers also develop FXTAS, but the symptoms are less severe compared to male premutation carriers (Hagerman et al., 2004).

Genetic Studies of the FMR1 CGG Repeat in Diverse Populations

FXS has been studied extensively in several western European populations. In most studies, analysis of the CGG repeat number has been undertaken due to its ability to expand to the full mutation and because of the corresponding associated diseases found in premutation carriers (Willemsen et al., 2011). In addition, various methods have been used to determine the CGG repeat sizes in different reports making cross-population comparisons difficult; however several reports used protocols by Fu et al. (1991) making cross-population comparisons at the CGG repeat possible. In FMR1, 30 and 29 copies of CGG repeats are the most common numbers of repeats found in western European ancestry populations (Buyle et al., 1993; Oudet et al., 1993a, b; Malmgren et al., 1994; Tranebjaerg et al., 1994; Matilainen et al., 1995; Syrrou et al., 1996; Arrieta et al., 1999; Table 1). There is substantial evidence of a strong founder effect in western European populations (Chakravarti, 1992; Richards et al., 1992; Buyle et al., 1993; Oudet et al., 1993b; Malmgren et al., 1994; Chiurazzi et al., 1996b). However, the founder effect is not present in eastern European populations of Slavic origin (Dokić et al., 2008). Within western European populations, significant differences in allelic and haplotypic distributions exist between normal chromosomes found in the general population and chromosomes that harbour the full mutation which causes FXS (Rousseau et al., 1995; Crawford et al., 2001). This particular distribution of normal and fragile X chromosomes is hypothesized to occur because a limited number of primary events may have been at the origin of most present-day chromosomes that harbour the full mutation in founder western European populations (Chakravarti, 1992; Morton & Macpherson, 1992). Such founder chromosomes may have carried a number of CGG repeats in an upper-normal range or “grey zone,” from which recurrent multistep expansion mutations could have arisen (Buyle et al., 1993; Oudet et al., 1993a, b; Malmgren et al., 1994).

Table 1.  Distributions of CGG repeat alleles in among world populations adapted from Sharma et al. (2001).
Country (population)Sample sizeNo. of CGG repeat variants (range)Common repeat(s)Reference
Brazil (African Brazilians)25526 (15–43)30, 29Mingroni-Netto et al. (2002)
Brazil (Ameridians)4626 (15–43)30, 29Mingroni-Netto et al. (2002)
Brazil (European Brazilians)6426 (15–43)30, 20Mingroni-Netto et al. (2002)
Cameroon (Bamileke, Bororo, and Sanga)4722 (22–41)29, 30Chiurazzi et al. (1996a)
Chile (European and Asian ancestry)NR(192 X chrs screened)19 (19–44)30, 29Jara et al. (1998)
China (Chinese ancestry)17716 (19–40)29, 30Zhou et al. (2006)
Croatia (European ancestry)7426 (20–45)30, 31Dokić et al. (2008)
Ghana (multi-ethnic)35023 (18–54)30, 29Peprah et al. (2010b)
India (multi-ethnic)26526 (19–50)29, 28Sharma et al. (2001)
India (multi-ethnic)9915 (19–40)30, 29Zhou et al. (2006)
Indonesia (multi-ethnic)106932 (NR)29, 30Faradz et al. (2000)
Indonesia (Malay)17818 (19–40)29, 30Zhou et al. (2006)
Japan (Japanese ancestry)94624 (6–54)27, 26Otsuka et al. (2010)
Mexico (Mestizos)20723 (15–87)32, 30Barros-Nunez et al. (2008)
Mexico (Tarahumaras)14013 (15–87)32, 30Barros-Nunez et al. (2008)
Mexico (Huichols)13814 (19–87)30, 29Barros-Nunez et al. (2008)
Mexico (Western region)12923 (16–76)32, 30Rosales-Reynoso et al. (2005)
United Kingdom (multi-ethnic)25430 (13–49)30, 29Jacobs et al. (1993)
United States (African American)21332 (14–55)30, 29Crawford et al. (2000a)
United States (European ancestry)20039 (11–56)30, 29Crawford et al. (2000a)

Faradz et al. (2000) conducted an extensive survey of male samples in 12 subpopulations in Indonesia. In the population, 32 different CGG repeat alleles were present (Faradz et al., 2000). Twenty-nine and 30 CGG repeats accounted for 72% of the alleles present in the population. Twenty-nine repeats was the most frequent which was similar to Chinese ancestry populations (Faradz et al., 2000; Zhou et al., 2006). The Indonesian population showed a much lower frequency of CGG repeat alleles with fewer than 29 repeats and a higher frequency of alleles greater than or equal to 36 repeats when compared to western European ancestry populations (Faradz et al., 2000). The data was similar to other Asian populations in which the 29 allele is present at a higher frequency than the 30 allele (Faradz et al., 2000; Zhou et al., 2006; Chiu et al., 2008; Table 1). FXS is present in 2.8–8.6% of the intellectually disabled institutionalized males from the Japanese and Chinese populations, respectively (Arinami et al., 1986; Zhong et al., 1995). In the Chinese populations, the most common CGG repeat alleles are 29 followed by 30 (Zhong et al., 1994; Zhong et al., 1995; Tzeng et al., 1999).

In Mexican populations, the trinucleotide repeat number varied from 16 to 40 (Rosales-Reynoso et al., 2005). The modal repeat number of 32, a second peak at 30, and a minor peak at 34 were detected within this population (Rosales-Reynoso et al., 2005). The 32 repeat is the most frequent allele for Mestizos and Tarahumaras in the Mexican population (Barros-Nunez et al., 2008). Huichols display the 30 and 29 profile found in other populations (Barros-Nunez et al., 2008). Note that 10.5% of the Mexican population had larger repeats (i.e., 34+ repeats), which is similar to patterns observed in Indonesian and Chinese ancestry populations (Rosales-Reynoso et al., 2005). Rosales-Reynoso et al. (2005) concluded that the Mexican population, with a significant number of large alleles (34–40), would be at a higher risk for allelic expansion. However, cytogenetic expression of the Xq27.3 fragile site showed no statistical differences when compared with those from other populations (Diaz-Gallardo et al., 1995; Gonsalez-del Angel et al., 2000).

Data collected in Brazil among different ethnic groups including samples from quilombos, Amerindians, and the ethnically mixed, but mainly European-derived population of Sao Paulo revealed that the 30 CGG repeat allele of FMR1 was the most frequent in all the groups. A second peak at 20 repeats was present in the population of Sao Paulo only, confirming the population as a western European peculiarity (Mingroni-Netto et al., 1999; Mingroni-Netto et al., 2002; Angeli and Capelli, 2005). Similar to the Brazilian study, studies conducted in the Chilean population showed that the most common CGG repeat allele was 30, with 29 being second most common (Aspillaga et al., 1998; Jara et al., 1998; Arrieta et al., 1999).

Molecular screening of institutionalized populations in India revealed that the prevalence of FXS was 7–8% (Sharma et al., 2001). In the population studied, 26 distinct alleles were present, ranging from 19 to 50 repeats (Sharma et al., 2001). The most frequent allele size in the population was 29 repeats, but 28 repeats, and minor peaks at 30 and 31 repeats were also observed (Sharma et al., 2001; Zhou et al., 2006). The frequency of FXS was fourfold higher in males than that observed in females; however, because of the stringent criteria employed in the Indian study, comparison cannot be made with studies conducted in Western countries of institutionalized populations. These included all unexplained intellectual disability cases although the Indian study only included mild to moderate intellectually disabled individuals with or without family history and a fragile X clinical phenotype (Sharma et al., 2001).

Studies conducted on African ancestry populations for the frequency of the fragile X allele are small in number (Chiurazzi et al., 1996a; Eichler & Nelson, 1996; Kunst et al., 1996; Peprah et al., 2010b); however African American (AA) FMR1 alleles have been well characterized (Crawford et al., 1999; Crawford et al., 2000a, c; Crawford et al., 2002). In AAs, 37 distinct repeat sizes are present (Crawford et al., 2002). The prominent peak was a CGG repeat of 30, followed by 29 and 31 repeats (Crawford et al., 2002). Twenty different CGG repeats size alleles and 55 different CGG structures were identified in AA which showed a greater heterozygosity than other populations (Crawford et al., 2000c). The African study by Chiurazzi et al. (1996a) demonstrated that the predominant repeat size was 29 and 30 repeats with 31 and 32 repeats also high in frequency. In Ghanaians, the distribution of CGG repeats is similar to AA with 30 and 29 CGG repeats being the most frequent alleles (Peprah et al., 2010b). This Ghanaian population has provided significant insight to the frequency of CGG repeats in this African population. Characterization of the FMR1 CGG repeat in diverse populations is starting to occur. Substantial ascertainment of diverse populations is needed before a thorough understanding of the CGG repeat instability in world populations can be reached.

Prevalence of the FMR1 Mutation in Diverse Populations

Several studies have elucidated the haplotype background of the FMR1 instability in unaffected and affected populations. In many cases the data could not be compared between studies containing different populations because of the diverse methods used for genotyping. These include different haplotype reconstruction schemes, differences in publication nomenclature used for flanking markers, and utilization of different number of short tandem repeats (STRs; e.g., two flanking markers instead of the commonly used three STRs). Many studies consisted of screenings of institutionalized individuals with intellectual disabilities, although without further analysis being conducted it was not possible to calculate prevalence estimates. These investigations yielded cursory confirmation of FXS but could not be extrapolated to the general population. Reports that address most of these issues and produced prevalence estimates abound but one limitation is that these reports used populations of primarily European ancestry, with few exceptions (Hill et al., 2010). Despite these issues, we attempted to summarize the current literature on FXS prevalence rates worldwide (Table 2). Table 2 indicates that the majority of the studies being conducted in non-European populations are currently in their infancy.

Table 2.  Reported prevalence estimate of the fragile X syndrome among world populations adapted from Crawford et al. (2001).
Estimated prevalence
CountryPopulationNo. positive/no. testedGeneral populationTargeted population (%)Reference
  1. 1Only point estimated provided.

  2. 2Provided a range not a point estimate, in which Millian et al. (1999) acknowledge that person with mild intellectual disability could have been missed.

  3. 3Calculated based on premutation carriers (n = 207).

  4. 4Calculated based on available data.

  5. GP, general population; GS, general special needs population; SN, special needs population with intellectual disability; CR, clinical referral for individuals with intellectual disability of unknown aetiology.

Australia1SN10/4721/43502.1(Turner et al., 1986; Turner et al., 1996)
BrazilSN0/83(Mulatinho et al., 2000)
BrazilSN5/2562.0(Haddad et al., 1999)
CanadaGP1/24,4461/24,446(Rousseau et al., 2007)
ChileSN4/2141.9(Aspillaga et al., 1998)
CroatiaSN4/1140.9–2.6(Hecimovic et al., 2002)
CroatiaSN14/8117.3(Hecimovic et al.,1998)
EstoniaGP, SN14/5161/27,1152.7(Puusepp et al., 2008)
EgyptSN34/2005.9(Behery, 2008)
FranceSN10/4032.5(Gerard et al., 1997)
Greece and CyprusCR8/6111/42461.3(Patsalis et al., 1999)
Guadeloupe, FWIIR11/1631/23596.7(Elbaz et al., 1998)
IndiaSN3/1462.5(Pandey et al., 2002)
IndiaCR19/3605.3(Jain et al., 1998)
Israel3GP3/14,3341.5(Tolendano-Alhadef et al., 2001)
IranGP, SN, CR32/5083.4–15.3(Pouya et al., 2009)
IndiaSN9/939.7(Sharma et al., 2001)
JapanGP0/9461/10,000(Otsuka et al., 2010)
JapanSN2/2560.8(Nanba et al., 1995)
KoreaSN4/656.15(Yim et al., 2008)
KuwaitSN11/182(Bastaki et al., 2004)
MexicoGP, SN0/129(Rosales-Reynoso et al., 2005)
MexicoCR2/533.8(Gonzalez-del Angel et al., 2000)
NetherlandsCR10/1975.1(van den Ouweland et al., 1994)
NetherlandsSN9/8661/60452.0–2.4(de Vries et al., 1997)
PolandGS, SN6/2013.0(Mazurczak et al., 1996)
Saudi ArabiaSN12/946.4(Al Husain et al., 2000)
SpainSN8/928.7(Arrieta et al., 1999)
SpainGP2/50001/2466(Rife et al., 2003)
SpainSN11/1826.0(Mila et al., 1997)
Spain2GS, SN, CR5/1801/6200–1/82002.7(Millan et al., 1999)
South AfricaSN9/1486.1(Goldman et al., 1997; Goldman et al., 1998)
TasmaniaGP, SN0/1253(Mitchell et al., 2004)
TaiwanGP1/10,0461/10,000(Tzeng et al., 2005)
TaiwanSN4/2061.9(Tzeng et al., 2000)
ThailandSN5/945.3(Ruangdaraganon et al., 2000)
TurkeyCR5/1663.0(Tuncbileck et al., 1999)
TurkeySN14/12011.7(Demirhan et al., 2003)
Yugoslavia4SN2/972.06(Major et al., 2003)
United Kingdom1GS4/1801/89182.2(Jacobs et al., 1993)
United KingdomSN1/1380.7(O’Dwyer et al., 1997)
United KingdomSN4/1031/41303.9(Slaney et al., 1995)
United KingdomSN20/37381/55300.5(Youings et al., 2000; Murray et al., 1996)
USAGS7/23241/2545–1/37170.3–0.4(Crawford et al., 2002)
USAGP7/36,1241/5161(Coffee et al., 2009)
USAGP, GS, CR1226/119,2320.61–1.4(Strom et al., 2007)
USACR10/1883.7(Kaplan et al., 1994)

Several different populations have been surveyed for the FMR1 premutation which includes extensive research on intellectually disabled individuals in diverse populations (Arinami et al., 1986; Jacobs et al., 1986; Zhong et al., 1995; Elbaz et al., 1998; Crawford et al., 1999). Children with learning disabilities have also been tested for the FMR1 full mutation (Webb et al., 1986; Slaney et al., 1995; Crawford et al., 1999). Screening for the FMR1 mutation is occurring beyond institutionalized individuals with intellectual disability, to encompass women of reproductive age (Hill et al., 2010).

General population surveys have occurred in western European ancestry populations and have contributed to accurate calculations of prevalence estimates. The lowest prevalence estimates for FXS have been reported in Canada, Estonia, Japan, and Taiwan (Table 2). The prevalence estimates for these countries were significantly lower when compared to the other western countries which have carried out fragile X testing (Crawford et al., 2001). Since 2008, other reports from countries including Egypt and Iran characterizing the FMR1 mutation in special needs populations have been published (Table 2). This suggests that; (1) diagnostic criteria for FXS are becoming widely accepted, (2) characterization of the FMR1 CGG repeat is recognized as a method to determine the aetiology of intellectual disability in diverse populations, and (3) the method is cost effective and accurate. These are a few of the parameters that must be met by the various screening methodologies for the protocols to be adopted and used in population screening of FMR1 mutation (Pembrey et al., 2001). As more reports on the distribution of CGG repeats from normal, premutation, and full mutation individuals in diverse populations are produced, these data can be compared to well-characterized (e.g., western European) populations, to gain a better understanding of the frequency of CGG repeat expansion variants of the FMR1 locus in diverse populations. This information will be important in; (1) understanding genetic instability at the locus, (2) elucidating cis elements which are associated with genetic instability, and (3) understanding CGG expansion risk which could be of interest for genetic counsellors and also FXS families and premutation carriers who would eventually want to have children.

Factors Associated with Repeat Instability

Several different populations have been surveyed to determine the role in which cis-elements contribute to the expansion of CGG repeats, using population-based or targeted studies which include intellectually disabled individuals with and without full mutations (Arinami et al., 1986; Jacobs et al., 1986; Elbaz et al., 1998; Crawford et al., 1999). At present, the evidence supports both a cis model (chromosomal structure and genetic elements listed in Table 3) and a trans model (DNA replication and repair enzymes listed in Table 4) in expansion disorders. Because of the enigmatic nature of FXS and other trinucleotide repeat disorders, a “unified” model is needed to describe the instability encompassing both cis elements and trans factors.

Table 3.  Chromosomal elements that affect FXS repeat instability.
FactorsEffect on repeat expansion (somatic)Reference
Length repeatIncrease or decrease expansion(Eichler et al., 1994)
Number of interruptions within the repeatIncrease repeat stability(Eichler et al., 1996)
CGG repeat purity of the repeat at the 3′ endDecrease stability(Crawford et al., 2000c)
5′ position of the first AGG interruptionIncrease stability(Eichler et al., 1995a)
Haplotype background of the mutationIncrease or decrease stability(Kunst & Warren, 1994)
SNP associated with expansion (ss71651738)Increase repeat instability(Ennis et al., 2007)
Table 4.  Gene associated with trinucleotide repeat instability adapted from Kovtun & McMurray (2008).
GeneEffect on repeat expansionSystemReference
  1. *Absence of gene has been shown to effect intergenerational expansions.

  2. ATR, Ataxia telangiectasia and Rad3 related Kinase; DM, myotonic dystrophy; Fen-1, Flap Endonuclease; HD, Huntington disease; Msh2, MutS homologue 2; Msh3, MutS homologue 3; Msh6, MutS homologue 6; OGG1, 7,8-dihydro-8-oxo-guanine-DNA glycosylase; Pms2, Postmeiotic segregation increase 2.

ATR (Mec1)Increase in repeat expansions*FXS mouse(Entezam & Usdin, 2009)
ATMIncrease in repeat expansionsFXS mouse(Entezam & Usdin, 2009)
FEN1 (Rad27)Increase in repeat expansions*Yeast/HD mouse(Spiro et al., 1999)
MSH2Decrease in repeat expansions*DM/HD mouse(Savouret et al., 2003; Pearson et al., 1997; Tome et al., 2009)
MSH3Decrease in repeat expansions*DM/HD mouse(Manley et al., 1999; Foiry et al., 2006; Owen et al., 2009)
MSH6Decrease in repeat expansionsDM/HD mouse(Savouret et al., 2003)
OGG1Decrease in repeat expansionsHD mouse(Kovtun et al., 2007)
Pms2Decrease in repeat expansionsDM mouse(Gomes-Pereira et al., 2004)

Current data suggests three mutational pathways that could explain the stepwise progression to the full mutation allele (Eichler et al., 1996; Crawford et al., 2000c). These mutation pathways were identified via haplotype associations based on the three flanking STRs of the FMR1 CGG repeat. These three STRs include DXS548, FRAXAC1, and another dinucleotide microsatellite, FRAXAC2 (description of each STR can be found in Peprah et al., 2010b). The mutation pathways for each haplotype rely mainly on the multiallelic model of CGG repeat expansion through the loss of AGG interruption and addition of CGG repeats, eventually resulting in the full mutation (Morton & Macpherson, 1992; Eichler et al., 1996). The pathway represented by the 2–1–3 haplotype was associated with highly interrupted CGG repeats which contained several AGG interspersions; this was proposed to retain the AGG interruptions although slowly expanding into the intermediate CGG repeat alleles through addition of CGGs at the polar end (i.e., the 3′ end of the repeat tract; Eichler & Nelson, 1996). The second pathway, the 6–4–5 haplotype, was associated with “asymmetrical” CGG repeat patterns and was hypothesized to progress rapidly towards CGG expansion because of the loss of the AGG interruption within the CGG repeat allowing the alleles on this haplotype to bypass intermediate CGG repeats (Eichler et al., 1996). The third pathway, the 4–4–5 haplotype, suggested that the absence of AGG interruption in the CGG array (i.e., AGG interruption at the 5′ end of the CGG repeat) increased instability of the repeat (Crawford et al., 2000c). Each expansion mechanism was hypothesized to result from different mutational processes. The mutational process could include several mechanisms which mediate the mutation (Snow et al., 1993; Eichler et al., 1994; Kunst & Warren, 1994; Zhong et al., 1995; Eichler et al., 1996; Gunter et al., 1998; Crawford et al., 2000b, c). If one mechanism was the initial predisposing factor, it might not be the primary mechanism by which the CGG repeat would reach the premutation threshold. The exact expansion mechanism(s) still remains to be elucidated.

Haploinsufficiency in DNA Repair/Replication Proteins

FXS, similar to other trinucleotide repeat expansion disorders, is locus specific, suggesting that the mechanism of repeat expansion might not include mutations in trans-acting factors (Mirkin, 2006) because of the lack of genomewide instability, as is observed in some cancers (Foulkes, 2008). Locus-specific expansions infer participation of DNA repair/replication proteins in the expansion process (reviewed by McMurray, 2010). Many enzymes are involved in DNA repair and replication, including those involved in resolving stalled replication forks and also those important in replication repair. Such enzymes include ATR, ATM, MSH2, and MSH3 (Pearson et al., 1997; Spiro et al., 1999; Entezam & Usdin, 2008; Table 4). ATR is known to play a role in the resolution of stalled replication forks and removal of DNA lesions. ATR haploinsufficiency is reported to increase intergenerational expansion of CGG repeats with a maternal bias (Entezam & Usdin, 2008). In contrast, ATM haploinsufficiency is associated with repeat expansion with significant paternal bias (Entezam & Usdin, 2009). The ATR-sensitive mechanism is hypothesized to occur on maternal transmission and an ATM-sensitive mechanism shows a male expansion bias (Entezam & Usdin, 2009). The roles of MSH2 and MSH3 in trinucleotide repeat instability have been extensively reviewed by others (Brouwer et al., 2009; McMurray, 2010). The model of trinucleotide expansion via haploinsufficiency of DNA repair/replication proteins has been primarily explored in mouse models. These and other proteins including MSH6, FEN1, and OGG1 may have roles as potential indicators of repeat expansions in FXS.

Recently, expression analysis of transcripts has occurred in human FXS patients (Bittel et al., 2007; Rosales-Reynoso et al., 2010). The expression data indicated significant down regulation of Rad9A, a DNA repair and cell cycle checkpoint protein within the ATR/ATM pathway responsible for response to DNA damage (Rosales-Reynoso et al., 2010). Rad9A expression was decreased in fragile X patients compared to controls supporting the hypothesis that reduced expression of at least Rad9A could lead to locus-specific expansion in humans. However, because transcript expression data is not easily correlated to protein expression, further studies will be needed to determine if Rad9A haploinsufficiency also leads to FMR1 CGG repeat expansion.

Conclusion and Prospective

CGG expansion in FMR1 is associated with FXTAS and FXPOI in premutation carriers of the expanded repeats and with FXS in individuals with the full mutation. This group of disorders caused by the FMR1 mutation impact families, making screening of the CGG repeat critical to understanding expansion risk in families and populations (Crawford et al., 2001). The FMR1 full mutation offers simple detection by identification via molecular means and phenotypic features, and has allowed successful screening and diagnosis of affected individuals and carriers of the premutation.

A number of studies have focused on newborn screening or general population surveys (Tzeng et al., 2005; Coffee et al., 2009). FXS screening studies have used robust methods, which have substantiated the prevalence estimates of FXS in the general Caucasian population (Coffee et al., 2009). However, the prevalence rate of FXS in the Taiwanese population is suggested to be lower compared to that in European ancestry populations (Tzeng et al., 2005). Other studies have found that nonexpansion variants in or around FMR1 marginally contribute to the prevalence of FXS (Collins et al., 2010). The use of these screening methodologies with previously undiagnosed conditions of intellectual disability will be beneficial in finding the cause of these conditions.

Because of the current enigmatic nature of trinucleotide expansion disorders a “unified” model is needed to describe the instability of repeat disorders encompassing both cis elements and trans factors. Simply stated, if haploinsufficiency of repair replication proteins is present in FX families in addition to the DNA structures associated with expansions, this will be a significant contribution to understanding trinucleotide repeat expansion disorders (Morton & Macpherson, 1992; Eichler et al., 1994; Eichler & Nelson, 1996; Crawford et al., 2000c). The current understanding of trinucleotide expansion disorders suggests that many of these expansions arose from several different mechanisms. DNA elements (e.g., expanded repeats), must be present in addition to the decrease in expression of trans factors creating a mutable background predisposing individuals or families to locus-specific expansions. In most animal models, expansions are observed in large premutation repeat backgrounds which suggest that one mechanism could be the initial predisposing factor, but would not be the primary mechanism in which the repeat would reach the pathogenic threshold. Understanding the mechanism of trinucleotide repeat expansion in FXS would be beneficial to other trinucleotide repeat expansion disorders (i.e., myotonic dystrophy and Huntington Disease). Finally, the evolutionary significance of locus-specific repeat expansion disorders cannot be understated, and their study will also engender greater understanding of the evolution of the human genome and the maintenance of genome fidelity.


This project was supported by the Emory University Fellowships in Research and Science Teaching (FIRST) Program. Additional support from the Intramural Research Program of the NIH, National Human Genome Research Institute, Center on Genomics and Global Health is acknowledged. I would also like to acknowledge the kind editorial assistance of the NIH Fellows Editorial Board.