The inherited ataxias: Genetic heterogeneity, mutation databases, and future directions in research and clinical diagnostics


  • Joshua Hersheson,

    Corresponding author
    1. Department of Molecular Neuroscience, Institute of Neurology, University College London, London, United Kingdom
    • Department of Molecular Neuroscience, Institute of Neurology, University College London, Queen Square, London WC1N 3BG, United Kingdom.
    Search for more papers by this author
  • Andrea Haworth,

    1. Department of Molecular Neuroscience, Institute of Neurology, University College London, London, United Kingdom
    Search for more papers by this author
  • Henry Houlden

    1. Department of Molecular Neuroscience, Institute of Neurology, University College London, London, United Kingdom
    2. MRC Centre for Neuromuscular Diseases, Institute of Neurology and The National Hospital for Neurology and Neurosurgery, Queen Square, London WC1N 3BG, UK
    Search for more papers by this author

  • For the Databases in Neurogenetics Special Issue


The inherited cerebellar ataxias are a diverse group of clinically and genetically heterogeneous neurodegenerative disorders. Inheritance patterns of these disorders can be complex with autosomal dominant, autosomal recessive, X-linked, and mitochondrial inheritance demonstrated by one or more ataxic syndromes. The broad range of mutation types found in inherited ataxia contributes to the complex genetic etiology of these disorders. The majority of inherited ataxias are caused by repeat expansions; however, conventional mutations are important causes of the rarer dominant and recessive ataxias. Advances in sequencing technology have allowed for much broader testing of these rare ataxia genes. This is relevant to the aims of the Human Variome Project, which aims to collate and store gene variation data through mutation databases. Variant data is currently located in a range of public and commercial resources. Few locus-specific databases have been created to catalogue variation in the dominant ataxia genes although there are several databases for some recessive genes. Developing these resources will facilitate a better understanding of the complex genotype–phenotype relationships in these disorders and assist interpretation of gene variants as testing for rarer ataxia genes becomes commonplace. Hum Mutat 33:1324–1332, 2012. © 2012 Wiley Periodicals, Inc.


The inherited cerebellar ataxias are a group of neurodegenerative disorders in which the dominant feature is progressive cerebellar degeneration, resulting in impairment of balance, gait, coordinated limb movements, and speech. Ataxia may present as an isolated cerebellar syndrome or more often is associated with a broad spectrum of neurological manifestations including pyramidal, extrapyramidal, sensory, and cognitive dysfunction. Given the significant clinical heterogeneity of these disorders and the complexity of the cerebellum and its associated connections, it is unsurprising that there exists significant genetic heterogeneity.

The prevalence of genetic forms of ataxia has been difficult to accurately determine and several reviewers suggest that previous data have underestimated the extent of the problem [Klockgether, 2011]. Worldwide prevalence of autosomal dominant ataxia has been reported to be between 1.2 and 41:100,000 [Wardle and Robertson, 2007].

Inheritance patterns of these disorders can be complex with autosomal dominant, autosomal recessive, X-linked, and mitochondrial inheritance demonstrated by one or more ataxic syndromes. The broad range of mutation types found in the inherited ataxias contribute to the complex genetic etiology of these disorders. Many of the more common ataxias are caused by a range of polynucleotide repeat expansions including trinucleotide, pentanucleotide, and hexanucleotide repeats; however, point mutations, deletions, and duplications are also represented. Dominant cerebellar ataxia genetic loci are designated with the spinocerebellar ataxia (SCA) prefix, with SCA36 being the most recent locus to be reported. The precise number of loci is open to interpretation as the list of approved SCA designations has been “polluted” with a variety of other related syndromes (SCA18—predominantly sensory ataxia; SCA29 congenital nonprogressive ataxia, allelic disorders [SCA15/16], and a SCA designation without a reported locus [SCA9]). Taking these discrepancies aside, there are a total of 30 separate SCA loci with the associated genetic variant identified in 21 of these. In addition, there are a number of other dominantly inherited neurological syndromes in which ataxia is a prominent feature including Huntington disease, dentatorubral-pallidoluysian atrophy (DRPLA), Alexander disease, and Gerstmann–Straussler–Scheinker disease. The recessive ataxias are even more diverse with nearly 100 genes having been identified (Washington Neuromuscular Website— and although many of these are exceedingly rare, the commonest inherited ataxia worldwide is caused by recessively inherited mutations in the frataxin (FXN) gene resulting in Friedrich ataxia (FRDA).

This article will provide an overview of the genetics of the inherited ataxias with a focus on the more rare spinocerebellar ataxia genes caused by conventional mutations. In the second part of the article, the current situation with regard to mutation databases in ataxia will be discussed.

Genetics of the Autosomal Dominant Ataxias

Due to Repeat Expansions

The majority of the dominantly inherited ataxias are caused by repeat expansions in either coding or noncoding parts of the relevant genes [Dueñas et al., 2006]. Polyglutamine (CAGn) expansions are the most common of these and comprise SCAs 1, 2, 3, 6, 7, and 17 and DRPLA. Genotype–phenotype correlations of these disorders are well described [Schöls et al., 2004] with the disease manifesting above a threshold of CAG repeats. The noncoding expansion SCAs comprise SCA8 (CTGn), SCA10 (ATTCTn), SCA12 (CAGn), SCA31 (TGGAAn), and SCA36 (GGCCTGn). Larger repeat numbers generally result in an earlier age of onset and more severe phenotype (genetic anticipation). Anticipation has been demonstrated to a varying degree by all of the repeat expansion SCAs. A summary of the dominant ataxia genes is provided in Table 1.

Table 1. Autosomal Dominant Ataxias
NameOMIM #LocusGeneProteinMutation% of ADCAGeographical distributionCharacteristic features
Repeat Expansions: coding
SCA11644006p22.3ATXN1Ataxin 1CAG repeat6–27%Common: South Africa, Japan, India, Italy, AustraliaHyperreflexia, sensory neuropathy, mild cognitive impairment
SCA218309012q24.12ATXN2Ataxin 2CAG repeat13–18%Common: United States, Spain, India, Mexico, ItalyPolyneuropathy, parkinsonism, dysphagia
SCA310915014q32.12ATXN3Ataxin 3CAG repeat20–50%Most common worldwideSpasticity, polyneuropathy, dystonia, parkinsonism
SCA618308619p13.2CACNA1ACalcium channel, voltage dependent, P/Q type, α1A subunitCAG repeat13–15%Common: United States, Germany, Australia, TaiwanLate onset, pure ataxia
SCA71645003p14.1ATXN7Ataxin 7CAG repeat3–5%Finland, Mexico, South AfricaRetinal degeneration
SCA176071366q27TBPTATA box-binding proteinCAG repeatRareUnited Kingdom, Belgium, France, Germany, JapanDementia
DRPLA12537012p13.31ATN1Atrophin 1CAG repeat0.8:100,000 (Japan) Rare worldwideJapan, Portugal, United StatesDementia, epilepsy
HD1431004p16.3HTTHuntingtinCAG repeat3–7:100,000WorldwideChorea, dementia
Repeat Expansions: noncoding
SCA860876813q21.33ATXN8OSAtaxin 8 opposite strandCTG repeat3%Common: FinlandPure ataxia
SCA1060351622q13.31ATXN10Ataxin 10ATTCT repeatUnknownMexico, BrazilSeizures
SCA126043265q32PPP2R2BProtein phosphatase 2, regulatory subunit B, βCAG repeatRare worldwide 7% in IndiaCommon: IndiaTremor, polyneuropathy
SCA3111721016q21BEANBrain expressed associated with NEDD4TGGAA repeat8–40% in Japan Rare worldwideJapan esp. Nagano prefectureSpasmodic torticollis
SCA3661415320p13NOP56Nuclear protein 56GGCCTG repeat6.3% Galicia, Spain 9 Japanese familiesSpain, JapanMotor neurone involvement
Conventional mutations
SCA560022411q13SPTBN2Beta 3 SpectrinDeletions, missense mutationsRareUnited States, Germany, FrancePure ataxia, facial myokymia, gaze palsy
SCA1161169515q15.2TTBK2Tau tubulin kinase 2Nonsense, frameshift deletions/insertionsRareUnited Kingdom, France, GermanyPure ataxia
SCA1360525919q13.3-q13.4KCNC3Potassium channel, voltage-gated, shaw-related subfamily, member 3Missense1% (France)France, PhilippinesEarly onset, mental retardation
SCA1417698019q13.4PRKCGProtein kinase C gammaMissense, deletion2% (France)United Kingdom, France, Netherlands, United States, Japan, AustraliaMyoclonus, dystonia
SCA156066583p26-p25ITPR1Inositol 1, 4, 5-triphosphate receptor type 1Muliti-exon or whole gene deletion, missense1.8% (France) 0.3% (Japan)United Kingdom, FrancePure ataxia
SCA2060868711q12260kb duplicationRareAustraliaDentate calcification, bulbar symptoms
SCA2361024520p13PDYNProdynorphinMissenseRareNetherlandsPyramidal signs
SCA2760930713q33.1FGF14Fibroblast growth factor 14MissenseRareNetherlandsOnset with tremor, psychiatric episodes
SCA2861024618p11.21AFG3L2ATPase FAMILY GENE 3-LIKE 2Missense, deletion3%Italy, France, United KingdomOphthalmoplegia, spasticity
SCA3561390820p13TGM6Transglutaminase 6MissenseRareChinaPure ataxia

Due to Conventional Mutations

A minority of the dominant ataxia syndromes (SCAs 5, 11, 13, 14, 15, 20, 23, 27, 28, and 35) is caused by conventional mutations. In a French ataxia series, conventional mutations accounted for 6% of all dominant ataxia, repeat expansions accounted for 45% with the remaining 48% being genetically undiagnosed [Durr et al., 2009]. Genotype–phenotype correlations are much harder to determine in this group, owing to the limited number of families affected by these mutations. Functional analysis of potassium channels (EA1, SCA13) and calcium channels (SCA6, EA2) has demonstrated a correlation between the degree of functional impairment and the severity of the phenotype. In contrast to the repeat expansion SCAs, these disorders often have a “purer” cerebellar phenotype (ADCAIII), with a slower rate of progression.

SCA5 is caused by a mutation in the SPTBN2 gene, which encodes B3 spectrin [Ikeda et al., 2006]. Missense and in-frame deletions have been described resulting in a pure cerebellar syndrome with onset between 15 and 50 years. The first SCA5 kindred was reported in 1994 with 56 affected individuals over 10 generations who were descendants of the paternal grandparents of Abraham Lincoln. SCA5 has also been reported in French and German pedigrees [Zühlke et al., 2007].

SCA11, initially reported in two British families, is caused by stop mutations, frameshift insertions or deletions in the TTBK2 gene, resulting in a pure cerebellar syndrome with normal life expectancy [Houlden et al., 2007]. Pathogenic variants in TTBK2 have also been reported in French and German families [Bauer et al., 2010].

SCA13 was initially reported in French and Filipino families and is caused by missense mutations in KCNC3, which encodes a voltage-gated potassium channel [Figueroa et al., 2010]. There is a wide phenotypic spectrum that correlates with different missense mutations. The childhood-onset form, in which motor and mental developmental delay is a common feature, has been associated with two variants: (g.10693G>A p.Arg423His) and (g. 10767T>C p.Phe448Leu) described in European and Filipino families, respectively [Figueroa et al., 2011]. Females are more frequently affected and of the two missense mutations reported, the p.Phe448Leu variant results in the more severe phenotype. The p.Arg423His variant has also been reported in a Caucasian family in the United States.

SCA14 is caused by mutations in PRKCG [Yabe et al., 2003], resulting in a variable ataxic phenotype, which may include myoclonus, dystonia, or peripheral neuropathy. The onset is usually in adulthood. The majority of mutations (missense) have been reported in exons 4, 5, 10, and 18. It has been reported in more than 20 families from Europe, Japan, and Australia [Klebe et al., 2005].

SCA15/16 is caused by heterozygous deletions of the 5′ part of the ITPR1 gene [van de Leemput et al., 2007] although a missense mutation (c.1480G>A p.V494I) has been reported. The ITPR1 protein is highly expressed in cerebellar Purkinje cells and is an important modulator of intracellular calcium signaling. SCA15/16 is characterized by a mild cerebellar ataxia with slow disease progression. In a French ataxia series, SCA15 was identified in 1.8% of patients [Marelli et al., 2011]. SCA15/16 shares a locus with SCA29, raising the possibility that they are allelic disorders.

SCA20 has been described in a single Australian family of Anglo-Celtic descent and is the result of a 260 kb duplicated region comprising >12 genes at 11q12 [Knight et al., 2004]. Bulbar symptoms including dysphonia and spasmodic cough in addition to dentate nucleus calcification are characteristic of this condition.

SCA23 is due to missense mutations of PDYN [Bakalkin et al., 2010], which encodes prodynorphin protein, an opioid neuropeptide precursor. This causes a relatively pure cerebellar syndrome with a late onset (43–73 years) and slow progression. The disease has been reported in only a single large Dutch ataxia family and was not identified on screening a large German ataxia series [Schicks et al., 2011].

SCA27 causes an early-onset ataxia [Brusse et al., 2006], associated cognitive deficits, and head or limb tremor and dyskinesia that can be exacerbated by stress or exercise. The causal gene was identified, in a large Dutch kindred, fibroblast growth factor 14 (FGF14) with missense and nonsense mutations having been reported. There is normal life expectancy; however, most affected patients are unable to walk by the seventh to eighth decade. The disease has also been reported in a German ataxia patient.

SCA28 is caused by a mutation in AFG3L2, which encodes a mitochondrially located metalloprotease [Di Bella et al., 2010]. Missense mutations have been reported which are commonly located in the proteolytic domain of the protein with a mutation hotspot in exons 15–16. SCA28 has a typically early onset between 12 and 36 years and is characterized by a slowly progressive cerebellar ataxia with ophthalmoparesis and lower limb hyperreflexia. The disease is estimated to account for 1.5% of European ADCA cases [Cagnoli et al., 2010].

SCA35 is caused by mutations in the cerebral transglutaminase TGM6 and was the first dominant ataxia gene to be identified through exome sequencing [Wang et al., 2010]. Missense mutations were reported in two Chinese families in which a late-onset cerebellar syndrome with associate upper motor neuron involvement was reported. There was moderate progression with patients commonly using a wheelchair 20 years after disease onset.

Episodic Ataxias

The episodic ataxias are a group of heterogeneous channel disorders characterized by attacks of ataxia, which may be associated with a range of other neurological manifestations including myokymia, migraine, seizures, or chorea. Eight episodic ataxia syndromes have been described: EA 1–7 and episodic ataxia with paroxysmal choreoathetosis and spasticity (CSE). EA 1 and 2 are the most common and best characterized of these. The genes for EA 1, 2, 5, and 6 (Table 2) have been identified with linkage loci mapped in EA 3, 7, and CSE. Episodic ataxia is rare with a combined incidence of <1:100,000.

Table 2. Episodic Ataxias
NameOMIM #LocusGeneProteinMutationGeographical distributionCharacteristic features
EA116012012p13.32KCNA1Potassium channel, voltage gated, shaker-related subfamily, member 1MissenseWorldwideAttacks last seconds–minutes; myokymia
EA210850019p13.2CACNA1ACalcium channel, voltage dependent, P/Q type, α1a SubunitMissense, nonsense, large deletionsWorldwideAttacks last hours; allelic with SCA6, familial hemplegic migraine
EA56138552q23.3CACNB4Calcium channel, voltage dependent, β-4 SubunitMissenseFrench-Canadian familyAttacks last hours–days; late onset, seizures
EA66126565p13.2SLC1A3Solute carrier family 1 (glial high affinity glutamate transporter), member 3MissenseUnited States, NetherlandsAlternating hemiplegia, seizures

EA1 is primarily due to missense mutations in KCNA1 [Browne et al., 1994] although truncation mutations have been reported. The disease is characterized by brief periods of ataxia (seconds to minutes) and interictal myokymia. The degree of channel impairment correlates with the severity of the phenotype… Mutations associated with severe phenotypes that may be poorly treatment responsive or associated with seizures or neuromyotonia show the most significant impairment of potassium channel function.

EA2 is due to a range of mutations in CACNA1A [Ophoff et al., 1996], which include missense, nonsense, aberrant splicing, and nucleotide insertions and deletions. EA2 typified by longer periods of ataxia lasting several hours with baseline nystagmus and progressive ataxia. There is a wide spectrum of phenotypes associated with mutations in CACNA1A. EA2 is allelic with SCA6 and familial hemiplegic migraine (FHM). Most of the mutations that cause EA2 disrupt the open reading frame, whereas FHM is caused primarily by missense mutations.

EA5 has been described in a single French-Canadian family that was heterozygous for a missense mutation in the CACNB4 gene, resulting in a phenotype similar to EA2 [Escayg et al., 2000]. The precise functional effects of this mutation are not clear as the same mutation was identified in a German family with generalized epilepsy but no ataxia.

EA6 was initially reported in a patient from the United States presenting with characteristic episodes of hemiplegia, seizures, and ataxia. A de novo mutation was identified in the SLC1A3 gene, which results in complete loss of function of the protein EAAT1—a glutamate transporter localized to astrocytes. Other cases have been reported in the Netherlands with the p.C186S variant that resulted in a milder phenotype without the manifestations of seizures or alternating hemiplegia [de Vries et al., 2009].

Recessive Ataxias

The recessive ataxias are a particular diverse group of disorders that are generally early onset with significant variation in clinical phenotype, which is variably associated with neuropathy, ophthalmological disturbance, seizures, and a range of other neurological and non-neurological manifestations. These disorders are discussed in detail in other reviews; a nonexhaustive summary of recessive ataxia genes are listed in Table 3. FRDA is the most common recessive ataxia worldwide [Palau and Espinós, 2006] and is mainly due to homozygous GAA expansions in the FXN gene, but few patients show compound heterozygosity for a point mutation and the GAA-repeat expansion. Some common pathological pathways have been described in the recessive ataxias including DNA repair dysfunction, mitochondrial dysfunction, defects in lipoprotein metabolism, and protein chaperone dysfunction. There is significant overlap of clinical phenotype with a range of metabolic ataxias (Table 4), which are invariably complex multisystem disorders that can result in severe disability despite dietary modification where possible.

Table 3. Autosomal Recessive Ataxias
DiseaseOMIM ♯GeneProteinMutationIncidence/carrier frequencyGeographic distributionCharacteristic features
Friedrich ataxia606829FXNFrataxinGAA repeat expansions; point mutations in compound heterozygotesIncidence: 1:30–50,000Carrier frequency: 0.9–1.6%Worldwide except natives to: Far East, sub-Saharan Africa, Australia, AmericaSpasticity, neuropathy, cardiac involvement
Ataxia-telangiectasia607585ATMAtaxia telangiectasia mutatedDeletions: splice-site related; nonsense; missenseIncidence: 1:400,000–450,000 live birthsCarrier frequency: 0.35–1%Reported in many worldwide populationsOculomotor apraxia; extrapyramidal features; increased cancer risk/ radiosensitivity
Ataxia-telangiectasia like disorder (ATLD)604391MRE11AMeiotic recombination 11, S. cerevisiae, homolog ofMissense25 reported cases worldwideSaudi Arabia (15 cases), Japan (4 cases), UK (4 cases), Italy (2 cases)Similar to ATM but milder phenotype
Ataxia-oculomotor apraxia type 1208920APTXAprataxinInsertion, deletion, missenseRare worldwide—More common in Portuguese and Japanese populationsPortugal, Japan, France, TunisiaOculomotor apraxia, peripheral neuropathy
Cerebellar ataxia with muscle coenzyme Q10 deficiency607426APTXAprataxinMissenseRareSingle Italian familyLow coenzyme Q10 levels; late-onset hypergonadotrophic hypogonadism
Ataxia-oculomotor apraxia type 2606002SETXSenataxinNonsense, missenseCarrier frequency: 2.1–3.5%Incidence: 1:400,000 (Alsace)Commoner in French-Canadian populationsOculomotor apraxia (variable); extrapyramidal features; peripheral neuropathy
Spastic ataxia of charlevoix-saguenay (ARSACS)270550SACSSacsinStop-gain deletions and point mutations most commonCarrier frequency (Quebec): 4.5%Incidence: 1/1930Most common in Quebec. Tunisian, Turkish, Italian, Japanese families reportedMyelinated retinal fibers; prominent lower limb spasticity
Cerebellar ataxia, seizures and ubiquinone deficiency612016ADCK3aarF domain containing kinase 3Missense, splice site, frame shift, deletionRareFrench, Dutch, British families reportedMental retardation, seizures, low coenzyme q10 levels
Spinocerebellar ataxia with axonal neuropathy (SCAN1)607250TDP1Tyrosyl DNA phosphodiesterase 1MissenseRareSaudi Arabian familyAxonal neuropathy
Autosomal recessive spinocerebellar ataxia type 8610743SYNE1Synaptic nuclear envelope protein 1Splice site, intronicRare worldwide 3rd most common ARCA in QuebecCanadaHyperreflexia
Autosomal recessive spinocerebellar ataxia type 10613728ANO10Anoctamin 10Missense, splice site, deletionRareFrench, Dutch, Serbian familiesTortuous conjunctival vessels
Table 4. Metabolic Ataxias
DiseaseOMIM #GeneProteinMutationIncidence/carrier frequencyGeographic distributionCharacteristic features
Ataxia with selective vitamin E deficiency277460TTPATocopherol transfer protein alphaFrameshift, missensePrevalence: 0.55–3.5:1000,000United Kingdom, French, Italian, Moroccan, Japanese families reported.Low vitamin E; resembles FRDA
Abetalipoproteinemia200100MTTPMicrosomal triglyceride transfer proteinMissense, nonsensePrevalence: <1:1000,000GlobalAcanthocytosis; pigmentary retinal degeneration; polyneuropathy
Refsum disease266500PHYHPhytanoyl-CoA hydroxylaseMissense, nonsense, deletions, splice site mutationsPrevalence: 1:1000,000GlobalDeafness, retinitis pigmentosa, icthyosis, demyelinating polyneuropathy
Cerebrotendinous xanthomatosis213700CYP27A1Cytochrome p450 subfamily XXVIIA, polypeptide 1Missense, deletions, splice site mutationsPrevalence (Moroccan Jews): 1:108 United States: 1:50,000More common in Moroccan JewsWidespread cholesterol deposits: tendons, brain, lungs; cataracts; dementia
Niemann Pick Type C607623NPC1NPC1 proteinDeletions, point mutationsPrevalence: 1:100,000–150,000GlobalExtrapyramidal features, seizures, dementia,
Wilson's disease277900ATP7BATPase, Cu(2+)-Transporting, beta polypeptidePoint mutations, nonsense mutationsPrevalence: 1:10,000–30,000Global—higher incidence in China, Japan, SardiniaExtrapyramidal features; liver disease

Future Directions in Ataxia Research and Diagnostics

The rate of discovery of new ataxia genes has accelerated enormously in recent years, commensurate with rapid advances in next-generation sequencing (NGS) technologies [Bamshad et al., 2011]. Exome sequencing in particular has demonstrated its utility in the identification of causal genes in a variety of Mendelian disorders including ataxia [Montenegro et al., 2011; Ng et al., 2010; Pierson et al., 2011]. Although exome sequencing is a useful tool in new gene discovery, issues of cost and significant data storage burden associated with processing exome samples prohibit the routine use of sequencing in a diagnostic setting. A modification of the technology, targeted enrichment, and sequencing [Mertes et al., 2011; Schlipf et al., 2011] will allow focused panels of relevant genes to be sequenced in highly multiplexed and extremely cost-effective runs. This is much more likely to replace traditional Sanger sequencing runs in diagnostic laboratory as the technology becomes cheaper and more widespread and has the potential to transform the capabilities of diagnostic laboratories and research groups worldwide.

The provision of diagnostic tests for the inherited ataxias is generally limited by cost and technical considerations. In the United Kingdom, most patients (depending on the clinical presentation and inheritance pattern) the following tests are available: SCA1, 2, 3, 6, 7, 12, 17, HD, DRPLA, PRNP, FXN, ATM, AOA1/2. Testing for rarer genes is often available from specific international diagnostic laboratories or on a research basis by various interested research groups.

It seems likely that in future diagnostic, laboratories will be able to offer relatively low-cost screening for all known ataxia genes using targeted NGS techniques and which are already being employed to screen for genetic conditions including nonsyndromic deafness [Walsh et al., 2010] and hypertrophic cardiomyopathy [Voelkerding et al., 2010].

Despite the advantages of NGS in gene discovery and diagnostics over conventional methods, significant challenges remain with the interpretation and storage of the wealth of data generated by these NGS applications. Exome sequencing identifies on average between 20,000 and 24,000 single-nucleotide variants per sample [Bamshad et al., 2011]. Most analysis pipelines for NGS variant data include a step to filter sequenced variants against control datasets such as dbSNP (, 1000genomes (, and the Washington Exome Variant Server ( These datasets should be used with caution however, as there have been reports of “contamination” of some datasets including dbSNP with rare pathogenic variants with disease mutations that are not sufficiently annotated [Walsh et al., 2010]. This can potentially lead to the exclusion of variants that are potentially pathogenic on the basis of their presence in these datasets.

Subsequent filtering steps take into account specific inheritance patterns (e.g., exclusion of all but homozygous variants in recessive disorders) and may make use of linkage data or homozygosity mapping, where available, to further refine the list of variants. Variants can also be stratified according to the impact of the variant on protein structure and function and the degree of evolutionary conservation. A range of bioinformatics tools are available to enable this including the commonly used SIFT (Sorting Intolerant from Tolerant) and PolyPhen2. Often multiple tools are used for in silico analyses of pathogenicity. A study by Thusberg et al. (2011), investigating the performance of pathogenicity prediction methods, found MutPred and SNPs&GO to be the best performing tools; however, they noted that no single method performed optimally according to their specified parameters. Functional analysis of variants is usually limited to research laboratories and is generally not practicable in a diagnostic setting.

Equally significant in advancing such research will be the extensive sharing of next-generation datasets and associated phenotypic information for the benefit of national and international collaborations. Recently launched by the Miller School of Medicine at the University of Miami was the Genome Variant Database for Neuromuscular Diseases ( The aim of the resource is to share genomic data on patients and families with neuromuscular disorders including Charcot–Marie–Tooth disease, hereditary spastic paraplegia, and amyotrophic lateral sclerosis. Complete variant data determined through exome sequencing are provided for a range of families in which the pathogenic mutation is currently unknown. Although ataxia families are not currently represented, this is a good model for NGS data sharing in investigating neurological disorders.

Neurogenetics Databases

Although the challenges in developing mutation databases for ataxia genetics are by no means unique, they are particularly well aligned to those outlined by the Human Variome Project (HVP), and its neurogenetics consortium, which aims to develop a global collaboration for the collection, storage, interpretation, and sharing of genetic variation [Cotton et al., 2009]. Recent meetings of the HVP neurogenetics consortium [Haworth et al., 2010] have determined that global access to comprehensive repositories of genetic variant data were particularly apposite for neurogenetics due to the large number of disease genes, significant genetic heterogeneity, clinical variability, and complex genotype–phenotype relationships in neurological disorders. Also stated were the significant shortcomings in the current situation with regard to databases for genes relevant to neurogenetics.

Ataxia Gene Variants within General Mutation Databases

Although a number of publically available, well-curated neurological databases exist including those for Charcot–Marie–Tooth disease (Inherited Peripheral Neuropathies Database—, Parkinson disease (PDGene database—, and Alzheimer disease (AD & FTD Mutation Database—, comprehensive databases for ataxia genes are poorly represented. Before detailing the currently available ataxia-specific resources it is first worth considering the general mutation databases in which the majority of ataxia gene variation has been deposited. Both the Online Mendelian Inheritance in Man database (OMIM— and the Human Gene Mutation Database (HGMD— contain variant information on ataxia genes curated from the medical literature. OMIM is a publically available resource accessed through the National Center for Biotechnology website and provides, where available, detailed clinical information albeit with a limited selection of reported variants. The HGMD attempts to curate all known published gene mutations responsible for human disease through automated searches of medical literature and also includes variants reported in locus-specific databases (LSDBs). Access to the full HGMD database requires a commercial license although a limited version is publically available. Unlike OMIM however, the HGMD eschews detailed phenotypic information.

For the more widely tested ataxia genes SCAs 1, 2, 3, 6, 7, 12, 17, DRPLA, FXN, ATM, AOA1/2, diagnostic laboratories hold a wealth of legacy data on pathogenic fragment lengths, other deleterious mutations, and nonpathogenic polymorphisms. Most variant information from diagnostic laboratories is usually reported through medical literature and not through direct submission to online databases. This may in part reflect the seemingly laborious process of data submission; however, concerns about future data ownership, patient consent, and confidentiality issues are equally relevant. In the United Kingdom, the Diagnostic Mutation Database, established in 2005 by the National Genetic Reference Laboratory, is intended as a repository of diagnostic variant data, to support the diagnostic process in UK genetic testing laboratories. This resource is primarily aimed at UK diagnostic laboratories although a large number of international laboratories, including those in China, Canada, and New Zealand, have signed up to the service. No ataxia genes are currently represented on the database; however, this is likely due to the fact that the majority of the ataxia gene mutations commonly tested in diagnostic laboratories are of the repeat expansion type, the interpretation and genotype–phenotype correlation which depend mainly on expanded allele length and where the pathogenic ranges are generally well documented in the medical literature. Advances in sequencing technology are likely to soon result in a much broader range of ataxia genes that will be tested in a diagnostic setting. This will strengthen the need to employ such databases in order to share variant data, to support the interpretation of new variants and improve the quality and consistency of diagnoses.

Locus- and Disease-Specific Databases

It has been widely argued that the best way to share data is through publically available, open-access databases and that LSDBs are a viable solution to meeting this need [Samuels and Rouleau, 2011]. LSDBs are well suited to high-penetrance monogenic genetic disorders typified by the various inherited ataxias and although some exist for a handful of ataxia genes, the list is far from comprehensive and does not reflect the extent of variant data that has been collected on these genes research and diagnostic laboratories worldwide. The most widely available platform for the creation of LSDBs is the Leiden Open Variation Database (LOVD, supported by the European Community's Seventh Framework Programme under the GEN2PHEN project ( Although the creation of these databases is straightforward, there are well-reported limitations relevant to the ongoing curation of variant data and maintenance of the database [Cotton et al., 2008]. Detailed recommendations for the curation of LSDBs have previously been published [Celli et al., 2011] and while it is acknowledged that expertly curated, up-to-date LSBDs offer significant benefits for patients and the research community, continued funding of these projects is a not an inconsiderable challenge.

Only one truly ataxia disease-specific LSBD is currently listed on the Human Genome Variation Society mutation database list ( although there are a number of specific genes curated within other more general gene collections (see Table 5). Most of the known ataxia genes are represented in some form on various LOVD installations; however, the majority of these have merely been identified by the LOVD team as being in need of a curator and have no variant submissions to date. None of the conventional mutation SCA genes are represented in LSDBs although most of the reported mutations in these genes can be found within HGMD. A number of the recessive genes are listed on LOVD including those for several of the metabolic ataxias.

Table 5. Ataxia Mutation Databases
NameWebsiteGenes listedPhenotypic informationUnique variantsLast updatedFormat
SCA-LSVD, ATXN2, ATXN3, ATXN8OS, PPP2R2B, ATN1, ATXN7, CACNA1A, ATXN10, TBP,FXNYes612 (repeat size only)February 2009LOVD
SACSIN database 2008Excel
Human DNA POLG Mutation Database∼230UnknownHTML
Cerebrotendinous 2010LOVD
Refsum disease 2008LOVD
Niemann–Pick type C disease gene variation databasehttp://npc.fzk.deNPC1Yes244May 2011Web-form

SCA-LSVD is a LOVD installation that was created to deposit repeat-oriented variant information on 400 SCA families identified from a tertiary referral center in north India between 1998 and 2007 [Faruq et al., 2009]. Data on the repeat size of SCAs 1, 2, 3, 6, 7, 8, 12, 17, and FXN are reported on the database together with detailed phenotypic information on the individuals screened. As the data were all collected through fragment length analysis, no additional nonpathogenic polymorphism data were submitted. The study authors report that they were in the process of curating variations on all ataxia-related genes and while this aim is consistent with the intended function of such databases, it is worth noting that no submissions have been made to the database for approximately 2 years. As a means to share variant information obtained in epidemiological studies, LOVD installations are undoubtedly convenient but it would perhaps be more useful to the wider research community if disease-specific databases such as this were curated on an ongoing basis with a gene list that reflects the current population of known disease genes.

Ataxia Disease Registries

It has been suggested that a strategy for the development of neurological LSDBs should be initiated by international, multidisciplinary disease centered networks. One of the achievements of the EUROSCA project (, funded by the European Commission, was to establish the world's largest DNA registry of SCA patients, together with detailed clinical information. Data were collected on over 3,000 patients affected by dominantly inherited ataxia, which included both those with and without a genetic diagnosis. This Internet-based registry is available to participating investigators and is arguably one of the largest collections of ataxia gene variant information.

EFACTS ( is a project funded under the EU FP7 framework and has engaged a network of European collaborators to adopt a translational research strategy for the FRDA. One of the primary aims of this project is to populate a pan-European FRDA database linked to bio-banks of patient material. Like EUROSCA, this registry will be available only to participating investigators but it is not clear whether the FXN gene variant data collated in this project will be made publically available.

It is not clear which organizations are best suited to meeting the challenges of developing comprehensive variant databases for ataxia genes linked to detailed phenotypic information. Although significant financial, technical, and ethical issues regarding the use of large patient datasets are yet to be fully addressed, robust guidance for tackling these issues has been provided by a number of interested parties. Given the rapid advances in sequencing technology. it is imperative that a coherent strategy to meeting these challenges is undertaken by ataxia research groups worldwide.