SEARCH

SEARCH BY CITATION

Keywords:

  • amyotrophic lateral sclerosis;
  • ALS;
  • resequence;
  • clinical information

Abstract

  1. Top of page
  2. Abstract
  3. Introduction
  4. Database Structure
  5. Data Sources
  6. Utility
  7. Accessibility and Usage
  8. Discussion
  9. Acknowledgements
  10. References
  11. Supporting Information

An amyotrophic lateral sclerosis (ALS) mutation database has been constructed as a publicly accessible online resource for recording the nucleotide and amino acid variants identified in genes associated with ALS, along with corresponding clinical conditions. The database currently consists of more than 600 entries, including about 180 unique variants found in 25 disease-causative or disease-related genes. In addition to published data collected from literature, novel variants identified by microarray resequencing in our laboratory are incorporated into the database. Every reported gene has a respective page that provides information on its variation positions with various statistics, clinical characteristics, and primary references, as well as gene-sequence and protein-structure information that will assist in assessing variation significance. Users can access a homology search function to find variations in arbitrary sequences of interest and to check if they have already been described in the database. This database is expected to fulfill an essential need in terms of integrating comprehensive information on genetic and clinical data related to ALS, which will subsequently deepen our understanding of the possible mechanisms of the disease, as well as help with the clinical practice and treatment of ALS. The database is accessible at: https://reseq.lifesciencedb.jp/resequence/SearchDisease.do?targetId=1. Data submission is open to all researchers and is highly encouraged. Hum Mutat 31:1003–1010, 2010. © 2010 Wiley-Liss, Inc.


Introduction

  1. Top of page
  2. Abstract
  3. Introduction
  4. Database Structure
  5. Data Sources
  6. Utility
  7. Accessibility and Usage
  8. Discussion
  9. Acknowledgements
  10. References
  11. Supporting Information

Amyotrophic lateral sclerosis (ALS; MIM♯ 105400), also known as Lou Gehrig's disease, is a rapidly progressive neurodegenerative disease that is characterized by degeneration of motor neurons in the motor cortex and spinal cord. It is a common neurodegenerative disorder worldwide, affecting people of all ethnic backgrounds. The incidence and prevalence of ALS in Japan are 0.4–1.9 and 2–7 per 100,000 people per year, respectively (Japan Intractable Diseases Information Center, http://www.nanbyou.or.jp/sikkan/021_i.htm); worldwide, they are 1–2 and 4–6 per 100,000 people per year, respectively [Mitsumoto et al., 1998]. Although the majority of individuals with ALS have no family history of the disease and represent sporadic cases, about 5–10% of patients appear to have an inherited form; it occurs more than once in their family lineage and is referred to as familial ALS [Camu et al., 1999]. Although the pathogenic mechanisms that cause ALS are not yet fully understood, there is growing evidence that many genetic factors play important roles in pathogenesis and pathophysiology. By far the most extensively studied gene thought to be responsible for ALS is SOD1 (MIM♯ 147450). Since the first identification of a defective SOD1 gene among cases of familial ALS in 1993 [Rosen et al., 1993], more than 100 different mutations have been identified in its sequence [Wroe et al., 2008]. Some studies have shown that mutations in the SOD1 gene result in gain of a new but still unclear function that is toxic to motor neurons. Even the SOD1 mutations, however, account for only about 20% of familial cases and about 2% of all cases. Recently, various new mutations have been reported in familial ALS patients, for example, the FUS gene (MIM♯ 13070) [Kwiatkowski et al., 2009; Vance et al., 2009] and the TARDBP gene (MIM♯ 605078) [Gitcho et al., 2008; Van Deerlin et al., 2008]. Other rare genetic causes of familial ALS include mutations in the SETX gene (MIM♯ 608465) in the case of juvenile ALS [Chen et al., 2004]. As to sporadic ALS, some studies have reported mutations in genes related to familial ALS, such as the TARDBP gene [Sreedharan et al., 2008] and the ANG gene (MIM♯ 105850) [Greenway et al., 2006]. However, the major causative genes of sporadic ALS have not yet been resolved. Currently, more and more genes have been reported to be associated with ALS in various forms [Schymick et al., 2007]. Some of these genetic variations are considered causal factors, while others may indirectly influence ALS susceptibility. However, how they cause or predispose a person to ALS has yet to be determined.

Due to recent advances in sequence technology, such as high-throughput genotyping techniques based on microarrays, the number of newly identified genetic variations has expanded rapidly in the past few years. A negative result of this is that it has become extremely difficult to retrieve pertinent information from large amounts of text in an efficient way. This is problematic because complete and accurate information on variations and their clinical effects is essential for identifying correlations between genotypes and phenotypes and for understanding disease mechanisms. Creating integrated databases that contain all reliable information on both genetic and clinical data is now recognized as the best way to address this need, and a number of locus-specific mutation databases targeting various human genetic disorders have been established (e.g., [Gout et al., 2007; Runz et al., 2008; Tang et al., 2008]).

In this study, the authors have created a new relational mutation database for ALS that aims to provide a complete and up-to-date overview of all nucleotide variants identified in ALS patients along with the corresponding clinical data. The ALSOD Consortium has built the first register database of mutations in ALS patients (ALSODatabase) [Radunovic and Leigh, 1999; Wroe et al., 2008]. We expect our own database to play a key and complementary role, especially in terms of accumulating variations in the Asian region. It will become a valuable tool for epidemiological and pathophysiological research through correlating genotypes with phenotypes in patients and evaluating the clinical significance of detected variations. In addition, the centralized mutation database is intended to be a useful resource in actual clinical practice, especially as an aid to predicting disease course and prognosis based on clinical information of patients with the same mutations, because the progress and symptoms of ALS can largely depend on the mutation positions even within the same gene. All sorts of variations including mutations, variants of uncertain pathogenic significance, and polymorphisms found in 25 disease-causative or disease-related genes are accumulated in the database. In addition to published data exhaustively collected from literature, our original resequencing results obtained using a microarray are also incorporated [Takahashi et al., 2008]. The database provides information on genetic variations, clinical conditions, and primary references, as well as general gene information thought to be helpful for assessing the variation effects. Various crossreferences to other public resources are also incorporated to assist further exploration of variations of interest.

Database Structure

  1. Top of page
  2. Abstract
  3. Introduction
  4. Database Structure
  5. Data Sources
  6. Utility
  7. Accessibility and Usage
  8. Discussion
  9. Acknowledgements
  10. References
  11. Supporting Information

The ALS mutation database is designed to be a publicly accessible online resource with a user-friendly interface. Variations are described in accordance with the nomenclature guidelines of the Human Genome Variation Society (http://www.hgvs.org/) [den Dunnen and Antonarakis, 2000]. In the interest of privacy protection, special permission is required to access parts of data that is directly related to the patients' sequences and clinical information. MySQL, a relational database-management system (DBMS), is used to organize the data, and a Web-based user interface is provided via an Apache HTTP server with Java Servlet/JSP running on Tomcat. The configuration of the database system is shown in Supp. Figure S1. Programs for parsing the data and uploading data sets to the database are written in Perl and shell script.

The database structure is composed of 28 tables that are interrelated by unique identifiers (Ids). The centerpiece of the database is the “gene” table that is linked to another master table called the “mut_gene” table. Each table is surrounded by several complementary tables. The “gene” table has a list of all the genes concerned but contains only a minimum of information about the genes themselves (including Entrez gene ID). Other tables recording gene information are connected to the “gene” table via the Entrez gene ID. Each reported variant is numbered with a unique variation ID, and its data are divided between several tables. The “mut_gene” table lists the variation ID with the corresponding Entrez gene ID, thus acting as a gateway that links the tables of the variation data to the “gene” table. Supp. Figure S2 presents the simplified schema of the database structure. The constituent tables are briefly described in Table 1.

Table 1. Summary of Database Tables
TableDescription
GeneA list of all genes concerned containing some gene information: Entrez gene ID, gene symbol, gene full name, and accession IDs of representative mRNA and amino acid sequences
Gene_synGene synonyms for each gene
rel_posGenomic position for each gene
exp_seqSequence information from original resequencing experiment
rel_geneAccession IDs of reference sequences
align_posMapping information of mRNA sequences
aalign_posMapping information of amino-acid sequences
exp_seq_infoMapping information of experimental sequence
mr_seqmRNA sequences
aa_seqAmino acid sequences
Structure_3DStructure information
Structure_2DSecondary structure information (InterProScan)
func_structureSecondary structure information (Uniprot/Swiss-Prot)
functionList of function names for secondary structures
mut_GeneUnique variation ID with corresponding Entrez gene ID
mut_idVariation IDs in other variation databases for each variant
mut_posVariation position for each variant
mut_kindsVariation type for each variant
mut_reportVarious statistics of each variant
mut_report_sumVarious statistics of each variant
pat_mutPatient ID with corresponding variation ID
patientsPatients' detail
diagnosisClinical information for each variant
referencePublication information
eth_infoA list of ethnic groups
align_pat_seqAlignment results of all patients' sequences
alignmentA list of file names for amino acid alignments of the human sequence with orthologs
a_diagnosisList of references

Data Sources

  1. Top of page
  2. Abstract
  3. Introduction
  4. Database Structure
  5. Data Sources
  6. Utility
  7. Accessibility and Usage
  8. Discussion
  9. Acknowledgements
  10. References
  11. Supporting Information

General gene information was obtained from the NCBI Entrez Gene Database (http://www.ncbi.nlm.nih.gov/). The GRCh37/hg19 sequence assembly (http://genome.ucsc.edu/cgi-bin/hgGateway) is used for the human genomic reference sequence. The reference mRNA and amino acid sequences for respective genes were imported from NCBI and Uniprot/Swiss-Prot databases (http://www.ebi.ac.uk/uniprot/), respectively. Of these, the longest mRNA and amino acid sequences were adopted as the representative sequences for each gene. We used GMAP to map all mRNA sequences against a genome and thus obtain the exon/intron structure. GMAP is a stand-alone program to map and align sequences to a genome on the basis of a minimal-sampling strategy for genome mapping, an oligomer-chaining method for approximate alignment, and a dynamic programming procedure for splice-site identification [Wu and Watanabe, 2005]. The coding regions were determined in accordance with the NCBI definitions. The motif/domains of the protein sequences were extracted with InterProScan (http://www.ebi.ac.uk/Tools/InterProScan/), a widely used motif/domain-finding tool. Information on the secondary structures of the protein sequences was also derived from Uniprot/Swiss-Prot. Three-dimensional structural information related to the proteins was retrieved from the PDB database. Those structures not in the database were predicted with Phyre, an ensemble system of multiple-fold recognition algorithms [Bennett-Lovsey et al., 2008] that is widely known to perform well. Orthologous amino acid sequences of other organism species related to the reference human sequences were found through BLAST searches at NCBI. Multiple alignments were created with the ClustalW program for sets of resultant sequences of only major species. Multiple alignments of all patients' sequences for each gene were also achieved with the ClustalW program.

Variation reports were exhaustively collected from full articles by expert curators. Every variation entry was manually examined and essential information, such as genomic position, mRNA and amino acid changes, population, type of ALS (sporadic or familial), size of studied sample, control group, and data from patients including diagnostic features were extracted from the data source. Some statistics, such as odds ratio and P-value, were calculated. In this study, we represent the position of each variation with respect to amino acid sequences in two ways: one, from the original description in the publication, and two, in terms of the representative sequences mentioned above. To compare the variation differences, diagnostic features are described in as unified a manner as possible in the database.

Novel variations identified in our laboratory have also been incorporated into the database. We developed a DNA microarray-based high-throughput resequencing system for genes associated with ALS and other neurodegenerative diseases. As of October 2009, comprehensive resequencing gene analysis has been carried out for 10 patients with familial ALS and 35 patients with sporadic ALS. All of the sequence variations determined were confirmed by direct nucleotide sequence analysis. We also performed direct sequencing on a total of 238 control genomic DNA samples. The system detected point mutations with 100% accuracy and accomplished the resequencing of 270 kbp in three working days with a base-call accuracy of more than 99.9%. Full details of the system and analysis are described in Takahashi et al. [2008].

Utility

  1. Top of page
  2. Abstract
  3. Introduction
  4. Database Structure
  5. Data Sources
  6. Utility
  7. Accessibility and Usage
  8. Discussion
  9. Acknowledgements
  10. References
  11. Supporting Information

The main page of the database presents the list of gene symbols (Fig. 1). Each of the gene symbols links to its own page, which is structured into the following sections: “Detailed information,” “Sequence information,” “Mutation and clinical information,” “Multiple alignments,” “3D structure,” “Structure information,” “Overview of mutations,” and “Mutation Search.”

thumbnail image

Figure 1. Main page of ALS variation database.

Download figure to PowerPoint

The “Detailed information” section mainly summarizes general gene information, for example, the full name and official symbol with all known aliases and its position on the genome. In the “Sequence information” section (Fig. 2), schematic drawings of the conformations of the mRNAs from the public database and those of the original experiments are shown. Users can change the plot ratio between exon and intron regions upon selection. Here, intron regions common to all the sequences are shrunk in length according to the user's specified ratio. The locations of the variants found are superimposed over the gene sequence in order to visualize the variation distribution along the gene, where pathogenic variants are depicted by arrows whose length is in proportion to the variation rate (in other words, the number of patients with the variation divided by the number of all investigated patients). Additional drawings include active site information and motif/domain information related to the gene sequence in order to help in assessing the variation significance in gene functions.

thumbnail image

Figure 2. Schematic drawings of sequence conformations with locations of variants found.

Download figure to PowerPoint

All available information on each variant is given in the “Mutation and Clinical Information” section. These include the genomic position, accession IDs of mRNA and amino acid reference sequences, description of variation sequences at both DNA and amino acid levels, studied population, type of ALS (sporadic or familial), variation statistics data, patient data including hetero- and homozygosity, sex, clinical features at diagnosis, and publication details. For each variation entry, the following are calculated: the number of patients reported to have the variation, the ratio of number of patients to all people examined with the variation, the ratio of number of patients with the variation to all patients examined, the ratio of number of controls with the variation to all controls examined, odds ratio, 95% confidence interval (CI), and P-value. Clinical details cover age of onset, duration, disease type (UMN and/or LMN), onset site (lower limb, upper limb, or bulbar), years until initiation of a respirator or death. Additional clinical characteristics are also summarized. We set out all variants in a table with three display options (Fig. 3): in “Mutation Data only” mode, unique variants are listed in the order of their genomic positions and no patient data is shown; in “Simple/Detail mode,” variants associated with each patient are described in some/full detail; in “Experimental Data only” mode, only novel variants identified in our laboratory are included. In the “Structure” column, to help in predicting the variation effect (if available), a link to Jmol applet (http://jmol.sourceforge.net/), which displays a three-dimensional structure of the protein, is set up, with the corresponding variation highlighted.

thumbnail image

Figure 3. Partial table of mutation and clinical information.

Download figure to PowerPoint

The “Overview of mutations” section shows a bird's eye view of patients' sequences and clinical characteristics, allowing users to take a first glance at genotype–phenotype correlations in a single table (Fig. 4). The alignments of all patients' sequences are provided in the “Multiple alignments” section, where variations found are represented in color. Not all of the variations shown here are related to ALS, however. Possibly pathogenic variants that are not found in the control samples are listed in the “Experimental Data only” table provided in the “Mutation and clinical information” section. Access to these pages requires prior permission in order to protect the patients' information (the application procedure is described in the following subsection).

thumbnail image

Figure 4. Partial table of overview of mutations.

Download figure to PowerPoint

Generally, the evolutionary conservation of amino acids among different species indicates that these amino acids make an important contribution to protein functions; thus, variations in such amino acids might be of great significance in disease pathogenesis [Kulkarni et al., 2008]. The “Multiple alignments” section shows amino acid alignments of the reference human sequence with orthologous sequences of other organism species to help in assessing the variation significance. The “Structure information” section presents information on the secondary structures of the protein sequences to help in predicting the influence of variations of interest on protein functions.

Last, the “Mutation search” section has a link to the functional page that enables users to search the database for any variants relevant to ALS. By submitting arbitrary sequences of interest in a single FASTA format, users are informed whether variants exist in the sequence, and if so, whether they are novel ones or have been previously reported. Here, a sequence-similarity search is performed using SSEARCH, which is a well-validated alignment program based on the Smith-Waterman algorithm [Smith and Waterman, 1981]. Both mRNA and amino acid sequences are permitted as queries.

For ease of use and navigation, various useful features, such as cross references spanning Web pages and external public databases, have been built into the database. The full details are provided in the database legend.

Accessibility and Usage

  1. Top of page
  2. Abstract
  3. Introduction
  4. Database Structure
  5. Data Sources
  6. Utility
  7. Accessibility and Usage
  8. Discussion
  9. Acknowledgements
  10. References
  11. Supporting Information

The database is located at: https://reseq.lifesciencedb.jp/resequence/SearchDisease.do?targetId=1. The “Overview of mutations” and “Multiple alignments” sections contain unpublished patient information and require an ID and password, which are issued following a simple Web-based application. The database is designed for the inclusion of new variants when discovered, both from published data and from direct submission. Submission is open to all laboratories and is highly encouraged. Expert curators are dedicated to the curation of the imported data, and users are invited to contact them in order to submit new variants and report additional information for existing variants with their published journal information. All data are stored in coded form with appropriate safeguards.

Discussion

  1. Top of page
  2. Abstract
  3. Introduction
  4. Database Structure
  5. Data Sources
  6. Utility
  7. Accessibility and Usage
  8. Discussion
  9. Acknowledgements
  10. References
  11. Supporting Information

As of October 2009, the mutation database contained 638 entries extracted from 107 publications including 178 unique variants covering 25 genes. Among them, 440 entries (69%) reported in 55 publications (51%) are made up of variations found in the SOD1 gene (MIM♯ 147450), which indicates the principal focus placed on this gene in current ALS-based genetic research. Table 2 summarizes the number of entries, unique variants, and references involved in each gene. For genes with a large number of reports, a large portion of entries are related to familial forms of ALS, whereas most of the entries for genes with a small number of reports are related to sporadic ALS. On the whole, genetic research of ALS has thus far been much more focused on familial ALS, and more entries are related to familial ALS. Recently, however, an increasing number of rare variations have been reported for sporadic ALS in genes related to familial ALS. The respective numbers of the two cases for each gene are also presented in Table 2. Figure 5 shows the distribution of the number of references per variant. Most of the variants are reported in only one article. The exceptions are those in the SOD1 gene; about half of them are described in more than one article. All of the variants with more than four references are those in the SOD1 gene.

thumbnail image

Figure 5. Number of references per variant.

Download figure to PowerPoint

Table 2. Summary of Entries in Database
Gene nameNo. of entries (familial/sporadic)No. of unique variants (familial/sporadic/both)No. of references
SOD1440(313/32)88(57/9/6)55
ANG49(17/27)17(3/6/3)4
TARDBP29(21/8)14(6/8/0)4
VAPB19(19/0)1(1/0/0)1
NEFH14(4/10)7(1/6/0)3
ALS213(0/7)8(8/0/0)6
VEGFA13(0/7)6(0/4/0)5
PON19(0/9)6(0/6/0)5
DCTN19(7/2)6(4/2/0)4
HFE6(0/6)1(0/1/0)2
LIF5(0/0)1(0/0/0)1
PON24(0/4)3(0/3/0)3
OGG14(0/4)1(0/1/0)1
APOE3(0/2)2(0/1/0)3
SMN13(0/3)2(0/2/0)3
PRPH3(0/3)3(0/3/0)2
SETX3(3/0)3(3/0/0)1
DPP62(0/2)1(0/1/0)2
PON32(0/2)1(0/1/0)2
APEX12(0/2)1(0/1/0)1
CHMP2B2(2/0)2(2/0/0)1
CNTF1(0/1)1(0/1/0)1
FGGY1(0/1)1(0/1/0)1
ITPR21(0/1)1(0/1/0)1
SPAST1(0/1)1(0/1/0)1
Total638(399/127)178(85/59/9)107

For almost all of the genes, most patients are found to be heterozygous in regard to the variations. Homozygous variations are reported for the SOD1 gene NM_000454.4:c.272A>C (corresponding to a substitution of amino acid p.Asp90Ala). Variations of c.272A>C in heterozygous states are reported in control samples, and they appear to follow an autosomal recessive fashion, which is not the case for variations of c.272A>T (p.Asp90Val). For c.272A>C variations, we do not find any significant difference in clinical conditions (like the age of onset between patients) between the homozygous variations and the heterozygous ones. As for the variations in the ALS2 gene (MIM♯ 606352), all these patients (19 individuals reported in seven publications) are homozygous. Concerning the eight kinds of variations in ALS2, it is reported in the seven publications that patients' family members with the heterozygous variation do not develop ALS, whereas all family members with the homozygous variation develop ALS. Further, concerning ALS2, more kinds of variations are reported in control samples than in other genes. Currently, the number of variation kinds in case samples and in control samples is 8 and 34, respectively. Of these variations in control samples, 13 are nonsynonymous. A similar tendency is found only in TARDBP (MIM♯ 605078): the number of variation kinds is 14 in case samples, and 65 (7 nonsynonymous) in control samples including homozygous variations. Minor allele frequencies of nonsynonymous variations in control samples are more than 0.01, except for two variations. Considering the low prevalence rate of ALS, most of them might not be pathogenic even in a homozygous state. (It is important to note, however, that variation reports without disease outcomes may suffer from publication bias.) With regard to the type of variations, most of the variations are single nucleotide variations. Deletions and insertions are found in the SOD1 gene and the PRPH gene (MIM♯ 170710). As for the ALS2 gene, all of the variations in case samples are frame-shift deletions. Variations in the NEFH gene (MIM♯ 162230) are also all deletions, except one that is an insertion. One of the two variations in the SMN1 gene (MIM♯ 600354) is a duplication; the other is a deletion.

This mutation database provides a comprehensive overview of the clinical information collected from patients with the variations, allowing users to instantly speculate on the importance and possible influences of variations of interest. The age of onset and the duration of the disease are of particular interest, because they provide an insight into how specific variations influence the disease course and prognosis. Although variations in most genes represent a wide variety of these two features, a significant trend is seen for variations in the ALS2 gene. For the eight kinds of variations in the database, all patients show extremely early age of onset (around 1 year old) with long duration (>30 years), indicating the connection between ALS2 variations and early onset followed by a milder course of the disease. It is also clear that a significantly short duration (less than 3 months) results from two variations in the SOD1 gene of NM_000454.4:c.20G>T [Kohno et al., 1999] and c.304G>C [Sato et al., 2004, 2005], which lead to substitutions of amino acids p.Cys6Gly and p.Asp101His, respectively. Four patients are reported to have these variations and all of them had duration of 2 or 3 months. Interestingly, the amino acid substitution of p.Asp101Asn caused by c.304G>A variation, instead of p.Asp101His caused by c.304G>C, does not appear to be associated with short duration (25 individuals are reported and their duration ranges from 2 to 4 years). It suggests preferential dependence of the duration on the amino acid substitute. This result is probably related to the change in protein structure caused by the bulky amino acid histidine. Similarly, the amino acid substitution of p.Gly41Ser seems to result in relatively shorter duration (around 1 year for 9 patients reported) than the substitution of p.Gly41Asp (>11 years for 11 patients reported). It also happens with the amino acid substitution of p.Asn86Lys (around 2 years for 7 patients reported) compared to p.Asn86Ser (>5 years for 5 patients reported). In addition, as reported in previous studies, variations in the SOD1 gene of NM_000454.4:c.14C>T (corresponding to a substitution of amino acid p.Ala4Val) are associated with a relatively short duration (<1–2 years). All of the 104 patients reported showed duration of 1 to 2 years. As the amount of clinical information increases, it will become possible to pursue extensive investigations of the relationships between variations and other characteristics such as onset site and initial symptoms. Furthermore, information on ethnic groups will enable users to understand regional research efforts regarding respective genes. This information is also valuable for identifying significant genetic factors, namely, those reported to be present in many ethnic groups. Variations in the ANG gene (MIM♯ 105850) have so far been reported from the largest number of populations apart from SOD1, with eight ethnic groups covering individuals of Arab, Asian, and European descent studied.

In summary, this new mutation database will serve as a valuable tool—for both researchers and clinicians—for investigating genetic evidence, and ultimately creating a new therapeutics, for ALS through systematic data mining of the integrated genetic and clinical information. Identifying the relationships between various genetic factors is an essential step forward in understanding how they affect ALS, particularly when the involvement in the mechanisms of multiple genes and a number of variants in each gene is considered. Our future work will focus on developing an information–analysis system that can assess the variation effects in terms of their associations with each other.

Acknowledgements

  1. Top of page
  2. Abstract
  3. Introduction
  4. Database Structure
  5. Data Sources
  6. Utility
  7. Accessibility and Usage
  8. Discussion
  9. Acknowledgements
  10. References
  11. Supporting Information

The mutation database was constructed as part of the Life Science Integrated Database Project conducted by the Japan Ministry of Education, Culture, Sports, Science, and Technology.

References

  1. Top of page
  2. Abstract
  3. Introduction
  4. Database Structure
  5. Data Sources
  6. Utility
  7. Accessibility and Usage
  8. Discussion
  9. Acknowledgements
  10. References
  11. Supporting Information
  • Bennett-Lovsey RM, Herbert AD, Sternberg MJE, Kelley LA. 2008. Exploring the extremes of sequence/structure space with fold recognition in the program Phyre. Proteins 70:611625.
  • Camu W, Khoris J, Moulard B, Salachas F, Briolotti V, Rouleau GA, Meininger V. 1999. Genetics of familial ALS and consequences for diagnosis. J Neurol Sci 165:S21S26.
  • Chen YZ, Bennett CL, Huynh HM, Blair IP, Puls I, Irobi J, Dierick I, Abel A, Kennerson ML, Rabin BA, Nicholson GA, Auer-Grumbach M, Wagner K, Jonghe PD, Griffin JW, Fischbeck KH, Timmerman V, Cornblath DR, Chance PH. 2004. DNA/RNA helicase gene mutations in a form of juvenile amyotrophic lateral sclerosis (ALS4). Am J Hum Genet 74:11281135.
  • den Dunnen JT, Antonarakis SE. 2000. Mutation nomenclature extensions and suggestions to describe complex mutations: a discussion. Hum Mutat 15:712.
  • Gitcho MA, Baloh RH, Chakraverty S, Mayo K, Norton JB, Leritch D, Hatanpaa KJ, White 3rd CL, Bigio EH, Caselli R, Baker M, Al-Lozi MT, Morris JC, Pestronk A, Rademakers R, Goate AM, Cairns NJ. 2008. TDP-43 A315T mutation in familial motor neuron disease. Ann Neurol 63:535538.
  • Gout AM, Martin NC, Brown AF, Ravine D. 2007. PKDB: polycystic kidney disease mutation database—a gene variant database for autosomal dominant polycystic kidney disease. Hum Mutat 28:654659.
  • Greenway MJ, Anderson PM, Russ C, Ennis S, Cashman S, Donaghy C, Patterson V, Swingler R, Kieran D, Prehu J, Morrison KE, Green A, Acharya KR, Brown Jr RH, Hardiman O. 2006. ANG mutations segregate with familial and “sporadic” amyotrophic lateral sclerosis. Nat Genet 38:411413.
  • Kohno S, Takahashi Y, Miyajima H, Serizawa M, Mizoguchi K. 1999. A novel mutation (Cys6Gly) in the Cu/Zn superoxide dismutase gene associated with rapidly progressive familial amyotrophic lateral sclerosis. Neurosci Lett 276:135137.
  • Kulkarni V, Errami M, Barber R, Garner HR. 2008. Exhaustive prediction of disease susceptibility to coding base changes in the human genome. BMC Bioinformatics 9:S3.
  • Kwiatkowski Jr TJ, Bosco DA, Leclerc AL, Tamrazian E, Vanderburg CR, Russ C, Davis A, Gilchrist J, Kasarskis EJ, Munsat T, Valdmanis P, Rouleau GA, Hosler BA, Cortelli P, de Jong PJ, Yoshinaga Y, Haines JL, Pericak-Vance MA, Yan J, Ticozzi N, Siddigue T, McKenna-Yasek D, Sapp PC, Horvitz HR, Landers JE, Brown Jr RH. 2009. Mutations in the FUS/TLS gene on chromosome 16 cause familial amyotrophic lateral sclerosis. Science 323:12051208.
  • Mitsumoto H, Chad DA, Pioro EP. 1998. Amyotrophic lateral sclerosis. New York: Oxford University Press.
  • Radunovic A, Leigh PN. 1999. ALSODatabase: database of SOD1 (and other) gene mutations in ALS on the Internet. Amyotroph Lateral Scler Other Motor Neuron Disord 1:4549.
  • Rosen DR, Siddique T, Patterson D, Figlewicz DA, Sapp P, Hentati A, Donaldson D, Goto J, O'Regan JP, Deng HX, Rahmani Z, Krizus A, McKenna-Yasek D, Cayabyab A, Gaston SM, Berger R, Tanzi RE, Halperin JJ, Herzfeldt B, Van den Bergh R, Hung WY, Bird T, Deng G, Mulder DW, Smyth C, Laing NG, Soriano E, Pericak-Vance MA, Haines J, Rouleau GA, Gusella JS, Hervitz HR, Brown Jr RH. 1993. Mutations in Cu/Zn superoxide dismutase gene are associated with familial amyotrophic lateral sclerosis. Nature 362:5962.
  • Runz H, Dolle D, Schlitter AN, Zschocke J. 2008. NPC-db, a Niemann-Pick Type C disease gene variation database. Hum Mutat 29:345350.
  • Sato T, Nakanishi T, Yamamoto Y, Andersen PM, Ogawa Y, Fukada K, Zhou Z, Aoike F, Sugai F, Nagano S, Hirata S, Ogawa M, Nakano R, Ohi T, Kato T, Nakagawa M, Hamasaki T, Shimizu A, Sakoda S. 2005. Rapid disease progression correlates with instability of mutant SOD1 in familial ALS. Neurology 65:19541957.
  • Sato T, Yamamoto Y, Nakanishi T, Fukada K, Sugai F, Zhou Z, Okuno T, Nagano S, Hirata S, Shimizu A, Sakoda S. 2004. Identification of two novel mutations in the Cu/Zn superoxide dismutase gene with familial amyotrophic lateral sclerosis: mass spectrometric and genomic analyses. J Neurol Sci 218:7983.
  • Schymick JC, Talbot K, Traynor BJ. 2007. Genetics of sporadic amyotrophic lateral sclerosis. Hum Mol Genet 16:R233R242.
  • Smith TF, Waterman MS. 1981. Identification of common molecular subsequences. J Mol Biol 147:195197.
  • Sreedharan J, Blair IP, Tripathi VB, Hu X, Vance C, Rogelj B, Ackevley S, Durnall JC, Williams KL, Buratti E, Baralle F, de Belleroche J, Mitchell JD, Leigh PN, Al-Chalabi A, Miller CC, Nicholson G, Shaw CE. 2008. TDP-43 mutations in familial and sporadic amyotrophic lateral sclerosis. Science 319:16681672.
  • Takahashi Y, Seki N, Ishiura H, Mitsui J, Matsukawa T, Kishino A, Onodera O, Aoki M, Shimozawa N, Murayama S, Itoyama Y, Suzuki Y, Sobue G, Nishizawa M, Goto J, Tsuji S. 2008. Development of a high-throughput microarray-based resequencing system for neurological disorders and its application to molecular genetics of amyotrophic lateral sclerosis. Arch Neurol 65:13261332.
  • Tang S, Zhang Z, Kavitha G, Tan EK, Ng SK. 2008. MDPD: an integrated genetic information resource for Parkinson's disease. Nucleic Acids Res 37:D858D862.
  • Van Deerlin VM, Leverenz JB, Bekris LM, Bird TD, Yuan W, Elman LB, Clay D, Wood EM, Chen-Plotkin AS, Martinez-Lage M, Steinbart E, McCluskey L, Grossman M, Neumann M, Wu IL, Yang WS, Kalb R, Galasko DR, Montine TJ, Trojanowski JQ, Lee VM, Schellenberg GD, Yu CE. 2008. TARDBP mutations in amyotrophic lateral sclerosis with TDP-43 neuropathology: a genetic and histopathological analysis. Lancet Neurol 7:409416.
  • Vance C, Rogelj B, Hortobagyi T, De Vos KJ, Nishimura AL, Sreedhawan J, Hu X, Smith B, Ruddy D, Wright P, Ganesalingam J, Williams KL, Tripathi V, Al-Saraj S, Al-Chalabi A, Leigh PN, Blair IP, Nicholson G, de Belleroche J, Gallo JM, Miller CC, Shaw CE. 2009. Mutations in FUS, an RNA processing protein, cause familial amyotrophic lateral sclerosis type 6. Science 323:12081211.
  • Wroe R, Wai-Ling Butler A, Andersen PM, Powell JF, Al-Chalabi A. 2008, ALSOD: the amyotrophic lateral sclerosis online database. Amyotrophic Lateral Scler 9:249250.
  • Wu TD, Watanabe CK. 2005. GMAP: a genomic mapping and alignment program for mRNA and EST sequences. Bioinformatics 21:18591875.

Supporting Information

  1. Top of page
  2. Abstract
  3. Introduction
  4. Database Structure
  5. Data Sources
  6. Utility
  7. Accessibility and Usage
  8. Discussion
  9. Acknowledgements
  10. References
  11. Supporting Information

Additional Supporting Information may be found in the online version of this article

FilenameFormatSizeDescription
humu_21306_sm_SupplInfo.pdf121KSupplementary Materials

Please note: Wiley Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.