Comprehensive analysis of candidate genes for photosensitivity using a complementary bioinformatic and experimental approach

Authors


Address correspondence to Sarah von Spiczak, MD, Department of Neuropediatrics, University Medical Centre Schleswig-Holstein, Arnold-Heller-Str. 3, Building 9, 24105 Kiel, Germany. E-mail: s.vonspiczak@pedneuro.uni-kiel.de

Summary

Photoparoxysmal response (PPR) is a highly heritable electroencephalographic trait characterized by an increased sensitivity to photic stimulation. It may serve as an endophenotype for idiopathic generalized epilepsy. Family linkage studies identified susceptibility loci for PPR on chromosomes 5q35.3, 8q21.13, and 16p13.3. This study aimed to identify key candidate genes within these loci. We used bioinformatics tools for gene prioritization integrating information on biologic function, sequence data, gene expression, and others. The prime candidate gene from this analysis was sequenced in 48 photopositive probands. Presumed functional implications of identified polymorphisms were investigated using bioinformatics methods. The glutamate receptor subunit gene GRIN2A was identified as a prime candidate gene. Sequence analysis revealed various new polymorphisms. None of the identified variants was predicted to be functionally relevant. We objectified the selection of candidate genes for PPR without an a priori hypothesis. Particularly among the various ion channel genes in the linkage regions, GRIN2A was identified as the prime candidate gene. GRIN2A mutations have recently been identified in various epilepsies. Even though our mutation analysis failed to demonstrate direct involvement of GRIN2A in photosensitivity, in silico gene prioritization may provide a useful tool for the identification of candidate genes within large genomic regions.

Despite efforts to unravel the genetics of idiopathic (genetic) generalized epilepsies (IGEs), success in identifying genetic markers for IGEs has been limited. Using photosensitivity (photoparoxysmal response, PPR) as an endophenotype has been advocated to reduce phenotypic and genetic heterogeneity and complexity (Helbig et al., 2008).

Photosensitivity is characterized by an abnormal visual sensitivity of the brain to photic stimulation and high incidences in patients with IGE. Recently, a meta-analysis of several published linkage studies and additional families revealed suggestive nonparametric linkage for three loci on chromosomes 5q35.3, 8q21.13, and 16p13.3 (De Kovel et al., 2010).

In total, 450 genes are located within these three linkage regions. Identifying key candidate genes is difficult when applying experimental methods alone such as traditional sequence analysis. Selection of candidate genes is fraught with personal bias based on the necessity of an a priori hypothesis.

Within this study, we explore a combined approach of candidate gene selection and analysis using both bioinformatics tools and candidate gene sequencing.

Methods

The study protocol was approved by the local ethics committee, and all participants or their parents/legal guardians, respectively, gave written informed consent. The study outline is demonstrated in Figure S1. Detailed information and relevant literature on all bioinformatic programs are presented as Supporting Information.

Computational prioritization of candidate genes

Four web-based, freely available bioinformatics tools were selected in order to use complementary methods covering different data sources as input.

The chromosomal locations for the linkage regions published by De Kovel et al. (2010)±5 Mb (5q35.3 = chr5:171.600.001-185.915.260; 8q21.13 = chr8:75.100.001-89.600.000; 16p13.3 =  chr16:1-12.900.000, NCBIBuild37) were used as input for the different programs. For Endeavour and PROSPECTR and SUSPECT, where a training set of genes is needed, epilepsy genes listed in OMIM (Online Mendelian Inheritance in Man, search term “epilepsy”) were used.

Study cohort

The study cohort comprised 48 probands with PPR type II–IV (Waltz et al., 1992) recruited at the Department of Neuropediatrics, University Hospital Schleswig-Holstein (Kiel, Germany). Patients were diagnosed with IGEs or were without history of seizures.

DNA from individual blood samples was extracted using commercially available kits.

Sequence analysis

Mutation analysis of GRIN2A including all exons, exon–intron boundaries, and the promoter region was performed using the NCBI Primer Set RSS000057426.1 and additionally designed primers (Primer3, http://www.primer3.sourceforge.net, Table S1). Amplification by polymerase chain reaction (PCR) and bidirectional sequencing was performed following standard protocols. Polymorphisms and InDels were identified by NovoSNP 3.0 (Weckx et al., 2005).

In silico analysis to predict functional impact of polymorphisms

To assess possible functional implications of newly identified single nucleotide polymorphisms (SNPs), we used a complimentary set of bioinformatics tools. Default parameters were used for all programs.

Nonsynonymous coding polymorphisms:

Intronic SNPs:

Analysis of potential affection of transcription factor binding sites (TFBS) by polymorphisms within the promoter region:

Follow-up of variant chr16:g.10277068G>A

The polymorphism chr16:g.10277068G>A was investigated in a control cohort of 358 healthy blood donors of reported German descent, who were negative for neurologic and psychiatric diseases as screened by standardized questionnaires (Popgen cohort, http://www.popgen.de), using a custom-made TaqMan™ SNP assay (Applied Biosystems, Carlsbad, CA, U.S.A.).

Results

Computational prioritization of candidate genes

The top 50 results generated by each program were compared for genes overlapping between at least three of the four programs. Detailed results are demonstrated in Table 1. Because of its biologic function as a subunit of the ionotropic glutamate receptor and recent reports of involvement in epilepsy (Endele et al., 2010; Reutlinger et al., 2010), GRIN2A was chosen for subsequent analysis by mutation screening.

Table 1.   Results of candidate gene prioritization
Gene symbolFull namePositionEndeavouraPrioritizerbG2DcPandSd
  1. Italic numbers indicate program scores; x means gene scored under top 50 genes by the respective program.

  2. aEndeavour creates ranks for each data source that are fused by order statistics resulting in an overall rank (global prioritization score: 1 → n).

  3. bPrioritizer assesses the interaction of genes located within different susceptibility loci with respect of their involvement in common gene networks. It calculates an empiric p-value by permutation to correct for differences in the topology of gene networks.

  4. cG2D gives an R-score (relative score). Low values (close to zero) indicate the possibility of a relation between the sequence investigated and a certain disease.

  5. dThe output of PROSPECTR is a score α (0–1) representing the likelihood of involvement in disease with higher scores suggesting a higher likelihood. SUSPECTS combines the score of PROSPECTR with additional scores for co-expression of candidate genes with known disease genes, rare Interpro domains and semantic similarities of GO terms, weighted for the amount of information available. This total score is transferred into a total rank.

DDX41DEAD (Asp-Glu-Ala-Asp) box polypeptide 41Chr. 5q35.3: xxx
176,938,578–176,944,4700.2830.2680.607/6
B4GALT7Xylosylprotein beta 1,4-galactosyltransferase, polypeptide 7 (galactosyltransferase I)Chr. 5q35.3:xxxx
177,027,101–177,037,348490.1840.0180.607/11
CLCN7Chloride channel 7Chr. 16p13.3:xxxx
1,494,935–1,525,08510.1280.0030.677/2
PKD1Polycystic kidney disease 1 (autosomal dominant); transient receptor potential cation channel, subfamily P, member 1Chr. 16p13.3:x xx
2,138,711–2,185,89960.0620.717/4
ABCA3ATP-binding cassette, subfamily A (ABC1), member 3Chr. 16p13.3:x xx
2,325,877–2,390,747400.0460.717/6
GRIN2AGlutamate receptor, ionotropic, N-methyl-d-aspartate 2AChr. 16p13.3:x xx
9,847,261–10,276,61150.0320.601/1

Study cohort

Details on diagnoses are given in Table S2.

Sequence analysis

Sequencing revealed seven previously unknown single nucleotide polymorphisms. Apart from one polymorphism within the promoter region (chr16:g.10277263G>A, NCBIBuild37), which was present in the heterozygous state in 8/48 probands (16.7%) and in the homozygous state in one proband (2.1%), all SNPs were found in the heterozygous state in a single patient each.

In silico analysis to predict functional impact of polymorphisms and follow-up of variant chr16:g.10277068G>A

Detailed results of the in silico analysis are demonstrated in Tables 2–4. For polymorphism chr16:g.10277068G>A, a new binding site for transcription factor MZF1 was predicted. Investigation of this variant in controls revealed a heterozygous status in 3 of 358 probands equivalent to a minor allele frequency of 0.4%. None of the other polymorphisms identified was rated as functionally relevant by at least two bioinformatics tools.

Table 2.   In silico analysis of the nonsynonymous coding SNP g.9858211A>G
LocalizationPosition (NCBIBuild37)Base exchangeAS exchangePolyPhen2SIFTSNAP
Score*PredictionScorePredictionAccuracyPrediction
  1. *For PolyPhen2, the HumDiv-trained program was used as recommended for complex traits.

Exon 14g.9858211A>GThr1064Ala0.003Benign0.74Tolerated78%Neutral
Table 3.   In silico analysis of intronic noncoding SNPs
LocalizationPosition (NCBIBuild37)Base exchangeAS exchangeBDGPHSF V2.4
Wild-typeMutated
  1. ASS, acceptor splice site; DSS, donor splice site.

  2. avariation −35.3% resulting in loss of acceptor splice site in mutated allele.

5′UTR, exon 2g.10275749C>ANo differenceNo difference
Intron 9/10g.9923799A>CNo differenceASS 82.02ASS 53.07a
DSS 80.32DSS 75.57
Intron 13/14g.9858894T>CNo differenceNo difference
Table 4.   In silico analysis of promoter SNPs
LocalizationPosition (NCBIBuild37)Base exchangeAS exchangeFastSNPSNPInspector 2.2
Transcription factorScoreTranscription factorMatrix sim
  1. TFBS, transcription factor binding site; transcription factors: EEF2, eukaryotic translation elongation factor 2; MZF1, myeloid zinc finger 1; WT1, Wilms tumor 1; XCPE1, X core promoter element 1; BCL6, B-cell CLL/lymphoma 6.

5′UTR, promoterg.10277263G>ATFBS not changedNew site for EEF20.863
5′UTR, promoterg.10277068G>ANew site for MZF194.8New site for MZF10.991
Site lost for WT10.943
Site lost for XCPE10.801
5′UTR, promoterg.10276998T>CTFBS not changedSite lost for BCL60.76

Discussion

Within the present study we attempted candidate gene identification by using various complementary bioinformatics tools for candidate gene analysis in combination with traditional sequencing techniques.

Photosensitivity can be used as an endophenotype for idiopathic generalized epilepsies (Helbig et al., 2008). Endophenotypes have been used to investigate the genetic background of common complex diseases, assuming that the genetic basis of the endophenotype is less complex than the genetic basis of the disease in question. This concept has been applied to psychiatric disorders (Cannon & Keller, 2006) and neurologic syndromes (Stefansson et al., 2007).

A recent analysis of whole-genome linkage data for photosensitivity combining previous linkage studies and the analysis of additional families revealed several linkage peaks (De Kovel et al., 2010).

Former candidate genes studies on photosensitivity failed to demonstrate reproducible major effects of the investigated genes. The genes investigated were chosen due to their biologic functions (e.g., Von Spiczak et al., 2010) and presumed involvement in idiopathic epilepsies (e.g., Lorenz et al., 2006).

Traditional candidate gene selection identifies genes based on known biologic functions and considerations of possible involvement in pathophysiologic processes. Accordingly, this method is highly subjective (Zhu & Zhao, 2007). Systematically screening the available information is beyond the possibilities of individual researchers. Screening all genes located within a given linkage region is time consuming and cost-intensive. Although recent technical advances such as next-generation sequencing techniques are likely to overcome these problems, these methods are expensive and not yet available for most researchers. Accordingly, alternative approaches for candidate gene selection are necessary.

Several online bioinformatics tools have been developed to facilitate candidate gene prioritization (Tranchevent et al., 2010). These programs differ with respect to data sources (gene ontology, gene function, sequence data, protein–protein interactions, information on gene expression, and others), computational methods, and prioritization algorithms. The basic approach of selecting candidate genes for follow-up studies by computational prioritization is, therefore, to some degree similar to proceedings of researchers. However, the magnitude of data analyzed for prioritization is far more comprehensive than what is analyzable for researchers. The programs used within our study were selected based on the complementarity of data sources used for gene prioritization, aiming to avoid bias toward specific aspects and data categories and to increase overall reliability.

In our study, gene prioritization revealed six genes as prime candidates for photosensitivity identified by at least three of the four bioinformatics tools. Of these, GRIN2A is coding for the 2A subunit of the ionotropic (N-methyl-d-aspartate , NMDA) glutamate receptor. NMDA receptors (NMDARs) are critically involved in excitatory synaptic transmission, plasticity, and excitotoxicity in the central nervous system. Recently, involvement of GRIN2A in human epilepsies was suggested (Endele et al., 2010; Reutlinger et al., 2010). Given the functional implications of NMDARs and the involvement of GRIN2A in epilepsy, the identification of GRIN2A by candidate gene prioritization seems both plausible and reliable.

To assess the importance of GRIN2A in photosensitivity, we sequenced the gene in photosensitive probands. One promoter variation at chr16:g.10277068G>A was predicted to create a new binding site for transcription factor MZF-1 (myeloid zinc finger gene). Accordingly, this newly created MZF-1 binding site may impact on the expression of GRIN2A in the CNS. The polymorphism was found to be present at a minor allele frequency of 0.4% in a control population. Additional studies are needed to further evaluate this finding.

None of the other polymorphisms identified was concordantly rated to have an impact on protein function, splicing, or gene regulation by the applied programs.

In summary, a combination of bioinformatics tools and traditional sequencing efforts was applied to evaluate candidate genes for photosensitivity. Gene prioritization revealed GRIN2A as an intriguing candidate. Although sequence analysis failed to show a direct role of GRIN2A in photosensitivity, GRIN2A is increasingly recognised in human epilepsies. We suggest that in silico prioritization has the capability to identify exciting genes within a given linkage region.

Acknowledgments

We would like to thank I. Urbach, M. Newsky, A. Dietsch, S. Greve, and M. Depta for technical assistance. We are grateful to A. Ackerhans and K. Moldenhauer for database management. SvS receives institutional support from the Christian-Albrechts-University Kiel, Germany and received a scholarship from the German Epilepsy Society for research activities (Otfrid-Foerster-Stipendium).

Disclosure

None of the authors has any conflict of interest to disclose.

We confirm that we have read the Journal’s position on issues involved in ethical publication and affirm that this report is consistent with those guidelines.

Ancillary