Polygenic transmission and complex neuro developmental network for attention deficit hyperactivity disorder: Genome-wide association study of both common and rare variants


  • Li Yang and Benjamin M. Neale contributed equally to this study.
  • For Psychiatric GWAS Consortium-ADHD subgroup authors see Supplementary Materials.

Correspondence to:

Yufeng Wang, Peking University Sixth Hospital, 51, Huayuan Bei Road, Haidian District, Beijing 100191, China.

E-mail: wangyf@bjmu.edu.cn

Correspondence to:

Stephen V. Faraone, Ph.D., Department of Psychiatry and Behavioral Sciences, SUNY Upstate Medical University, Syracuse, NY 13210.

E-mail: sfaraone@childpsychresearch.org


Attention-deficit hyperactivity disorder (ADHD) is a complex polygenic disorder. This study aimed to discover common and rare DNA variants associated with ADHD in a large homogeneous Han Chinese ADHD case–control sample. The sample comprised 1,040 cases and 963 controls. All cases met DSM-IV ADHD diagnostic criteria. We used the Affymetrix6.0 array to assay both single nucleotide polymorphisms (SNPs) and copy number variants (CNVs). Genome-wide association analyses were performed using PLINK. SNP-heritability and SNP-genetic correlations with ADHD in Caucasians were estimated with genome-wide complex trait analysis (GCTA). Pathway analyses were performed using the Interval enRICHment Test (INRICH), the Disease Association Protein–Protein Link Evaluator (DAPPLE), and the Genomic Regions Enrichment of Annotations Tool (GREAT). We did not find genome-wide significance for single SNPs but did find an increased burden of large, rare CNVs in the ADHD sample (P = 0.038). SNP-heritability was estimated to be 0.42 (standard error, 0.13, P = 0.0017) and the SNP-genetic correlation with European Ancestry ADHD samples was 0.39 (SE 0.15, P = 0.0072). The INRICH, DAPPLE, and GREAT analyses implicated several gene ontology cellular components, including neuron projections and synaptic components, which are consistent with a neurodevelopmental pathophysiology for ADHD. This study suggested the genetic architecture of ADHD comprises both common and rare variants. Some common causal variants are likely to be shared between Han Chinese and Caucasians. Complex neurodevelopmental networks may underlie ADHD's etiology. © 2013 Wiley Periodicals, Inc.


Attention-deficit hyperactivity disorder (ADHD) is a common behavioral disorder of childhood, affecting 3–6% of school-age children around the world [Faraone et al., 2003]. It has been viewed as a polygenic, multifactorial disorder. Both common and rare DNA variants contribute to its complex etiology [Poelmans et al., 2011; Stergiakouli et al., 2012; Williams et al., 2012].

Genome-wide association studies (GWAS) are hypothesis-free, interrogate all genes and regulatory regions of the genome and have the potential to discover novel risk genes. The first GWAS of ADHD performed by Neale et al. [2008] analyzed 438,784 SNPs in 909 Caucasian ADHD trios. Although none of the SNP association tests achieved genome-wide significance, the top-25 SNPs (based on P-value) implicated some interesting candidate genes, including cytoskeleton-organizer DCLK1, extracellular matrix component SPOCK3, cell-cell adhesion protein CDH13, as well as two potassium-channel regulators KCNIP1 and KCNIP4. Using the same sample set, Lasky-Su et al. [2008] performed a quantitative genome-wide association analysis of ADHD symptoms. A high percentage (30/32, 94%) genes hit by the 58 SNPs with P values less than 10−5 were brain-expressed, including five related to transcription factors.

Meanwhile, Lesch et al. [2008] used independent DNA pools from343 ADHD-affected adults and 304 controls for association analyses of the ADHD diagnostic phenotype. Of the 30 top-hit genes, seven were involved in cell adhesion/migration/neurogenesis (e.g., CDH13, ASTN2, CSMD2, ITGAE, ITGA11, CDH23, SDK2), two regulated synaptic plasticity (e.g., CTNNA2, KALRN), three were transcription factors (MYT1L, TFEB, SUPT3H), and one coded for a potassium channel (KCNC1) [Lesch et al., 2008].

Neale et al. [2010a] performed case-control analyses in896 cases with DSM-IV ADHD and 2,455 controls. A consensus dataset of 1,033,244 SNPs was imputed (using the HapMap Phase III European CEU and TSI samples as the reference). No genome-wide significant associations were found. The most significant results implicated PRKG1, FLNC, TCERG1L, PPM1H, NXPH1, CDH13, HK1, and HKDC1. Combining data from four ADHD GWAS projects, Neale et al. [2010b] performed a meta-analysis in a sample of 2,064 trios, 896 cases, and 2,455 controls. Even with this much larger sample size, no genome-wide significant associations were found. One reason for this is that the samples were underpowered to estimate effect sizes of common variants of small effect. This has been confirmed by analyses that estimate the variance contributed by common variants all together. Genome-wide complex trait analysis (GCTA) [Yang et al., 2010, 2011; Lee et al., 2011] applied to ADHD samples (4,163 cases and 12, 040 controls) from the Psychiatric Genomics Consortium, estimated SNP chip heritability to be 0.28 (SE 0.02; Psychiatric GWAS Consortium ADHD Group. Paper submitted for publication).

Copy number variations (CNVs) have also been implicated in the etiology of ADHD. Elia et al. [2010] found that inherited rare CNVs in an ADHD sample were significantly enriched for genes known to be important for psychological and neurological functions, including learning, behavior, synaptic transmission, and central nervous system development. Williams et al. [Williams et al., 2010, 2012] found an increased burden of large, rare CNVs and reported excess of chromosome 16p13.11 and 15q13.3duplications and an overlap between CNVs reported for ADHD and autism spectrum disorders. Elia et al. [2012] further showed that CNVs affecting the metabotropic glutamate receptor genes GRM5, GRM7, GRM8, and GRM1were enriched across several independent samples.

In summary, although ADHD is acknowledged to be a genetic disorder, GWAS has not revealed any common SNP variants with genome-wide significance. This study used both common and rare variants, using polygenic and pathway analyses, to evaluate the genetic etiology of ADHD in a large homogenous Han Chinese case–control sample.



One thousand and forty ADHD cases (876 boys, 84.2%) aged between 6 and 16 years [average (9.7 ± 2.4) years] were recruited from the Child and Adolescent Psychiatric Outpatient Department of the Sixth Hospital, Peking University. All cases met DSM-IV ADHD diagnostic criteria. A clinical diagnosis was first made by a senior child and adolescent psychiatrist based on the parent and teacher completed ADHD Rating Scale-IV (ADHD-RS-IV), and then confirmed by semi-structured interview with the parents and child using the Chinese version of the Clinical Diagnostic Interview Scale [Barkley, 1998; Yang et al., 2004]. Those with major neurological disorders (e.g., epilepsy), schizophrenia, pervasive development disorder, and mental retardation (IQ < 70) were excluded. The sample consists of 680 (65.4%) ADHD combined type and 360 (34.6%) inattentive type. The comorbidities included oppositional defiant disorder (ODD) in 380 patients (36.5%), conduct disorder in 58 (5.6%), and tic disorder in 167 (16.1%).

Nine hundred sixty three controls were students from local elementary schools, healthy blood donors from the Blood Center of the First Hospital, Peking University, and healthy volunteers from our institute. Six hundred and eight were males (63.1%). The average age was (15.4 ± 8.8) years. Parents or adults themselves completed the ADHD Rating Scale-IV (ADHD RS-IV) to exclude ADHD. Major psychiatric disorders, family history of psychosis, severe physical diseases, and substance abuse were also excluded according to a medical history report form. All the cases and controls were of Han Chinese decent.

The study was approved by the Institutional Review Board of the Peking University Health Science Center. After complete description of the study to the subjects, written informed consent was obtained from parents of the ADHD probands.


Both cases and controls were genotyped using the Affymetrix6.0 array at CapitalBio Ltd. (Beijing) using the standard Affymetrix protocol. Samples of cases and controls were added in equal proportion to each chip to avoid batch effects. The Affymetrix 6.0 array included 906,600 SNP probes and 946,000 CNV probes. The SNP genotypes were called with BIRDSEED v2, while CNVs were called with Genotyping Console (GTC) 4.0 using default parameters. A total of 2003 cases and controls passed the first stage sample control with call rates >98%, no first or second-degree relative relationships, and genders consistent with site reports.

Data Quality Control and Statistical Analysis

Data quality control and association analysis were performed using PLINK 1.07 [Purcell et al., 2007, http://pngu.mgh.harvard.edu/purcell/plink/]. For inclusion of SNPs we required: call rate >95%, MAF >1%, and HWE P-value >10−6. After data cleaning, there were 656,051 SNPs for the association analyses. To examine population stratification, we performed multi-dimensional scaling (MDS). In the pair-wise MDS plot for 10 dimensions, the majority of subjects were tightly clustered, suggesting no substantial population stratification (SF1). We then conducted logistic regression to adjust the association P-value, using the 10 principal components from the MDS procedure as covariates.

CNV calling only included segments larger than 100 kb, spanning at least 10 consecutive, informative SNPs. Quality control for samples excluded 136 individuals (71 cases, 65 controls) who carried more than 40 apparent CNVs. Analysis focused on rare CNVs with frequency <1%. We used the human reference sequence of NCBI Build 36.1 - hg18 to filter known segmental duplications. Known common CNVs defined by the Genome Structural Variation Consortium (http://projects.tcag.ca/variation/ng42m_cnv.php) and known gaps of at least 200 kb in the SNP array were also filtered. Burden analysis counted the number of total CNVs, deletions and duplications in cases and controls, calculated the CNV rate, as well as percent of cases and controls that carried rare CNVs. The significance of CNV differences between cases and controls was assessed by permutation test with 50,000replicates.

Polygenic analyses

To investigate the contribution of common SNPs to variation in liability to ADHD, we estimated the SNP-heritability using GCTA [Yang et al., 2011]. A non-zero heritability is estimated if cases are genetically more similar to other cases than they are to controls [Lee et al., 2011]. We removed individuals such that no pair had genetic similarity relationship >0.05 (as this may inflate estimates unfairly), so that 1,010 cases and 917 controls remained. We used Caucasian samples from the Psychiatric Genomics Consortium for ADHD (4,163 ADHD cases and 12,040 controls) and a bivariate model of analysis [Lee et al., 2012b] to estimate the SNP-genetic correlation between Han Chinese and Caucasians for liability to ADHD. Since the SNP frequencies differ between ethnic groups the additive genetic similarities between individuals i and j were estimated as

display math

for the L SNPs with minor allele frequency >0.01and imputation R2 > 0.6 (L = 917,066), where i ∈ s represent a population that individual i belong to and p and q = 1 − p are allele frequencies of the first and other allele and xil is the number of first alleles for the lth SNP in individual i. The analysis model include sex, cohort and 20 ancestry principal components are covariates.

Pathway analysis

To determine if any neurobiological pathways were implicated by our association signals, we input our top hit intervals from the SNP and CNV association analyses into Interval enRICHment Test (INRICH [Lee et al., 2012a]). Associated intervals for SNPs included those with P-values <10−4 after correcting for the MDS components. The SNP tagging function in PLINK was used to generate LD independent genomic intervals (tag r2: 0.2, tag kb: 1,000). We included CNV intervals that were more prevalent in cases than in controls with at least a trend difference of statistical significance (P < 0.15). We used the Gene Ontology (GO) nodes as our target gene sets. After size filtering, 5,237 target gene sets (nodes) each comprising at least three genes were examined. Interval overlap was limited to 20 kb up/downstream of a gene. The number of overlapping genes was recorded as Reali. Ten thousand replicates generated random interval sets each matching to the number of associated intervals. The empirical gene-set P-value equals the percent of replicates with at least Reali number of random intervals overlapping with genes in a target gene set. Bootstrapping-based re-sampling was used for multiple testing to correct the empirical gene-set P-value over all gene sets.

To explore potential physical interactions among proteins encoded in associated intervals, we used a second method for pathway analysis, that is, Disease Association Protein–Protein Link Evaluator (DAPPLE) [Rossin et al., 2011]. In consideration of the contributions of both common and rare variants to the etiology of ADHD, and that both might separately capture nodes in the ADHD pathogenesis network, we used the same genomic intervals for both SNPs and CNV that we used for INRICH. DAPPLE uses experimentally validated, protein–protein interaction databases to identify direct and indirect networks from associated proteins and scores network and protein connectivity. We built 10,000 random networks and compared these with the ADHD associated networks to determine if the connectivity of the ADHD networks and each seed protein was greater than expected by chance.

The third pathway analyses we used the Genomic Regions Enrichment of Annotations Tool (GREAT, [McLean et al., 2010] to assess for enrichment of cis-regulatory regions. GREAT examines not only proximal but also distal regulatory regions up to 1 Mb upstream or downstream of transcription start sites. In addition to typical calculation of gene-based P-values for enrichment, GREAT computes a binomial test over genomic regions, which uses the fraction of the genome associated with each ontology term as the probability of selecting the term. This method explicitly accounts for the variability in length of gene regulatory domains, eliminating the bias that leads to false positive enrichments for distal regulatory regions.


Single Variant Analyses

The quantile-quantile (QQ) plot (SF2) for SNPs' association was almost completely diagonal. The lambda statistic (λ) was 1.02.The distribution of observed P-values did not deviate from the distribution expected under the null hypothesis of no association. The corrected Manhattan plot is shown inSF3. The lowest P values were about 10−5 to 10−6.The SNPs associated with P values of 10−5 or lower are listed in Supplementary Table SI. All hit genes were expressed in brain. Most of them were known to be involved in neurodevelopment (including cell adhesion, neuron migration, neurite outgrowth, neuronal morphogenesis, and synaptic plasticity: ITGA1, NYAP2, ADAM28, CNTN2, LRFN2, NTM, GJA1, FLRT2, PRKG1, PICK1, CAMK2G; glutamate receptor and transporter: GRIK4, GRM7, SLC38A1; and related transcription factors: PAPOLA, MED27, TAF2, ZNF516).

We included 3,460 rare CNVs (1,817 in cases and 1,643 in controls) in the analyses, with all segments intersecting with one or more genes (hg18). Burden analyses showed a significantly higher rate of rare CNVs (1.875% vs. 1.830%, ratio: 1.02, P = 0.038) and proportion of individuals carrying rare CNVs (55.8% vs. 51.2%, ratio: 1.09, P = 0.026) for the ADHD group than for controls. Association analyses found six regions nominally associated with ADHD (P < 0.05, with 50,000 permutation tests), though none of them survived genome-wide correction (Supplementary Table SII).

Polygenic Analyses

The estimate of the SNP-heritability calculated in the bivariate analysis was 0.42 (SE 0.13) for the Han Chinese sample (math formula). A maximum likelihood ratio test of H0: math formula is P = 0.0017. In the bivariate analysis the SNP-heritability for the European ancestry sample (math formula) was 0.28 (SE 0.02, P = 0), in close agreement (as expected) with the univariate estimate PGC Cross Disorder Group, paper in submission. The estimate of the SNP-genetic correlation between Chinese and European samples (rg-SNP) was 0.39 (SE 0.15, P = 0.0072).

Pathway Analyses

Interval enrichment tests of the most significantly associated SNPs found 23 pathways enriched for associated signals (Table I). Although none of these achieved significance after correcting for multiple comparisons, many implicated neurobiological functions potentially relevant to ADHD, e.g. neuron projection morphogenesis (ITGA1, GJA1), neuron migration (PRKG1, GJA1), endocytic vesicle membrane (PICK1, CAMK2G), synaptic transmission (PICK1, CAMK2G, SLC38A1, GRM7). Pathways related to transcription were observed, that is, transcription initiation from RNA polymerase II promoter (MED27, TAF2). Interval enrichment tests of CNVs found 9 pathway nominal significant at P < 0.05. None achieved significance after correcting for multiple comparisons (Table I). Most were related to transmembrane transport, including water, sodium and potassium ion transport.

Table I. Pathways Enriched for Associated SNPs and CNVs by INRICH Test
TargetTarget sizeInterval no.Emp. PCor. PGene list
  1. aWith corrected P-value <10e−4.
  2. bIncluding CNVs more in cases than in controls with P < 0.15.
Pathways enriched for associated SNPsa
GO: 0009268Response_to_pH1220.000099990.14917ARSB, GJA1
GO: 0043403Skeletal_muscle_tissue_regeneration920.000199980.192162PLAU, GJA1
GO: 0048812Neuron_projection_morphogenesis1820.000299970.231354ITGA1, GJA1
GO: 0007160Cell-matrix_adhesion7230.000499950.295941VCL, ITGA1, BCL2L11
GO: 0005916Fascia_adherens920.000799920.378724VCL, GJA1
GO: 0006936Muscle_contraction9330.001099890.447111VCL, ITGA1, GJA1
GO: 0030666Endocytic_vesicle_membrane2420.002699730.661668PICK1, CAMK2G
GO: 0006367Transcription_initiation_from_RNA_polymerase_II_promoter6720.003799620.727854MED27, TAF2
GO: 0005741Mitochondrial_outer_membrane8520.006799320.847231GJA1, BCL2L11
GO: 0007229Integrin-mediated_signaling_pathway5820.0103990.894421ITGA1, ADAMDEC1
GO: 0005178Integrin_binding6420.01149890.907419ITGA1, ADAMDEC1
GO: 0007268Synaptic_transmission26640.01219880.915217PICK1, CAMK2G, SLC38A1, GRM7
GO: 0005764Lysosome15420.01579840.946011ARSB, GJA1
GO: 0030165PDZ_domain_binding5420.0198980.965807GRM7, GJA1
GO: 0015293Symporter_activity11220.0198980.965807SLC38A1, SLC16A8
GO: 0005624Membrane_fraction46740.02439760.982004ITGA1, SLC16A8, PSD3, BCL2L11
GO: 0001764Neuron_migration5920.02559740.983003PRKG1, GJA1
GO: 0045121Membrane_raft11020.02569740.983203ITGA1, GJA1
GO: 0006814Sodium_ion_transport11820.02659730.984003SLC38A1, SCN9A
GO: 0005654Nucleoplasm73240.02149790.968806MED27, CAMK2G, DSCC1, PAPOLA
GO: 0017124SH3_domain_binding10120.03579640.992402BAIAP2L2, GJA1
GO: 0043234Protein_complex15220.04379560.995601PICK1, VCL
GO: 0001701In_utero_embryonic_development14220.04599540.996601GJA1, BCL2L11
Pathways enriched for associated CNVsb
GO: 0006833Water_transport3620.005499450.508498AQP9, ADCY8
GO:0005244 Voltage-gated_ion_channel_activity14940.01159880.734653KCNH7, KCNQ1, SCN10A, SCN11A
GO: 0055085Transmembrane_transport61960.01259750.752849KCNH7, SCN10A, KCNQ1, AQP9, ADCY8, SCN11A
GO: 0042493Response_to_drug24130.01649840.834233USP47, CPS1, SCN11A
GO:0005248 Voltage-gated_sodium_channel_activity1520.01729830.84883SCN10A, SCN11A
GO:0001518 Voltage-gated_sodium_channel_complex1220.01729830.84883SCN10A, SCN11A
GO:0006814Sodium_ion_transport11820.02589740.909018SCN10A, SCN11A
GO:0006811 Ion_transport56550.0303970.928814KCNH7, KCNQ1, KCNT2, SCN10A, SCN11A
GO: 0006813Potassium_ion_transport15330.03929610.956809KCNH7, KCNQ1, KCNT2

DAPPLE identified 16 direct connections among proteins in 152 associated regions (Table II). Compared to 10,000 random networks, the associated network (SF4) is significantly enriched for direct connectivity(16 vs. 9.7, P = 0.030).The connected proteins formed six groups. Their functions involved cell adhesion/synaptic formation/plasticity, especially for glutamatergic synaptic plasticity, as well as related transcription factors. For each seed protein, taking the best of the direct and indirect scores and correcting for the number of tests as well as for the number of genes in one locus, we identified seven genes significant for connectivity to be candidate genes for future research: NCL (P = 2 × 104), KCNH7 (P = 8 × 10−4), NXPH1 (1 × 10−3), LANCL1 (6 × 10−3), CNTNAP2 (9 × 10−3), SV2C (1.2 × 10−2), and PICK1 (4 × 10−2).

Table II. Direct Connections Between Proteins Encoded by Genes Overlapped With Potential Associated SNPs and CNVs
Direct connectionAssociated SNP/intervalGene functionaGene and pathway implicated in other studies of ADHD or psychiatric disorders
  1. aFrom the UCSC Browser, UniProtKB, NCBI's OMIM, and GeneCards.
ajmgb32169-gra-0001ajmgb32169-gra-0002NXPH1: chr7: 8471930-8567358NXPH1: The encoded protein forms a very tight complex with alpha neurexins, a group of proteins that promote adhesion between dendrites and axons NRXN1: The encoded protein functions as cell adhesion molecules and receptors. May play a role in formation or maintenance of synaptic junctionsBoth NXPH1 and NRXN1 were among the top hit genes (10−5) in ADHD GWAS by Neale et al. [2010a]. CNVs of NXPH1 were found in ASD families [Salyakina et al., 2011]. CNVs of NRXN1 were also observed in schizophrenia and autism [Doherty et al., 2012]
 ajmgb32169-gra-0003NRXN1: chr2: 50226778-50323795 CNTNAP2 has been implicated in multiple neurodevelopmental disorders, including Gilles de la Tourette syndrome, schizophrenia, epilepsy, autism, ADHD and mental retardation. CNTNAP5 was among the top hit genes in ADHD GWAS by Neale et al. [2010a]. ZMIZ1was among the top hit genes in Neale et al. [2008]
  CNTN2: rs2802837CNTN2: It is a neuronal membrane protein that functions as a cell adhesion molecule. It may play a role in the formation of axon connections in the developing nervous systemCell adhesion molecule had been reported in previous ADHD GWAS studies [Lasky-Su et al., 2008; Zhou et al., 2008; Neale et al., 2010a]
  CNTNAP2: chr7: 145653708-145657817CNTNAP2: This gene encodes a member of the neurexin family which functions as cell adhesion molecules and receptors.This protein is localized at the juxtaparanodes of myelinated axons, and mediates interactions between neurons and glia during nervous system development and is also involved in localization of potassium channels within differentiating axons 
  Chr7: 145755346-145813647ZMIZ1: The encoded protein regulates the activity of various transcription factors 
  ZMIZ1: chr10: 80350172-80476638  
 ajmgb32169-gra-0004SCN10A: chr3: 38787799-38889198SCN10A: Voltage-gated sodium channels are integral membrane glycoproteins that are responsible for the initial rising phase of action in most excitable cells4 SNPs of SCN10A showed nominal association with ADHD in the meta-analysis by Neale et al. [2010b], with the lowest P-value = 0.022for rs7430438
  GJA1: rs7753979GJA1: This gene is a member of the connexin gene family. The encoded protein is a component of gap junctions. Gap channels allow electrical and biochemical coupling between cells and in excitable tissues, such as neurons and heart.rs7740467, which is approximately 3 kb upstream of GJA1, was found nominal significant (P = 0.0204) in the meta-analysis of ADHD GWAS by Neale et al. [2010b].
  TNNT2: chr1:199488134-199580414TNNT2: The protein encoded by this gene is the tropomyosin-binding subunit of the troponin complex, which is located on the thin filament of striated muscles and regulates muscle contraction in response to alterations in intracellular calcium ion concentration. This gene expresses highest in the heart, but also expresses in the brain8 SNPs of TNNT2 showed nominal association with ADHD in the meta-analysis by Neale et al. [2010b], with the lowest P-value = 0.015for rs10800775
 ajmgb32169-gra-0005GRM7: rs13317247GRM7 and GRM3: L-glutamate is the major excitatory neurotransmitter in the central nervous system, and it activates both ionotropic and metabotropic glutamate receptors. Glutamatergic neurotransmission is involved in most aspects of normal brainfunction and can be perturbed in many neuropathologic conditions. GRM3 belongs to the metabotropic glutamate receptors Group II, while GRM7 belongs to Group IIIMore than 20 SNPs of GRM7 showed association with ADHD in the meta-analysis by Neale et al. [2010b], with the lowest P value = 2.96E−3 for rs1532544. CNVs of GRM7 were found in a Genome-wide copy number variation study of ADHD [Elia et al., 2012]. rs17031835, rs12491620, rs1450099, and rs3749380 of GRM7 were associated with SCZ [Ganda et al., 2009; Ohtsuki et al., 2008; Shibata et al., 2009]. GRM3 and EPHA3 were identified to be candidate genes for ASD [Casey et al., 2012]. GRM3 was associated with SCZ in some candidate gene studies [Cherlyn et al., 2010]. It was also related to psychosis and relapse in bipolar disorder [Dalvie et al., 2010]. PICK1 located in the linkage and association region of SCZ [Pulver et al., 1994; Hong et al., 2004; Fujii et al., 2006]
  GRM3: chr7: 86132321-86244019PICK1: The protein encoded by this gene has been shown to interact with multiple glutamate receptor subtypes, Probable adapter protein that bind to and organize the subcellular localization of a variety of membrane proteins containing some PDZ recognition sequence. Involved in the clustering of various receptors, possibly by acting at the receptor internalization level. Plays a role in synaptic plasticity by regulating the trafficking and internalization of AMPA receptorsrs12661215 of EPHA7 was found nominal significant (P = 0.0107) in the meta-analysis of ADHD GWAS by Neale et al. [2010b] rs2664283 in CAMK2G was found nominal significant (P = 7.5E−3) in the meta-analysis of ADHD GWAS by Neale et al. [2010b]
  PICK1: rs8142185EPHA7: This gene belongs to the ephrin receptor subfamily of the protein-tyrosine kinase family. EPH and EPH-related receptors have been implicated in mediating developmental events, particularly in the nervous system 
  EPHA7: chr6: 94194161-94216651 Metabotropic glutamate receptor genes family and their interacting genes were previously found to be enriched with CNVs in ADHD samples [Elia et al., 2012]
 ajmgb32169-gra-0006CAMK2G: rs10824051, rs11000831CAMK2G: The product of this gene is one of the four subunits of an enzyme which belongs to the Ca(2+)/calmodulin-dependent protein kinase subfamily. Calcium signaling is crucial for several aspects of plasticity at glutamatergic synapses 
  ADAM28: rs7012077ADAM28: This gene encodes a member of the ADAM (a disintegrin and metalloprotease domain) family. Members of this family have been implicated in a variety of biological processes involving cell-cell and cell-matrix interactions, including neurogenesis 
 ajmgb32169-gra-0007MED20: chr6: 41904795, 42015370MED20: Component of the Mediator complex, a coactivator involved in the regulated transcription of nearly all RNApolymerase II-dependent genesrs3218100, 15 kb downstream of MED20, was nominal significant (P = 7.06E−3) in the meta-analysis of ADHD GWAS by Neale et al. [2010b]
  MED27: rs10512416, rs6597539MED27: mediator complex subunit 27. These factors work with co-activators to direct transcriptional initiation by the RNA polymerase II apparatusrs10901091, which is approximately 90 kb upstream of MED27, was found nominal significant (P = 1.13E−3) in the meta-analysis of ADHD GWAS by Neale et al. [2010b]. 3 SNPs, approximately 135 kb downstream of PAPOLA, showed association with ADHD in the meta-analysis by Neale et al. [2010b], with the lowest P-alue = 7.02E−3 for rs1969795. POLR2F Located in the linkage and association region of SCZ [Pulver et al., 1994; Hong et al., 2004; Fujii et al., 2006]. RSL1D1 located in 16p13 of the linkage region for ADHD [Fisher et al., 2002; Smalley et al., 2002]. A SNP, rs464017, 65 kb downstream of RSL1D1, was nominal significant (P = 0.0387) in the meta-analysis of ADHD GWAS by Neale et al. [2010b]. A SNP, rs3218100, about 4 kb downstream of BYSL, was nominal significant (P = 7.06E−3) in the meta-analysis of ADHD GWAS by Neale et al. [2010b]
  TAF2: rs3812463, rs6469849, rs6469852, rs6989791, rs7012857TAF2: stabilizes TFIID binding to core promoter. Transcription factor TFIID is one of the general factors required for accurate and regulated initiation by RNA polymerase IITranscription regulating proteins, such as members of ZNF family, DMRT2, FHIT, FOXP1, and MEIS2, had been implicated in previous GWAS [Lasky-Su et al., 2008; Neale et al., 2010b]
  PAPOLA: rs7149784, rs7160641PAPOLA: Polymerase that creates the 3'-poly(A) tail of mRNA's 
  POLR2F: rs8142185POLR2F: This gene encodes the sixth largest subunit of RNA polymerase II, the polymerase responsible for synthesizing messenger RNA in eukaryotes, that is also shared by the other two DNA-directed RNA polymerases 
  NCL: rs16828074NCL: Nucleolin play a role in pre-rRNA transcription and ribosome assembly. May play a role in the process of transcriptional elongation 
  RSL1D1: chr16: 11823863-11943065RSL1D1: ribosomal L1 domain containing 
  BYSL: chr6: 41904795-42015370BYSL: Required for processing of 20S pre-rRNA precursor and biogenesis of 40S ribosomal subunits. May be required for regulating cell adhesion during implantation of human embryos 

Using the same set of associated SNPs and CNVs for GREAT analyses, we found significant enrichment for 6 GO Cellular Component terms after correcting for multiple comparisons (Table III). The six terms were from two GO branches and their child nodes (Fig. 1): synapse (15 genes hit, FDR Q-val: 0.0055; three child nodes were also significant: synapse part, synaptic membrane, and presynaptic membrane) and neuron projection (16 genes hit, FDR Q-val: 0.013; one child node was also significant: axon).

Table III. Significant Enriched Gene Ontology Terms by Genes Associated With Regulatory Regions
Term nameBinoma rankBinom raw P-valueBinom FDR Q-ValBinom fold enrichmentBinom observed region hitsBinom region set coverage (%)Hypera rankHyper FDR Q-ValHyper fold enrichmentHyper observed gene hitsHyper total genesHyper gene set coverage (%)
  1. a“Binom” represented binomial test over genomic regions; “Hyper” represented hypergeometic test over genes.
Presynaptic membrane12.4669E−112.7654E−89.40561610.5315.6208E−313.56546503.85
Synaptic membrane24.2028E−112.3557E−84.77872617.1151.3568E−25.217591955.77
Synapse part41.2576E−93.5243E−73.56193019.7426.9311E−34.2597133458.33
Neuron projection131.5147E−41.3062E−22.08862818.4241.2807E−23.17881656910.26
Figure 1.

Neurodevelopmental network predicted by proximal and distal regulatory region among the top hit of genome-wide SNPs and CNVs association.


This GWAS of ADHD, comprising 1,040 cases and 963 controls, is the first performed in a homogeneous Han Chinese population. Although we did not find any genome-wide significant SNP or CNV variants, we did find significant evidence for a polygenic SNP component and an increased burden of rare CNVs.

The significant SNP-heritability implies that common variants are associated with ADHD, but that our sample is underpowered to detect them at the stringent significance level imposed by the genome-wide burden of multiple testing. The SNP-heritability in Han Chinese was 0.42 (SE 0.15). Although the point estimate is higher than for the larger European ancestry sample from the PGC-ADHD, 0.28 (SE 0.02), its high standard error shows that the estimates are not significantly different. The estimate of the SNP-genetic correlation (rg-SNP) was 0.39 (SE 0.15, P = 0.0072), which indicates that common SNP risk variants are shared by the Han Chinese and European Ancestry samples. To our knowledge, this is the first such correlation reported for any disease or disorder. The significant correlation indicates that ancient common variants associated with ADHD are shared between the ethnic groups. However, the point estimate of the SNP correlation between Han Chinese and European Ancestry samples is lower than between sub-samples of the European Ancestry cohort. Specifically, when the PGC-ADHD data was split into two sub-samples, the math formula estimates were 0.21 (SE 0.07) for the first sample and 0.41 (SE 0.03) for the other sample with a genetic correlation of 0.71 (SE 0.17) implying, as expected, more sharing of associated variants and/or higher linkage disequilibrium between causal variants and SNPs within than between ethnic populations.

Despite the fact that no individual SNPs reached association at genome-wide significance, our most significant findings implicated genes participating in neurodevelopmental processes such as cell adhesion, neuron migration, neurite outgrowth, neuronal morphogenesis, and synaptic plasticity. Similar sets of genes were also suggested by previous ADHD GWAS and a meta-analysis (see Supplementary Table SI). For example, PRKG1 was implicated by Neale et al. [2010a], ITGA1, CAMK2G, CAMK1D were implicated in the meta-analysis by Neale et al. [2010b], and ITGAE and ITGA11 were implicated by Lesch et al. [2008]. Some of our top genes code for glutamate receptors and transporters. The same genes and gene family members (GRM7, GRIK1) were reported in the quantitative GWAS by Lasky-Su et al. [2008], the meta-analysis by Neale et al. [2010b], and the genome-wide CNV study by Elia et al. [2012]. Some genes related to transcription (ZNF544, ZNF385D, ZNF423, ZNF516, ZNF75A, DMRT2, FHIT, FOXP1, and MEIS2) were also implicated by Lasky-Su et al. [2008] and by Neale et al. [2010b]'s meta-analysis.

Although not significant after correcting for multiple comparisons, the pathways revealed by the INRICH analyses of associated SNPs involved neurobiological functions consistent with the prior findings discussed above. For example, neuron projection morphogenesis and neuron migration pathways were implicated by genes encoding adhesion molecules (e.g., GJA1, ITGA1, PRKG1). Neuron migration and axon guidance toward the target in the development of the nervous system involve interactions between molecules on the surface of the axon and those in the extra-cellular matrix [Tsiotra et al., 1993]. The endocytic vesicle membrane and synaptic transmission pathways involve glutamatergic synaptic function. The transcription related pathway is a ubiquitous biological process, if, as our findings suggest, it is implicated in ADHD's pathophysiology, any defects in the implicated transcription network must require other etiological factors to lead to a pathophysiologic state.

Because the GO “pathways” used by INRICH are based on bibliometric gene annotations rather than experimental data, we also used DAPPLE, which is based only on experimentally documented physical interactions among proteins. Considering the complexity of the genetic basis of ADHD, we hypothesized that both common and rare variants contribute to the disorder and act on similar functional classes of genes [Poelmans et al., 2011; Stergiakouli et al., 2012]. The DAPPLE analyses showed that the proteins implicated by our GWAS were significantly more likely to be interconnected with one another than expected by chance, suggesting that risk variants might exist in suites of genes involved in the underlying biological process of protein-protein interaction networks. The DAPPLE results are consistent with the INRICH results implicating three pathways: cell adhesion (NXPH1–NRXN1, CNTN2CNTNAP2ZMIZ1), glutamate synaptic development (GRM7–PICK1–GRM3, PICK1EPHA7), and the transcription pathway (TAF2PAPOLAPOLR2FMED27MED20, POLR2FNCLRSL1D1BYSL).

Using the regulatory annotation of associated signals, GREAT depicted a clearer outline of associated genes, which encoded proteins comprising neuronal cellular components from the Neuron Projection and Synapse branches of the GO tree. Most of the genes from these pathways were consistent with the INRICH and DAPPLE findings; they encode adhesion molecules, glutamate receptors and proteins involved in axon and synapse development (Supplementary Table SIII).

All the above pathways are consistent with the hypothesis that mis-wiring of the brain during neurodevelopment might cause ADHD. Similar conclusions were drawn by Lesch et al. [2008] and Franke et al. [2009] based on the findings from existing GWAS, which suggested that neuronal spine formation and plasticity might underlie the pathophysiology of ADHD. Consistent with these ideas, a recent integration of ADHD GWAS findings found significant evidence for a neurodevelopmental network of directed neurite outgrowth [Poelmans et al., 2011]. Although our findings are consistent with prior work, they also provide evidence for a more comprehensive network, involving neuron migration, neurite outgrowth, neuronal morphogenesis, and synaptic plasticity, especially glutamatergic synaptic development. The glutamate system is a reasonable candidate for ADHD's pathophysiology as glutamate is the major excitatory neurotransmitter in the central nervous system, and regulates the catecholaminergic activity which has been implicated in ADHD by neurobiological [Scassellati et al., 2012] and treatment [Faraone and Glatt, 2010] studies.

Although our findings are intriguing, we have only captured fragments of the puzzle of ADHD's etiology in this study. We could not paint the full picture. Our work must be considered in the context of its limitations. We had no genome-wide significant findings for any single variant, which might be due to the sample size. However, our bioinformatic and pathway analyses found some interesting genes and neurobiological pathways which implicate complex neurodevelopmental network underlying ADHD. Our finding of a significant polygenic component suggests that there are many common SNP variants with small effect sizes that increase the risk for ADHD. Individually, these SNPs will be difficult to detect with currently available sample sizes.


We thank all colleagues who helped collect and manage the data. We thank the patients and the family members who provided data for this project, which was supported by the following grants: National Natural Science Foundation of China (30800302 to Li Yang, 81071109 to Q.Q.), Chinese Ministry of Health grants (200802073 to Q.Q.), Ministry of Education Program for New Century Excellent Talents in University (NCET-11-0013 to Q.Q.), Ministry of Science and Technology grants (2007BAI17B03 to B.X., Y.W.). US National of Institute of Health Grants R13MH059126, R01MH62873, U01MH085518 and R01MH081803 to S.V.F., U01MH085515 to M.D., and K23MH066275-01 to J. Elia; and R01MH58277 to S.S.; Funding from the Australian Research council (FT0991360) and the Australian National Health and Medical Research Council to N.R.W. (1011506, 1047956). Institutional Development Award to the Center for Applied Genomics from the Children's Hospital of Philadelphia to H.H. Affymetrix Power Award, 2007 to B.F; NHMRC (Australia) and Sidney Sax Public Health Fellowship (443036) to SEM; Wellcome Trust, UK for sample collection to L.K. UMC Utrecht Genvlag Grant and Internal Grant of Radboud University, Nijmegen Medical Centre to J.B., the Deutsche Forschungsgemeinschaft (KFO 125, SFB 581, GRK 1156 to K.P.L., ME 1923/5-1, ME 1923/5-3, GRK 1389 to C.F., and J. Meyer, SCHA 542/10-3 to H. Schäfer) and the BundesministeriumfürBildung und Forschung (BMBF 01GV0605 to K.P.L.). PND0080/2011, PI040524, PI080519, PI1100571, PI1101629, PI1201139, CP09/00119, 092330/31, 2009SGR1554, SAF2012-33484, and 2009SGR0971 (to M. Casas, J.A. Ramos-Quiroga, MònicaBayés, Bru Cormand, Cristina Sánchez-Mora, Marta Ribasés). Canadian Institutes of Health Research MOP 64277, 44070, 74699.

Dr. Li Yang received research grant from Janssen Science Council of China. She had been a speaker for Janssen and received travel support from Eli Lilly. In the past year, Dr. Faraone received consulting income and/or research support from Shire, Akili Interactive Labs and Alcobra and research support from the National Institutes of Health (NIH). He is also on the Clinical Advisory Board for Akili Interactive Labs. In previous years, he received consulting fees or was on Advisory Boards or participated in continuing medical education programs sponsored by: Shire, Alcobra, Otsuka, McNeil, Janssen, Novartis, Pfizer and Eli Lilly. Dr. Faraone receives royalties from books published by Guilford Press: Straight Talk about Your Child's Mental Health and Oxford University Press: Schizophrenia: The Facts. Dr. Yufeng Wang has served on an advisory board and received funding for research, lectures, and travel from Xi'an Janssen Pharmaceutical and the advisor of Eli lilly and Company. The other authors reported no financial relationships with commercial interests.