*Correspondence to: Stephen V. Faraone, Ph.D., Department of Psychiatric, SUNY Upstate Medical University, 750 East Adams Street, Syracuse, NY 13210. Phone: 315-464-3113; Fax: (315) 849-1839 (fax); E-mail: firstname.lastname@example.org
The purpose of this study was to determine whether the single nucleotide polymorphisms (SNPs) within candidate genes for attention deficit hyperactivity disorder (ADHD) are associated with the age at onset for ADHD. One hundred and forty-three SNPs were genotyped across five candidate genes (DRD5, SLC6A3, HTR1B, SNAP25, DRD4) for ADHD in 229 families with at least one affected offspring. SNPs with the highest estimated power to detect an association with age at onset were selected for each candidate gene, using a power-based screening procedure that does not compromise the nominal significance level. A time-to-onset analysis for family-based samples was performed on these SNPs to determine if an association exists with age at onset for ADHD. Seven consecutive SNPs surrounding the D5 dopamine receptor gene (DRD5), were associated with the age at onset for ADHD; FDR adjusted q-values ranged from 0.008 to 0.023. This analysis indicates that individuals with the risk genotype develop ADHD earlier than individuals with any other genotype. A haplotype analysis across the 6 significant SNPs that were in linkage disequilibrium with one another, CTCATA, was also found to be significant (p-value = 0.02). We did not observe significant associations with age at onset for the other candidate loci tested. Although definitive conclusions await independent replication, these results suggest that a variant in DRD5 may affect age at onset for ADHD.
Attention deficit hyperactivity disorder (ADHD) (AHC [MIM 143465]) is diagnosed by symptoms of inattention or hyperactivity and impulsivity that manifest themselves before the age of seven, persist for at least six months, and are accompanied by impairment in multiple settings (American Psychiatric Association, 1994). The prevalence of ADHD is 8 to 12 percent (Faraone et al. 2003). Family, twin and adoption studies suggest the etiology of ADHD is influenced by genes throughout the lifespan (Faraone et al. 2005, 2004a,b).
Researchers have long considered age at onset as an informative phenotype for studying the genetic etiology of disorders (Faraone et al. 2004; Chen et al. 1992). For some common late-onset disorders like Alzheimer's disease, identifying early onset individuals and families has led to the successful identification of rare, highly penetrant disease-causing alleles (Corder et al. 1994). In contrast to most psychiatric disorders, age at onset has not been examined in genetic studies of ADHD. This may reflect the fact that the diagnostic criteria for ADHD require individuals to show symptoms by the age of seven, constraining the variability in age at onset. Age at onset of ADHD may be useful in refining the ADHD phenotype, by dividing ADHD individuals into separate, more genetically homogeneous groups. As with other disorders, such as Alzheimer's disease and diabetes, we hypothesize that this strategy could lead to the successful identification of disease susceptibility loci.
The validity of the DSM-IV age at onset requirement has been challenged for some time by research that suggests age at onset has modest test-retest reliability, and is not associated with important attributes of the disease such as severity of symptoms, types of adjustment difficulties, or the persistence of the disorder (Applegate et al. 1997; Barkley & Biederman, 1997; Faraone, 2000). From a genetic standpoint it is not clear that individuals who meet all criteria except for onset before age seven are genetically different from those who have onset by age seven, and therefore it may be useful to use individuals that meet all the remaining criteria for ADHD in genetic analyses. Because use of age at onset has been successful in identifying susceptibility genes for other psychiatric disorders, the question of whether the age at onset of ADHD may be a useful phenotype for identifying susceptibility genes for ADHD exists. In this paper we use the age at onset of ADHD in a logrank test, designed specifically for family-based genetic studies, to determine whether there are genetic influences on age at onset for any of the candidate genes specified above for ADHD.
Materials and Methods
Two hundred and twenty-nine ADHD families were recruited through several ongoing research studies being conducted at Massachusetts General Hospital (MGH) pediatric psychopharmacology clinic. Ninety families were ascertained from the longitudinal case-control family studies of boys and girls with ADHD. Probands were recruited from the MGH general pediatrics and pediatric psychopharmacology clinics, as well as from health maintenance organizations (HMO) in the Boston area. Ascertainment of these individuals occurred prior to the publication of DSM-IV, so the affection status for probands and their relatives was based on DSM-III-R criteria. Children and adolescents of 6 to 18 years of age were eligible to participate in this study. Potential subjects were excluded if they had major sensorimotor handicaps, psychosis, autism, inadequate command of the English language, or an Intelligence Quotient (IQ) less than 70. Participants were also excluded if they were adopted or their nuclear family was not able to participate in the study. All of the ADHD probands met the DSM-III-R diagnostic criteria for ADHD at the time of the clinical referral; they all had active ADHD symptoms at the time of recruitment.
In addition 83 families were ascertained from an affected sibling pair linkage study for ADHD, 37 families were ascertained from a sample of bipolar families, 17 families were ascertained from a family study of ADHD adults, and 2 families were ascertained from a study of substance abuse. Recruitment and inclusion and exclusion criteria were similar to the longitudinal study for ADHD boys and girls, with the following exceptions: 1) ADHD cases were obtained from the MGH pediatric psychopharmacology clinic, the child psychiatry clinic at Children's Hospital in Boston, or by referrals from individual child psychiatrists throughout the community; 2) ascertainment was based on DSM-IV diagnoses; 3) the pediatric bipolar studies ascertained cases for bipolar disorder and did not screen out cases with psychosis. Individuals 18 years of age or older provided written informed consent, mothers provided written informed consent for minor children, and children provided written assent to participate in this study. Because the same research group conducted these studies, the ascertainment criteria for ADHD did not differ among studies (e.g., children enrolled as bipolar probands for the family study of bipolar disorder would have qualified for enrollment in the ADHD studies if they also met the criteria for ADHD), thereby limiting potential heterogeneity. Additional details about this sample have been reported elsewhere (Smoller et al. 2006).
ADHD Diagnostic Assessment
We collected psychiatric information from children using K-SADS-E (Epidemiologic Version), a widely used semi-structured psychiatric diagnostic interview with established psychometric properties (Orvaschel, 1994). The interview inquired about the child's lifetime history of psychopathology. This included information on the affection status of ADHD and the age at which each child was diagnosed with the disorder, which were the primary variables of importance in this analysis. The K-SADS-E provides a standardized method of obtaining and recording symptoms necessary for the assessment of most Axis I categories. For all children, including siblings, psychiatric data were collected from the mother. In addition children 12 years and older were evaluated directly. Discrepancies between the child's and the parent's interview were resolved by the diagnostic procedures discussed below. We did not directly interview children younger than 12 years because they are limited in their expressive and receptive language abilities, they lack the ability to map events in time, and they have limited powers of abstraction. Given these limitations, there is a real question about whether the young child's self-perceptions, memories, feelings and reported behaviour can be reliably assessed through self-report. Although limited, studies on the use of interview techniques among young children show that their replies are unreliable (Achenbach & McConaughy, 1987; Breton et al. 1995; Edelbrock et al. 1985; Schwab-Stone et al. 1994).
Final diagnostic assignment was made after a blind review of all available information by a diagnostic committee, chaired by Dr. Joseph Biederman and composed of three board−certified child and adolescent psychiatrists and licensed clinical psychologists. The interviewers were instructed to take extensive notes about the symptoms for each disorder. These notes and the structured interview data were reviewed by the diagnostic committee, so that the committee could make a best estimate diagnosis as described by Leckman et al. (1982). Definite diagnoses were assigned to subjects who met all diagnostic criteria. Subthreshold diagnoses were assigned to those subjects who met most, but not all, required criteria. Diagnoses presented for review were considered definite only if a consensus was achieved in that criteria were met to a degree that would be considered clinically meaningful. By “clinically meaningful” we mean that the data collected from the structured interview indicated that the diagnosis should be a clinical concern, due to the nature of the symptoms, the associated impairment and the coherence of the clinical picture. To combine discrepant parent and offspring reports we used the most severe diagnosis from either source as the consensus diagnosis, unless the diagnosticians suspected that the source was not supplying reliable information. Interviewers of subjects were blind to all prior data collected from that subject and his or her family members.
Information on the age at onset of ADHD was defined as the age at which a child exhibited at least two symptoms from the inattentive or hyperactive-impulsive category. This individual must subsequently also have been diagnosed with ADHD. Because of this, the age at onset of ADHD which was used in the analysis was not the age of formal diagnosis, but the age of initial symptoms. The minimum age of onset that could be recorded was 1 year of age.
The diagnostic criteria for ADHD require that ADHD symptoms occur by age seven. In this paper we were particularly interested in the age when ADHD behaviour began. By including only the individuals that met the age criterion we limited the range of the variable of interest. Therefore in this study 9 individuals were included who met all DSM-IV criteria for ADHD, except that their ADHD symptoms began after age seven years.
The genotyping for this project was completed prior to the completion of HAPMAP. Because of this, tag SNPs were not used and a larger number of SNPs were tested to ensure that accuracy of the haplotype block structure and that the SNPs selected fully covered the gene's block structure. One hundred and forty-three single nucleotide polymorphisms (SNPs) were selected across, or flanking, five candidate genes for ADHD. These included 35 SNPs for SLC6A3, 13 for DRD4, 13 for DRD5, 63 for SNAP25, and 19 for HTR1B. When SNPs were genotyped for DRD5 none of the SNPs were in the regulatory region or the coding region. This is because two pseudogenes for DRD5 exist, psi DRD5P1 and psi DRD5P2. Each of the pseudogenes shares 94% homology with DRD5 (Nguyen et al. 1991). A consequence of this is that complications occur when trying to genotype SNPs in DRD5. Therefore SNPs were genotyped surrounding, but not in, DRD5. To evaluate SNP assay quality and characterize the linkage disequilibrium (LD) structure we screened the SNPs in 12 multigenerational CEPH pedigrees. SNPs were selected for testing in the ADHD family sample if they met the following quality control metrics: a) genotyping call rate >90%; b) genotypes in Hardy-Weinberg equilibirum; and c) no Mendelian errors. Haplotype blocks were generated with our data using the EM algorithm and the haplotype block criteria of Gabriel et al. (2002) as implemented in Haploview (Barrett et al. 2005). We compared the haplotype block structure from our data with the block structure in the CEPH families, in order to verify that the two were similar. For the significant findings the haplotype block structures were verified using our data. Genotyping was done by MALDI-TOF mass spectrometry (Buetow et al., 2001).
Family-Based Association Test-Logrank (FBAT-Logrank)
Family-based association tests (FBATs) are those studies that use genetic data from various family members to evaluate the possible association of a disease phenotype and a gene allele. PBAT is a computer program that implements several extensions of family-based association tests (Lange, 2003). The general methodology underlying these tests compares the test statistic for association to the conditional distribution of offspring genotypes, where conditioning is on the minimal sufficient statistic of nuisance parameters that are specified in the model (Rabinowitz & Laird, 2000). Nuisance parameters include the distribution of the phenotypes being studied, the parental allele frequencies, and the model for ascertainment. This conditioning approach eliminates the sensitivity of the analysis to misspecification of any of the nuisance parameters. Therefore, this approach is robust to misspecification of the phenotype distribution, population admixture, and missing parental genotypes.
Lange et al. (2004) derived a logrank test, FBAT-logrank, that can be applied to family-based association data in the PBAT program. FBAT-logrank incorporates a commonly used survival analysis approach, the logrank test, into the FBAT test statistic. In this context the logrank test compares the rates with which individuals of different genotypes are diagnosed with ADHD. When the Breslow estimator is used, it can be shown that this methodology is equivalent to the proportional hazards approaches of Mokliatchouk et al. (2001). When parental genotypes are known the FBAT-logrank is also equivalent to Shih & Whittemore's (2002) method.
The FBAT-logrank test is implemented in the context of the PBAT screening procedure. The PBAT screen selects a subset of SNPs based on power estimates calculated at each SNP (Lange et al. 2003, 2002; Lange & Laird, 2002). Through the use of the conditional mean model, these power estimates are calculated in a way that is statistically independent of the genetic data that are used in the subsequent FBAT-logrank tests. First, the conditional mean model uses parental genotype data to infer offspring genotypes. This information is then used as the predictor variable in a regression model, and the offspring phenotype is used as the response variable. The estimated slope from this regression equation is the genetic effect estimate that is then used, along with the specified genetic model and the allele frequency, to calculate the power that each SNP has to detect the association of interest. This power calculation is independent of the subsequent statistical test because parental genotypes are used in the power calculation, whereas offspring genotypes are used in the FBAT-logrank test statistic. Therefore, the SNP selection does not compromise the significance level of the subsequent statistical test. These power estimates can then be used to rank order the SNPs. The subset of SNPs with the greatest power to detect the association of interest is then selected and tested for a genetic association using the FBAT-logrank statistic. In summary, the PBAT screening and testing algorithm follows three steps: 1) power is calculated for all SNPs using the conditional mean model; 2) the SNPs are rank-ordered according to the power calculations; 3) the subset of SNPs with the greatest power is selected and; 4) statistical analyses are performed on the selected subset of SNPs. Like the actual logrank test, this screening procedure can use unaffected offspring as well; however, our data had only affected offspring and therefore only affected offspring were used in the analysis.
The primary purpose of the PBAT screening algorithm is to select a subset of genotyped SNPs to test. Genetic information from family-based designs can be divided into two independent parts: the within family information and the between family information. The between family genetic information uses the parental genotypes to infer all of the possible offspring genotypes, whereas the within family genetic information uses the offspring genotypes directly. The between family information is used in the screening step while the within family information is used to calculate the FBAT statistic. Because the between-and within-family information are both estimating the same genetic effect, using the between family information to calculate power in the screening step results in a power estimate that is based on the genetic effect we are trying to estimate in the FBAT statistic.
In this analysis all SNPs with a minimum of 20 informative families were screened using the minor allele in a recessive genetic model. The recessive model was selected because it requires fewer model assumptions than the additive model. The additive model assumes that the heterozygous genotype has an effect that is in the middle of both homozygous genotypes, while the recessive just compares two groups. The following algorithm was used to identify the set of SNPs that were subsequently tested: 1) The power to detect association at each SNP was calculated using the conditional mean model; 2) The SNPs were rank-ordered according to the power to detect an association with age at onset of ADHD; 3) The top 8 SNPs per gene with the highest power were selected; 4) The selected SNPs were then used in logrank tests for age at onset of ADHD, where the phenotype of interest was ADHD affection status. We tested 8 SNPs, as this number was between 5 and 10 SNPs, both of which were recommended numbers of SNPs to select using the screening algorithm. Although the number of SNPs to retain and test at each gene is somewhat arbitrary, we selected 8 SNPs because this allows one to test several SNPs at each gene while still restricting the overall number of tests. A detailed description of the logrank methodology is described elsewhere (Lange et al. 2004). Findings were adjusted for multiple comparisons using the false discovery rate (FDR) correction at each candidate gene, with an allowable FDR of 0.05 (Benjamini & Hochberg, 1995). The FDR controls the proportion of false positives, unlike several other multiple comparisons methods such as Bonferroni, which controls the probability of generating a single false positive result. Q-values, the FDR equivalent to p-values, were then used to determine the significance of a finding. When significant findings were identified on a grouping of consecutive SNPs, the haplotype block structure was identified and a logrank haplotype analysis of the specific haplotype was performed in PBAT using a sliding window throughout the haplotype. We used haplotype analysis in addition to individual SNP analyses because the LD between SNPs is not perfect. Therefore,the haplotype analysis and the individual SNP analyses may reveal different information.
Descriptive information on the families is listed in Table 1 below. Of the 229 families, 199 had both parents genotyped, 15 families had 1 parent genotyped with at least two siblings genotyped, and 15 families had zero parents genotyped with an average of 2.6 siblings genotyped (all families with zero genotyped parents had at least 2 siblings genotyped). Of the individuals used in this analysis, 23.6 percent had discrepant age at onset information between the parent and child responses; however, only 10.2 percent of these discrepancies were greater than 3 years. As explained above, in the case of discrepancies between the two reported ages, the information collected from the parent was typically used, as information from the children has often been found to be unreliable.
Table 1. Descriptive Statistics on the Individuals used in the FBAT-Logrank Analysis
Total number of people genotyped
Number of families
Number of affected children per family
Average age at onset of ADHD (SD) in offspring
Age at onset quartile 1 (25%)
Age at onset quartile 2 (50%)
Age at onset quartile 3 (75%)
Age at onset quartile 4 (100%)
Gender Distribution of those with an age at onset for ADHD in offspring
The selected SNPs from the screening procedure are listed from highest to lowest power within each candidate gene in Table 2. The Table lists the SNP name, the number of informative families at each SNP, and the nominal FBAT-logrank p-value. The asterisks next to the p-values indicate that the association was significant after adjusting for multiple comparisons using the FDR. For the significant SNPs surrounding DRD5 the adjusted p-values ranged from 0.008 to 0.023. For those SNPs that were significantly associated with age at onset, the risk genotypes and the physical location of the SNP (Human Genome Working Draft, http://genome.ucsc.edu/ May 2004 freeze) are provided in Table 2. Table 3 lists the genotype frequencies for each of the significant SNPs and Table 4 provides the primer sequences to aid future replication studies. We also found that the conservation was quite high for all candidate genes across the 17 species that are compared using UCSC genome browser.
Table 2. FBAT-Logrank Results from the SNPs Selected in the PBAT Screen
Number of Informative Families
Position (base pairs)
*SNP is significant at the α= 0.05 level after adjusting for multiple comparisons.
Table 3. Allele Frequencies of significant DRD5 SNPs
Minor Allele Frequency
Major Allele Frequency
Table 4. Primer sequences for the significant SNPs
Forward PCR Primer
Reverse PCR Primer
Multi-base Extension Primer
This sample included individuals with other psychiatric diagnoses in addition to ADHD. To insure that the age at onset results were not due to a higher genetic liability from multiple psychiatric diagnoses, the analyses were redone using covariates to adjust for other psychiatric conditions. We examined this with both a binary covariate indexing if the subject had any comorbidity, and a continuous variable (the number of comorbidities). Using either definition the results did not change. Of the SNPs with significant associations to the age at onset of ADHD, hCV12062485, was also nominally associated with ADHD affection status (p-value = 0.0495).
Cumulative probability plots of age at onset were generated for the significant SNPs. Figures 1a and 1b show the cumulative probability plots for rs7690455 and hCV12062454, respectively. In each Figure the solid line shows individuals with the risk genotype and the dashed line shows the other genotype. For all of the significant results the recessive genotype was associated with a later age at onset of ADHD. The risk genotype in Figure 1 therefore refers to those individuals without the recessive genotype who had an earlier age at onset for ADHD, while the other genotype is the recessive genotype.
All of the significant SNPs had a cumulative probability plot similar to that presented in Figure 1a, with the exception of hCV12062485 which was more similar to Figure 1b. Figure 2 illustrates the haplotype block structure and LD (as measured by r2) between the SNPs with significant associations. The haplotype blocks generated using our data revealed that six of the seven significant SNPs are in LD with each other, with DRD5 in the middle of this region. Using the CEPH data all seven of these SNPs were in one haplotype block. The degree of LD notably ranges across the 6 SNP haplotype block, with two SNPs in perfect LD (hCV12062484 and hCV12062485) and many SNPs with only modest LD (r2s ≈ 0.40). The 6 SNP haplotype, TAGGCG, was also found to be significant (p-value = 0.02).
Our work suggests that variants surrounding DRD5 may influence the age at onset of ADHD. Using a logrank family-based analysis, we observed association with age at onset for 7 SNPs surrounding DRD5; individuals with the risk genotype were affected with ADHD earlier than those without the risk genotype. Many studies have reported associations of ADHD with the candidate genes discussed in this paper; however, few studies have found concrete ways in which these genes affect ADHD individuals. This is one of the first analyses that suggests candidate genes for ADHD may affect the disorder through the age at onset of the disease. It appears that individuals with a risk genotype at any of the seven significant SNPs surrounding DRD5 may be susceptible to developing ADHD earlier. These findings are also consistent with the evidence that is emerging from the ADHD literature, which suggests that ADHD results from small effects of several genes across the genome acting simultaneously in a complex way. With this finding there is evidence that DRD5 influences a specific attribute of ADHD, age at onset of the disease. It is also possible that the use of age at onset presents one way to refine the ADHD phenotype, which could be more effective in finding genetic effects.
Although the findings suggest that SNPs surrounding DRD5 are associated with age at onset of ADHD, and 6 of the 7 are in LD with the gene, these results must be interpreted cautiously because none of the 7 significant SNPs are actually contained within DRD5. Therefore, it is more likely that these SNPs are in LD with a functional variant that is causing the observed association, than that any of these SNPs is a functional variant in itself.
After observing the distribution of age at onset among individuals with the risk genotype, a large number of individuals indicated that their ADHD began at age one. Onset at age one is the earliest age that was accepted in the interview. Therefore it is likely that, for these individuals, onset actually occurred as soon as they were born. This may suggest that we are observing two groups: the risk genotype group that more often has ADHD onset at birth, and the group without the risk genotype that has ADHD onset later in childhood. Because many individuals in the risk genotype group have such early onset, environmental exposures outside of the womb are less likely to influence their development of the disorder.
None of the other candidate genes had significant associations using FBAT-logrank; however, this analysis cannot rule out the possibility that an association exists. The absence of an association could be explained by many things, most notably low power. The sample used in this analysis was small and a larger sample would have had more power to detect modest genetic effects, which are likely to exist for some of these candidate genes.
There are several limitations of this study. One disadvantage was that we were not able to identify all of the individuals in the sample who met all of the diagnostic criteria for ADHD with the exception of the age at onset criterion. By not having information on these individuals, the survival analysis was less precise in the upper extremity of the age at onset distribution. Genetic studies such as this may benefit more from using individuals that meet all of the diagnostic criteria for ADHD with the exception of the age at onset criterion. In this way more precise information can be gathered about those who exhibit all ADHD signs and symptoms, but do not meet the age at onset criterion. A second limitation is that the methodology implemented in this analysis may not detect significant associations of age at onset with a SNP that has less estimated power to detect an association, as these tests will be dropped when the SNPs are rank ordered according to power in the screening procedure. This problem could be greater for larger genes. The goal of our SNP selection procedure was to reduce the number of SNPs tested and hence reduce the number of multiple comparisons. That, in turn, increases our power. If we had selected more SNPs for the larger genes that would have increased our multiple comparison problem. We think this is not a problem for testing our main hypothesis, i.e., that each gene contains a causal SNP or haplotype that affects the age at onset of ADHD. A third limitation is that the ADHD families used in this analysis were ascertained through different studies. Because some of these studies had ascertainment criteria other than ADHD (e.g., selecting through bipolar probands, requiring two or more ADHD siblings) our results may not be generalizable to samples recruited only through a single ADHD proband. Although families were recruited using slightly different diagnostic systems (DSM-III-R or DSM-IV), the collection of data on inattentive and hyperactive-impulsive symptoms and age at onset was uniform throughout the sample, and the correspondence of ADHD diagnoses between these systems is high. Finally, the reliability of age at onset of ADHD as an outcome variable in this analysis is one of the biggest limitations of this study. Because one year is a minimum age at onset that could be recorded, there is right censoring to the variable. In addition, it is often difficult to determine the age at onset when the first symptoms can occur at such a young age. There were also some discrepancies between the age at onset reported by parents and by the children themselves, although this is likely attributable to the difficulties children have with accurately recalling something so early in their life. Some children may report a later age at onset due to issues of access to service. Although there are clear limitations of the reliability of the age at onset variable, unreliability would obscure associations with age at onset rather than create false positive findings.
This is the first study to evaluate the association between age at onset of ADHD and ADHD candidate genes. Our findings suggest that one candidate gene for ADHD, DRD5, may affect ADHD through disease onset; however additional studies of age at onset with this gene are necessary before these findings can be substantiated. Our results provide suggestive evidence that genes have some involvement in the age at onset of ADHD, which further supports the idea that age at onset is a biologically meaningful feature of the disorder, and may be useful in clarifying its genetic heterogeneity. Future studies in molecular genetics that identify functional proteins in DRD5 may be useful in clarifying how and why DRD5 affects the age at onset of ADHD.
This work was sponsored by the National Institute of Mental Health through the Psychiatric Epidemiology and Biostatistics training grant (T32-MH017119) as well as three grants from the National Institute of Health: MH059532, R01HD37694, R01HD37999, and R01MH66877. We would also like to thank Dr. Pamela Sklar for her useful comments on this manuscript.