The Diagnostic Utility of Executive Function Assessments in the Identification of ADHD in Children
Background: Deficits in executive functions have been widely reported to characterise individuals with ADHD. The aim of this study was to evaluate the utility of a range of executive function measures for identifying children with ADHD.
Method: Eighty-three children with ADHD and 50 normally-developing children without ADHD were assessed on measures of inhibition, set-shifting, planning, problem-solving, response inhibition, sustained attention and working memory. Measures of sensitivity, specificity, likelihood ratios and diagnostic odds ratios were calculated.
Results: Executive function tasks effectively discriminated between children with and without ADHD. Measures of response inhibition and working memory contributed the most to the discriminant function.
Conclusions: Cognitive measures of executive function can be used to help identify children with ADHD and could be useful as additional diagnostic tools for clinical practitioners.
Key Practitioner Message:
- • ADHD diagnoses are often based heavily upon symptoms assessed by behavioural checklists. These can lack diagnostic utility.
- • It is possible to enhance clinical diagnoses of ADHD by employing neuropsychological / cognitive tests of executive functioning;
- • Where there is little opportunity to undertake a full range of cognitive measures, brief tests of response inhibition and working memory can provide high levels of discrimination between individuals with and without ADHD.
- • Guidance from clinicians about the difficulties in executive functioning experienced by children with ADHD may prove helpful to teachers and parents.
Deficits in executive functions, the high-level cognitive processes involved in goal-directed behaviour, are widely believed to lie at the core of attention deficit hyperactivity disorder (ADHD). It is therefore perhaps surprising that current diagnostic practice makes little systematic use of assessments of executive dysfunction, with clinical assessment relying heavily on descriptions of behaviour in multiple settings (typically, at home and either at school or work) as markers for the disorder. The aim of the present study was to investigate whether the deficits in executive function that are widely reported to accompany ADHD can themselves provide reliable indices of this relatively common clinical condition.
ADHD is a disorder characterised by atypically high levels of hyperactive/ impulsive behaviour and of inattention (APA, 1994). Children and adults are typically diagnosed through psychiatric services on the basis of elevated levels of these symptoms on behaviour checklists such as the Conners Rating Scales (Conners, 1997) and the ADHD-IV Rating Scale (DuPaul et al., 1998), optionally combined with semi-structured clinical interviews and formal and informal observations of the individual. There are several concerns about the heavy reliance on behaviour checklists for ADHD diagnosis. The discriminant validity of behaviour ratings for ADHD symptoms is far from impressive (Gomez et al., 2003, 2005). Levels of agreement between behavioural ratings across settings are low (Antrop et al., 2002; Gomez et al., 2003, 2005; Mares et al., 2007). Additionally, ratings are subject to the negative halo effect, which describes how one negative attribute or behaviour in a person can influence a rater to evaluate other behaviours and attributes negatively (Schachar, Sandberg, & Rutter, 1986; Stevens, Quittner, & Abikoff, 1998). Finally, there is substantial symptomatic overlap between ADHD and other disorders such as autism and conduct disorder (Barkley, 1990). These issues have prompted the recent call for multi-method assessments to add validity to the diagnosis of ADHD (Pineda et al., 2007).
Impairments in executive functions are widely recognised as primary features of ADHD (Barkley, 1997; Castellanos et al., 2006; Martinussen et al., 2005; Nigg, 2001). Most taxonomies of executive function include inhibition, shifting, planning and working memory (WM). Inhibition involves overriding dominant or prepotent responses at either the motoric or cognitive level, shifting involves switching between multiple tasks or mental sets, and planning refers the ability to think ahead to achieve a goal. WM is a multi-component system of storage and attentional control that supports the temporary storage and mental manipulation of material for brief periods of time (Baddeley, 2000).
Individuals with ADHD have been widely reported to have impairments in these key executive functions: in inhibition (Nigg, 2001; Pennington & Ozonoff, 1996), in shifting (Oades & Christiansen, 2008; van Mourik, Oosterlaan & Sergeant, 2005), in planning (Barkley, 2003; Solanto et al., 2007), and in WM (Barkley, 2006; Martinussen et al., 2005; Willcutt et al., 2005). Deficits have also been widely reported in cognitive activities that require sustaining attention over extended periods of time (Pasini et al., 2007; Solanto et al., 2007). One such measure, the Continuous Performance Test (Conners & Multi-Health Systems Staff, 2004) is at present one of the only cognitive assessments used to inform clinical diagnosis of ADHD, with high frequencies of false responses to non-target stimuli (known as commission errors) characterising many individuals with ADHD (Ricco et al., 2002).
A variety of explanations have been advanced for the extensive deficits in executive function found to accompany ADHD. Some theorists have proposed that core inhibitory problems cause secondary disruption to other executive functions and underlie the broader constellation of cognitive and behavioural deficits associated with ADHD (Barkley, 1997). Other multiple deficit models, however, suggest that ADHD results from the additive or interactive effects of a number of factors that include inhibition and WM (Castellanos & Tannock, 2002; Willcutt et al., 2005).
In the present study, we investigated the extent to which measures of executive function could be used by clinical practitioners to identify children who are likely to have ADHD. Although some studies have previously examined the discriminant validity of individual measures of memory and learning (Phelps, 1996) or of latent constructs derived by a battery of neuropsychological assessments (Pineda, et al., 2007), there have to our knowledge been no studies comparing the predictive value of different executive function measures. In our study, children with ADHD and a group of typically-developing children of the same age completed a battery of standardised tests designed to tap the full range of executive functions reported to be impaired in ADHD. Participants completed measures of cognitive inhibition (Color-Word Interference), set shifting (Trail Making), problem-solving (Card Sort) and planning (Tower) from the Delis- Kaplan Executive Function System (D-KEFS; Delis, Kaplan & Kramer, 2001). The Automated Working Memory Assessment (Alloway, 2007) was also administered. This test battery is a standardised tool with high construct validity consisting of verbal and visuo-spatial short-term memory (STM) tests, and verbal and visuo-spatial WM tests that tap both the central executive and the appropriate domain-specific stores of the Baddeley and Hitch model of WM (Baddeley & Hitch, 1974; Baddeley, 2000). Finally, the children completed the Walk- Don’t Walk test of response inhibition from the Test of Everyday Attention for Children (Manly et al., 1999), and the Continuous Performance Test (CPT) of sustained attention (Conners & Multi-Health Systems Staff, 2004).
Two groups of children participated in the study. The first group consisted of 83 children aged between 8 and 11 years with a clinical diagnosis of ADHD. These children were recruited through pediatric psychiatrists and community pediatricians based in the North-East of England. Their mean age was 9 years 9 months (SD = 11.98 months), and there were 71 boys and 12 girls. The majority were receiving psycho-stimulant medication for ADHD (methylphenidate n = 64, dexamphetamine n = 2, dexedrine n = 2, imipramine n = 1) and 15 were receiving no medication. Children prescribed drugs for their ADHD symptoms ceased ingestion 24 hours prior to testing. No children with autistic spectrum disorders were included in the sample. A comparison group of 50 typically developing, non-ADHD children aged between 8 and 11 years from the same schools was also recruited, with a mean chronological age of 9 years 10 months (SD = 11.89 months). This group consisted of 20 girls and 30 boys. Ethical approval was obtained through the local National Health Service ethics board (Hartlepool & North Tees Local Research Ethics Committee) and through Durham University’s Ethics Committee. Consent was obtained from parents/ guardians and children, with appropriate opportunities for withdrawal.
Four tests of the D-KEFS (Delis et al., 2001) were administered. The Trail-Making test assessed abilities to shift attention between mental sets, and consisted of a number of different conditions. In the Number-Letter Sequencing condition, children attempted to draw connecting lines between circles containing letters and numbers, in increasing alternating sequence (A-1-B-2-C-3, etc.). Other conditions within this test measured the basic processes necessary for the completion of the Number-Letter Sequencing condition. These included a Visual Scanning condition, which measured the children’s visual scanning speed. This required children to cross-out targets (all the number 3s) in an array of letters and numbers as quickly as possible. A Motor Speed condition, in which children had to connect a series of unlabelled circles as quickly as possible was also included, as were separate Number Sequencing and Letter Sequencing conditions. The latter two conditions required children to connect either consecutively numbered or consecutively lettered circles. Completion times were calculated for each condition, and converted to scaled scores (M = 10.00, SD = 3.00). Errors were calculated and converted to cumulative percentiles as per the instructions in the user manual.
Cognitive inhibition of a prepotent response was assessed by the Color-Word Interference test. One condition, Color-Word, involved a standard Stroop task in which children inhibited the over-learned verbal response of naming the colour word, and instead named the dissonant colour in which the words were printed (e.g., if the word green was printed in red ink, the correct response was ‘red’). A further condition, the Color-Word with Switch condition, tested both inhibition and switching: the child was now instructed to name the colour of the ink (as in the Color-Word condition) for all words except those displayed in a box. For these trials the child was instructed to switch to naming the colour word, not the colour of the ink. Further conditions measured the basic processes necessary for the completion of the Color-Word and Color-Word with Switch conditions. These were a Color-Naming condition, which required children to name colours, and a Word Reading condition, which required children to read colour words. Completion times, converted to scaled scores, were produced for all four tasks and errors were scored as cumulative percentiles.
The D-KEFS Card Sort test measured the initiation of problem-solving behaviour and conceptual learning, as well as the ability to inhibit and control previous responses to engage in flexible thinking when problem solving. This test required children to sort six cards into two groups of three according to different dimensions such as shape, colour or semantic information written on the cards. The number of correct sorts was scored and converted to a scaled score.
Planning, rule learning and the ability to inhibit an impulse response were measured using the D-KEFS Tower test. The test required children to move five disks of different sizes that were arranged on three pegs from a start position to an end state. The children had to adhere to two rules when attempting to reach the end state: i) only one disk was to be moved at a time ii) no disk was to be placed on a smaller disk. A total achievement scaled score was calculated which reflected the number of moves it took the child to complete the task. Rule violations, the number of times children broke the rules of the task, were scored as cumulative percentiles.
In addition to the four tests of the D-KEFS, children completed the K test of the CPT (Conners & Multi-Health Systems Staff, 2004), which measured sustained attention. In this test, a series of letters appeared on the computer screen and the child was required to press the space bar in response to the letter K, but not respond when any other letter appeared. In total, 480 stimuli were each presented for 250 ms, with an inter-stimulus interval of one second. The target stimuli appeared on 140 of the trials at random intervals. The number of omissions and commissions were scored as counts.
Response inhibition was measured using the Walk-Don’t Walk test from the TEA-Ch (Manly et al., 1999). In this test, children were given an A4 sheet showing paths made up of footprints. They listened to a CD that played Go and No-Go sounds and were required to dot the next footprint on the path with a marker pen when they heard the Go sound. They were instructed not to respond to a No-Go sound. The Go sounds were presented in a regular, rhythmic order with the No-Go sounds occurring at random, unpredictable intervals. Inter-tone intervals began at 1500 ms and were systematically reduced throughout the task, reaching a minimum of 500 ms at item 20. The duration of the task was approximately six minutes. The total number of correct responses out of 20 was converted to a scaled score (M = 10, SD = 3).
Children also completed all 12 subtests of the AWMA (Alloway, 2007), providing three tests each of verbal STM (Digit Recall, Word Recall, and Nonword Recall), visuo-spatial STM (Dot Matrix, Block Recall, and Mazes Memory), verbal WM (Backwards Digit Recall, Listening Recall, and Counting Recall), and visuo-spatial WM (Mr. X, Spatial Span, and Odd-One-Out).
Descriptive statistics for the principal measures are provided in Table 1. Group comparisons were conducted using t-tests; significance levels and effect size values (Cohen’s d) are also shown. There were significant group differences between the two groups across all measures of executive function.
Table 1. Descriptive statistics for executive function measures as a function of group
|Trail Making test||visual scanning time||80||11.18||2.86||50||11.84||2.71||1.25||0.22||0.22|
|motor speed time||80||10.73||2.73||50||11.16||2.51||0.80||0.43||0.14|
|number sequencing time||80||9.91||3.33||50||10.58||2.94||1.02||0.31||0.18|
|number sequencing errors||80||90.75||23.08||50||97.10||15.20||1.91||0.06||0.31|
|letter sequencing time||80||9.19||3.70||50||9.92||2.99||1.38||0.17||0.25|
|letter sequencing errors||80||76.75||29.61||50||88.98||27.50||2.41||0.02||0.42|
|number-letter sequencing time||80||10.51||2.85||50||10.56||2.92||0.09||0.93||0.02|
|number-letter sequencing errors||80||32.86||22.38||50||45.51||18.30||3.51||0.00||0.58|
| Interference test||color naming time||77||10.25||2.99||49||12.20||3.45||3.62||0.00||0.62|
|color naming errors||77||47.19||37.31||49||73.27||35.70||4.27||0.00||0.72|
|word reading time||77||9.88||3.29||49||12.16||2.14||4.99||0.00||0.75|
|word reading errors||77||54.84||44.67||49||82.45||33.40||4.44||0.00||0.70|
|color-word with switch time||77||10.55||3.25||49||12.29||2.87||3.07||0.00||0.54|
|color-word with switch errors||77||41.49||24.53||49||53.37||20.30||4.58||0.00||0.78|
|Card Sort test||number of free sorts||83||5.71||2.49||50||7.86||2.26||4.99||0.00||0.82|
|Tower test||total achievement||83||13.95||4.80||50||12.94||3.83||1.34||0.18||0.23|
| Performance test||omissions||83||34.39||23.07||50||23.36||21.20||2.75||0.01||0.48|
|Response inhibition||walk / don’t walk||83||3.83||3.31||50||9.28||3.69||8.80||0.00||1.25|
|Working memory||verbal STM||83||98.82||16.81||50||112.3||10.2||5.11||0.00||0.84|
The ADHD group committed significantly more errors in three conditions of the Trail Making test: Number Sequencing, Letter Sequencing and Number-Letter Sequencing. However, there were no significant differences between the groups in terms of completion time in these conditions, nor were there any significant differences in the Visual Scanning or Motor Speed conditions of this test.
The ADHD group also committed significantly more errors than the non-ADHD group in all conditions of the Color-Word Interference test: the Word Reading, Color Naming, Color-Word and Color-Word with Switch conditions. The non-ADHD group performed each of these conditions significantly faster than the ADHD group, but the completion times of the ADHD group still fell within the normal range.
On the Tower test, the groups did not differ significantly on total achievement score, indicating that there were no significant differences in the number of moves taken to complete the task. However, the ADHD group committed a significantly greater number of rule violations than the non-ADHD group.
The ADHD group was also significantly impaired on the Card Sort test relative to the non-ADHD group, demonstrating poorer problem-solving and conceptual skills. They also committed a significantly greater number of both omissions and commissions on the CPT than the non-ADHD children.
Finally, the ADHD group scored significantly more poorly than the non-ADHD group across all four aspects of WM; verbal STM, visuo-spatial STM, verbal WM and visuo-spatial WM. These differences were significant for all 12 WM subtests.
Discriminant function analyses were conducted to evaluate the extent to which performance on executive function measures accurately predicted whether children had been diagnosed with ADHD or not. In the first analysis, all of the principal executive function measures were entered. The resulting function was significant, Λ =.40, χ2 (26, N = 126) = 100.57, p < .001. Canonical variate correlation coefficients for this function are shown in Table 2. Group membership was classified correctly for 85.7% of the children, with 88.3% of the ADHD and 81.6% of the comparison children correctly assigned. With leave-one-out cross validation, a method that assesses the extent to which the function can predict a new sample, 77.8% of the sample were correctly classified, 80.5% of the ADHD and 73.5% of the comparison children. Acceptable levels of classification range between 70% and 90% (Glascoe & Squires, 2007; Miesels, 1988).
Table 2. Canonical variate correlations
|Trail Making test||visual scanning time||0.09|
|motor speed time||0.06|
|number sequencing time||0.06|
|number sequencing errors||0.09|
|letter sequencing time||0.08|
|letter sequencing errors||0.16|
|number-letter sequencing time||0.01|
|number-letter sequencing errors||0.25|
| Interference test||color naming time||0.25|
|color naming errors||0.29|
|word reading time||0.32|
|word reading errors||0.27|
|color-word with switch time||0.23|
|color-word with switch errors||0.37|
|Card Sort test||number of free sorts||0.33|
|Tower test||total achievement||−0.09|
| Performance test||omissions||−0.18|
|Response inhibition||walk / don’t walk||0.62|
|Working memory||verbal STM||0.34|
The classification rates from the discriminant function analyses reported above were used to compute likelihood ratios (Sackett et al., 1991), which quantify the extent to which members of one group are more likely to score either above or below a particular cut-off value (in this case, derived from a discriminant function analysis) than members of another group. For this study, the positive likelihood ratio LR+ is calculated by dividing the proportion of children with ADHD who were correctly classified by the discriminant function as belonging to that group by the proportion of children from the comparison group who were misclassified as belonging to the ADHD group. The negative likelihood ratio LR- is obtained by dividing the proportion of children in with ADHD who were wrongly classified as belonging to the comparison group divided by the proportion of comparison group children who were correctly classified as such. The LR+ value was 4.58, indicating that children with ADHD were at least 4.5 times more likely to score poorly on executive function measures than children without ADHD. The LR- value was .14. The diagnostic odds ratio LR+/LR-, a summary measure of the degree of discrimination between the groups provided by the executive function measures, was 34.29. Diagnostic odds ratios range from 0 to infinity; values over 1 indicate that a test discriminates between groups, with higher diagnostic odds ratios indicating better discriminant ability.
These data clearly establish that multiple executive function measures can reliably discriminate between children with and without ADHD. However, their utility for clinical practice may be severely limited by the number of cognitive tests that can be undertaken in a single assessment - together, these tests take approximately 90 minutes to administer. To address this practical issue, we sought to identify the measures that provided the best individual predictors of group membership, guided by the canonical variate correlation coefficients for the first discriminant analysis. These coefficients represent the relative contribution of each dependent variable to group separation: the larger the value, the greater the contribution. The four variables with the highest coefficients were response inhibition (.62), visuo-spatial WM (.46), verbal WM (.46) and visuo-spatial STM (.44). These four variables were entered into the second discriminant analysis, Λ =.52, χ2 (4, N = 132) = 83.70, p < .001. The classification function correctly predicted group membership for 82% of the sample, with 85.56% of the ADHD group and 76% of the non-ADHD group correctly classified. With leave-one-out cross validation, the classification function was unchanged. The LR+ was 3.56 and the LR- was .19, yielding a diagnostic odds ratio of 18.74, again providing excellent group differentiation.
Separate discriminant function analyses were run for each of the four variables entered into the previous analysis to explore how useful a single measure might be at discriminating between ADHD and non-ADHD groups. The visuo-spatial WM measure was entered into the first of these analyses, Λ = .74, χ2 (1, N = 132) = 38.69, p < .001. This function correctly classified 74.4% of the sample, with 84.3% of the ADHD group and 58% of the non-ADHD group correctly identified. The classification function was unchanged with leave-one-out cross validation. The LR+ was 2.01 and the LR- was .03, yielding a diagnostic odds ratio of 67. When entered as a single predictor, the verbal WM measure, Λ = .74, χ2 (1, N = 132) = 39.01, p < .001, correctly predicted group membership for 71.4% of the sample; 77.1% of the ADHD and 62% of the non-ADHD group. The LR+ was 2.03, the LR- was .37 and the diagnostic odds ratio was 5.49 for this function, demonstrating that the verbal WM measure was not as good as a single predictor as the visuo-spatial WM measure. The next analysis, conducted on the visuo-spatial STM measure, Λ = .74, χ2 (1, N = 132) = 38.69, p < .001, revealed that it was better than the verbal WM, but poorer than the visuo-spatial WM measure, at predicting group membership for the ADHD group (81.9% were correctly classified). However, this measure yielded a high LR-, 1.52, which produced very low diagnostic odds ratio of 1.17, indicating that it is not as good overall at discriminating between the groups as the WM measures.
The final discrimination function analysis was performed on the best predictor of group membership in this dataset, response inhibition (see Table 2). The resulting function, Λ = .63, χ2 (1, N = 133) = 60.65, p < .001, correctly predicted group membership for 78.2% of the sample, with 83.1% of the ADHD and 70% of the non-ADHD groups correctly classified, yielding LR+ =2.77 and LR- =.24, and a diagnostic odds ratio of 11.54.
Taken together, this series of analyses shows that the response inhibition and visuo-spatial WM measures were the best single predictors of ADHD group membership (response inhibition, 83.1% and visuo-spatial WM, 84.3%), but that the response inhibition task was better at predicting non-ADHD group membership (response inhibition, 70%, visuo-spatial WM, 58%).
This study investigated the extent to which cognitive assessments of executive functions could be used to identify children with ADHD. In line with previous research (Barkley, 1990, 1997; Castellanos et al., 2006; Holmes et al., 2008; Martinussen et al., 2005; Nigg, 2001; Willcutt et al., 2005), our large sample of children with ADHD was found to perform more poorly than typically-developing children of the same age on measures of cognitive inhibition, motor inhibition, set shifting, planning, card sorting and WM.
Our data establish that these cognitive assessments provide excellent degrees of discrimination between children who do and do not have a diagnosis of ADHD. Using scores on all of the tests, over 90% of children with ADHD and over 80% of the comparison group were correctly classified. Classification accuracy was still high when only the single best behavioural predictor was used, with 83% of the children with ADHD and 70% of the comparison group assigned to the correct group. This test, which took approximately five minutes to administer, was the Walk-Don’t Walk test of the TEA-Ch (Manly et al., 1999), and involved the child responding by placing a dot on the next footprint on a sheet if he or she heard one distinctive sound, but not if another less common one was heard. It provides a measure of inhibition of a prepotent motor response, and children with ADHD performed very poorly on the test, frequently responding to the non-target sound rather than withholding a response, as appropriate.
These findings indicate that cognitive measures of executive functions can be used, with high degrees of accuracy, to help to identify children who are likely to have ADHD. Importantly, their administration could be readily incorporated into a standard clinical assessment without unduly extending the length of the session: a 5-minute assessment in our study provided well over 80% diagnostic accuracy for children with ADHD. Inclusion of such measures in clinical assessments would go some way towards alleviating concerns about the reliance upon behaviour checklists in the diagnosis of ADHD, and would also contribute to the recognised need for multi-method assessments in identifying ADHD (Pineda et al., 2007). Furthermore, this form of assessment would go beyond reaching a diagnosis by describing behavioural characteristics through the counting of signs and symptoms, to assess the underlying neuropsychological mechanisms thought to be impaired in individuals with ADHD (Seidman et al., 1997; Barkley et al., 2001); deficits which are independent of comorbid disorders, such as ODD or anxiety (Oosterlaan, Scheres, & Sergeant, 2005). As the work is undertaken to develop the diagnostic criteria for DSM-V there continues to be considerable debate about whether the syndrome is best characterised along dimensional lines or as a discrete category. This is further complicated by the recognition that the symptoms of ADHD can have a range of origins (American Academy of Pediatrics, 2000). The inclusion of cognitive/neuropsychological assessments alongside checklists and clinical evaluation offers the opportunity to refine the nature and degree of difficulty a child is experiencing, not only helping to develop a more focused intervention, but also adding to the process of refining the understanding of etiologies.
Although this study demonstrates the potential utility of tests of executive function in the diagnosis of ADHD, caution is recommended when using only a subset of a battery of neuropsychological tests. As seen in the present study, using only four tests correctly identified 86% of children with ADHD, but it also misclassified 25% of those without the disorder. It is therefore advisable to use such tests in conjunction with behaviour checklists and clinical evaluation. It should also be noted that while these tests accurately discriminate between individuals with ADHD and a typically-developing sample, it is not clear from the present data whether they are able to distinguish between specific neuro-developmental disorders. To investigate this, future studies could apply the current methodology to other clinical samples, such as children with obsessive compulsive disorder or Asperger’s syndrome.
This research was supported by a project grant from the Economic and Social Research Council of Great Britain.