How does exercise benefit performance on cognitive tests in primary-school pupils?


  • This article is commented on by Barnett on page 580 of this issue.

Dr Liam Hill at College of Life Sciences and Medicine, Mental Health (Applied Clinical Sciences), University of Aberdeen, Clinical Research Centre, Royal Cornhill Hospital, Aberdeen AB25 2ZH, UK. Email:


Aim  We have previously demonstrated improved cognitive performance after a classroom-based exercise regime. In this study, we examined the reproducibility of this effect in a more socio-economically diverse sample and also investigated whether cognitive benefits of exercise were moderated by body mass index (BMI) or symptoms of attention-deficit–hyperactivity disorder (ADHD).

Method  A crossover design trial (2wks in duration) randomized 552 children (mean age 9y 8mo, SD 1y 2mo; range 8–12y) by their school into two counterbalanced groups. Children were eligible to participate provided that they did not receive any additional support. One group received a classroom-based programme of physical exercise on week 1 and then no programme on week 2, and this order was reversed for the other group. Each week, all participants completed a cognitive test battery that was delivered in one part per day at the end of each school day.

Results  On the cognitive tests, a significant interaction between counterbalance group and exercise was observed (p<0.001). Benefits occurred only for participants who exercised during the second week (mean improvement mean 3.85, standard error 1.39). Although test scores were affected by age, sex, and level of ADHD symptoms, the effect of exercise was not moderated by either these factors or BMI.

Interpretation  Exercise interventions have a positive effect (with variable magnitude) on cognitive performance, possibly by facilitating practice effects. These effects are not moderated by sex, ADHD symptom level, or BMI.


Brain-derived neurotrophic factor


Cognitive test battery


Deprivation category

What this paper adds

  •  Exercise interventions have a positive effect (with variable magnitude) on cognitive performance.
  •  The article confirms that benefits can occur in response to a single episode of exercise, but only with repeated testing.
  •  This study found no relationship between the magnitude of benefit received from exercise and sex, baseline attentional level, or body mass index.

Evidence suggests clear associations between children’s physical fitness and their academic achievement,1,2 and experimental research has demonstrated that exercise interventions can be used to improve directly cognitive and behavioural functioning in children.3–6 Nonetheless, methodological weaknesses within most studies (such as uncontrolled practice effects,5 small sample size,4 and limited generalizability6) indicate that further studies are required.

Hill et al.3 conducted a randomized controlled trial to investigate the effects of a classroom-based exercise programme on cognitive performance in 1224 children. Over a 2-week period a cognitive test battery (CTB) was administered twice to each participant randomized either to group A (who received exercise intervention in week 1) or group B (who received exercise intervention in week 2), creating a counterbalanced design. A significant improvement in overall CTB performance was found post exercise and this was independent of age.

The data reported by Hill et al.3 are of great potential interest, but as it is only one study we felt that it was necessary to see if the results could be replicated. Also, when the postcodes of the schools sampled in Hill et al.3 were cross-referenced with the Scottish Index of Multiple Deprivation,7 the sample was found to comprise exclusively of the second least-deprived quintile of the population, indicating a need to conduct the study in a more socio-economically diverse sample.

We also sought to test an assertion8 that acute bouts of physical activity are particularly effective for improving cognitive function in children with more symptoms of attention-deficit–hyperactivity disorder (ADHD). Acute exercise has been documented as up-regulating brain-derived neurotrophic factor (BDNF), and it is through this mechanism that immediate improvements in cognitive function are thought to occur (other mechanisms become relevant in relation to chronic exercise).9,10 Correspondingly, BDNF has also been implicated in the pathogenesis of ADHD.11 Thus, it is conceivable that those children with more symptoms of ADHD might be most sensitive to post-exercise increases in BDNF and obtain the most benefit from it. In line with other current thinking, we conceive of ADHD as affecting children at the extreme end of a dimensional continuum, such that variability in the general population will have the same origins as differences between children with and without the disorder.12

Recently, Davis et al.6 demonstrated that exercise improves executive functioning specifically in overweight children and speculated that weight might moderate the effects of exercise. There is evidence of more scope for exercise to benefit overweight individuals: in adolescence, individuals with a higher body mass index (BMI) show mild deficits in executive functioning compared with their normal-weight peers,13 and extremely overweight children have been shown to have reduced BDNF levels.14 Thus, we also felt it appropriate to investigate BMI as a moderator variable.



Nine schools were recruited by Aberdeen City Council (north-east Scotland), and, of the 1226 children from the senior section of these schools (primaries 4–7; age increases sequentially by primary level from 8y up to 12y) invited to participate in the study, 760 consented (62%). Students were eligible to participate provided they could do so without the need for additional support (e.g. owing to a disability). None of these schools was involved in the previous study.3 Using school postcodes and the Scottish Index of Multiple Deprivation,7 two of these schools were classified as being in deprivation category (depcat) 1 areas (least deprived), two in depcat 3, three in depcat 4, and two in depcat 5 (most deprived).

In order to blind participants to the experimental aims, consent for the exercise intervention and the test battery was given separately, and the children received a full explanation of their participation only at debriefing. Parent/guardian consent was fully informed from the outset. Ethical approval was granted by the local ethics committee of the University of Aberdeen.


The frequency with which participants displayed ADHD-like symptoms was measured using a standardized parental report questionnaire (the Vanderbilt ADHD Diagnostic Parent Rating Scale15) that was sent home along with the consent forms.

Each participant’s height and weight were measured by the school before the experimental stage of the study. To assess the reliability of this measurement method, 10% of the participants from each school were randomly selected for re-measurement by the researchers using a standardized protocol16 and the recommended equipment – a Leicester Portable Height Measure® (SECA; Birmingham, UK) and Quattrotronic scales® (Soehnle; Nassau, Germany).

The experimental stage of the study replicated the crossover design used in the previous experiment (summarized in Fig. 1). The two schools at each level of socio-economic status were randomly assigned to either group A or group B. Schools in group A completed the exercise intervention in week 1 and the control condition in week 2, while those in group B did the opposite. This counterbalancing was an attempt to ensure equal numbers within each counterbalance arm at each socio-economic level. Thus, in depcat 4 the two smallest schools were grouped together for this randomization.

Figure 1.

 Experimental design: group A schools received the classroom exercise programme in week 1 (30min after lunch) and no physical exercise in week 2; in group B schools, the opposite order was applied. Two versions of each psychometric test were then administered, one per week at the end of a specific weekday (α and β). Psychometric tests: PSA, paced serial addition; SOT, size ordering task; LS, listening span, DSB, digit-span backward; VC, visual coding.

On their scheduled week, participants completed the exercise intervention approximately 30 minutes after lunch. At the end of each day on both the intervention and control weeks, the psychometric testing was conducted within the classroom in the last 15 minutes of the school day. Children sat at their desks and completed the tasks under traditional ‘test’ conditions (no conferring, no pausing once started, and no assistance from the teacher). To ensure uniform delivery, stimuli and instructions were pre-recorded on to a CD and, in order to reduce practice effects, different versions of the battery of equal difficulty were delivered week by week. Teachers silently invigilated, judging whether participants were compliant with the instructions. Following administration, teachers noted any participants whom they had observed being non-compliant and these responses were excluded from further analysis. Responses were written in response booklets and later marked by two independent markers who were blind to the experimental hypothesis to ensure internal validity (10% were also double marked).


The cognitive test battery (CTB) used a selection of existing psychometric tests that were modified to ensure that each test could be presented orally to whole classes simultaneously and gain written responses within 15 minutes. A similar protocol was used to that employed previously by Hill et al.,3 although some instructions were reworded and more practice trials were included to improve participants’ understanding of the task demands. Also, all but one of the tests (‘paced addition’) was modified to include more trials and/or had its marking scheme altered. These changes were made to increase the number of response data collectable within the 15-minute administration time or to make it harder for participants to use invalid responses (i.e. to ‘cheat’).

The CTB subtests were definable as mental tracking tasks ‘requiring participants to track two or more stimuli or associated ideas simultaneously, alternatively, or sequentially on double or multiple tracking tests involving dividing and/or shifting attention’.17 The tests used were ‘paced serial addition’18 (delivered on Mondays), ‘size ordering task’19 (Tuesdays), ‘listening span task’20 (Wednesdays), ‘digit-span backwards’21 (Thursdays), and ‘visual coding’21 (Fridays). For each subtest, two versions of equivalent difficulty were created, α and β, to reduce practice effects from retesting. For example, the α and β of the digit-span backwards exercise used different sets of randomized number sequences. Participants completed the α version on week 1 of the experiment and the β version on week 2. The verbal protocols for all the adapted tests are available as supplementary material published online.

The exercise intervention was developed by Aberdeen City Council’s physical education curriculum support team. Lasting 10–15 minutes and directed by the teacher within the classroom, students completed a series of prescribed exercises while standing behind their desks (e.g. jogging on the spot, a basic sequences of jumps, etc.). This was intended to be moderately intensive for the average student (i.e. participants should begin to perspire and be slightly out of breath by the end of it; see supplementary material published online). During the control week, participants completed non-physically active, enjoyable curricular activities that were set by their class teacher (e.g. art, music, story time).

Statistical analysis

The inattention and hyperactivity/impulsivity subscales of the Vanderbilt ADHD Diagnostic Parent Rating Scale were scored using the standard protocol. If responses were less than 50% complete, they were excluded. Participant response was categorized into three levels for analysis: (1) individuals with zero symptom report scores for both subscales; (2) individuals with non-zero but subclinically significant symptom report scores; (3) individuals with a clinically significant score on either or both of the subscales (based on DSM-IV-TR criteria).22

Participants’ height and weight were used to calculate their BMI, by dividing weight (in kilograms) by height (in metres) squared. This was then categorized as normal weight, overweight, or obese, based on accepted criteria.23 Agreement between the repeated height and weight measurements taken by the schools and the researcher was examined using Bland–Altman plots.

Scores on the CTB subtests were converted into percentages of the maximum possible score on each test and then averaged to give each child an overall performance score for each week. Overall performance scores included only performances on subtests that had been validly completed on both weeks. Participants were excluded from further analysis if their overall performance score was based on data from fewer than three complete subtests.

Linear mixed modelling24 using SAS (version 9.1, service package 4, SAS Inc., Chicago, IL, USA) was carried out by the first and third authors to examine the effects of the exercise intervention on the CTB and whether this factor interacted with any potential moderators. A restricted maximum likelihood method was used with a compound symmetry covariance structure used in modelling the random effect of individual participants. This random effect of individual participants was also nested within a random effect of school to take into account clustering by school in randomization procedures.

The model set the overall performance scores for each of the 2 weeks as a repeated measure and examined the following factors as fixed effects: intervention (exercise vs control), counterbalance group (A or B), primary level, sex, BMI classification, and Vanderbilt ADHD Diagnostic Parent Rating Scale classification. The model was specified to consider all the main effects and any two-way interactions that involved the intervention factor. Only participants who had valid overall performance scores for both weeks and data available on all the fixed factors being considered were included within the model. Post hoc pairwise comparisons of differences in the least square means were conducted using Tukey’s test.


Attrition rate was 8%, with 64 participants (696 remaining) unable to complete at least three components of the CTB fully on both weeks. This was primarily owing to school absence.

A minimum of 11% of the data for each variable was double marked and then double entered. The mean disagreement level was 2%, ranging between 0% and 6% depending on the particular variable. The Bland–Altman plots showed no clinically significant differences between the school-based measures of height and weight and those obtained using the standardized protocol (absolute difference in heights 1.3cm, SD 0.9cm; absolute difference in weights 0.74kg, SD 0.87kg; see supplementary material published online).

Of the 696 participants left after dropout, 552 (79%) satisfied the criteria for inclusion in the linear mixed modelling analysis. Table I shows the demographics of this group, how they were distributed across the counterbalance, and the frequency of missing data. Variable participation rates within the recruited schools resulted in a highly skewed distribution of depcats within the sample as a whole (and between counterbalance groups), so depcat was not included as an analysis variable.

Table I.   Summary of demographic variables and their distribution across counterbalance groups
Variablen (% of n)Group A (% within group)Group B (% within group)
  1. aNumber of participants who completed three out of the five tests on the cognitive test battery (CTB) and had data for all moderator variables. bThe percentage of the total sample, after attrition, which was still ineligible for analysis because of data missing on this variable. cDivided into relevant subscales. dNumber of participants who had complete data for this variable. BMI, body mass index; ADHD, attention-deficit–hyperactivity disorder; depcat, deprivation category.

Total na552 (100)319 (57.8)233 (42.2)
 Male295 (53.4)162 (50.8)133 (57.1)
 Female257 (46.6)157 (49.2)100 (42.9)
 % Missingb1.80.03.5
Primary level
 4141 (25.5)77 (24.1)64 (27.5)
 5144 (26.1)85 (26.6)59 (25.3)
 6133 (24.1)75 (23.5)58 (24.9)
 7134 (24.3)82 (25.7)52 (22.3)
BMI classification
 Normal389 (70.5)212 (66.5)177 (76.0)
 Overweight118 (21.4)75 (23.5)43 (18.5)
 Obese45 (8.2)31 (10.0)13 (5.6)
 % Missingb12.911.015.3
ADHD symptom reportc
 Zero symptoms237 (42.9)137 (42.9)100 (42.9)
 Subclinical symptom level261 (47.3)151 (47.3)110 (47.2))
 Clinical symptom level54 (9.8)31 (9.7)23 (9.9)
 % Missingb12.911.015.0%
Depcat code
 1 (least deprived)212 (38.4)97 (30.4)115 (49.4)
 20 (0.0)0 (0.0)0 (0.0)
 3138 (25.0)105 (32.9)33 (14.2)
 4154 (27.9)99 (31.0)55 (23.6)
 5 (most deprived)48 (8.7)18 (5.6)30 (12.9)
CTB subtest completenessd
 Paced Serial Addition489 (88.6)
 Size ordering490 (88.8)
 Listening span474 (85.9)
 Digit-span backwards484 (87.7)
 Visual coding478 (86.6)

The significance of the main effects and two-way interactions are reported in Table II and the estimated regression coefficients and their standard errors are reported in Table III. These tables show the influence of the significant main effects of sex, primary level, ADHD symptom report, and counterbalance group on CTB performance (overall performance score). Also the counterbalance group main effect was complicated by an interaction with the effect of the exercise intervention.

Table II.   The significance of fixed effects factors on overall cognitive test battery score, analysed using linear mixed modelling
Fixed effectNumerator degree of freedomDenominator degree of freedomFp
  1. BMI, body mass index; ADHD, attention-deficit–hyperactivity disorder.

Counterbalance group15426.830.009
Primary level354261.39<0.001
BMI classification25422.590.076
ADHD symptom report254214.12<0.001
Two-way interactions
Intervention×counterbalance group1542115.73<0.001
Intervention×primary level35421.590.190
Intervention×BMI classification25420.030.975
Intervention×ADHD symptom report25420.000.997
Table III.   Estimates of the regression coefficients for predictors of cognitive test battery performance
ParameterEstimated regression coefficient (SE)
  1. Reference levels within each categorical variable are represented by a dash. a<0.001; b<0.01; c<0.05. dA, exercise in week 1, control in week 2; B, opposite of A. ADHD, attention-deficit–hyperactivity disorder.

Fixed effects
 Intercept43.26 (3.30)a
 Exercise6.30 (2.30)b
Counterbalance groupd
 A1.67 (1.28)
 Male−5.39 (1.28)a
Primary level
 4−21.58 (1.78)a
 5−9.91 (1.76)a
 6−6.03 (1.80)a
BMI classification
 Normal4.94 (2.33)c
 Overweight3.84 (2.59)
ADHD symptom report
 No symptoms10.52 (2.26)a
 Subclinical symptom level6.24 (2.23)b
 Clinical symptom level
Intervention×counterbalance group
 Exercise in group A−9.58 (0.89)a
 Control in group A
 Exercise in group B
 Control in group B
 Exercise in males0.39 (0.89)
 Control in males
 Exercise in females
 Control in females
Intervention×primary level
 Exercise in primary 4−0.56 (1.24)
 Control in primary 4
 Exercise in primary 5−2.54 (1.24)c
 Control in primary 5
 Exercise in primary 6−0.95 (1.26)
 Control in primary 6
 Exercise in primary 7
 Control in primary 7
Intervention×BMI classification
 Exercise with normal weight0.02 (1.63)
 Control with normal weight
 Exercise with overweight0.23 (1.80)
 Control with overweight
 Exercise with obese
 Control with obese
Intervention×ADHD symptom report
 Exercise with no symptoms−0.09 (1.58)
 Control with no symptoms
 Exercise with subclinical symptom level−0.02 (1.56)
 Control with subclinical symptoms level
 Exercise with clinical symptom level
 Control with clinical symptom level
Random parameters
 Participant (nested within school)162.41

Pairwise comparisons were conducted, contrasting the mean overall CTB performance (range 0–100) between groups and conditions of interest. Females significantly outperformed males on the CBT (mean difference [95% confidence interval, CI] 5.20 [2.84–7.55]; p<0.001). Performance significantly improved with primary level; children in primary 5 outperformed those in primary 4 (mean 10.68 [CI 6.47–14.88]; p<0.001), children in primary 6 outperformed those in primary 5 (4.67 [CI 0.41–8.93]; p=0.025), and children in primary 7 outperformed those in primary 6 (6.51 [CI 2.12–10.85]; p<0.001). Additionally, Vanderbilt ADHD Diagnostic Parent Rating Scale classification was significant: zero symptom report scorers outperformed subclinical symptom report scorers (4.26 [CI 1.32–7.19]; p=0.002) and those with subclinical levels outperformed those with clinical levels (6.22 [CI 1.31–11.14]; p=0.009).

The interaction between the intervention and counterbalance group was similar in nature to the one noted in our previous study.3 The effect of the intervention on CTB performance was influenced by the counterbalance group. In week 1 the exercise and baseline groups showed no significant difference between one another (−2.40 [CI −5.99 to 1.19]; p=0.314). In week 2 the exercising group significantly outperformed the non-exercising group (3.85 [CI 0.26–7.44]; p=0.030).


These results indicate that the findings of Hill et al.3 are robust and can be replicated in different schools with a more diverse socio-economic population. Both studies found that a classroom-based exercise regime conducted after lunch can improve performance on a CTB at the end of the school day and were in agreement that these effects were moderated by an interaction with the counterbalance group. This is a significant finding given that this study is only the second to date (Hill et al.3 being the first) to demonstrate benefits of exercise upon childhood cognition while using a rigorous interparticipant crossover experimental design.

Nevertheless, we found no evidence to support our conjecture that participants’ BMI might moderate the amount of cognitive benefit that they gain from exercise,6 suggesting that the benefits of exercise are similar for all children, irrespective of their weight. Alternatively, heavier children may simply have performed at a lower level of activity or exercised at a relatively similar rate while self-adjusting for weight.

We also found no relationship between participants’ scores on an ADHD questionnaire and the magnitude of cognitive benefit that they received from exercise, suggesting that the benefit on attentional abilities conferred by exercise is independent of the level of ADHD symptoms. This finding may be relevant if practitioners are considering applying exercise routinely in educational settings. Our findings suggest that any such programmes may be as usefully applied to whole groups of children as to selected individuals with recognized attention difficulties.8

It is also feasible that socio-economic status may have been a variable moderating the benefits of exercise. Unfortunately, we were forced to exclude depcat from our statistical analysis owing to skewed participation rates between depcat levels and were unable to assess this hypothesis empirically.

Missing data, a potential source of bias, were reasonably minimal for the moderator variables and the individual subtests of the CTB (all <15%). Carryover effects from one week to the next were unlikely, given that the effect of the exercise intervention was anticipated to be acute. Long-term (chronic) improvements in cognitive performance have been documented only for much longer interventions.25

Interestingly, in both this study and that by Hill et al.,3 the benefits of exercise were confined to the second week of the study. Exercise boosted performance on the second administration of the CTB, over and above any improvements from practice, while it had no effect on performance the first time the CTB was administered. This would seem to argue against the notion that exercise has a direct effect on the cognitive functions required for the tracking tasks, but rather that exercise enhances the capacity to utilize previously encoded memories (i.e. participants remember effective task strategies). Animal studies9 have shown that exercise enhances synaptic functioning during learning and memory formation (i.e. it promotes long-term potentiation). Furthermore, in a spatial learning task, hippocampal BDNF facilitated recall as well as learning.10 Therefore, if the first testing session resulted in learning (‘learning to learn’) that was encoded in the form of long-term potentiation, the effect of the exercise intervention would have been to enhance the recall of that learning in the second testing session.

The mean improvement (3.85) compares well with the mean improvement seen by primary level or the mean deterioration by ADHD symptom report level. However, these comparisons should be viewed cautiously given the CI (0.26–7.44) surrounding the estimate of the exercise effect. While the effect of exercise on cognition is statistically significant, its size is poorly estimated and conservative interpretation (based on the lower end of the CI) would suggest only a small mean improvement due to exercise. More research needs to be conducted to determine accurately the effect size of exercise on cognition.

In conclusion, two large-scale randomized controlled trials (out of only two well-controlled trials that we know to have been conducted) have found that classroom-based exercise can improve cognitive function in children. In both studies the effect was limited to the second testing session, indicative of a specific enhancement of practice effects. This effect on learning could be explained by an up-regulation of BDNF in response to exercise.10 In animal studies9 similar affects on cognition after acute (rather than chronic) exercise have been shown which were mediated by changes in BDNF. We found no evidence that this effect was moderated by BMI, sex, or level of ADHD symptoms, suggesting that the benefits of exercise might be best maximized by incorporating exercise more regularly as a component of mainstream education.


The research received funding and support from Aberdeen City Council. The corresponding author received a scholarship from the Medical Research Council while conducting this research. We are extremely grateful to all of the teachers, parents, guardians, and especially children who made the research possible. We would also like to thank Dr Helen Brown for her advice on using linear mixed-effects modelling.