N‐back training and transfer effects revealed by behavioral responses and EEG

Abstract Introduction Cognitive function performance decreases in older individuals compared to young adults. To curb this decline, cognitive training is applied, but it is not clear whether it improves only the trained task or also other cognitive functions. To investigate this, we considered an N‐back working memory (WM) training task and verified whether it improves both trained WM and untrained cognitive functions. Methods As EEG studies have noted task difficulty and age‐related changes in time‐locked EEG responses, called event‐related potentials (ERPs), we focused on the relation between the P300 ERP component, task difficulty level, and behavior response accuracy and reaction time (RT) in young and older healthy adults. We used two groups of young and older healthy participants to assess the effect of N‐back training: cognitive training group (CTG) and passive control group (PCG). Before and after training, cognitive tests were administered to both groups to evaluate transfer effects. Results Despite the observed age‐related differences in the P300 ERP component and in terms of RT and accuracy, our findings demonstrate a stronger improvement in the trained task for older CTGs compared to younger CTGs, larger near‐ and far‐transfer effect to WM and fluid intelligence for both younger and older CTGs, and a far‐transfer effect to attention but only for older adults. Significant differences in response accuracy were shown between young and older subjects in spatial memory and attention tests. Conclusion The application of a WM training is a promising tool for both healthy adults, and in particular for older subjects, as it showed physiological and behavioral differences in cognitive plasticity across life span and evidence of benefits in the trained task and near‐/far‐transfer effects to other cognitive functions.


| INTRODUC TI ON
Cognitive decline has been sufficiently evidenced in healthy older adults across different cognitive domains (attention, working memory [WM], spatial memory, reasoning) (for review, see Au et al., 2015;Karbach & Verhaeghen, 2014;Soveri, Antfolk, Karlsson, Salo, & Laine, 2017) and, in view of the rapidly increasing elderly population, a growing concern of healthcare organizations in the near future. Given that decline in WM, one of the main cognitive functions, is paralleled by neurochemical, structural, and functional changes in the aging brain (Bopp & Verhaeghen, 2005), the study of cognitive decline during one's life span, and, more importantly, what can be done to slow it down, has gained interest from the research community. Motivated by the alleged ability to rekindle plasticity processes in the brain, cognitive training has been promoted to be effective in improving cognitive function performance after extensive training (Blacker, Negoita, Ewen, & Courtney, 2017;Kundu, Sutterer, Emrich, & Postle, 2013;Yang, Krampe, & Baltes, 2006).
However, cognitive training studies that provide direct evidence of transfer effects are still scarce and the outcomes mixed, albeit the ultimate aim of cognitive training is to go beyond what has been trained on and to eventually improve one's quality of life (Brehmer, et al., 2012;Klingberg, 2010;Richmond, Morrison, Chein, & Olson, 2011). Transfer effects to untrained tasks are typically classified into near and far. In the first case, the untrained task also relies on WM; in the latter case, it relies on other cognitive functions, in addition to WM, such as reasoning, intelligence, attention (Klingberg, 2010).
When targeting transfer effects in relation to WM training, the choice of the transfer task should, in our opinion, be based on two factors: transfer effects after WM training reported in literature studies and the relationship with the trained task. There have been reports on effective WM training improvements in untrained tasks, such as spatial WM, attention, and fluid intelligence in young (Anguera, et al., 2013;Jaeggi et al., 2008Jaeggi et al., , 2010 and older adults (Borella et al., 2010(Borella et al., , 2013Li et al., 2008). Also, significant correlations were found between WM decline in older adults and inhibition and processing speed by Borella, Carretti, and Beni (2008), and between WM and fluid intelligence in terms of the temporary retention of a certain amount of information (Kyllonen & Christal, 1990) and of attentional control processes (Salthouse, Pink, & Tucker-Drob, 2008). Considering these characteristics, we will verify whether near-and far-transfer effects can be observed following WM training. When using a WM task, we will stay in line with the WM model defined by Baddeley (1992). It refers to a cognitive system that provides temporary storage and manipulation of information necessary to execute complex cognitive tasks. More recently, the WM model proposed by Oberauer (2009) considers WM as "a blackboard for information processing on which we can construct new representations with little interference from old memories" and requires six elements for a WM system, dividing declarative and procedural WM. From our point of view, also supported by Baddeley (2012), this theory is very complex and could be difficult to evaluate experimentally.
Our WM training relies on an N-back task, a WM task introduced by Kirchner (1958) as a visuospatial task with four load factors ("0back" to "3-back"), and by Mackworth (1959) as a visual letter task with up to six load factors. The task involves multiple processes: WM updating, which includes the encoding of incoming stimuli, the monitoring, maintenance, and updating of the sequence, and stimulus matching (matching the current stimulus to the one that occurred N positions back in the sequence). It reflects a number of core executive functions (EFs) besides WM, such as inhibitory control and cognitive flexibility, as well as other higher order EFs such as problem solving, decision making, selective attention (Kane, & Engle, 2002). It has been shown that the N-back task consistently activates dorsolateral prefrontal cortex as well as parietal regions in adult brain (Owen, McMillan, Laird, & Bullmore, 2005). Schneiders, Opitz, Krick, and Mecklinger (2011) have shown that with N-back training it is possible to achieve an improvement in performance and an alteration in brain activity, such as a decreased activation in the right superior middle frontal gyrus (Brodmann area [BA] 6) and posterior parietal regions (BA 40).
The aim of the present study is to verify whether N-back task performance improves during N-back training and EEG recording, and whether transfer effects to other (untrained) cognitive functions can be observed, such as spatial memory, attention, and fluid intelligence, in two different groups of healthy young and older subjects. Although mixed results have been reported (Clark, Lawlor-Savage, & Goghari, 2017;Lawlor-Savage & Goghari, 2016;Salminen, Frensch, Strobach, & Schubert, 2015;Stephenson, & Halpern, 2013), in light of the results obtained in previous studies for both near- (Li et al., 2008;Stephenson & Halpern, 2013) and far-transfer effects (Jaeggi et al., 2008(Jaeggi et al., , 2010 in healthy young adults and near- (Heinzel et al., 2014;Stepankova et al., 2013) and far-transfer effects (Borella et al., 2010;Heinzel et al., 2016) in healthy older adults, we hypothesize that improvements in the trained task and near-and far-transfer effects are observed in both age-groups, with greater gains in young compared to older adults. Besides behavioral responses, we will also record ERP responses as they have shown to reflect the time course of cognitive and sensory processes during cognitive task performance.
We will thereby focus on the P300, a positive ERP component appearing approximately 300 ms after stimulus presentation, as it has been related to updating WM (Smith-Spark & Fisk, 2007), to executive functions (Finnigan, O'Connell, Cummins, Broughton, & Robertson, 2011;Zanto, Toy, & Gazzaley, 2010), and to the neural mechanisms behind training-induced performance changes.

| Subject recruitment
We recruited 18 healthy young subjects (seven females and 11 males, mean age 26.15 years, range 21-34 years), undergraduate and graduate students and staff of KU Leuven University, and 28 healthy older subjects (15 females and 13 males, mean age 63.11 years, range between 53 and 69 years) were recruited via posters, social media, and the university's Academic Center for General Practice.
Participants were healthy, with reported normal or corrected vision, no history of psychiatric or neurological diseases, not on medication, and never participated in WM training (Table 1).

| Cognitive training program
Healthy younger participants were assigned to two subgroups, cognitive training group (CTG, N = 9) and passive control group (PCG, N = 9), and healthy older subjects to two subgroups, CTG (N = 14) and PCG (N = 14), and the results were used to evaluate improvements in WM task performance and to detect transfer effects to other cognitive tasks.
Cognitive training group-young participants performed WM training (1-, 2-, 3-back task) with visual feedback on the correctness of their behavioral responses and monetary reward (max. 10 €/session), while PCG participants did not undergo any training. We decided to give them not only monetary reward, but also feedback because, according to the self-determination theory, intrinsic human motivation plays an important role in individuals to be engaged in activities, giving a sense of satisfaction and increasing performance results (Deci & Ryan, 1985). During all training sessions, EEG was recorded.
In the first experimental session (pretest), each participant was informed about the experimental procedure and invited to sign the informed consent form. The day after the participants performed the behavioral pretest session, and from the third meeting onward, the CTGs (young and older) started the training sessions. The study was approved by our university's ethical committee.

| Stimuli
For the N-back stimuli, pictures of meaningful objects were used, presented for 1,000 ms followed by a 2,000-ms interstimulus interval to which a jitter of ±100 ms was added and during which the picture was replaced by a fixation cross ( Figure 2). This was the moment when participants were required to press a button on the keyboard (33% of the pictures were targets). If the response was correct, a green face (visual feedback) appeared on the screen, and if it was wrong, a red face appeared. We opted for colorful pictures that are easy to understand not only for healthy subjects, but also for cognitive decline patients in view of future studies.
The sequences with identical difficulty levels (1-, 2-, 3-back) were grouped into 2-min. blocks across four sessions. Each session included two repetitions of three sequences with increasing load level (i.e., from 1-to 3-back). In total, there were eight blocks. For each sequence, there were 60 stimuli, presented in pseudorandom order. Before starting with the three sequences, a training session consisting of 10 stimuli for each difficulty level is used to explain our N-back task.

| Transfer effect assessment
All participants were administered a battery of pre-and post-tests to evaluate whether there are transfer effects to other cognitive functions (attention, spatial memory, and fluid intelligence) and to assess the effect of using a different version of the N-back task (without visual feedback and nonadaptive  (Raven & John Hugh Court, 1998). The behavioral preand post-tests were administered to compare task performance between groups (CTG and PCG) for the untrained tasks (nonadaptive N-back, TOVA, CORSI, and RAVEN test).
In view of possible transfer effects, it is interesting to note similarities and differences between the N-back task and the CORSI test Persson, & Reuter-Lorenz, 2008;Zhao, Wang, Liu, & Zhou, 2011): Both the CORSI and the N-back task measure WM and the capacity for temporarily retaining information, but the CORSI test simply quantifies the spatial span and calls upon the recollection process of previously presented items, while the N-back task is a more complete task as it involves several cognitive processes, including WM updating, and adopts different task rules by using recognition of previously pre- to indicate the magnitude of the significant differences.  (Cohen, 1973) were reported (η 2 = SS effect /SS total , where SS = sums of squares) for significant differences.

| Working memory training: Behavioral results
We For CTG-young participants, we observed RT to decrease with number of training sessions. To test this, we performed a t test analysis between the first and middle sessions, between the first and last sessions, and between the middle and last training sessions.
We found a significant effect between the first and last sessions for 3-back (p < 0.05, d = 1.72) and between the first and middle sessions for 3-back (p < 0.05, d = 1.34), confirming that RT decreases significantly between the first and middle sessions compared to the middle and last sessions for which we did not find any significant result. In contrast, when we looked at accuracy, the main effect of session was not significant (p = 0.31), indicating that accuracy did not substantially increase as a result of training although there was a trend of improvement between the first and middle sessions. Interestingly, comparing the middle and last sessions, we observed a decrease in performance probably due the boredom of the young subjects.
We also examined accuracy and RT during N-back training of older adults of CTG (Figure 4). We performed a t test analysis between the first and middle sessions, the first and last session, and the middle and last sessions. We found for RT a significant effect

| Working memory training: EEG results
As neuroimaging studies have shown that, during N-back task performance, the most activated brain regions are the lateral premotor cortex, dorsal cingulate and medial premotor cortex, dorsolateral and ventrolateral prefrontal cortex, frontal poles, and medial and lateral posterior parietal cortex (Gevins et al., 1990), and that the midline electrodes are the most significant ones (Mahncke et al., 2006;Watter, Geffen, & Geffen, 2001), we decided to analyze ERPs, more specifically the P300, using electrodes located over these areas: Fz, Pz, and Cz. Figures 5 and 6 show P300 amplitudes (250-400 ms) in three different sessions during training (first, middle, and last sessions) for CTG in young and older adults. to the 1-and 2-back tasks. Taken together, these data support the observation that the P300 amplitude decreases with increased task load/difficulty, but that with N-back training, it is possible to inverse the process as revealed by an increased P300 amplitude for the 3back task compared to the easier ones (1-and 2-back).
We also analyzed P300 amplitudes of the midline electrodes (Fz, Cz, Pz) with a three-way ANOVA (N-back, target, and session) for CTG-old participants. We found significant effects for the interaction between first and middle sessions × target for 3-back in channel Pz, F(1) = 4.2120, p < 0.05, η 2 = 8.15%, and for the interaction between the first and last sessions x target for 3-back in channel Pz, F(1) = 14.2780, p < 0.001, η 2 = 11.84%. Compared to the healthy young subjects, the P300 amplitude (target minus nontarget) was significant in the parietal area, while for young subjects, it was in frontal and central areas. Furthermore, the P300 amplitude was higher for the N-back tasks that were easier (1-and 2-back) and lower for the more difficult one (3-back). In this case, after training the older adults, the P300 amplitude increases for the most difficult task (3-back), showing that the P300 amplitude decreases with increasing task load/difficulty and that N-back training can change the neural response of the subject ( Figure 6). These findings confirm the results of Gevins and Smith (2003) who reported that training on an N-back task shows EEG changes in responses to changes in the mental effort required for task performance. F I G U R E 5 Event-related potentials (ERPs) during the training for cognitive training group (CTG young). P300 amplitude difference (target minus nontarget) shown for channels Fz, Cz, and Pz for the first (black curves) and middle sessions (red curves) (left three columns) and first (black curves) and last (red curves) sessions (right three columns) of nine young adults of the CTG. Significance was measured using threeway ANOVA (p < 0.01). Error bars indicate SEM in channel Pz, F(1) = 11.6941, p < 0.001, η 2 = 12.45%. We did not find any significant difference between young and older adults in the middle training session.

| Transfer effects (pre-and post-tests)
Percent correct responses (means and standard deviation) for each task are presented in Tables 2 and 3 for healthy young and older subjects, for pre-and post-tests. We did not find any significant intragroup differences in pretest performance for the healthy young and older adults, while we found significant intergroup differences be-     (Figure 9). Significant effects were found for RT between CTG and PCG for pre-post N-back task (p < 0.05, d = 0.48) (Figure 10).
Since ,  showed that younger adults benefit more from cognitive training than older adults and Bherer et al. (2005) showed the opposite (older subjects gained more positive effects than younger ones), we also analyzed the differences between young and older adults for CTG and for PCG separately. We used a t test for the factors age-group (young and older) and pre-post training. Significant pre-post training differences were found in accuracy for CTG (young vs. older) for TOVA (p < 0.05, d = 1.08) and CORSI (p < 0.05, d = 1.1), and for PCG (young vs. older) for TOVA (p < 0.05, d = 1.57) and CORSI (p < 0.05, d = 2.1). Finally, we considered Pearson's correlation between P300, accuracy, and RT.
Our results revealed significant correlations in healthy older adults for the interaction of 1-back P300 and RT (p < 0.05, r = 0.49783), for the interaction of 2-back P300 and RT (p < 0.0001, r = 0.96356), and for the interaction of 3-back P300 and RT (p < 0.05, r = 0.52979) in channel Fz; for the interaction of 2-back P300 and RT (p < 0.001, r = 0.89268) and for the interaction of 3-back P300 and RT (p < 0.001, TA B L E 3 Pre-and post-test performance (accuracy) of training (N = 14) and passive control groups (N = 14) of older healthy subjects. Conventions are as in Table 2 F I G U R E 7 Accuracy pre-post tests (TOVA, RAVEN, CORSI) for two groups of young adults (CTG and PCG). Pre-and post-test performance (in % correct responses) of young adults of the CTG (left) and PCG (right) for the N-back task, TOVA, CORSI, and RAVEN. Error While significant correlations were found between P300 and both accuracy and RT for older adults, we found a significant correlation only between RT and P300 for young adults. Our results showed significant interactions between RT and 1-back P300 (p < 0.001, r = −0.99202) and between RT and 3-back (p < 0.01, r = 0.89219) in channel Cz; and between RT and 1-back (p < 0.05, r = −0.764) and between RT and 3back P300 (p < 0.001, r = 0.9716) in channel Pz.

| D ISCUSS I ON
The main purpose of our study was to investigate whether cognitive training improves only trained task performance or also transfers to other cognitive tasks. To verify this, we performed a study where we subjected a group of healthy young and older subjects to 10 N-back training sessions, and assessed their performance on a battery of untrained cognitive tasks (different version of N-back, TOVA, CORSI, and RAVEN tests) before and after training. To assess whether level of task difficulty affected training outcome, we considered groups of young and older participants (CTG) that performed the 1-, 2-, 3back version of the N-back task and other groups of young and older participants (PCG) that performed no training but were subjected to the same pre-and post-test battery. We found for our CTG of healthy young subjects that training indeed improves N-back task performance compared to PCG. Additionally, their transfer effects to other untrained tasks were significant for near-transfer tasks using an untrained, nonadaptive version of the N-back task (WM task) and for a far-transfer task, RAVEN, that measures fluid intelligence, also compared to PCG. Furthermore, also Cohen's d confirmed the large effects after WM training in the pre-and post-tests. As mentioned above, transfer effects were found for tasks that overlap in terms of cognitive processes involved, for the untrained N-back version, in WM ability , and for the RAVEN test, in temporary information retention and attentional control processes (Jaeggi et al., 2008). These results showed clearly that it was not a test-retest effect as PCG did not show any significant differences in pre-and post-tests. difficult to allocate attention to spatial differences, as the CORSI test requires. In contrast, the untrained N-back transfer task differed in feedback and task difficulty level, but the stimuli were only visual, not spatial, as for the trained task. Additionally, although the TOVA shares the attentional process with the trained task, it also involves a spatial process that was not trained for. For our healthy older subjects, we found significant improvements in N-back, TOVA, and RAVEN test performance for CTG compared to PCG. Also, these results support the aforementioned theory of cognitive processes overlapping different tasks. In addition, although unexpected, we found a significant difference in TOVA for attention. The significance of these findings was clearer for older subjects compared to young adults because attentional control decreases with age (Karbach & Verhaeghen, 2014), and there is more room for improvement compared to young individuals. We do not think that experimental conditions, in terms of task or training features and the presence of the researcher, might have affected these results, as we kept the same conditions in each session for both groups. We do believe that individual characteristics, such as expectation and motivation, could have affected our results (cf. Deci & Ryan, 1985). These results are in line with the studies of Yang et al. (2006) and Bherer et al. (2005) who showed that, also in the aging brain, the capacity of plasticity improves cognitive functioning. However, the present findings are in contrast with other studies that showed greater improvements for young subjects compared to older subjects . In general, we found that both young and older adults achieved gains in N-back task performance after training, but this did not transfer to the same extent to untrained functions, as for older adults we reported improvements in N-back (near-transfer effect) and RAVEN (far-transfer effect) and also in TOVA (far-transfer effect), while in young adults only in Nback and RAVEN, indicating greater improvements for older adults (Bherer et al., 2005). We verified our initial hypothesis by showing improvements in the trained task and near-and far-transfer effects, although it was surprising to find a larger gain for older adults compared to young subjects , and no transfer to CORSI although it is a near-transfer task Klingberg, 2010).
Additionally, we tested whether differences in P300 amplitude were visible after five sessions or whether it was necessary to consider 10 sessions. We observed significant differences in P300 amplitude between first and middle sessions and between first and last sessions (target minus nontarget) for young and older subjects, complementing the study of Friedman and Simpson (1994) who used a simple oddball paradigm to observe differences in ERP amplitude between young and older adults. Our results showed a higher P300 for young adults in frontal, central, and parietal areas, especially for the first five sessions. The P300 of the most difficult N-back task (3-back) was strongest affected by training as it increased in the last session of training compared to the easier tasks, 1-and 2-back. Also, older adults showed a higher P300 amplitude after five and ten sessions of N-back training, and no significant differences between the middle and last sessions. Compared to the healthy young subjects, the P300 amplitude was stronger over the parietal area. As for young adults, P300 amplitude became higher for the most difficult task, showing the effectiveness of N-back training. These findings for both training groups showed significant differences between the first five sessions, but not for the last five sessions, suggesting that five training sessions could be enough to reach significant improvements in P300 for the trained task in healthy adults. In general, our initial hypothesis was verified as we measured an increase in P300 amplitude with training, indicating that the task was easier for the participants. In particular, the most significant effect was found for the highest difficult level, the 3-back task, suggesting a large improvement in storage, manipulation, and updating processes involved in the N-back task.
In summary, in light of our results, an issue that deserves further consideration is why N-back training in the study with young and older adults did not produce significant near-transfer effects in a similar task (CORSI), whereas in contrast, ,  observed a near-transfer effect to another memory task. We hypothesize that this might be due to specific task-trained features and strategies developed by the participants during training that could help in storage and manipulation information.

| CON CLUS ION
Studying cognitive plasticity across different epochs in one's life span has become very important given the steadily increasing life expectancy. We decided to investigate the potential of cognitive training to compensate for age-related cognitive decline and provided evidence of beneficial effects, both in healthy young and in older subjects. In addition, the cognitive decline in WM is also supported by decline in the frontoparietal regions that have an important role in WM (Rajah & D'Esposito, 2005). We noticed differences in P300 responses for young and older adults and showed that N-back training not only improves WM but also transfers to attention and fluid intelligence for young and older adults. These results provide evidence for brain plasticity, in particular in older adults, although the degree and extent of it are expected to decrease with age. In the future, we want to repeat the same experiment with older adults performing a multisensory Nback task, more specifically a dual (visual and auditory) N-back task, as Salminen et al. (2016) found larger improvements compared to single N-back task in young subjects, and their use of specific strategies during task performance. It remains to be seen whether these tasks can be used in practice to maintain or even improve quality of life across ages.