The impact of augmented reality on cognitive load and performance: A systematic review

Background: Previous studies on augmented reality-enriched learning and training indicated conflicting results regarding the cognitive load involved: some authors report that AR can reduce cognitive load, others have shown that AR is perceived as cognitively demanding and can lead to poorer performance. Objectives: The aim of this study is to systematically analyse previous research on AR and cognitive load, including performance, and thus to be able to contribute to answering the question of the impact of AR on cognitive load when used in learning environments. Methods: This study applied the systematic review method. A total of 58 studies were identified and analysed using rigorously defined inclusion and exclusion criteria. The results are now reported as a synthesis. Results and Conclusions: Compared to other technologies, AR seems to be less cognitively demanding and also leads to higher performance. However, these results are based on media comparison studies that have been criticized for years. The spatial AR type is better compared to see-through AR. However, the latter can be improved by visual cues and the addition of learning activities, such as value-added studies have revealed. Major takeaways: The essential findings of this study are that the technology used, for example, AR glasses, can unnecessarily increase cognitive load and that still many studies focus on the comparison of AR with more traditional media. Less studies applied alternative research designs, for example, value-added comparisons. However, such designs are better suited to investigate design principles for AR-enriched learning environments, which can then in turn reduce cognitive load as well as positively affect performance.

illustrate. In these studies, the potential of AR is primarily related to constructivist and situated learning approaches, like inquiry-or problem-based learning, in science, technology, engineering and mathematics education (Ibáñez & Delgado-Kloos, 2018), medical and health care education (Pugoy et al., 2016;Zhu et al., 2014) as well as for skills acquisition in assembly (Wang et al., 2016), manufacturing (Bottani & Vignali, 2019) and surgery (Meola et al., 2017).
Major challenges of AR in education reported by researchers are technological issues, usability and practical concerns, like not fitting into a traditional classroom set-up or not appropriate to large group teaching scenarios (overview in Akçayır & Akçayır, 2017, pp. 7-8). Another issue is the risk of cognitively overloading learners when interacting with AR-technology. However, study results show contradictory findings: Some authors, like Goff et al. (2018), Santos et al., (2014Santos et al., ( , 2016, Müller (2014, 2018) and Tang et al. (2003), provide evidence that AR can keep cognitive load low or even reduce it, thus, freeing up working memory capacities and facilitating learning. Other authors, like Akçayır and Akçayır (2017), Antonioli et al. (2014), Cheng and Tsai (2013), Dunleavy et al. (2009), van Kreveln andPoelman (2010) and Wu et al. (2013), show that AR is distracting and is providing too much information at once while working on learning tasks. Both sides argue in terms of empirically validated principles from cognitive load theory (CLT; Sweller, 1988Sweller, , 2011 and the theory of multimedia learning (Mayer, 2019b).
While the other challenges mentioned have been addressed in previous studies and solutions to overcome them have been proposed, such as AR-related pedagogical or instructional models (Cuendet et al., 2013;Drljevic et al., 2017;Wang, 2017) and standards how to design AR systems to increase usability (Guest et al., 2018;Wild et al., 2020), the cognitive load issue was not systematically investigated in any study yet. Due to the contradictory results, we see an urgent need here to expand knowledge on this topic. Therefore, the aim of our study is to systematically review the literature on AR and cognitive load to find out how and to what extent AR impacts cognitive load on learning. In order to unravel the relationship between these two entities, we took into account the following four aspects: 1. For what purpose was AR used in the studies? 2. How does AR affect cognitive load and performance compared to other media? 3. How do the different types of AR affect cognitive load and performance? 4. Which features can improve the effectiveness of AR regarding performance and cognitive load?
In the next sections, we provide details on the theoretical background on cognitive load in AR, our methodological approach and the coding of the data. Then, we present results of our systematic review. Finally, we discuss these results with regard to the four aspects mentioned above. The article closes with limitations, an outlook for future research and a conclusion.

| BACKGROUND
A widely accepted definition of AR is stated in Azuma (1997) and in Azuma et al. (2001, p. 34) considering three major characteristics: 1. Representation of real and virtual objects simultaneously in a real environment.
2. Interactions run in real time.
3. Alignment of real and virtual objects to each other (also known as geometrical registration).
For a long time, AR has been limited to research environments, the military sector or sophisticated marketing studies. In 2004, the availability of tracking systems for mobile phones marks the starting point for today's AR-systems (Arth et al., 2015;Kipper, 2013). In 2016, AR became known to a broader public with PokemonGo (Qiao et al., 2019), a location-based game using GPS data to match real world locations of the player (Paavilainen et al., 2017). In contrast, vision-based AR uses image recognition technology to blend digital content onto an object. Learners in Habig (2019) point the camera of a tablet at printed markers ('trigger images') to see a 3D model of the double helix in chemistry education. With spatial AR technology digital information is displayed directly onto a physical object without the need to carry a device. This AR type is often used in informal learning environments, like museums, to visualize phenomena in science (Yoon et al., 2017).
Increasingly used are see-through AR systems, which depend on more expensive wearables, like AR glasses (Bower & Sturman, 2015). Seethrough AR provides an ego-centric presentation of information while simultaneously gesturing with the whole body or interacting with other objects (Oh et al., 2018). In the corporate world, see-through AR is often called mixed reality but in fact, following the definition in Azuma et al. (2001), using glasses or head-mounted-displays to extend the real world through virtual content is a type of AR. More recently, with markerless AR content is projected directly onto a free surface with the help of a smartphone's camera (Brito & Stoyanova, 2018) without the need of trigger images or marker. Qiao et al. (2019) estimate the web AR technology will be more future proof, when no additional application is needed to display AR content. Users just open a web browser on a mobile device, enter a string in a search engine and click on 'watch in 3D'. To date, research on learning and training with AR is dominated by mobile types of AR like vision-based and location-based applications which include combination of text, pictures, animations, videos and 3D objects (Arici et al., 2019;Ibáñez & Delgado-Kloos, 2018).

| AR-based learning and training
Learning and training with AR can be related to Mayer (2002Mayer ( , 2014's cognitive theory of multimedia learning (CTML), as verbal and pictorial information is presented simultaneously. Basic assumptions for meaningful learning with multimedia, that is, application of knowledge to problem-solving, are the processing of words and pictures in two different channels (Paivio, 1991), learning as a generative activity (Wittrock, 1992) and the limited capacity of working memory as proposed in CLT (Sweller, 1988;Sweller et al., 2019). In the last 30 years, a large number of studies has been able to demonstrate for the claims made in CTML, resulting in various multimedia design principles. The application of these principles in the design of instructional media contributes to improve learning for three reasons: First, they reduce extraneous processing, that is, extraneous cognitive load (ECL; Mayer & Moreno, 2003), through the signalling and redundancy principle, and freeing up cognitive capacity (Mayer & Fiorella, 2014). Second, manage essential processing to not overload working memory capacities through the segmenting or pre-training principle (Mayer & Pilegard, 2014).
Thirdly, foster generative processing through generative learning strategies or social cues (voice principle, personalization principle and embodiment principle) to engage and motivate learners putting effort into learning with the multimedia instruction (Mayer, 2019b).
For AR, researchers argue that the unique features of this technology, like the annotation of the real world with virtual objects at the same time and place, are more beneficial for learning due their potential to overcome the violation of the principles compared to more traditional media. As a result, unnecessary cognitive load might be reduced or kept low and thus promote learning and task performance (Bressler & Bodzin, 2013;Goff et al., 2018;Santos et al., 2014Santos et al., , 2016Sommerauer & Müller, 2014. Other authors argue that AR learning and training applications provide too much information as well as more distracting factors like the devices used. Hence, risk of cognitive overload is high and must kept in mind when using AR for learning purposes (Akçayır & Akçayır, 2017;Antonioli et al., 2014;Cheng & Tsai, 2013;van Kreveln & Poelman, 2010;Wu et al., 2013).

| Cognitive load
Theoretically, the importance of considering cognitive load during instruction is grounded in the aforementioned CLT and the assumption of human cognitive architecture consisting of a sensory register, a working memory with limited capacity and a long-term memory with unlimited storage size (Sweller, 1988;Sweller et al., 1998Sweller et al., , 2019. Most important in research on CLT is working memory, its limited capacity and its interplay with long-term memory. Already stored knowledge in the form of schemata, that is, knowledge organized by chunking, can extend the working memory capacity and thus help to process more intellectual activities like problem solving. Therefore, the goal of instruction must be to support the construction of schemata in working memory by not overloading its capacities (Paas & Sweller, 2014;Sweller, 2011). Consequently, beside prior knowledge as a learners' individual causal factor for influencing working memory capacity the task itself with which learners acquire new knowledge has to be taken into account. For example, novice learners learn better with the help of worked examples than with unguided inquiry while experts often do not need this help and can start directly with problem-solving tasks (Kalyuga, 2007;Wittwer & Renkl, 2010). This is due high element interactivity of complex tasks and results in intrinsic cognitive load (ICL). To better deal with ICL the second main load type in CLT, ECL, must be reduced. ECL can hinder learning caused by instructional design, learners' characteristics (e.g., motivation) and the learning environment. Reducing the unproductive ECL can stimulate germane processing which helps learners to deal with ICL (Paas & van Merriënboer, 2020). It should be noted here that in earlier CLT research germane processing was treated as an own load type, namely germane cognitive load (GCL). There was a long debate on how many load types are necessary to explain cognitive processing of information (for an overview see de Jong, 2010;Kalyuga, 2011) resulting in a redefinition of GCL as having a 'redistributive function from extraneous to intrinsic aspects of the task rather than imposing a load in its own right' (Sweller et al., 2019, p. 264).
Another development of CLT concerns the incorporation of the evolutionary account of educational psychology suggested by Geary (2008).
In CLT, this means that acquiring biologically primary knowledge, like learning to speak, happens independently of working memory limitations.
As a result, learners may use biologically primary knowledge and skills during learning of biologically secondary knowledge, like reading and mathematics, to reduce ECL and optimize ICL (Paas & Sweller, 2012;Sweller, 2016). With this in mind Choi et al. (2014) extended the classic CLT model with the learner, the task and their interaction as causal factors for cognitive load levels (Kirschner, 2002;Paas & Van Merriënboer, 1994) by including the physical learning environment as causal factor on its own. This means that environment-task, environment-learner, task-learner and environment-task-learner interactions may affect cognitive load (Choi et al., 2014, p. 229).
With regard to the learning tasks, empirical research found several principles how to handle cognitive load during instruction, for example, split-attention effect, worked-example effect and guidancefading effect (Paas & van Merriënboer, 2020). These principles have implications for instructional design in general as well as for learning with multimedia technology, for example, presenting information in an integrated format is known in CTML as temporal and spatial contiguity principle (Ayres & Sweller, 2014). The use of technology, here also especially AR, to segment learning lessons and provide just-in-time information is also recommended within the 4C/ID framework (Mayer & Pilegard, 2014;Van Merrienboer & Kester, 2014).
It is also relevant in research on CLT to focus on learners' characteristics, like affective factors which can reduce cognitive load and help to be more willing to engage with the given task or multimedia instruction (Mayer, 2019a;Paas & van Merriënboer, 2020). Gaining currently more and more attention in research on how to manage learners cognitive load are collaborative and embodiment learning. In terms of collaborative learning, it is stated that individual working memory limitations can be overcome through a collective working memory effect (Janssen et al., 2010;Kirschner et al., 2009aKirschner et al., , 2009bKirschner et al., , 2011. This so-called collaborative cognitive load theory (CCLT) is also important for multimedia learning considering computer-supported collaborative learning (CSCL; Janssen & Kirschner, 2020;Kirschner et al., 2018) and mixed reality learning environments (Johnson-Glenberg et al., 2014). Same applies for embodied learning which is grounded in embodied cognition theory.
From CLT perspective, human gestures and movements may reduce cognitive load and foster germane processing through outsourcing information processing to another modality, that is, physical embodiment (for an overview see Sepp et al., 2019, p. 299f). In multimedia research, this effect is also known as enactment and was shown to be effective, for example, whilst video instruction (Fiorella et al., 2017).
The impact of the learning environment on cognitive load is primarily comprised of factors which can distract or promote learner's engagement with the learning material or task. Choi et al. (2014) mention cognitive, physiological and affective effects of the physical learning environment, for example, a noisy learning environment increases ECL while an emotionally positive perceived classroom can foster germane processing. Evidence for the environmental impact of co-actors in a learning situation was recently found by Skuballa et al. (2019).
With regard to multimedia learning, it was found that an immersive virtual reality (IVR) lab for science learning can distract learners resulting in poorer learning outcomes compared to traditional digital slides (Makransky et al., 2019). On the contrary, Kyza (2017, 2018) demonstrated that location-based AR-induced immersion is linked to cognitive engagement and thus learning gain.
To measure how these factors might affect learning, CLT researchers suggest three assessment aspects: mental load, mental

| METHOD
Does AR increase or decrease cognitive load? To answer the question, we analysed available studies on this controversial issue based on a systematic review of pertinent research. For our systematic review, we had to define a search string and inclusion/exclusion criteria as wells as had to decide, where to search to identify relevant studies (Gough et al., 2012;Newman & Gough, 2020). Regarding the databases to search for relevant studies, we decided to search in ERIC, Web of Science, Scopus and PsycINFO because these databases are recognized as relevant sources in the field of educational research (Newman & Gough, 2020, p. 9). Results from these databases can be downloaded, contain the information necessary for a systematic review, that is, the abstract, and thus ensure further processing on a computer. To define our search string, we first conducted a preliminary search and identified the most used keywords in AR studies as well as cognitive load studies. Complementing the different terms for AR, we also found many studies that dealt with AR but used virtual reality in the keyword section. Therefore, we also included the term virtual reality together with the most common terms used to describe AR in the final search string presented in Table 1. Regarding the search terms for cognitive load, we also found different terms used in previous studies. Again, the most common ones were used in the final search string (Table 1). The final search was done in October 2019.
We included journal articles, conference papers and book chapters written in the English language, published, reporting empirical results as primary studies including an experimental and control group.
No limitations with regard to time span were set.
The search initially resulted in 2008 references, of which 10 were first excluded because being duplicates. Of the remaining 1998 sources, initially 20 titles and abstracts were screened based on the inclusion and exclusion criteria by both the first and the second author. In 90% of the studies agreement whether to include or exclude the study was found. If it was unclear whether an article should be included or not, the abstract was read together again, and a decision was made collectively. The remaining sources were divided among the first author and the second author and screened by using titles and abstracts. After this process, 126 sources remained for full text screening, resulting in 54 publications containing 58 studies which were included in the review (see Figure 1; publications included in the systematic review are indicated with an asterisk in the reference list).
The coding of the studies also was a joint negotiation process involving all three authors of this paper. This process was very intensive and was carried out by us in detail. This means that every single study included was considered together, and the coding was done only after agreement of all three researchers involved.
We decided to set different priorities to answer the research question in the best possible way. First, from an instructional design perspective, we extracted the purpose of AR usage (e.g., the support of assembly work via an AR system). All categories can be found in Section 4.1.
Further, we have examined which research type the studies use to explore the relationship between AR and cognitive load. Here, we have followed the differentiation made by Mayer (2019aMayer ( , 2019b distinguishing media comparison and value-added studies. Media comparison studies compare one technology with another, usually more traditional, approach. Most often, these studies assume that a 'new' medium should be result in higher learning gains or improve motivation. Table 2 provides an overview of media comparison studies included in this review. Value-added studies, on the other hand, compare one educational technology, for example, a digital game, in two or more versions. In Pilegard and Mayer (2016), an experimental group had to fill out a worksheet while playing a computer game, whereas a control group played the game without another task. In our systematic review, we looked at value-added studies that compare the use of AR technology in several experimental conditions or supplemented with a specific feature.
Furthermore, we analysed studies investigating different AR types as a factor that may affect cognitive load during learning (Table 3). In our sample of studies, we found the following types of AR • Mobile vision-based AR, that is, augmented digital content is superimposed onto a trigger or marker and visible on the display of a mobile device.
• Vision-based AR, that is, augmented digital content is superimposed onto a trigger or marker and visible on a monitor with webcam.
• Location-based AR, that is, augmented digital content is visible on the display of a mobile device due the GPS data.
• Spatial AR, that is, augmented digital content is superimposed onto the object of interest via a camera or projector.
• See-through AR, that is, augmented digital content is shown in the point of view of the user via an AR glass.
To visualize the results of the studies, we have grouped them into larger categories. Among others, we have summarized those studies that found a lower or equal cognitive load with higher performance in the AR group, those that found no differences and those that reported negative results. The detailed presentation of all results follows in the next section.

| RESULTS
In this section, we outline results from the analysed studies by providing a representation of (1) the purpose of the AR usage, (2)

| Cognitive load and performance by purpose of AR
In analysing the corpus of studies in our sample, we reached agreement for the classification of the following six categories (see also Buchner et al., 2021): • AR-guided assembly tasks (15 studies).
• AR spatial ability training (one study).
• AR collaborative problem solving (one study).
In the following, we present the results of cognitive load and performance measures separately for these categories.

| AR-guided assembly tasks
Fifteen studies using AR were identified guiding assembly tasks, of which eight compare AR with two or more other conditions (see Table 4). In Funk et al. (2016), participants had to fulfil a manual assembly task on a Lego Duplo plate, supported either by a printed manual, a manual presented on a tablet, an AR glass or a spatial AR system. The aim of these studies is to research if AR guidance can improve performance while at the same time keeping the cognitive load low or even reducing it.
In 9 out of 14 comparisons, this aim was proven empirically when a certain type of AR is compared to another technology. Three studies found no differences between guidance with AR or other media and two studies provide evidence for higher cognitive load levels within the AR condition resulting in poorer performance outcomes.
In five studies, the spatial AR type proved to be superior in terms of cognitive load and performance in assembly compared to mobile vision-based AR or see-through AR. In Gross et al. (2018), no differences were found when using two different see-through AR glasses.
Five studies applied the same type of AR but added a specific feature. Alves et al. (2019) demonstrate that when using mobile vision-based AR it is beneficial to attach a handle compared to mounting the mobile device on a tripod. For see-through AR it seems to be superior to use visual cues instead of written text, as has been shown in two studies. In addition, Lampen et al. (2019) have been able to show that a simulation conducted by a human demonstrating a task is better in terms of performance as well as cognitive load levels.

| AR task assistance
In this category, we summarize studies that use AR to support tasks in medical education, during surgery, navigation, driving or flying and everyday duties ( paper-based or monitor-displayed drawings. The AR group outperformed the two other groups and reported lower cognitive load.
In contrast, three other studies found an increased cognitive load together with higher performance associated with AR. In these studies, participants worked on a dual task while navigation or walking, contributing to a higher cognitive load.
Summarized, AR can compensate for the demands of a secondary task, resulting in higher performance compared to traditional display-based navigation tools (Wen et al., 2014 4.1.6 | AR collaborative problem solving Wang and Dunston (2011) developed a see-through AR system which enables workers to collaboratively search and solve errors spotted in industrial construction sites. Performance results showed that the participants in the AR condition outperformed the control groups working on the task with paper-based materials or the standard software used on a desktop computer. In terms of cognitive load, the AR system led to lower or equal ratings on the Nasa-TLX.

| Cognitive load and performance by media comparison
In 45 studies, AR is compared to one or more other instructional media (48 comparisons) with display-based and paper-based materials are most common. Three studies used audio and one IVR as comparative media. Table 7 gives an overview of the studies and the gained results. In 27 studies, the used AR system or instruction led to lower or at least equal cognitive load ratings compared to other media. In these studies, also the performance, that is, learning outcome, time on task, accuracy of the solution, was higher when using AR. In another three studies, AR lowered cognitive load in comparison to non-AR while performance measurements are missing. Also, in three more studies cognitive load was higher within an AR group compared to a control group but resulting in higher performance. Eight studies did not find differences in their comparisons, where two of those compared AR to more than one other instructional condition. Accordingly, studies in Bellucci Mobile vision-based AR supports a computer inspection task Polvi et al. (2018) See-through AR and spatial AR supports reaction-based tasks Baumeister et al. (2017) (1) See-through AR and spatial AR supports reaction-based tasks Baumeister et al. (2017) (2) Spatial AR supports assembly tasks with low or high complexity Yang et al. (2019 (2) See-through AR supports lane changing driving tasks Young et al. (2016) See-through AR supports a ureterovesical anastomosis task Chowriappa et al. (2015) Vision-based AR supports endoscopic skull base procedures Dixon et al. (2011) Mobile vision-based AR supports learning of science concepts Lai et al. (2019) Mobile location-based AR supports inquiry-based learning during a field trip Chiang et al. (2014) See-through AR and spatial AR supports an assembly task Funk et al. (2016) See-through AR supports motor and cognitive tasks Wenk et al. (2019) Spatial AR supports assembly tasks Yang et al. (2019) (1) Two see-through AR approaches support assembly task Biocca et al. (2007) Vision-based AR supports tele-operated crane tasks See-through AR supports collaborative problem-solving task Wang & Dunston (2011) Mobile vision-based AR supports physical computing task Bellucci et al. (2018) AR condition lower CL, performance not measured (6.3%) Vision-based AR supports designing a room with furniture Chandrasekera & Yoon (2015) See-through AR supports a heat conduction experiment Strzys et al. (2019) Vision-based AR supports endoscopic sinus surgery Dixon et al. (2012) AR condition higher CL, performance better (6.3%) Mobile vision-based AR supports a scene imagination task Shin et al. (2013) Mobile location-based AR supports navigation tasks Wen et al. (2014) (2) Mobile location-based AR supports navigation tasks Wen et al. (2014) (3) No differences in CL and performance (16.7%) Mobile vision-based AR supports physical computing tasks Bellucci et al. (2018) Vision-based AR supports an erection task with tele-operated crane Chi et al. (2012) Spatial AR supports driving tasks via crash warning information presentation Kim et al. (2013) Mobile location-based AR supports navigation tasks Wen et al. (2014) (1) See-through AR supports picking tasks with a forklift Gross et al. (2018) Vision-based AR visualized 3D forms in geometric classroom Lin et al. (2015) et al. (2018) and Funk et al. (2016) appear in our overview twice. Bellucci et al. (2018) found differences between AR and paper-based materials, not for the display condition. Funk et al. (2016) found exactly the opposite, no differences when AR is compared to printed instruction, but in comparison to a display condition. AR can even lead to worst performance as shown in the study by He et al. (2019). Participants had to make a coffee and were guided by audio, paper, display or see-through AR instructions. The latter was most cognitively demanding and time to brew the coffee was longest. This is just one of the seven studies which report different results regarding cognitive load leading to the same performance levels like with a more traditional approach or, like illustrated above, even to the worst. The study in Wenk et al. (2019) also appears twice in the overview. Once, AR is compared to a screen condition, where the results are beneficial for AR, and the other time participants performing with IVR outperform participants in the AR condition.

| Comparison of cognitive load and performance by AR type
In six studies, a type of AR is compared to one or more other AR types (Table 8). Most common spatial AR is contrasted with seethrough AR. Spatial AR seems beneficial regarding cognitive load and performance measures, hence, four of the six studies reporting results supporting this conclusion. Also, compared to mobile visionbased AR, spatial AR seems to be superior, like shown in Alves et al. (2019). Only one study in this section found no differences between two different AR types when applied to chemistry learning (Chen et al., 2009).

| Cognitive load and performance in valueadded AR studies
We identified 20 studies examining just one type of AR but under different conditions. Here, the research questions are more extensive, they go beyond the question if one technology or type of a specific technology is better or less demanding than the other. These studies search for strategies how to improve AR as a learning technology, an overview can be found in Table 9.
One effective feature found in the studies can be summarized as visualization cues. For example, 3D representations of a phenomenon seem to improve learning and are less cognitively demanding compared to 2D when used in, especially, see-through and spatial AR applications. It is worth mentioning that as shown in Lampen et al. (2019) a human actor demonstrating a task is superior to all T A B L E 7 (Continued)

Study result Description Reference
See-through AR and spatial AR supports an assembly task Funk et al. (2016) Mobile vision-based AR supports the completion of a science project work Chang & Hwang (2018) AR condition equal/lower/ higher CL, performance not better/worst. (14.5%) See-through AR supports the operation of a coffee machine to make an espresso He et al. (2019) See-through AR supports navigation tasks during walking Kawai et al. (2010) See-through AR shows the Stroop-test as a secondary task during walking Sedighi et al. (2018) See-through AR supports simple and complex assembly task Deshpande & Kim (2018) See-through AR supports picking tasks Friemert et al. (2019) Mobile vision-based AR game supports English vocabulary learning Pu & Zhong (2018) See-through AR supports motor and cognitive tasks Wenk et al. (2019) T A B L E 8 Studies comparing one AR type to another

Study result Description Reference
Spatial AR lower/equal CL, higher performance (83.3%) Spatial AR and two mobile vision-based AR approaches assist assembly tasks Alves et al. (2019) See-through AR and spatial AR support reaction-based tasks Baumeister et al. (2017) (1) See-through AR and spatial AR support reaction-based tasks Baumeister et al. (2017) ( 2) See-through AR and spatial AR support an assembly task Funk et al. (2016) See-through AR and spatial AR support touching tasks on a study platform Hochreiter et al. (2018) No differences (16.7%) Vision-based and see-through AR supports chemistry learning Chen et al. (2009) other forms of visual cueing, for example, arrows or colours which outline where or what the next steps are to fulfil the task.
Learning activities, like note taking or self-testing, also can contribute effectively when combined with different approaches of AR. Interestingly, in the study by Ferdous et al. (2019), students preferred paper-based note taking when learning with spatial AR against taking notes on a tablet.
Navigating and moving objects in see-through AR still seem to be a challenging task. Two studies report on newly designed possibilities that outperform the more traditional ones, for example, when using a novel AR pointer (Ro et al., 2019).
Other features that may be helpful are using a handle for holding a mobile device for vision-based AR learning (Alves et al., 2019), the usage of colour within AR to provide feedback (Loup-Escande et al., 2017) or gamification (Hsu, 2019).
See-through AR can reduce cognitive load when working on a dual task. However, two studies report that AR is not able to improve task performance (Woodham et al., 2016;Young et al., 2016).

| DISCUSSION
Some researchers have argued that an increase of cognitive load when learning with AR seems inevitable, and the risk of overloading students' working memory is high. Others propose that AR can keep cognitive load constant or reduce it while performance increases. In this research, we have summarized studies on AR and cognitive load with the aim to contribute to this debate.
In our systematic review, we first analysed to which purpose AR has been applied in the studies of our sample. Procedural knowledge like in assembly, navigation or flying tasks, has been of interest to a larger extent than declarative knowledge. For the categories guided assembly and task assistance, the majority of studies report higher performance for participants performing in the AR condition while no evidence for cognitive overload compared to other conditions was found. This was also the case when AR was applied to support collaborative problem solving, to the training of spatial ability and to provide feedback during a coding activity.
From the CLT perspective, these results are consistent with the assumption from the split attention effect. The split-attention effect states that the integrated presentation of information contributes to better learning and performance. This can be achieved by means of AR, for example, if the necessary steps are presented directly on the components when performing an assembly task. The learner then no longer has to shift his attention between, for example, a paper manual and the real components. Additionally, this is in line with the spatial and temporal contiguity principle of CTML. Like for the split-attention effect, many empirical investigations proved that presenting information in a spatial and temporal integrated format is superior to non-integrated presentation formats (Mayer & Fiorella, 2014). Furthermore, these results T A B L E 9 Studies comparing added values to a certain AR type

Study result Added value Reference
Value-added condition equal/lower CL, higher performance (55%) Mobile vision-based AR with assessment mechanic Chu et al. (2019) Spatial AR with raised surface. Spatial AR and paper-based note taking Boyce et al. (2019) Ferdous et al. (2019) Spatial AR with 3D visualizations instead of 2D Fischer et al. (2016) See-through AR with 'attention funnel' Biocca et al. (2007) See-through AR with human simulation Lampen et al. (2019) See-through AR with visual cues instead of written text Murauer et al. (2018) See-through AR with novel interaction tools Ro et al. (2019) See-through AR with novel interaction tools Tsai & Huang (2018) See-through AR with 3D visualizations instead of 2D Cheung, McKinley, et al. (2015) See-through AR with 3D visualizations instead of 2D Cheung, Craig, et al. (2015) No differences (20%) See-through AR monocular or binocular Gross et al. (2018) Different AR glasses Kawai et al. (2010) Position of slim bar icons on a spatial AR windshield Kim et al. (2013) Combination of location-based AR map and display-based map Wen et al. (2014) (1) Value-added condition lower CL, no effect on performance (25%) Handle supports holding during mobile vision-based AR instruction Alves et al. (2019) Vision-based AR and the use of coloured feedback Loup-Escande et al. (2017) Intrinsic load lower through collective game-based learning approach Hsu (2019) See-through AR to compensate a secondary task Young et al. (2016) See-through AR to compensate a secondary task Woodham et al. (2016) support the claims made by AR researchers that using AR technology is an effective way to provide learners with integrated formats, which in turn positively affect performance and cognitive load (Goff et al., 2018;Sommerauer & Müller, 2014;Tang et al., 2003).
In the case of collaborative problem solving, the further develop- In more traditional educational settings, AR is used as an instructional tool that generates 3D objects from 2D images and thus contributes to a better illustration. However, in these studies it remains largely unclear why this should reduce or otherwise affect cognitive load. From the CLT perspective, it would be necessary to include other factors in the study design, such as measurements of working memory capacity (Anmarkrud et al., 2019). Or at least compare two different AR instructional materials that either incorporate or violate principles from CLT or CTML. To date, it seems that the potential of visualization dominates in studies in which AR is used to promote declarative knowledge. However, referring again to CTML, the combination of text and pictures is also beneficial in more easy-to-use instructional materials like videos or even textbooks. Hence, more work is needed to fully understand how to use the characteristics of AR described in Azuma et al. (2001) to also boost AR-enriched learning environments aiming to promote declarative knowledge. One example from our lab is adding generative learning strategies (Fiorella & Mayer, 2016) to learning environments that use AR as instructional media, for example, engaging learners in a self-explaining activity during interacting with the AR materials (Buchner, 2021).
A more detailed analysis seems necessary regarding see-through AR using special glasses where generated information is presented directly in the visual field of learners. Three studies compared such a system with other conditions to assist tasks like coffee making or navigating; they found negative effects for cognitive load as well as performance. When compared to other types of AR, like spatial AR, participants using see-through AR performed worse in assembly tasks.
Here, a look at the value-added studies can help practitioners still wanting to use see-through AR. Visual and attention-guiding cues as like human simulations seem to be particularly suitable in such systems. These results are also in line with theoretical assumptions made in CLT where the learning environment, here the device used, affects the cognitive load (Choi et al., 2014). The positive effects of visual cues are again in line with split-attention effect and the spatial and temporal contiguity principle whereas the information necessary to perform a certain task is presented in the learners' field of view.
Additionally, the effect of visual cues can be explained by the signalling principle of CTML: Visuals can guide the learner's attention to the most relevant information during learning, which contributes to the reduction of ECL resulting in more working memory capacity to deal with the intrinsic load of a task (Mayer & Fiorella, 2014 However, it should be noted that media comparison studies are not free of criticism in the field of research on educational technology, for example, the problem of providing exactly the same content and instructional method for the experimental and control groups (Mayer, 2019a). Therefore, the findings in this section must be interpreted with care. Another limiting factor of media comparison studies is that they reproduce a technology-or thing-oriented view of learning. This view focuses on the technology and its effect on learning outcomes rather than considering the complexity of teaching and learning. Richard Clark, Richard Mayer and colleagues therefore discourage researchers from continuing to do media comparisons but to apply alternative research designs that examine, for example, learner characteristics or the influence of a technology-enhanced learning environment on the learning process (Clark, 1983;Hodges et al., 2020;Mayer, 2020;Reeves & Reeves, 2015). We also discussed this issue in detail in our mapping study .
However, media comparison studies can contribute to identify the effectiveness of a technology, particularly, to investigate which tasks seem to benefit from a certain novel technology (Parong & Mayer, 2018). For AR this can be, as mentioned, providing just in time information or providing temporal and spatial contiguity during a task.
As the results of our systematic review also shows, the positive effect of AR on performance and cognitive load is higher for procedural knowledge than declarative knowledge facilitation.
Third, relatively few studies compare different types of AR. Current studies clearly demonstrate that spatial AR is superior compared to see-through AR. This is in line with new assumptions of CLT according to which primary biological knowledge familiar to humans can reduce the cognitive load. In the case of spatial AR, this means that learners can perform tasks using gestures and movements they are familiar with, rather than having to learn new gestures that are necessary for control in see-through AR. A possibility to overcome this issue is to provide learners with pre-training before using an AR glass. Pre-training is a well known and empirically robust principle of CTML aiming to prepare learners for upcoming tasks by providing relevant knowledge and skills. As a result, ICL is reduced, and learners can use their cognitive capacity to engage in essential processing (Mayer & Pilegard, 2014 Adding generative learning strategies to AR-enriched learning environments can foster germane processing resulting in meaningful learning beyond knowledge retention (Fiorella & Mayer, 2016). In this review, we found just one study that integrated a learning strategy together with AR. The results of the study are promising as the learners reported lower cognitive load and higher performance compared to a control group without the learning strategy (Ferdous et al., 2019). From an instructional design perspective, combining AR and generative learning strategies is an interesting approach, also for the promotion of declarative knowledge. Here, generative learning strategies can be used in combination with the 3D objects or other virtual illustrations engaging learners into sense making of the information delivered through the visualizations.
Another aspect is the result of higher cognitive load associated with higher performance found in assembly task studies. These studies use a secondary task to examine if AR can compensate the additional cognitive demand, which has been be demonstrated by three studies. Participants reported higher cognitive load caused by the secondary task but performance on assembly tasks did not suffer. In the task assistance category, these results were not confirmed as two studies show contrary findings. For example, Young et al. (2016) conclude that AR is not yet able to overcome cognitive burden occurring during a simulation on driving with an automobile. These contradictory results are worth of further investigation as one of the main advantages of AR reported in the literature is to help learners and users in keeping their focus of attention (Radu, 2014). Interestingly, AR could support learners to process structured assembly tasks even during handling a secondary task. However, for more unstructured everyday tasks this compensating effect was not found.

| LIMITATIONS AND SUGGESTIONS FOR FUTURE RESEARCH
The interpretation of the results presented in this systematic review is limited due the studies selected on the basis of the inclusion and exclusion criteria. An additional limiting factor is how research on AR and cognitive load is conducted to date. Here, we found a lot of media comparison studies that tend to highlight the medium, not the underlying factors which contribute to learning performance. We encourage researchers to focus more on the skills and knowledge acquisition process by designing studies that clearly describe how AR is used to support learning and training. Therefore, asking which learning strategies or activities contribute to the improvement or may make the improvement even better are worth to include in future research.
Studies are also lacking on whether individual or collaborative learning with AR affects cognitive load and performance. The outlined valueadded studies are a recommendable starting point as well as the AR type comparison studies. More of these are necessary due the fact that technological advancement will soon bring new possibilities of AR application on the market. Especially in terms of see-through AR authors noted that the high cognitive load is a result of limitations attributable to technical problems. Also, no studies evaluated newer types of AR like markerless or web AR applications, hence, research regarding these is recommended to unravel the differential potentials of the rich-and still emergent-variants of AR for learning.

| CONCLUSION
This systematic review on the impact of AR on cognitive load and performance has above all shown how complex this interaction is. However, the goal of such an analysis is to systematically process the results and therefore we take the opportunity to draw the following conclusions from our findings: First, with respect to the many variants of AR in the different studies and a range of results, we find evidence to conclude, that a majority of studies report lower or equal cognitive load with higher performance when compared to more traditional conditions like displaybased or paper-based instruction. However, it should also be noted that contradictory results are found, for example, when cognitive load and performance are high or when the AR condition showed the worst performance due, among other factors, to cognitive load.
It is also necessary to point out that these results are based on comparative media studies that have been criticized for a very long time. In such comparative studies, impact is usually attributed to a technology or medium, although it is the activities and learning requirements that influence cognitive load and other factors.
Second, our review shows that few studies have yet addressed the question of whether a particular AR type exerts an influence on cognitive load and performance. Only six studies could be found on this. Among them, it can be stated that spatial AR proves as superior to see-through AR for both variables. No differences were found between vision-based AR and see-through AR.
Third, it is worth considering the question of how learning and training with an AR system can be sustainably improved. For this purpose, value-added studies are available, of which we have found 20.
Such studies compare the same AR systems but add special features to one or both. Here, different functions have proven to be helpful, which should be further investigated in future studies. Only then can we expect to obtain reliable empirical findings. In summary, visual cues for attention guidance and activating learning strategies have proven to be particularly beneficial. The use of different AR glasses or simultaneous use of AR and non-AR did not show any differences.

ACKNOWLEDGMENT
Open Access funding enabled and organized by Projekt DEAL.

CONFLICT OF INTEREST
The authors declare that there is no conflict of interest.

DATA AVAILABILITY STATEMENT
All references included in the review are available in the paper.