The full text of this article hosted at iucr.org is unavailable due to technical difficulties.

ORIGINAL ARTICLE
Free Access

Repeating virtual assembly training facilitates memory for coarse but not fine assembly steps

Katharina Sebastian

German Research Center for Artificial Intelligence (DFKI), , Germany

TU Kaiserslautern, , Germany

Search for more papers by this author
Markus Huff

Corresponding Author

E-mail address: huff@die‐bonn.de

Department of Research Infrastructures, German Institute for Adult Education, , Germany

Correspondence

Markus Huff, Department of Research Infrastructures, German Institute for Adult Education, Heinemannstraße 12‐14, Bonn 53175, Germany.

Email: huff@die‐bonn.de

Search for more papers by this author
First published: 10 July 2018

Abstract

Humans segment naturalistic actions into meaningful units. These are hierarchically organized: multiple fine events are part of a coarse event. We investigated how repeated practice of a virtual sequential assembly task influences learning of coarse and fine assembly steps. In Experiment 1, participants (N = 10) parsed the task into coarse and fine events. This determined the task's hierarchical structure. In Experiment 2, domain experts (N = 18) and novices (N = 19) practiced the task in a virtual environment three times with intermittent memory testing. We found a hierarchical level effect: memory improved with repeated virtual training for coarse but not for fine steps. Further, experts' memory for fine assembly steps was higher compared with novices' memory. The hierarchical level effect could be explained by higher saliency (and lower similarity, respectively) of the coarse compared with the fine steps. We discuss implications for virtual training, which compose of hierarchical structure into task presentation.

Lay Description

What is already known about this topic?

  • Virtual trainings offer a safe environment for repeated practice in which learners can handle challenging activities.
  • Changes in the working life (e.g., “industry 4.0”) require workers to learn new dynamic tasks quickly.
  • Humans perceive dynamic tasks in terms of coarse and fine events, these are hierarchically structured.

What this paper adds?

  • Repeated virtual training improves memory for coarse but not fine events.
  • Domain experts recognize fine events more successfully than novices.

Implications for practice and/or policy?

  • Virtual trainings considering the underlying event structure are useful for training dynamic assembly tasks.
  • Knowledge tests of virtual trainings should be based on the event structure.

1 INTRODUCTION

Manual assembly tasks play an important role in the industrial production of complex products, such as cars and electronic devices. In this manuscript, we investigate how the existing research in event cognition can help develop better learning environment or techniques for assembly task workers. We identify an existing gap in research, that is, the lack of empirical data on memory performance during repeated practice and the effect of expertise. Repeated practice is common when both expert and novice workers acquire a new assembly skill. New assembly sequences require workers to regularly execute extensive training on specially built hardware prototypes (Hermawati et al., 2015). Additionally, workers may practice new assembly tasks in virtual environments (Malmsköld, Örtengren, Carlson, & Svensson, 2007).

According to the Event Segmentation Theory, complex tasks consist of two conceptually different parts, that is, coarse higher level conceptual changes and fine lower level, less salient changes representing “ongoing activity” (Zacks, Speer, Swallow, Braver, & Reynolds, 2007). An open question in this domain is how repeated practice influences this hierarchy. Based on the evidence from event cognition, we predict that learning curves will differ for the coarse and the fine assembly steps. This is because the steps of a coarse task are distinct and salient, therefore, will be more successfully learned than the less salient steps of a fine task, which are more similar and confusable. In turn, the similarity of the steps of a task can influence the long‐term memory during task acquisition through repeated practice. This is the main topic of research of this manuscript.

In the following paragraphs, we elaborate on how sequential knowledge is acquired through training, how an ongoing activity is parsed into constituent meaningful parts, and describe how the similarity influences long‐term memory. Finally, we describe experiments designed to investigate the above‐mentioned question.

1.1 Acquisition of sequential knowledge through training

In the automotive domain, introduction of new car models and variants requires frequent adaptation of work procedures on the production line. For preparation, workers go through repeated practice sessions acquiring upcoming procedures with the help of preseries hardware prototypes (Hermawati et al., 2015). Disadvantages of hardware‐based training are effortful assembly and reassembly, as well as material wear‐off (Hermawati et al., 2015). This problem can be solved by using realistic virtual simulations instead of hardware‐based training (Malmsköld et al., 2007; Moskaliuk, Bertram, & Cress, 2013). More importantly, virtual training offers an alternative suitable for learning, that is, repeated practice in a safe environment in which learners can handle challenging activities and receive immediate feedback (Lipsey & Wilson, 1993; McGaghie, Issenberg, Petrusa, & Scalese, 2010; Salas, Tannenbaum, Kraiger, & Smith‐Jentsch, 2012).

In order to promote trainees' learning success, whole‐task trainings were shown to be more effective as compared with part‐task trainings (Wightman & Lintern, 1985).

1.2 Perception of procedural activities

According to the Event Segmentation Theory, observers divide an ongoing activity automatically into discrete events. In working memory, the current event is represented as a working model (Zacks et al., 2007) prompting expectations about following actions. When predictions fail due to the occurrence of unexpected changes, observers update the working model and, consequently, perceive a transition—an event boundary—between an old and a new event. At the event boundaries, attention for new information increases (Zacks et al., 2007). Thus, event boundaries are strategically and conceptually important time points.

Event perception is, furthermore, hierarchically structured. Cognitive activities are perceived as hierarchically structured event units, that is, several fine event boundaries are perceived between two coarse event boundaries. This event structure can be assessed by instructing participants to segment activities into both fine‐ and coarse‐grained event units (Kurby & Zacks, 2008; Newtson, 1973). The perception of fine event boundaries is due to bottom up processes, for example, lower level changes in movement, (Zacks, Kumar, Abrams, & Mehta, 2009) and brief increases in prediction error (Radvansky & Zacks, 2014; Zacks et al., 2007). Coarse event boundaries are perceived due to higher level conceptual changes (Zacks et al., 2009), represent larger changes in goals (Zacks, Tversky, & Iyer, 2001), involve more physical changes (Hard, Recchia, & Tversky, 2011), and go along with more sustained increases in prediction error (Radvansky & Zacks, 2014; Zacks et al., 2007). Coarse event boundaries represent one or more salient alterations, that is, of location, character, action, or time (Huff, Meitz, & Papenmeier, 2014; Magliano & Zacks, 2011). In assembly, attaching a major object represents a coarse event and fine events depict further actions with this object, such as orienting and attaching it with the help of screws (Daniel & Tversky, 2012). The relation between fine and coarse events follows hierarchical alignment and enclosure, respectively (Hard, Lozano, & Tversky, 2006; Zacks & Tversky, 2001). That is, several fine events represent subordinate events, which precede a common coarse event boundary.

Both observers' goals and prior knowledge influence event segmentation processes. To understand these potential influences, it is important to know that event segmentation occurs at different levels. Besides the above‐mentioned fine and coarse event boundaries, there is also a “naturally grained” segmentation level. The size of naturally grained events lies between those of the fine‐ and coarse‐grained events (e.g., Mura, Petersen, Huff, & Ghose, 2013). In contrast to fine‐ and coarse‐grained events, for “naturally grained” event segmentation tasks, the participants do not receive any specific instruction during the task but they are asked to indicate the boundaries between two meaningful units.

Influences of observational goals and prior knowledge on event segmentation were only observed for this naturally grained and not for the fine‐ and coarse‐grained segmentation levels. First, the observational goals are important for naturally grained event segmentation because they determine the selection of goal‐relevant schemas, which are the basis for subsequent perceptual processing of the observed behaviour (Cohen & Ebbesen, 1979). Participants who had the observational goal of forming an impression of the observed actor segmented at a finer grain than participants with the observational goal of remembering the details of the observed task. Second, prior knowledge about an upcoming event was shown to influence the natural segmentation pattern. Massad, Hubbard, and Newtson (1979) provided their participants with two different kinds of information about an upcoming event. Because the resulting segmentation patterns differed substantially, the authors concluded that the participants' prior knowledge determined subsequent information selection. Similarly, Graziano, Moore, and Collins (1988) showed that, compared with novices, experts segmented familiar material more coarsely (for an overview, see Schwan & Garsoffky, 2008).

Importantly, such influences were not found when participants were asked to segment on a fine and a coarse level (e.g., Hard, Tversky, & Lang, 2006). In their study, Hard et al. (2006) compared segmentation performance for familiar and unfamiliar animations. Although participants interpreted familiar films as more intentional than backward films, hierarchical segmentation did not differ between the two versions.

To sum up, top‐down influences on event segmentation were only found on naturally grained segmentation. If segmentation instructions focus participants' attention on fine and coarse events (i.e., the hierarchical structure) during segmentation, then there are no influences observed.

1.3 Long‐term memory for events

Memory is better for event boundaries than for non‐event boundaries both in free recall and recognition (Huff et al., 2017; Lassiter & Slaw, 1991; Newtson & Engquist, 1976; Schwan & Garsoffky, 2004a; Zacks, Speer, Vettel, & Jacoby, 2006; for an overview, see Swallow, Zacks, & Abrams, 2009). Because event boundaries go along with increased attention (Huff, Papenmeier, & Zacks, 2012), they are more likely to be encoded into long‐term memory (Radvansky & Zacks, 2014). Deletions, delays, or disturbances at the points of event boundaries are more detrimental for memory as compared with time points within event boundaries (Boltz, 1992; Schwan & Garsoffky, 2004a). Thus, instructions considering event boundaries are more beneficial (Spanjers, van Gog, & van Merriënboer, 2010; Zacks & Tversky, 2003). In addition, learning success depends on prior knowledge. For instance, Spanjers, Wouters, van Gog, and van Merriënboer (2011) demonstrated that novices learned more efficiently from segmented animations, whereas experts learned equally well from segmented and nonsegmented material. Hence, communicating event boundaries and incorporating individual's prior knowledge are important for successful memory of events.

Furthermore, the hierarchical structure of complex tasks matters. Memory for fine events is more fragile than for coarse events in written and pictorial narratives (Bransford, Barclay, & Franks, 1972; Gernsbacher, 1985; Johnson‐Laird & Stevenson, 1970; Treisman & Tuxworth, 1974). Memory performance is better for coarse information, but more effort is required to recall it compared with fine information (Franklin & Bower, 1988; Zacks & Tversky, 2001). For instance, participants who were asked to memorize a previously read text answered more slowly when they integrated coarse compared with fine events suggesting deeper processing for coarse information. Furthermore, fine events may be more similar and less distinct compared with coarse events (Hard et al., 2011; Radvansky & Zacks, 2014; Zacks et al., 2001; Zacks et al., 2009), which might lead to differences in memory representations.

Memory for materials differing in similarity and distinctiveness has been the focus of recent research. Reagh and Yassa (2014) compared conceptually different versus similar material and found that repetition enhanced discrimination only for conceptually different pictures. In their basic research study, participants were more likely to correctly detect a target picture when they saw it three times compared with only once; however, pictures that were similar to the targets but were not presented in the study phase (distractors) were more likely to be falsely identified as the target after three repetitions than after one repetition. Another negative effect of repetition was shown by Jacoby, Jones, and Dolan (1998) who instructed participants to detect words that they had read but not heard in previous study phases. Increasing the number of reading repetitions increased the difficulty in correctly rejecting a word that indeed was read but not heard. In this case, repeated presentation increased the familiarity of the read word making it more difficult to disentangle if it was additionally heard or not heard. Thus, repeated presentation may be beneficial or disadvantageous for learning depending on stimuli properties, such as similarity.

1.4 Experimental overview

Long‐term acquisition of finely and coarsely segmented information has not been in the focus of research until now. Studies exploring memory for dynamic events presented the stimulus material only once (Lassiter & Slaw, 1991; Newtson & Engquist, 1976; Swallow et al., 2009; Zacks, Swallow, et al., 2006). A number of studies indicated that the memory for coarse events is better than for fine events (Bransford et al., 1972; Gernsbacher, 1985; Johnson‐Laird & Stevenson, 1970; Treisman & Tuxworth, 1974). Yet as repeated presentation of stimulus material is crucial for learning (Ericsson, Krampe, & Tesch‐Romer, 1993) and repeated presentation changes basic memory processes (e.g., Reagh & Yassa, 2014), it is an important but unanswered question whether repetition will benefit memory for coarse events more than fine events or vice versa. We postulate that repetition will be initially more beneficial to the memory for coarse than for fine events, and we refer to this result as hierarchical level effect. We base this hypothesis on previous research, which indicated that processing coarse information involves more effort than fine information (Franklin & Bower, 1988) and that effort is an important prerequisite to successful learning (Ericsson et al., 1993). In addition, lower distinctiveness and higher similarity of fine compared with coarse events (Radvansky & Zacks, 2014; Zacks et al., 2007) should hamper the memory for fine events after repetition (Reagh & Yassa, 2014).

Furthermore, results with regard to influences of familiarity with an activity on event segmentation are mixed and presumably depend on segmentation grain—reliable effects were only found for naturally grained segmentation instruction—(Graziano et al., 1988; Hard et al., 2006; Jarodzka, Scheiter, Gerjets, & van Gog, 2010; Zacks et al., 2001), it might be possible that memory processes after repeated training differ between domain experts and novices. In our study, we included a novice group consisting of students from a middle school at a stage just before potential entry‐level job in the automotive industry. Our expert group consisted of production workers with multiple years of work experience in the automotive industry. As stated before, our main hypothesis was the hierarchical level effect: memory benefits from repeated practice only for coarse, but not for fine events. However, because fine events are action‐based (Daniel & Tversky, 2012), they are expected to be affected more by prior experiences with assembly actions. Therefore, experts who executed multiple assembly actions themselves should show per se a higher level of memory performance for fine events than novices.

The purpose of Experiment 1 was to determine whether the video of the assembly task used as our stimulus is indeed hierarchically perceived as a sequence of coarse and fine events. Therefore, participants performed an event segmentation task while watching a video of the assembly task. The resulting event boundaries were used to design the memory test for the following experiment. In Experiment 2, experts and novices practiced the assembly task three times in a virtual environment and we tested their memory after each repetition. We assessed memory for coarse or fine events by stopping the video and asking for the correct next event. The memory for coarse events was assessed by stopping the viewing of the video shortly before a coarse event boundary and that for fine events by stopping it shortly after a coarse event boundary (Section 3.1.2).

2 EXPERIMENT 1: EVENT SEGMENTATION

The goal of Experiment 1 was to determine the underlying hierarchical event structure of the chosen video of an assembly task from the automotive domain.

2.1 Participants, material, and procedure

Ten students (five females and five males; age: M = 24.6, SD = 4.6) from the University of Kaiserslautern and University of Tübingen performed an event segmentation task (Newtson, 1973). All participants completed the fine and coarse segmentation instruction conditions in a counter‐balanced order.

The procedural assembly task used in the following experiments consisted of assembling different parts of a car door to the rack of the same door in a given sequence (see Figure 1). The task contained typical manual operations from the production line of a German automotive company (e.g., picking up a workpiece and screwing) and consisted of 38 single operations of the car door assembly. To empirically determine the structure of the task with respect to its coarse and fine events, we created a video of the door assembly task. The original hardware parts were made available for the purpose of the current experiments. The video was shot from a point‐of‐view perspective using a head‐mounted camera. We presented the video, which was 7 min and 16 s long without sound.

image
Real door setup: The two upper pictures depict the car door rack from front and back side, respectively; the lower picture shows the main objects, screws, and tools [Colour figure can be viewed at wileyonlinelibrary.com]

Overall, participants saw the video depicting the procedural assembly task three times; first, they watched it without instruction, and then they had to segment it both in fine‐ and coarse‐grained units. The order of fine and coarse segmentation was counterbalanced across participants. Participants watched the video of the car door assembly and pressed the space bar key whenever they thought that one meaningful (fine/coarse) event ended and another one began.

2.2 Results and discussion

For analysing event segmentation data, we estimated a person's perception of an event boundary as a Kernel density distributed function around the person's key press (Papenmeier, 2014). Then, we summed up all participants' individual distributions. The respective event structure plots for fine and coarse segmentations are displayed in Figure 2. Furthermore, bootstrapping methods were applied to check which of the resulting peaks were significant on a 90% confidence level. We found seven meaningful event boundaries in the coarse condition, as indicated by the vertical (green) lines in the upper plot of Figure 2. They correspond to the seven main objects that had to be assembled successively onto the car door rack. In between those coarse units, participants perceived several fine steps, respectively, that is, positioning the current object, inserting screws, and fixing the screws with the help of a tool. These additional fine event boundaries are indicated by the vertical (green) lines in the lower plot of Figure 2.

image
Event structure of the car door assembly task: Upper plot shows coarse, bottom plot displays fine‐grained segmentation. Significant event boundaries (significance level of 0.10) are displayed as vertical (green) lines based on their exceeding of critical cut‐offs (horizontal red line). Coarse segmentation clearly yielded the main steps corresponding to the seven main objects that were assembled. The peaks resulting from the fine segmentation represent substeps that lie in between coarse events and depict detailed actions [Colour figure can be viewed at wileyonlinelibrary.com]

Furthermore, participants perceived the activity according to hierarchical alignment (Zacks et al., 2001) and enclosure (Hard et al., 2006). Participants observed more temporal closeness between fine and coarse event boundaries (M = 2.69 s, SD = 1.11) than expected by chance (M = 4.85 s, SD = 1.15), t(9) = −5.24, p < 0.001. In addition, the proportion of nearest fine event boundaries preceding its respective coarse event boundary was 0.81 (SD = 0.24) and significantly higher than a proportion of 0.50, t(9) = 4.12, p = 0.003.

Moreover, the results of this experiment provide important segmentation data showing that the perception of the car door assembly task is hierarchically structured like in previous work on assembly (Daniel & Tversky, 2012; Mura et al., 2013; Zacks et al., 2001; Zacks & Tversky, 2003).

3 EXPERIMENT 2: PRACTICING AN ASSEMBLY TASK

The goal of Experiment 2 was to test the effects of repeated presentations on memory for coarse and fine events. We expected that the memory for coarse but not fine events will benefit from repeated presentation. Because expertise was shown to be important in event cognition, this experiment involved two groups of participants, that is, with and without experience in automotive assembly.

3.1 Method

3.1.1 Participants and design

Middle school students from the Neue Gymnasium in Rüsselsheim, Germany (N = 19; Mage = 14.9 years, SDage = 0.3), and production workers from the Volvo® Trucks plant in Gothenburg, Sweden (N = 18; Mage = 42.2 years, SDage = 7.8), served as novice and expert participants, respectively. The experience of working in production for an average of 17.2 years (SD = 7.3) made the workers experts compared with the students with no work experience. The experts had significantly higher self‐reported manual skills compared with novices, t(32) = −3.73, p < 0.001, but they did not outperform in their spatial ability tested by a mental rotation task, t(29.9) = 1.19, p = 0.242, (see Table 1). Both groups had no prior knowledge of the car door task (note that this group of Volvo® workers usually assemble trucks, not cars).

Table 1. Differences between novices and experts
Experts Novices
M (SD) M (SD) t p
Age (years) 42.2 (7.8) 14.9 (0.3) 14.8 <0.001
Manual skillsa 1.5 (0.45) 2.2 (0.65) −3.73 <0.001
Spatial abilityb 0.98 (0.04) 0.95 (0.07) 1.19 0.242
  • a Average score with 1 and 5 indicating highest and lowest self‐reported manual skills, respectively.
  • b Sensitivity A′ (Pollack & Norman, 1964) based on hits and false alarms in a mental rotation test.

We adopted a within‐subject design in which all participants executed three training repetitions each followed by the above‐mentioned memory test based on the coarse event boundaries (see Section 3.1.2). Volvo® Gothenburg and Neues Gymnasium Rüsselsheim compensated participants' absence from work or school, respectively.

3.1.2 Material

Virtual training system

The virtual training setup is shown in Figure 3. Participants saw the 3D simulation of the car door assembly on a monitor approximately 2 m in front of them. Their task was to move an object shown on the screen to the correct assembly position using their hand motion tracked through a Microsoft Kinect. The correct assembly position was highlighted by a semi‐transparent blue area that was shaped like the object in question (see Figure 4a), for example, a door part, a screw, or a tool. Red, orange, and green colours were given as visual feedback, respectively, in order to indicate how close the object was located with respect to its target position. When participants positioned the object correctly, they confirmed this assembly step by pressing the button on their Wii Mote controller. Then, they saw the next object. We used this so‐called “easy mode” for training the participants in executing the door assembly task.

image
Virtual training system setup: Flat screen for visualization (52 inches), PC on which software was running, Microsoft Kinect for motion tracking, Nintendo Wii Mote as controller. PC: personal computer [Colour figure can be viewed at wileyonlinelibrary.com]
image
Graphical user interface: (a) Target position of door handle is shown by a half‐transparent (blue) area; its correct placement is indicated by (green) borders. (b) In advanced mode, the participant must select the appropriate part from a pop‐up circular menu [Colour figure can be viewed at wileyonlinelibrary.com]

The virtual training system incorporated a more difficult mode (“advanced mode”) as well. We used this mode as an additional final performance measure, which will be described in Section 3.1.4. In this mode, participants were asked to choose the correct subsequent object on their own by using a circular menu (Figure 4b). A blue highlighted area that indicated the shape and target position of the subsequent object could be used as a hint for selecting the correct part. If a wrong part was selected, an error message appeared followed by the circular menu with the correct object in the foreground.

In the present experiment, participants used a gesture‐based user interface to train on the assembly of the door. Thus, participants were required to actively use gesture during learning. This had several advantages. First, gesturing is assumed to change basic cognitive processes and knowledge representations by introducing “action into one's mental representation” (Beilock & Goldin‐Meadow, 2010, p. 1609). Second, because such a learning environment is actually used in the automobile industry, we wanted to design a learning environment that closely resembles the environment in which the learners apply their knowledge in the real world. In contrast to working with a computer mouse and with a rather small desktop monitor, in our learning environment, participants were required to stand, to use their hands and arms, and they saw approximately full‐sized automotive objects. This study‐test congruence is important for measuring learning success (Godden & Baddeley, 1975).

In sum, the door assembly execution within the virtual training system involved 38 assembly steps representing an imitation of the real assembly sequence introduced in Figure 2.

Memory test based on coarse event boundaries

In order to test memory performance for the correct assembly sequence, we created a test based on the video of the real door assembly used in Experiment 1. This resembled the approaches of Swallow et al. (2009) and Zacks, Kurby, Eisenberg, and Haroutunian (2011). The video was stopped at time points associated with the coarse event boundaries of the door assembly task; either before a coarse event boundary—this tested memory for coarse events—or after a coarse event boundary—this tested memory for fine events.

The video clips in the coarse events condition began two fine steps from the respective coarse event boundary and stopped shortly before it, that is, shortly before the person was just about to turn back to the table in order to take the next part. The videos in the fine events condition began when the person in the video turned towards the table and stopped when she had gripped the main object from the table, that is, shortly after the new coarse event began (see an overview in Figure 5). The memory test contained 14 video clips (seven coarse event items and seven fine event items).

image
Memory test illustration: We schematically sketched the memory test using three consecutive coarse event boundaries (EB) from the assembly task, that is, EB1, EB2, and EB3. Video clips were stopped either before or after a coarse EB. The video clip that stopped before the exemplary EB 2 required to predict the next coarse event. Its respective target picture was “EB 2” and distractor picture was “EB 3.” The video clip that stopped after the exemplary EB 2 required predicting a fine event. Its target picture was “1st step after EB 2” and distractor picture was “3rd step after EB 2” [Colour figure can be viewed at wileyonlinelibrary.com]

Immediately after the video stopped, participants saw a static picture frame depicting either the correct (target) or wrong next step (distractor) taken from the video. Target pictures depicted a screenshot of the next step. Distractor pictures in the fine events condition depicted the assembly operation two fine steps ahead; distractor pictures in the coarse events condition depicted one coarse step ahead. Figure 5 illustrates the memory test with an exemplary coarse event boundary.

Each video clip was used for testing memory twice, once with a target and the other time with a distractor item. The order of presentation was chosen at random. Participants indicated via key press whether the shown picture was the correct next step (“old” response) or not (“new” response), respectively. The test was created by using PsychoPy software (Peirce, 2007), and participants executed it on a conventional notebook personal computer within approximately 15 min.

3.1.3 Procedure

Data collection of the experts was conducted in Volvo® Trucks Factory in Gothenburg, Sweden; novices participated 1 month later in the Neues Gymnasium in Rüsselsheim, Germany. Experimenters were previously trained at the German Research Center for Artificial Intelligence in Kaiserslautern, Germany. The production workers at Volvo® signed an informed consent right before the experiment started; the student participants brought a consent form signed by their parents.

The procedure as summarized in Table 2 was the same for all participants. Each participant filled in a demographic questionnaire followed by the experimenter's oral introduction about the overall project. Each participant was then calibrated within the virtual training system. In order to get familiar with the usage, all participants performed a tutorial consisting of seven practice assembly steps at the front spoiler of a car. After this practice trial, they performed the virtual training for the door assembly task three times in the easy mode (see Section 3.1.1) and we recorded the time required to complete the task with help of a stopwatch. After each training repetition, participants were asked to execute the memory test. Next, they performed a final virtual assembly training session in the advanced mode (see Figure 4b) and the experimenter noted errors in choosing the next object. The expertise assessment consisted of a questionnaire on manual skills including six items on a 5‐point Likert scale (e.g., “I find it hard to assemble furniture by myself”) and a 5‐min computer mental rotation test programmed in PsychoPy software (Peirce, 2007). On a conventional notebook monitor, we showed a letter (“R” or “G”) either in mirrored or normal view. Additionally, the letter could be rotated. Participants had to indicate by button press as fast as possible if the letter was mirrored or not.

Table 2. Procedure of Experiment 2: within‐subject design
1. Tutorial
2. Virtual assembly training and testing (repeated three times)
a. Easy mode
b. Memory test
3. Virtual assembly training: advanced mode
4. Expertise assessment
a. Self‐reported manual skills
b. Mental rotation test

3.1.4 Data analysis

We used signal detection theory (Swets & Pickett, 1982) to analyse the data from the memory test, which is based on the calculation of the hit rate (i.e., the proportion of “old” responses to target items) and the false alarms (i.e., the proportion of “old” responses to distractor items). Both of them were used to calculate the sensitivity, that is, the actual cognitive ability to detect a picture as the target or distractor in an old/new recognition test.

We calculated the nonparametric value for sensitivity (Pollack & Norman, 1964), that is, A′, ranging from 0.5 (no ability to distinguish between target and distractor) to 1 (perfect performance). Sensitivity values less than 0.5 may arise from sampling error or response confusion with the minimum value being 0. Furthermore, we analysed response times of memory test answers.

We assessed the need for linear mixed effects analysis by fitting two models, that is, one with constant intercept for all participants and another allowing intercepts to vary across participants (Field, Miles, & Field, 2012). If the comparison of fit indices revealed significant existence of random effects, we would perform a linear mixed effects analysis. In case random effects were absent, we would compute analyses of variance. We used R (R Core Team, 2012) for all statistical analyses and additional R package lme4 (Bates, Maechler, Bolker, & Walker, 2012) for linear mixed effects analysis.

3.2 Results and discussion

3.2.1 Memory performance

Sensitivity

First, we analysed the influence of the fixed effects repetition (1, 2, 3), expertise (experts, novices), and item type (coarse, fine) on sensitivity A′. The descriptive data and the significant post hoc comparisons are shown in Figure 6. Because our test for random effects revealed individual participant as a random factor, we calculated a linear mixed effects model (Table 3 displays its summary). In post hoc analysis, we found a significant interaction effect between repetition and item type, F(1, 35) = 15.96, p < 0.001. This suggests that there is an improvement in memory with increasing training repetition, but only for the coarse and not for the fine events.

image
Learning curves after virtual training: Experts and novices show an improvement only for coarse event items not for fine. Error bars reflect standard errors, significant trends are indicated by * (95% confidence level)
Table 3. Results of the mixed effects model for sensitivity A′ in the memory test
b SE b 95% CI t p
Baseline A′ 0.75 0.08 0.57, 0.86 9.50 <0.001
Repetition −0.00 0.03 −0.07, 0.06 −0.24 0.811
Expertise −0.17 0.03 −0.38, 0.04 −1.59 0.121
Item type −0.43 0.10 −0.63, −0.23 −4.16 <0.001
Item type * repetition 0.18 0.05 0.09, 0.28 3.81 <0.001
Item type * expertise 0.30 0.15 0.02, 0.58 2.08 0.040
Expertise * repetition 0.03 0.05 −0.06, 0.12 0.59 0.554
Item type * repetition * expertise −0.09 0.07 −0.22, 0.04 −1.37 0.172

Further, there was a significant interaction effect between expertise and item type, F(1, 35) = 4.47, p = 0.042. Post hoc t tests indicate that experts performed marginally better in the fine event condition (M = 0.79, SD = 0.20) compared with the novices (M = 0.59, SD = 0.22), p = 0.096. In contrast, memory for the coarse event condition did not differ between experts (M = 0.64, SD = 0.26) and novices (M = 0.64, SD = 0.24), p = 0.500.

The experts' better performance in memory for fine events indicates the existence of prior knowledge cued by a certain automotive‐related object.

Response time

By using a linear mixed effects model with individual participant as random effect, we analysed if the response time to the items in the memory test depends on the fixed effects item type, expertise, repetition, and sensitivity A′ (see Table 4 for the model summary). We found that time (in seconds) decreased with repetition, M1 = 5.77 (SD = 3.00), M2 = 4.58 (SD = 1.94), and M3 = 3.99 (SD = 1.96), F(2, 68) = 25.77, p < 0.001 and that coarse events required longer response times than fine events, Mcoarse = 4.99, SD = 2.84 and Mfine = 4.57, SD = 1.99, F(1, 108) = 4.34, p = 0.040.

Table 4. Results of the mixed effects model for response time in the memory test
b SE b 95% CI t p
Baseline 6.26 1.04 4.27, 8.25 12.43 <0.001
Repetition −0.52 0.25 −1.00, −0.05 −2.10 0.037
Expertise −0.43 1.36 −3.11, 2.24 −0.32 0.751
Item type 3.56 1.15 1.36, 5.76 3.10 0.002
Sensitivity A′ 0.08 1.15 −2.13, 2.28 0.07 0.945
Item type * repetition −1.46 0.40 −2.22, −0.70 −3.69 <0.001
Item type * expertise −4.02 1.57 −7.04, −1.01 −2.56 0.011
Expertise * repetition 0.22 0.35 −0.46, 0.89 0.62 0.535
Item type * repetition * expertise 0.95 0.54 −0.09, 1.99 1.75 0.083

The significant interaction between item type and expertise, F(1, 102) = 9.02, p = 0.003, revealed that experts differed in response time for coarse versus fine events (Mcoarse = 6.30, SD = 3.31 and Mfine = 5.27, SD = 2.09) whereas novices did not (Mcoarse = 3.70, SD = 1.36 and Mfine = 3.86, SD = 1.62). Finally, response times for coarse events decreased more clearly with repetition than for fine, F(2, 102) = 6.78, p = 0.002.

Initially, the responses for the coarse events were slower than for the fine events, but they improved rapidly with repetition. This indicates that the memory task was more difficult for the coarse events in the beginning, but then became easier with repeated training.

Virtual training performance

We also analysed the virtual training execution. We performed two analyses of variance separately for novices and experts with the number of repetitions as independent and execution time as dependent variable, respectively. We report post hoc t tests comparing single conditions.

The time for a single virtual training session significantly decreased with repetition for the novices, F(1, 18) = 15.39, p < 0.001. Post hoc tests show that there was a significant speed up from the second to the third training, t(18) = 3.25, p < 0.01. We did not find such an effect for the experts, F(1, 17) = 0.21, p = 0.649.

In the final advanced mode training session, experts and novices made M = 3.9 (SD = 1.5) and M = 4.5 (SD = 1.9) errors, respectively, t(33) = −0.92, p = 0.36, when selecting the correct part from a virtual menu (Figure 4b). Qualitative inspection of the type of errors made by the participants revealed problems with choosing the correct screw, that is, a fine event. No participant failed to select the correct main object. Again, the novices were faster (M = 413.5 s, SD = 61.5) than the experts (M = 588.8 s, SD = 157.4) in the concluding virtual training, t(22) = 4.42, p < 0.001.

4 GENERAL DISCUSSION

4.1 Theoretical implications

Our study showed that repeated, virtual step‐by‐step assembly training enhances the acquisition of new procedural knowledge. Most importantly, there was a hierarchical level effect: With repeated practice, the participants showed increasing memory performance for the coarse events but not for the fine events. When learning a new complex task, trainees initially focus on the coarse events. This coarse information may serve as a frame to learn, in a subsequent step, the fine events that belong to it.

The hierarchical level effect adds to the event cognition literature by providing evidence that repeated practice influences memory for the hierarchical structure of activities. We conjecture to explain this by differences in saliency between successive fine and coarse events, respectively. Consecutive fine events resemble each other because they share the same main object. However, the main object is different between two coarse events. Because conceptual changes are bigger at coarse compared with fine event boundaries, they go along with higher attention and increased chances to be encoded in the long‐term memory. Thus, repeated practice strengthens the advantage of coarse event boundaries concerning long‐term memory representation.

The hierarchical level effect may also be explained by the assembly task's nature and the training design. First, decreased discrimination for fine events across repetitions indicated that they were more likely to be confused in the memory test. Objects characterizing coarse event boundaries were so characteristic that they could be easily recognized in the video. Fine events involved smaller, less characteristic objects that were more similar to each other (e.g., screws). They were competing with each other during memory test yielding memory interference (Radvansky & Zacks, 2014, p. 37). Second, the virtual training required going through the door assembly task in a fine‐grained step‐by‐step manner without pointing to the hierarchical organization. Participants elaborated fine events in a segmented way inhibiting chunking of details (Zacks, Speer, et al., 2006; Zacks, Swallow, et al., 2006). The repeated execution might have reinforced confusion. Third, the virtual training setup is likely suitable for communicating declarative knowledge, that is, higher level concepts like main assembly steps primarily represented by coarse events. However, virtual simulations cannot teach detailed manual operations and motoric skills as successfully as training based on real hardware prototypes (Malmsköld et al., 2007).

Expertise had a positive effect on memory for fine assembly steps likely due to experts' previous experiences with several automotive objects and their respective assembly over years of professional life. However, we acknowledge that there is a confound between expertise and age, despite our best efforts in planning this study. Therefore, differences between expertise levels have to be interpreted with caution.

4.2 Practical implications

The results have implications for upcoming virtual training design and the optimal use of training time. If trainees initially learn the sequence of coarse events, the main focus of a virtual training setup should be on coarse information.

In virtual training, stressing coarse information is possible through highlighting the main object continuously over a group of fine assembly steps (Baggett & Ehrenfeucht, 1991; Gobet et al., 2001; Zacks & Tversky, 2003). In addition, clear breaks at coarse event boundaries support communication of the higher and lower level content (Spanjers, van Gog, Wouters, & van Merriënboer, 2012). Furthermore, a less fine‐grained step‐by‐step training may promote a comprehensive, overall picture (Zacks, Speer, et al., 2006; Zacks, Swallow, et al., 2006). However, it has to be kept in mind that concise sum‐ups mislead viewers to adhere to even more summarized reviews (Schwan & Garsoffky, 2004b). We suggest incorporating the level of expertise when deciding for segmented versus more summarized accounts. Because experts showed enhanced performance for memory of fine events from the very beginning, they could especially benefit from more condense virtual training procedures.

Another way of communicating hierarchical structure in virtual training environments is presentation of all fine events that belong together at the same time. Support for simultaneous rather than sequential presentation of similar material comes from perceptual learning (Gibson, 2000). For instance, the diagnostic feature‐detection hypothesis (Wixted & Mickes, 2014) states that simultaneous presentation of similar objects is advantageous because it offers opportunity for stimulus comparison and discrimination.

4.3 Future research

The present results show pronounced differences between coarse and fine assembly steps. Learning success increases with each repetition for coarse but not for fine assembly steps. Future research is needed to explore how the learning success for fine assembly steps can be increased. Coarse assembly steps represent conceptual changes and are, thus, more salient than the fine assembly steps. Saliency of fine assembly steps, such as orienting and positioning an object or inserting and fixing screws, can be increased, for instance, through a voice instruction explicitly stressing the sequential number of screws to be inserted (see the verbal facilitation effect; Huff & Schwan, 2012). Another way of increasing attention for fine assembly steps is by using realistic animations of how to execute them in detail. Furthermore, implementation of intermediate knowledge tests may raise early awareness for the importance of fine assembly steps (Campbell, Trelle, & Hasher, 2014).

4.4 Conclusion

We investigated how repeated training and a task's hierarchical structure influence memory performance in both experts and novices. The three‐time virtual training of an automotive assembly task yielded successful acquisition of the sequence of the coarse events, but the memory for fine events did not improve. We referred to this observation as hierarchical level effect. We empirically establish that the event cognition research significantly improves the understanding of learning hierarchically structured activities and can help streamlining virtual training material.