The full text of this article hosted at iucr.org is unavailable due to technical difficulties.

ORIGINAL ARTICLE
Free Access

The role of scaffolding in improving information seeking in videos

Salomé Cojean

Corresponding Author

E-mail address: salome.cojean@univ‐rennes2.fr

Psychology of Cognition, Behavior and Communication Laboratory (LP3C), University of Rennes 2 Upper Brittany, , Rennes, France

Correspondence

Salomé Cojean, Psychology of Cognition, Behavior and Communication Laboratory (LP3C), University of Rennes 2 Upper Brittany, Rennes, France.

Email: salome.cojean@univ‐rennes2.fr

Search for more papers by this author
Eric Jamet

Psychology of Cognition, Behavior and Communication Laboratory (LP3C), University of Rennes 2 Upper Brittany, , Rennes, France

Search for more papers by this author
First published: 04 September 2018

Abstract

Information seeking (IS) has become a critical activity in video‐based environments. Up to now, the effects of support on information seeking (i.e., scaffolding) have seldom been assessed. The twofold aim of the current study was to (a) assess the effects of scaffolding on IS in videos and (b) determine the characteristics of the users' mental models after an IS activity with or without scaffolding. We divided 50 participants into two groups that either did or did not benefit from initial scaffolding during an IS task. Both groups then had to perform a localization task without any further access to scaffolding. Results showed that scaffolding the video by providing a table of contents and markers on a timeline helped students to engage in highly efficient IS, but they had less accurate mental representations of the video than those without scaffolding. The hypothesis that scaffolding provides a usable but external model was therefore supported.

Lay Description

What is already known about this topic:

  • Videos are increasingly used in learning.
  • Searching for information in videos may be a complex activity.
  • So far, very few studies have been conducted on the potential effect of scaffolding videos during an information‐seeking (IS) task.

What this paper adds:

  • In this study, we put in place the scaffolding of the video by adding a table of contents and markers on the timeline.
  • Structuration and segmentation have positive effects on the performance in IS, in terms of response success, time spent on each search, relevance, and perceived difficulty.
  • Scaffolding has a negative impact on the users' internal representations of the video.

Implications for practice:

  • Incorporating a table of contents and a structured timeline into a video facilitates the search activity.
  • Without scaffolding, the search activity is longer and cognitively more costly.
  • Paradoxically, users have a poorer representation of the video (i.e., poorer mental model) after the IS task when they were given scaffolding.
  • Future studies should focus on the benefits of providing structuring and segmentation in learning tasks.

1 INTRODUCTION

1.1 The information‐seeking activity

The development of the Internet means that a huge mass of information is now easily accessible (Sharit, Hernández, Czaja, & Pirolli, 2008). As a result, people generally use the Internet to search for information (Lazonder & Rouet, 2008). According to Guthrie and Mosenthal (1987), people spend more time looking for information than reading the document they are working on. These authors defined the information‐seeking (IS) activity as relying on the ability of users to locate one particular item of information among others, in order to achieve an explicit goal. Some authors claim that this IS activity is becoming increasingly common in both professional and personal spheres of life (Wopereis, Brand‐Gruwel, & Vermetten, 2008) and in schools, when students do their homework, for example (Tsai, 2009). Furthermore, IS may be involved in information processing and thus be useful for learning (Merkt & Schwan, 2014; Puustinen & Rouet, 2009; Rieh, Collins‐Thompson, Hansen, & Lee, 2016; Rouet & Coutelet, 2008). The increasing importance of video‐based environments for educational outcomes (Delen, Liew, & Willson, 2014; Giannakos, 2013; Kay, 2012) means that IS must be taken into account when designing such environments (Dinet, Chevalier, & Tricot, 2012).

1.2 Improving information seeking in video‐based environments

According to Kuhlthau (1991), when individuals look for information in a document, they have to make a series of choices as to which information to retain and where to search. In educational contexts, students' ability to accurately locate information is a valuable skill (Lazonder, Biemans, & Wopereis, 2000; Lazonder & Rouet, 2008). In order to facilitate IS, some authors (Rieh et al., 2016) recommend that delivery systems should include tools that support users' exploration of those systems. One major constraint in videos is the transient nature of the information: The information that is presented rapidly disappears, to be replaced by new information (Wong, Leahy, Marcus, & Sweller, 2012). It requires costly and continuous processing in working memory (Hasler, Kersten, & Sweller, 2007; Merkt, Weigand, Heier, & Schwan, 2011). At present, however, the concept of support is not applied to video‐based environments, and no guidelines or templates are available to the creators of pedagogical documents (Chen & Wu, 2015; Ilioudi, Giannakos, & Chorianopoulos, 2013).

1.2.1 Scaffolding

One way of guiding and helping users to explore a video is to scaffold it. Scaffolding means providing tools that increase users' comprehension (Azevedo & Hadwin, 2005), but it can also improve their planning (Reiser, 2002, 2004), one of the steps involved in the IS process, as described below. Depending on the specific features of the video, there are two levels of information processing that can be scaffolded.

First, it is widely recommended in the literature that users be given some control over the information flow, by adding play, pause, rewind, or fast‐forward buttons, for example (Delen et al., 2014; Mayer & Chandler, 2001; Schwan & Riempp, 2004; Wouters, Tabbers, & Paas, 2007). This is said to enable users to adapt the information flow to their cognitive needs and abilities during learning tasks (Merkt et al., 2011; Wouters et al., 2007). Moreover, of particular relevance to the IS activity, videos can be segmented into relevant sections, allowing them to be navigated in a more relevant way (Biard, Cojean, & Jamet, 2018; Hasler et al., 2007; Spanjers, van Gog, Wouters, & Van Merriënboer, 2012; Wouters et al., 2007). Because they enhance information processing at a local level, activities such as play, pause, forward, rewind, and segmentation are called microlevel activities (Merkt et al., 2011) and promoting this type of activities is called microscaffolding.

Second, being aware of how a document is structured may promote users' IS activity (Puustinen & Rouet, 2009), but it still requires considerable cognitive resources (Sanchez, Lorch, & Lorch, 2001). Accordingly, showing users the structure of the document, by adding a table of contents identifying the different chapters, for example, provides better access to specific parts of the video (Zhang, Zhou, Briggs, & Nunamaker, 2006) and allows information to be located more easily (Guthrie & Mosenthal, 1987; Lorch, Lemarié, & Grant, 2011; Yussen, Stright, & Payne, 1993). Structuring enhances navigation at a more general level than microlevel activities do and therefore goes under the name of macrolevel activities (Chun & Plass, 1996; Merkt et al., 2011). Promoting this type of activity is known as macroscaffolding.

The hypothesis that scaffolding has a positive effect on IS in a video‐based environment has so far only been tested in a few studies (e.g., Cojean & Jamet, 2017; Merkt & Schwan, 2014). Merkt and Schwan (2014) examined the effects of microscaffolding and the combination of micro‐ and macroscaffolding on learning and IS. They defined microscaffolding as allowing participants to stop and browse the video and macroscaffolding as adding a table of contents and an index. Combining the two levels was beneficial for participants, in terms of the amount of information they found. More recently, Cojean and Jamet (2017) tested four conditions to assess the impact of each level of scaffolding on IS and the potential benefit when they were used in combination. Participants were asked nine questions whose answers were contained in a video. Results indicated specific effects of microscaffolding (i.e., markers on the video's timeline) on perceived control by the user during the task and of macroscaffolding (i.e., table of contents) on recall of the video's chapters and success on the task. Concerning response times, users in the two‐level scaffolding condition performed far better than the others, and their performance remained stable throughout, whereas users in the other three conditions improved over time, eventually reaching the point where they performed just and those in the two‐level scaffolding condition. Perceived difficulty in these three conditions was 72% greater than in the two‐level scaffolding condition. This study supported the hypothesis that micro‐ and macroscaffolding provide relevant segmentation and structuring and serve as conceptual models for users. When conceptual models are provided to users by the designer, they act as organizers, thereby promoting users' comprehension of the system and their performances while using it (Staggers & Norcio, 1993). The challenge is to promote the construction of a relevant mental model for a given IS activity.

1.2.2 Mental models

A mental model is a mental representation (Storey, Fracchia, & Müller, 1999) of a system's structure and internal relationships (Borgman, 1986). One of its major strong points is that it helps users to predict and anticipate the system's behaviour before acting on it (Borgman, 1986; Norman, 1983; Rowe & Cooke, 1995; Staggers & Norcio, 1993). Norman (1983) made a distinction between mental and conceptual models, claiming that a mental model is internal to the user and cannot be directly observed. Mental models contribute to the success of IS activity (He, Erdelez, Wang, & Shyu, 2008; Zhang & Chignell, 2001) and should even be regarded as subtending successful localization activity. Indeed, according to Sharit et al. (2008), the searching process can be broken down into three subprocesses: elaboration of a mental representation, planning, and execution. An appropriate representation (i.e., a relevant mental model) is therefore key to fast and accurate IS.

According to the literature, conceptual models can serve as a basis for users to construct their own internal mental models (Borgman, 1986; Seel, 2003). Conceptual models enhance users' comprehension (Staggers & Norcio, 1993) of a given system (e.g., video), such that the resulting mental models are more relevant than those constructed without any help (Norman, 1983). However, most studies so far have involved learning tasks. We therefore have yet to ascertain how conceptual models can benefit mental models during an IS task. The study conducted by Cojean and Jamet (2017) highlighted a positive impact of scaffolding (i.e., external conceptual model) on IS performance in a video‐based environment. Because the basis for a relevant IS is a mental representation of the video, there are two possible explanations for the positive effect of scaffolding. First, during an IS task, the external conceptual model may promote the construction of a relevant internal mental model, as described in the literature on learning tasks. Second, the scaffolding may simply act as an external aid, providing a temporary representation of the video—a representation that is not internalized by the user.

1.3 The current study

The current study was designed to (a) assess the effects of scaffolding on IS efficiency and (b) characterize users' internal mental models after performing this activity with or without scaffolding. We hypothesized that scaffolding in video‐based environments promotes IS by providing an external, usable conceptual model, whereas users who are not given any scaffolding have to construct their own mental models. These act as internal representations but are costly and time‐consuming to construct.

The first aim of the study was to confirm the positive effect of scaffolding on IS performances recently observed by Cojean and Jamet (2017). In their study, microscaffolding took the form of segmentation, with markers along the video's timeline to give users control over the information flow and aid their local navigation. Macroscaffolding involved structuring the video by adding a table of contents to promote users' general navigation. In the current study, we only compared two experimental conditions: without scaffolding and with scaffolding. In order to be able to carry out a temporal analysis, we asked participants a series of nine questions whose answers could all be found in the video. We formulated the following series of predictions:

Hypothesis 1.Success: We predicted that participants provided with scaffolding in the form of a conceptual model would perform the task better than participants with no scaffolding. The latter's performances would, however, improve across the questions (i.e., over time), reflecting the gradual construction of a relevant mental model.

Hypothesis 2.Response time: We predicted that participants provided with scaffolding in the form of a conceptual model would spend less time seeking information than participants with no scaffolding. The amount of time the latter spent on each question would decrease over time, for the same reason that performances would improve (see Hypothesis ).

Hypothesis 3.Precision of the first click: We predicted that participants provided with scaffolding in the form of a conceptual model would make more precise first clicks (i.e., nearer the target segment) for each search question than participants with no scaffolding. The latter's first clicks would become more precise over time, reflecting the gradual construction of a mental model.

Hypothesis 4.Perceived difficulty: We predicted that, because participants without scaffolding would have no conceptual model to guide their activity or to use as a basis for constructing a relevant mental model, they would perceive the task to be more difficult than participants with scaffolding.

The second aim of the study was to examine the quality of the users' mental models after an IS activity, depending on whether or not they had received scaffolding. A relevant mental model is one that helps users predict what will happen when they interact with the system (Borgman, 1986; Norman, 1983; Staggers & Norcio, 1993). Providing scaffolding to users during an IS activity should be regarded as providing an external conceptual model they can rely on to construct a relevant mental representation of the document. This representation should promote IS. In the study by Cojean and Jamet (2017), participants in conditions where scaffolding was either totally or partially absent achieved similar results (i.e., they reported considerable perceived difficulty and their performances improved over time), supporting the hypothesis of the construction of an internal mental model during the task. This construction seemed to be cognitively costly but was ultimately just as efficient as scaffolding by the end of the task, in terms of performance. When micro‐ or macrolevel scaffolding was provided, users relied on it as a relevant external conceptual model and performed extremely well in terms of IS from the start of the activity. However, perceived difficulty only decreased when the two levels of scaffolding were combined. If the presence of a conceptual model (e.g., scaffolding) allows for the construction of a mental model, we would have expected perceived difficulty to undergo a gradual decrease across conditions (no scaffolding > one level of scaffolding > two levels of scaffolding). The authors concluded that when the conceptual model is sufficiently relevant to reduce the task's cognitive cost, it acts as an external representation and therefore does not support the construction of an internal mental model. In the present study, we administered a localization task after the IS task, to assess the quality of each participant's mental model of the video content. A relevant mental model should make a system's behaviour easier to predict (e.g., what information is likely to appear if I click here?). This qualitative model should therefore allow participants to find the requested information in the video, without relying on either scaffolding or feedback from the video.

Hypothesis 5.Localization: We predicted that as they would have constructed an internal mental model during the IS task, participants who had not received any scaffolding would perform better on the localization task than participants who had become accustomed to relying on scaffolding as an external conceptual model.

2 METHOD

2.1 Participants

A total of 50 students (43 women and seven men) from the University of Upper Brittany (France) took part in the study. Their mean age was 19.31 years (SD = 1.11). The experiment was conducted in accordance with the principles of the Declaration of Helsinki. All participants received a cinema ticket for their participation.

2.2 Materials and experimental design

2.2.1 IS task

We designed a specific learning environment to display the video (taken from the Canal U website; http://www.canal‐u.tv/). This video was about water in the universe (Doressoundiram, 2012) and lasted about 13 min. As on the website, the video was divided into 12 chapters, each with a different theme. In our design (see Figure 1), participants could browse the video with the mouse as much as they wanted, using a timeline below the video. A pause button allowed them to halt the video whenever they wanted to. Participants were evenly and randomly distributed across two experimental conditions: with or without scaffolding. In the condition with scaffolding (n = 25), a table of contents was displayed next to the video, and the timeline displayed 12 markers corresponding to the video's 12 chapters. Users could gain access to specific parts of the video by clicking on the corresponding chapter in the table of contents. In the condition without scaffolding (n = 25), there were no markers and no table of contents—only the video and the timeline below. Questions for the IS activity were displayed on the screen, to the left of the video (below or instead of the table of contents). A timer indicated how much time there was left to answer. Participants had 5 min to answer each question. Once they had typed an answer, they could move to the next question. If they did not answer, the following question automatically appeared. The nine questions were presented in random order.

image
Screenshots of the video‐based environment for the information‐seeking task in the conditions with (a) and without (b) scaffolding [Colour figure can be viewed at wileyonlinelibrary.com]

2.2.2 Localization task

The second step of the current study was identical for all participants. Nobody received any scaffolding, and instead of a video, there was just a black screen and the timeline (see Figure 2). There were no markers on the timeline and no table of contents. Questions (e.g., “How far from the Sun is the frost line?” and “On which satellite of Jupiter is there an ocean of liquid water?”) were displayed on the left side of the screen, and participants had 5 min to answer, indicated by a timer. Participants were asked to answer eight questions (four were questions that had already been asked in the first IS step, and four were new ones). They had to indicate on the timeline where they thought the answer was in the video. The eight questions were presented in random order.

image
Screenshot of the video‐based environment for the localization task [Colour figure can be viewed at wileyonlinelibrary.com]

2.3 Measures

2.3.1 Interest in the topic and perceived competence (control variables)

Before starting the IS task, participants had to rate their prior interest in the topic of the video (i.e., water in the universe) and their perceived competence in this topic, by answering two questions: “On a scale of 0 to 10, how interested are you in this topic?” and “On a scale of 0 to 10, how competent do you feel on this topic?” The aim was to ensure that participants were evenly distributed across the conditions on these two variables.

2.3.2 Successful responses

For a response to be considered correct, participants not only had to provide the right information but also had to indicate where it appeared in the video (i.e., noting the time when the information was explicitly presented). Concerning the time, we deemed a response to be correct if it was coherent (the information was delivered at a specific point, but participants sometimes gave the time for the end of what could be quite a lengthy sentence). When at least one of these two parts of the response was missing or wrong, the response was deemed to be incorrect.

2.3.3 Response times

Response times were calculated from when participants started searching (i.e., first click) to when they found the information (i.e., pressing the pause button or starting to type the answer). They were given 5 min to find the answer to each question. When participants failed to answer within 5 min, their response time were automatically counted as 300 s. A screen recorder was used to analyse each information search duration time.

2.3.4 Precision of the first click

The screen recorder was also used to note where participants made the first click on the timeline (or on the table of contents when available) during each search. As there were 12 segments (corresponding to the 12 chapters) on the video's timeline (present in the scaffolding condition but invisible without scaffolding), we counted the number of segments between the first click and the target segment (i.e., where the response was). The resulting score (distance score) reflected the distance of the first click from the target segment, with a short distance being synonymous with high precision.

2.3.5 Perceived difficulty

Three items of perceived difficulty (e.g., “I found this information‐seeking activity difficult” and “Searching for information was easy”) were adapted from previous studies (e.g., Kraft, Rise, Sutton, & Røysamb, 2005; Trafimow, Sheeran, Conner, & Finlay, 2002). Participants indicated how far they agreed with each item on a 7‐point Likert‐like scale. Cronbach's α was 0.88 for the perceived difficulty items used in this study.

2.3.6 Localization accuracy

To assess the quality of participants' mental models, we administered a localization task, in line with previous works conducted by Marchionini (1989) and He et al. (2008). This task measured the accuracy of the participants' predictions, as mental models have a predictive role. Participants were thus asked to infer what information could be delivered after an action in the system (i.e., the video). For each of the eight questions, we calculated a localization score, similar to the score for the precision of the first click. The resulting score (distance score) reflected the distance (in chapters) between the response given by the participant and the actual location of the information on the video's timeline.

2.4 Procedure

Three participants could perform the tasks at the same time. They were each installed at an isolated desk with headphones to listen to the video later. They began by answering the pretask questionnaire about their interest in the topic of the future video (i.e., water in the universe) and their perceived competence. The experimenter then described how the experiment would unfold, the instructions for searching for information in the video, and all the measures that would be made during the tasks. After that, the screen recorder was launched. Participants could click to start the task with the first IS question. A 5‐min timer indicated how much longer participants had to find the answer. After they had answered the nine questions, a new web page on the screen asked them to inform the experimenter, and they then gained access to a posttask questionnaire about perceived difficulty during the task. After this questionnaire, the experimenter gave participants the instructions for the localization task and let them start whenever they wanted to. For each of the eight questions (four were taken from the previous IS step, and four were created especially for this purpose), participants had 5 min to answer and could move on to the next question once they had indicated a response. After this second part of the study, participants each received a cinema ticket for their participation.

3 RESULTS

3.1 Control variables

Control variables corresponded to the two questions asked in the pretask questionnaire. The aim was to ensure that participants in the two experimental conditions (without or with scaffolding) did not differ in their interest and perceived competence in the video topic (ratings on a scale from 0 to 10). Analyses of variance revealed no significant differences between the without scaffolding and with scaffolding conditions on either prior interest (M = 5.76, SD = 1.92 vs. M = 5.88, SD = 1.76), F(1, 48) = 0.053, p = 0.819, η2p = 0.00, or perceived competence (M = 2.4, SD = 1.63 vs. M = 3.04, SD = 1.86), F(1, 48) = 1.672, p = 0.202, η2p = 0.03.

3.2 Data analysis

During the IS activity, participants had to answer nine questions. Response success, time, and precision were measured for each of these questions, the aim being to track changes in performance over time during the task. In the second step, each participant had to answer eight localization questions. As all the questions were put to each participant, the independence assumption was violated (Field, Miles, & Field, 2012). We used linear mixed models (Gueorguieva & Krystal, 2004) to take the nonindependence of the data into account. This statistical method allowed us to test the effect of a variable by comparing nested models (i.e., one with and one without the variable;e.g., Baayen, Davidson, & Bates, 2008). We applied a chi‐square test to measure the difference between the two nested models (i.e., comparison of their deviances). The significance threshold for p values was set at α = 0.05. Every model included random effects of question and participant.

3.3 IS activity

For each dependent variable, where we used linear mixed models (i.e., response success, response time, and precision of the first click), we assessed the effects of condition (i.e., scaffolding) and question rank (i.e., change over time). We therefore compared the effect of a baseline model (M0) with a model that included the condition variable (M1). We then compared the most satisfactory model (M0 or M1) with a model that included the additive effect of condition and question rank (M2) and a model that included the interaction effect of condition and question rank (M3).

3.3.1 Response success

Answers were coded either 0 (wrong answer) or 1 (right answer). In contrast to response times and precision, where we applied linear regression, here, we used logistic regression (Field et al., 2012), as the data were binomial. Results showed a significant effect of condition on success rate, χ2(1, N = 450) = 11.607, p < 0.001, but neither an additive effect, χ2(1, N = 450) = 0.327, p = 0.568, nor an interaction effect, χ2(1, N = 450) = 0.045, p = 0.833, of condition and question rank. Model M1 was therefore considered to be the best one. Descriptive statistics showed that participants seemed to have greater success when they were given scaffolding (see Figure 3).

image
Diagram showing response success according to question rank in the two experimental conditions [Colour figure can be viewed at wileyonlinelibrary.com]

3.3.2 Response times

Results on response times showed a significant effect of condition, χ2(1, N = 450) = 30.173, p < 0.001, on response times, and both an additive effect, χ2(1, N = 450) = 12.315, p < 0.001, and an interaction effect, χ2(1, N = 450) = 7.917, p = 0.005, of condition and question rank. Model M3 was therefore considered to be the best one.1 Descriptive statistics showed that participants who were not given scaffolding spent more time on each question than participants who were given it but also that the difference between the two conditions diminished over time (see Figure 4).

image
Diagram showing response times (in seconds) according to question rank in the two experimental conditions [Colour figure can be viewed at wileyonlinelibrary.com]

3.3.3 Precision of the first click

Results on the distance score for the first click showed a significant effect of condition, χ2(1, N = 450) = 29.807, p < 0.001, on distance, and both additive, χ2(1, N = 450) = 21.872, p < 0.001, and interaction, χ2(1, N = 450) = 5.100, p = 0.024, effects of condition and question rank. Model M3 was therefore considered to be the best one. Descriptive statistics showed that participants who were not given scaffolding made less accurate first clicks than participants who were given scaffolding but also that the difference between the two conditions became smaller over time (see Figure 5).

image
Diagram showing distance score according to question rank in the two experimental conditions [Colour figure can be viewed at wileyonlinelibrary.com]

3.3.4 Perceived difficulty

An analysis of variance revealed a significant effect of condition, F(1, 48) = 4.742, p = 0.034, η2p = 0.09 (see Table 1 for descriptive statistics). The data showed that participants who were not given scaffolding perceived the task to be more difficult than participants who were given scaffolding during IS.

Table 1. Descriptive statistics for perceived difficulty
Condition M SD
With scaffolding 2.53 0.98
Without scaffolding 3.19 1.14

3.4 Localization task

The localization score was a margin of error score rather than a precision one. Results conducted for the localization task showed a significant effect of condition, χ2(1, N = 400) = 4.156, p = 0.041, though neither an additive, χ2(1, N = 400) = 0.045, p = 0.832, nor an interactive, χ2(1, N = 400) = 0.035, p = 0.852, effect of question rank. Model M1 was therefore considered to be the best one. Descriptive statistics showed that participants' localization seemed to be more accurate when they had not previously been given scaffolding (i.e., smaller margins of error; see Table 2).

Table 2. Descriptive statistics for mean localization score (margin of error)
Condition M SD
With scaffolding 1.97 0.53
Without scaffolding 1.59 0.57

Lastly, we calculated the correlation between the total duration of exposure to the video during the first (IS) step (i.e., total duration of response times) and the mean localization score (distance in chapters) during the second step. This correlation was significant, but the coefficient indicated only a weak relationship between the variables, r = −0.334, p = 0.018.

4 DISCUSSION

The aim of the current study was to confirm previous findings of a positive effect of scaffolding on IS in video‐based environments and to investigate the characteristics of users' mental models in this context. Our main hypothesis was that scaffolding enhances IS but does not allow the external conceptual model that is provided to be internalized as a mental model. We tested the assumption that without scaffolding, the construction of an internal mental model is long and cognitively costly, but performances gradually improve as the IS questions are answered.

Concerning the first part of the study (i.e., IS activity), in accordance with Hypothesis , results showed an effect of condition on response success, indicating that participants who were given scaffolding found more correct responses in the video than participants who were not given it. The absence of an interaction effect indicated that the performances of participants without scaffolding remained below those of participants with scaffolding, even at the end of the task. Concerning response times, results showed that participants in the no scaffolding condition spent more time searching for information than participants with scaffolding, although this difference between conditions tended to decrease over time, validating Hypothesis . Scaffolding helped participants to make more precise first clicks, but the difference between conditions again tended to decrease over time, consistent with Hypothesis . At the end of the task, participants with scaffolding reported significantly less perceived difficulty than participants without scaffolding did, thus confirming Hypothesis . All these results were consistent with previous studies of IS activity, which reported a positive effect of scaffolding (e.g., Merkt & Schwan, 2014) and changes over time in the performances of users who were not given micro and/or macroscaffolding (Cojean & Jamet, 2017).

In the second part of the study, we tested the hypothesis of a noninternalized mental representation with scaffolding and the construction of an internal mental model in the absence of scaffolding. Results on perceived difficulty in the first part were consistent with this hypothesis. Furthermore, results on the localization task showed that participants who were not given scaffolding provided more precise responses (i.e., smaller margin of error) than participants who were given scaffolding during the prior IS activity, thus supporting Hypothesis . The absence of an effect of question rank may have been due to the absence of feedback in the localization task, thus precluding the possibility of learning. The results of this study cannot be equated with those in the literature on learning tasks. During IS, the conceptual model is used not so much as a basis for constructing a more relevant mental model but as a temporary and external representation of the video‐based environment. When the scaffolding was removed, participants become less able to predict the future behaviour of the system (i.e., the video‐based environment) than participants who had never been given scaffolding. This result can be interpreted in terms of desirable difficulty: When scaffolding is missing, users have to make greater efforts to perform well and therefore engage in more relevant mental activities. It could explain why, by the end of the IS task, they had a better mental model of the video content (E. Bjork, Little, & Storm, 2014; R. Bjork, 1994; Dobson, 2011).

One possible drawback to this interpretation is that the higher localization scores of participants who had not been given any scaffolding in the first (IS) step may have been due to the extra time spent viewing the video to find the information (see response time score) and not solely to the construction of a mental model. We tried to exclude this possibility by calculating the correlation between the total exposure time to the video in the first step and the mean localization score in the second step. This correlation was significant but only weakly so. The amount of time spent on the video therefore did not satisfactorily explain the quality of the mental model—a result that favoured our main hypothesis.

A further limitation was the actual measure of the mental model. As a mental model is not directly observable, its presence has to be inferred from performances on a different measure (He et al., 2008; Rowe & Cooke, 1995; Staggers & Norcio, 1993; X. Zhang & Chignell, 2001). Authors have used several different ways of measuring mental models, relying, for example, on the verbalization of strategies or knowledge about a given topic (e.g., Azevedo, Cromley, & Seibert, 2004; Chi, Leeuw, Chiu, & LaVancher, 1994; Slone, 2002). In the current study, we drew on a study by Marchionini (1989), where participants were asked to predict what would happen in a specified situation without being given any relevant feedback.

Finally, video‐based environments are used not only for IS tasks but also for learning contexts. Future studies should therefore focus on the benefits of scaffolding for learning. IS and learning are closely linked, as information processing generally begins with the localization of the relevant information. Merkt and Schwan (2014) showed that information that is easy to find is recalled better in learning tests. The presence of scaffolding in an IS task helps to improve users' performances. The participants who were given scaffolding in our study responded to the first question with a degree of efficiency that participants without scaffolding only achieved for the ninth question. We can therefore assume that scaffolding has a positive impact on learning, although the current study also demonstrated that when IS is scaffolded, users do not construct an internal mental model of the video. There is considerable evidence that deep learning is based on efficient mental models (e.g., Hegarty, 2014; Mayer, Mathias, & Wetzell, 2002). We should therefore focus on the reasons why scaffolding (i.e., proposed conceptual model) in video‐based environments is not well internalized during an IS activity and how it can be used to improve learning. Mayer (2004) reminds us that to build coherent and organized knowledge, students must engage in an active process of learning. We can assume that when there is no learning instruction and no need to build a mental model (e.g., to perform an IS activity), the scaffolding is not actively internalized. By contrast, when learning is explicitly required, the scaffolding is presumably internalized and has a positive effect on learning outcomes. Finally, future studies should also focus on the video's content and how it may influence users' performances. This would show whether the results of the present study can be generalized. In the literature, research has been conducted on ways of helping users deal with the content's complexity (e.g., Mayer, 2008) or on the interaction between presentation format and user characteristics (e.g., expertise; Khacharem, Zoudji, Kalyuga, & Ripoll, 2013).

5 PRACTICAL IMPLICATIONS

In an IS task performed in a video‐based environment, the goal is to find the relevant information in an efficient and effective way. This type of environment therefore needs to be tailored to users' needs. The results of the present study clearly showed that scaffolding the video by providing a table of contents and markers along the timeline is a highly efficient solution. Results in this condition were better than in the condition without scaffolding, in terms of response success, time spent on each search, precision, and perceived difficulty. When scaffolding is missing, users are more likely to develop a relevant mental model of the video content. However, it should be emphasized that, from a practical point of view, students are rarely asked to find answers to nine IS questions within the same video. In classic conditions, therefore, where there is less extensive interaction with the video content, they are unlikely to be able to construct a relevant mental model of the video. We therefore recommend that the designers of video‐based environments consider adding scaffolding. In the case of learning goals, more research is needed to investigate the role of scaffolding further.

ACKNOWLEDGEMENTS

This work was supported by the CominLabs laboratory of excellence funded by the French National Research Agency (Ref. ANR‐10‐LABX‐07‐01) and by the Brittany Region (France).

The authors certify that there was no financial or personal interest that could have influenced their objectivity in this study.

    NOTE

    • 1 We also conducted an analysis from which failures (i.e., failing to answer a search question within 5 min, coded 300 s) were excluded. Results were comparable, showing a significant effect of condition, χ2(1, N = 425) = 32.139, p < 0.001, on response times, and both an additive effect, χ2(1, N = 425) = 10.664, p = 0.001, and an interaction effect, χ2(1, N = 425) = 6.524, p = 0.011, of condition and question rank. Model M3 was therefore considered to be the best one.