Toward an Instructionally Oriented Theory of Example-Based Learning



Learning from examples is a very effective means of initial cognitive skill acquisition. There is an enormous body of research on the specifics of this learning method. This article presents an instructionally oriented theory of example-based learning that integrates theoretical assumptions and findings from three research areas: learning from worked examples, observational learning, and analogical reasoning. This theory has descriptive and prescriptive elements. The descriptive subtheory deals with (a) the relevance and effectiveness of examples, (b) phases of skill acquisition, and (c) learning processes. The prescriptive subtheory proposes instructional principles that make full exploitation of the potential of example-based learning possible.

1. Introduction

This article presents a theory of example-based learning that integrates the theoretical assumptions from three research areas: learning from worked examples (WE), observational learning (OL), and analogical reasoning (AR). The present theory is mainly rooted in research on learning from WEs (e.g., Atkinson, Derry, Renkl, & Wortham, 2000; Cooper & Sweller, 1987; Paas & van Gog, 2006; VanLehn, 1998). Such examples consist of a problem statement and its solution. A typical WE also includes solution steps leading to the final solution. Such examples are commonplace in well-structured domains such as mathematics or physics. Fig. 1 provides an illustrative WE. Note that the notion of a WE does not refer to examples without solutions (e.g., illustrating just a concept) or to pure problem solving of exemplary tasks.

Figure 1.

A worked example in a “double sense”: Screenshot from an example-based learning program for teachers on how to design effective WEs for high-school students (Hilbert, Renkl, Schworm, Kessler, & Reiss, 2008b).

We include OL and AR to enhance our theory because these two approaches and WE research share striking commonalities (for relations between OL and WE research, see also van Gog & Rummel, 2010). All three approaches emphasize the relevance (a) of example cases for learning, (b) of providing multiple examples, (c) of connecting the example cases to underlying principles, and (d) of facilitating learner activities directed toward connecting example cases and principles. It is worth noting that the selection of research areas to be considered is a “bandwidth-fidelity dilemma.” Including many research traditions may make the theory more integrative, but it can also lead to a more superficial treatment of each approach. We have thus decided to disregard other research traditions that may occur to the reader, for example, case-based reasoning (e.g., Kolodner, 1993) or cognitive apprenticeship (Collins, Brown, & Newman, 1989). A major reason for excluding the latter two research areas is also that they have not yielded a substantial body of well-controlled (experimental) studies on human example-based learning processes and instructional principles at the granularity of the present theory.

2. Three perspectives on example-based learning

Before presenting the integrative theory, we discuss the three research areas that form the theory's basis. Each discussion focuses on basic assumptions on the relevance and effectiveness of examples, phases of skill acquisition, and learning processes (see also Table 1).

Table 1. Overview of the main assumptions on example-based skill acquisition in three research traditions
 Worked ExamplesObservational LearningAnalogical Reasoning
Initial skill acquisitionExamples as important source of learning in initial skill acquisitionModels as important source of learning in initial skill acquisitionSolved problem cases as important source of learning in initial skill acquisition
Relevance of examplesProblem solving in contrast to examples: overloading working memoryLearning by doing in contrast to models: slow, error-prone, and inefficientAnalogical examples help out when learners have little domain knowledge
Effectiveness of examples

Reducing the learning-irrelevant cognitive load

Affordance for self-explanation and schema construction

Humans have advanced capacity for observational learning from exemplary modelsRelying on analogical examples as the core of human cognitive capabilities and, in particular, of effective learning
Phases of skill acquisition

Anderson et al.:

  1. Analogy
  2. Abstract declarative rules
  3. Proceduralized rules
  4. Retrieval


  1. Principle encoding
  2. Learning to solve problems and repairing knowledge gaps
  3. Automation

Schunk & Zimmermann:

  1. Observational
  2. Emulative
  3. Self-controlled
  4. Self-regulated

Holyoak (subcomponents of initial skill acquisition phases):

  1. Encoding source examples
  2. Activation and selection of source examples
  3. Mapping source examples to transfer problems
  4. Schema construction
Central learning processes

Self-explanation (earlier stages)

Schema construction (earlier stages)

Practice for automation (later stages)





See above: The phases are also conceived as processes

2.1. Learning from worked examples

Learning from WEs means that learners study (usually several) problems for which the solution is given (Renkl, 2005) before they are confronted with problem-solving demands. Learners should acquire a basic understanding of domain principles while studying examples, which provides a basis for later meaningful problem solving.

2.1.1. Relevance and effectiveness of examples

Sweller and Cooper (1985) conducted seminal research on WEs in initial cognitive skill acquisition, arguing that the usual method of problem solving directs attention to search processes but not to aspects directly relevant to schema acquisition (i.e., learning). WEs, in contrast, enable learners to acquire knowledge about problem states, operators, and the consequences of the application of operators integrated into schemas that are applicable for later problem solving. As Sweller and Cooper anticipated motivational problems when only those WEs are provided that do not induce activity, they used (isomorphic) example-problem pairs: If learners expect a similar problem to be solved, they are motivated to process the example.

Sweller and Cooper (1985) compared conventional problem-solving conditions (i.e., introductory WE plus problems to be solved) with WE conditions (i.e., introductory example plus a series of example-problem pairs) in algebra learning. The WEs reduced learning time as well as the number of errors and the time required for working on isomorphic test problems (see also Cooper & Sweller, 1987). Note that learning from WEs in that sense means that several examples are employed, in contrast to the more traditional approach of providing just one example. In other words, the use of multiple examples is inherent to this learning approach.

The findings on favorable learning outcomes from studying WEs have often been replicated (e.g., Carroll, 1994; Cooper & Sweller, 1987; Crippen & Earl, 2007; Hilbert & Renkl, 2009; Paas & van Merriënboer, 1994), even when learning from WEs was experimentally compared to well-supported learning by problem solving (e.g., Schwonke et al., 2009b). In a field study by Zhu and Simon (1987), a 3-year mathematics curriculum was taught in 2 years by using example-based materials for the standard curriculum (2 years of algebra and 1 year of geometry), with an achievement level that slightly exceeded even that under the standard instructional routine. Moreover, several more rigorously controlled laboratory studies have demonstrated an efficiency advantage of learning from WEs (i.e., less learning time for the same outcome; Salden, Koedinger, Renkl, Aleven, & McLaren, 2010).

Although factors moderating the effectiveness of example-based learning will be discussed in detail later on (see the theory's prescriptive part on instructional principles), one crucial issue should already be mentioned here. The classic study by Chi, Bassok, Lewis, Reimann, and Glaser (1989) and many follow-up studies (e.g., Atkinson, Renkl, & Merrill, 2003; Bielaczyc, Pirolli, & Brown, 1995; Hausmann & VanLehn, 2007; Renkl, 1997) demonstrated that the effectiveness of WEs depends on the learners' self-explanation activities. These activities refer to elaborations, whereby the learners explain the rationale of example solutions to themselves. Sophisticated spontaneous self-explanations (Renkl, 1997) as well as prompted or trained self-explanations (Atkinson et al., 2003; Renkl, Stark, Gruber, & Mandl, 1998) lead especially to a superior transfer to novel problems (e.g., Atkinson et al., 2003; Hilbert & Renkl, 2009). Learners are sometimes incapable of providing productive and correct self-explanations (e.g., Berthold & Renkl, 2009; Renkl, 2002). Help in the form of instructional explanations therefore makes sense (Wittwer & Renkl, 2010). However, as long as learners can explain worked solutions on their own, they should do so (Schworm & Renkl, 2006).

Even if there is strong evidence of a WE effect, there are exceptions to the main set of findings. However, the absence of positive effects can usually be explained very well based on research findings. For example, a metacognitive condition outperformed a WE condition in Mevarech and Kramarski (2003). However, the example condition provided the student with just one example (per problem category) followed by four practice problems. Such a condition corresponds to the control condition (i.e., problem-solving condition) in the studies by Sweller and Cooper (1985). Hence, Mevarech and Kramarski did not implement learning from examples as defined in the present article. Kalyuga, Chandler, Tuovinen, and Sweller (2001) found no example effect in more experienced learners, which is not surprising because WEs aim to support just initial, not advanced skill acquisition.

The classic explanation for the effectiveness of WEs stems from cognitive load theory (e.g., Paas & van Gog, 2006; Sweller, van Merriënboer, & Paas, 1998). When beginners solve problems, they typically search for solutions via means-ends analysis. They focus their attention on specific features of the problems to reduce the difference between current problem states and goals rather than on schema-relevant principles. Moreover, the task of reducing the difference between problem states requires learners to maintain subgoals and consider different solution options, which can result in a heavy cognitive load. Accordingly, problem solving induces extraneous cognitive load (here: load irrelevant to schema construction) and might even contribute to cognitive overload. WEs reduce extraneous load because there is no need to search for specific solutions. Hence, enough working memory capacity can be devoted to constructing a schema for later problem solving. For this reason, WEs foster initial cognitive skill acquisition.

Renkl and Atkinson (2007) argued similarly, but with less emphasis on means-ends analysis. If “premature” problem solving is required, the learners typically lack understanding of the domain principles and their application. Hence, they have to use shallow strategies, for example, a key word strategy (i.e., selecting a procedure by a key word in the cover story of a problem), a copy-and-adapt strategy (i.e., copying the solution procedure from a presumably similar problem and adapting just the numbers), or a means-ends analysis focusing on superficial problem features. Due to their lack of understanding, they are unable to employ the principles to be learned in their problem-solving efforts (cf. VanLehn et al., 2005). However, employing shallow strategies for problem solving does not deepen domain understanding and can therefore be classified as extraneous activities with respect to gaining understanding. WEs free the learners from such extraneous activities. They leave cognitive resources for self-explanation, that is, for explicating the rationale of solutions, especially in reference to underlying principles. Once the learners have understood the domain principles and their applications, they should be encouraged to solve problems requiring the application of these principles.

2.1.2. Phases of skill acquisition

There are two phase models with special emphasis on learning from WEs (Anderson, Fincham, & Douglass, 1997; VanLehn, 1996). Anderson et al. (1997) proposed four phases within their ACT-R framework. First, learners solve problems by analogy. They refer to known examples and relate them to the problem to be solved. Second, learners develop abstract declarative rules that guide their problem solving. After practicing, they move to the third phase, in which performance becomes smooth and rapid without relying on many attentional resources; proceduralized rules are formed. In the fourth phase, learners have many examples in long-term memory. They are thus often able to retrieve a solution directly from memory. These phases overlap because of the learners' flexibility in using different methods (e.g., analogy or abstract rule), depending upon the specific problem's familiarity.

VanLehn (1996) distinguished three phases. During the “early phase,” learners attempt to gain basic understanding of the domain without necessarily striving to apply the acquired knowledge. This phase corresponds to the study of instructional materials about domain principles. During the “intermediate phase,” learners become acquainted with how to solve problems, for example, by studying worked solutions. Here, learning is focused on how abstract principles are used to solve concrete problems. One potential outcome of this phase is that flaws in the knowledge base—such as the lack of certain elements or relations as well as misunderstandings—are corrected. When learning from WEs, learners first study a sample of examples before turning to problem solving in this phase. It is important to note that the construction of a sound knowledge base is not a quasi-automatic by-product of studying examples or solving problems. In fact, learners have to reason about the rationale of the solutions and the structure of examples. Hence, self-explanations are crucial for effective learning (e.g., Chi et al., 1989). In the “late phase,” practice enhances speed and accuracy. Now problem solving as opposed to self-explaining examples is crucial. As in the Anderson et al. (1997) model, the phases have no precise boundaries, especially when acquiring a complex cognitive skill involving multiple subcomponents. In this case, learners may be entering the late phase in the acquisition of one subskill while operating in the intermediate phase with respect to another subskill.

A smooth transition from example study to problem solving in later phases of skill acquisition can be implemented several ways. For example, learners can be invited to first study solution steps and then imagine their performance—as a first step toward their own problem solving (Cooper, Tindall-Ford, Chandler, & Sweller, 2001). Another option is that the WEs are made more and more incomplete so that the learners take over increasingly more problem-solving demands (Renkl & Atkinson, 2003).

2.1.3. Learning processes

Cognitive load theory—as a major theory in WE research—assumes that learning from examples is effective because it fosters the construction of schemas that can be used to solve related problems (e.g., Sweller et al., 1998). During further learning, these schemas become automated. This schema assumption is shared with many prominent and tried-and-tested theories on learning and problem solving (e.g., Gick & Holyoak, 1983; Reed, 1993). Schemas refer to abstract, skeleton-like knowledge structures induced from concrete experiences, for example, from several WEs of a certain problem category (e.g., probability problems with “order relevant” and “without replacement”). A schema prescinds from the surface features of problems (e.g., number, cover story) that do not determine the appropriate solution procedure. Instead, it contains the structural features of problems (e.g., “order relevant” and “without replacement”) which determine the correct solution procedure. Beyond these basic assumptions, cognitive load theory is not very specific about the learning processes that lead to schema construction and automation.

Anderson et al. (1997) does not emphasize schemas, but rather the formation of mental “if-then” rules. The “then” part refers to an action (i.e., operator, solution procedure) and the “if” specifies when this action is executed. Note, however, that Anderson et al. also assume schema formation if the if-parts are made more abstract (e.g., Anderson & Lebiere, 1998). Another aspect added to cognitive load theory is the assumption that specific exemplars are stored in memory; they are used to solve identical problems in the future.

Important processes when learning from WEs are described by self-explanation models. WEs are usually incomplete, especially with respect to the solution rationale (Chi et al., 1989). Hence, learners need to self-explain the solution to gain profound understanding (Nokes, Hausmann, VanLehn, & Gershman, 2011). In addition, part of the self-explanation effect seems to be due to the fact that learners actively generate these explanations (Hausmann & VanLehn, 2007). Without self-explanations, examples are just stored in a “verbatim,” superficial manner (Reimann, 1997). When underlying principles are considered in self-explanation, the examples are stored according to their structural, solution-relevant aspects (i.e., underlying principles; Chi, 2000, assumes additional functions of self-explanations; however, they are mainly justified in reference to text learning).

The learners' self-explanations can be indirectly fostered or hindered by the type of examples presented. For example, making single steps and their meanings salient can elicit self-explanation on these steps (Catrambone, 1996). Sets of to-be-compared examples can be used so that the learners' self-explanations are directed toward critical features (e.g., different solution of mathematics word problems despite (almost) the same cover story; see Guo, Pang, Yang, & Ding, 2012). Example sets sometimes include erroneously worked WEs to show the learners typical errors so that they can be avoided later (Booth, Lange, Koedinger, & Newton, 2013; Große & Renkl, 2007). Finally, WEs lose their effectiveness if two information sources (e.g., text and pictures) are difficult to interrelate so that extraneous load is induced, which hinders self-explanations (split-attention effect; e.g., Tarmizi & Sweller, 1988).

In summary, WE research has provided convincing evidence of the relevance and effectiveness of learning from WEs. Models of skill acquisition have been proposed and important learning processes (i.e., self-explanations) identified. Nevertheless, there are gaps in this approach, as will be shown in the following two sections.

2.2. Observational learning from models

Observational learning refers to learning by observing other persons' behavior (i.e., models). Bandura (1986) assumes in his well-known sociocognitive learning theory that for learning to occur, the learner has to pay attention to relevant behavior (attention), encode and remember the demonstrated activity (retention), and be able to reproduce that type of behavior (reproduction); in addition, the learner needs to be motivated to reproduce the behavior (motivation). Bandura described different observational effects (e.g., learning effects, response facilitation effects, and arousal effects) in a variety of areas (e.g., language acquisition, learning related to clinical disorders, and motor skills).

Here, the focus is on academic learning by abstract modeling, in which people are typically enacting or demonstrating the target skill. The term model does not primarily refer to the specific person demonstrating a skill, but to the example the person provides when solving a problem. Abstract modeling refers to the acquisition of cognitive skills based on underlying (abstract) rules or principles exemplified by models—in contrast to models that can be more or less mimicked, that is, copied in a “verbatim” manner without considering underlying rules (e.g., motor skills). We consider primarily theory elements related to academic learning. Note that two types of models are employed: Mastery models show only a correct, smooth performance; coping models also show impasses and how to overcome them.

2.2.1. Relevance and effectiveness of examples

In his book Psychological Modeling, Bandura (1971) wrote: “This volume is principally concerned with learning by example” (p. 1). On the basis of—at that time—surprising research findings and theoretical considerations, Bandura (1986) concluded: “Fortunately, most human behavior is learned by observation through modeling. By observing others, one forms rules of behavior, and on future occasions this coded information serves as a guide for action” (p. 47). Bandura (1986) stated “fortunately” because he assumed that learning from (the effects of) doing would be very slow, error-prone, and inefficient. Accordingly, many learning effects (e.g., language acquisition) cannot be entirely explained by learning by doing. This argumentation parallels cognitive load theory. However, Bandura is not primarily interested in the analysis of the cognitive learning processes elicited by different instructional approaches. The claim that learning by modeling is paramount and very powerful can be considered one of the “axioms” of Bandura's social-cognitive learning theory. Bandura (1999) claims that “…human beings have evolved an advanced capacity for observational learning…” (p. 25). Hence, it is a “natural” conclusion to rely on this learning mechanism for fostering the acquisition of academic cognitive skills (e.g., Braaksma, Rijlaarsdam, & van den Bergh, 2002; Harris, Santangelo, & Graham, 2008; Schunk & Zimmerman, 2007).

Also in accord with WE research, Bandura pleads for the use of multiple examples or models (e.g., Bandura, 1986). However, beyond making it possible to extract general principles from diverse modeled responses (Bandura, 1986), he argues that multiple models provide greater opportunity for the observer to identify with at least one of the models and to learn effectively from it (Schunk, Hanson, & Cox, 1987). Although the majority of OL studies investigated and compared different variants of modeling and relevant context variables (e.g., model-observer similarity), there is also evidence of the positive effects of abstract modeling when acquiring academic skills. Children who find mathematics difficult can be better supported by cognitive modeling while acquiring division skills than by didactic instruction (Schunk, 1981). Modeling fosters college students' writing skills more than working on practice problems (Zimmerman & Kitsantas, 2002). Video models on productive cooperation can be more effective for later cooperation output than learning by “scripted” doing (Rummel, Spada, & Hauser, 2009). In a nutshell, a number of findings have shown positive effects of OL in academic learning.

2.2.2. Phases of skill acquisition

Schunk and Zimmerman (1997, 2007) have formulated a social-cognitive model of the self-regulation development in academic areas (e.g., writing skills) with four levels: observational, emulative, self-controlled, and self-regulated. Over these phases, there is a shift from social sources (observational, emulative) to self-sources of regulation (self-controlled, self-regulated). Especially in the early phase, students benefit from observing models explaining and demonstrating a skill. At this observational level, students learn the major features of skills cognitively without necessarily performing the skills. At the emulative phase, the learner's performance approximates the model's performance, in the sense that it is not a copy but rather an emulation of the model's general pattern or style. Emulative learning also includes performance. These two phases are primarily social because learners need to be exposed to models. The skill's internalization has just begun. At the third phase (self-controlled), the learners demonstrate the skill independently when performing similar tasks. The learner's mental representation of the skill is still structured after the model. Learners have not yet developed a mental representation independent of the observed model. During the final self-regulated phase, learners adapt their skills to changes in contextual conditions. Now learners can initiate use of the skill and incorporate adjustments in various problem-solving contexts.

2.2.3. Learning processes

Bandura (1986) emphasizes four types of OL processes: attention, retention, reproduction, and motivation. With respect to attentional processes, it is important to note that typically “enacted” models are employed, meaning that a person demonstrates how to solve a problem. Such complex and perceptually rich models should attract and sustain attention more than verbal or written models (Bandura, 1986). However, it is difficult to induce rules from complex models because they must be abstracted from model responses that otherwise differ in many irrelevant aspects (see also van Gog & Rummel, 2010). Encouraging learners to identify underlying rules of model behavior fosters transfer (e.g., Decker, 1980). Interestingly, Bandura claimed, as does cognitive load theory, that learners need practice between model observations. The differences between what the model shows and what the learners can already do reveals what learners should attend to in subsequent models to correct their deficits.

Retention processes are necessary to transform transitory modeling experiences into stable memory traces. This process transforms and restructures what has been observed into symbolic representations (Bandura, 1986). Whereas the complexity of models cannot be fully remembered, succinct symbols represent the core features of model behavior. In abstract modeling, the displayed activities are abstractly represented as conceptions and rules. Cognitive rehearsal can strengthen such memory traces and fine-tune the skill. Reproduction processes convert the learners' representations of the abstract rules into appropriate action. Practice can improve these processes. However, if the observer has acquired merely a fragmentary sketch of the modeled behavior, reproduction processes are hindered. In addition, lacking skills in executing an action might also prevent production. A required behavior may also include subskills that must be coordinated, which initially might be somewhat difficult (Bandura, 1986). Motivational processes are relevant to Bandura's (1986) distinction between acquisition and performance. Learners must be motivated to perform what they have learned. For example, they need the self-efficacy to assume that they can perform the skill on their own. Such self-efficacy is greater if the observing learners perceive the model as resembling themselves (Schunk & Hanson, 1985).

Several studies have connected OL and self-explanation conceptually (e.g., Chi, Roy, & Hausmann, 2008; Craig, Chi, & VanLehn, 2009; Gholson & Craig, 2006). Interestingly, these studies do not focus on observing a performing model, as is usual in OL. They mainly analyze learning from watching a learning tutee while interacting with a human or computer-based tutor (Chi et al., 2008). In a certain sense, such models are coping models, as they do not show smooth, expert performance, but they do demonstrate their actions in acquiring certain skills. To optimize learning from observing tutorial dialogues, the (watching) learners need support for active processing, for example, in form of self-explanation prompts (Gholson & Craig, 2006), the insertion of deep-level questions (Gholson et al., 2009), or collaboratively observing, discussing, and solving problems (Chi et al., 2008). Although these articles do not explicitly relate these activating procedures to the four processes in Bandura's theory (1986), they are linked in particular to attentional and retention processes.

In summary, OL research affirms WE research with respect to the relevance and effectiveness of examples, the use of multiple examples, and the need to abstract from concrete examples. In addition, the social and motivation-inducing nature of the example/model is emphasized, which opens new possibilities to structure the learning setting (i.e., showing coping models). In addition, it is validly argued that after initial mastery of a skill, both automation and making skills more flexible and applicable in new contexts are important.

2.3. Analogical reasoning

Analogical reasoning refers to reasoning on specific exemplars or cases. In this process, knowledge about one exemplar is used to infer knowledge about another exemplar (Gentner, 2003; Holyoak, 2005). Only parts of AR research are directly relevant to the present purpose. We focus on analogies in problem solving, largely neglecting other analogies (e.g., between stories or personal relations). We have also left out important subfields such as computational modeling work (Holyoak, 2005, 2012).

2.3.1. Relevance and effectiveness of examples

Some researchers regard AR (i.e., the reliance on already-familiar examples or cases) as fundamental to achievements such as scientific discovery, problem solving, and identifying causal relations (e.g., Holyoak, 2005; Holyoak & Cheng, 2011). Some authors even regard AR as the core of human cognitive capabilities and, in particular, of effective learning (e.g., Gentner, 2010; Gust & Kühnberger, 2006). AR is also considered as one solution to the learning paradox (Bereiter, 1985): If a learner does not understand something, then she cannot learn it because she does not know enough to even begin. Analogy from a different domain might provide a schema as an initial template that can boost understanding (Chi & Ohlsson, 2005). Holyoak (2012) postulates that analogy is a “strong ‘weak’ (i.e., domain-general) method” that is very powerful if knowledge about an analog is available.

Much of the analogy literature has analyzed basic questions of cognition and learning. There is hardly any research comparing instruction based on AR to other approaches (for an exception, see Nokes-Malach et al., 2012). However, there is indirect evidence: In Hong Kong and Japan—countries with high levels of mathematics achievement—teachers provide much more AR support (e.g., a familiar source problem to be compared to a target analog) as compared to the U.S. “standard instruction,” in which average mathematics achievement is relatively poor (Richland, Zur, & Holyoak, 2007).

AR research provides explanations as to why example-based learning is effective. Transfer is achieved when both the abstract principles and exemplars are provided, and when learners relate the exemplars to abstract principles (e.g., Fong & Nisbett, 1991; Ross & Kilbane, 1997). In other words, transfer requires that both abstract and concrete knowledge be encoded and interconnected (Reeves & Weisberg, 1994). Example-based learning is thus effective because it provides affordances for interrelating abstract and concrete knowledge.

2.3.2. Phases of skill acquisition

Analogical reasoning research has not provided a stage model for an entire skill-acquisition process. However, it has provided a very fine-grained account of different stages within the phases of skill acquisition as described in WE and OL research (cf. the analogy phase by Anderson et al., 1997; intermediate phase, VanLehn, 1996). Typically, four core stages are postulated (Holyoak, 2005, 2012; see also Gentner, 2003). First, the examples—in some cases together with the underlying principles—are presented as sources of transfer. They are encoded in memory. A schema including an abstract principle might already be constructed (stage 1). When encountering a transfer problem, potentially relevant analogical examples—encoded in stage 1—are activated and selected (stage 2). The transfer problem is mapped to the analog; this mapping process can be regarded as the core process of AR (stage 3). Finally (stage 4), induction of an abstract schema can arise from this mapping process (or a schema modification provided it had been constructed in stage 1).

2.3.3. Learning processes

As just mentioned, in the first stage, the examples presented as transfer sources are encoded, and an abstract schema, including the solution principle, might already be constructed. There are two main instructional options in this stage: providing examples from which the principle must be induced (embedded principle method; Ross & Kilbane, 1997) or providing the principle up-front (abstract principle method), the former method being more typical in AR. In the embedded-principles approach, learners do not automatically go beyond the given surface features of the source examples (e.g., cover stories) and induce schemas (Reeves & Weisberg, 1994). Abstraction is fostered when providing multiple examples, particularly when the learners are simultaneously instructed to compare these examples (Holyoak, 2005; Reeves & Weisberg, 1994). The more specific such instructions are, the more likely schemas will be abstracted (e.g., Gentner, Loewenstein, & Thompson, 2003). In the abstract principle approach, the principles and examples must be actively interrelated (e.g., Reed, 1989, 1993; Ross, 1987, 1989). It makes sense to integrate the principle's presentation and an initial example so that learners establish tight connections between them (Ross & Kilbane, 1997). Even if learners encode examples by referring to abstract principles, concrete problem features are usually stored as well (Reeves & Weisberg, 1994; Ross, 1989). Learners also represent contextual features of the learning situation in their memories. For example, a similar context (source example and target problem presented by the same person) facilitates transfer better than a dissimilar context (different persons; Spencer & Weisberg, 1986).

In the second stage, when a transfer problem is to be solved, potentially relevant analogs are (a) activated and (b) selected. These processes are especially difficult for learners (Holyoak, 2005). Either they fail to notice that a known example solution applies to a problem, or they select an incorrect example. The latter problem arises from the fact that learners primarily use their knowledge traces of surface features (e.g., cover stories) to retrieve analogs (Ross, 1987, 1989), which can lead to the retrieval of analogs that do not fit. Note that even when an abstract schema was formed, learners still have the concrete examples encoded in long-term memory (e.g., Reeves & Weisberg, 1994) and they are reminded of related problems by superficial similarities (Ross, 1987, 1989). Such reminders occur even when learners have encoded an example several days before (e.g., Fong & Nisbett, 1991; Keane, 1987). Knowledge traces of concrete examples seem to be relatively stable. Even advanced learners “first” rely on superficial features (Novick, 1992). However, compared to less advanced learners, they are better able to decide whether the retrieved example actually fits or whether they better seek another example. In any case, reliance on surface features can provide useful cues for analogs and corresponding principles because structural and surface features are often correlated in real life (Bassok, 1996; Blessing & Ross, 1996; Novick, 1992). Hence, surface similarity often paves the way for principle-based analogy. For example, the verb “share” is typically a valid cue that division must be applied in a mathematics word problem, even if this is not necessarily true (the word “share” does not correspond to a structural feature; see, 2013).

In the third stage, the problem to be solved is mapped to the analog. The learners determine the correspondences between the features of the known example(s) and those of the problem at hand. Note that for productive AR, not just surface features are considered. The structure mapping theory by Gentner and colleagues (e.g., Gentner & Markman, 1997) emphasizes that AR involves mapping on the basis of structural features that consist of the relations between elements and not (necessarily) on the basis of the elements. Although the relations and not the objects are crucial for successful mapping, learners also rely on surface features when trying to map two problems. For example, when the roles of objects (e.g., cars) and persons (e.g., mechanics) were reversed in source examples and transfer problems, learners often mapped erroneously the corresponding objects to each other, leading to incorrect solutions (e.g., Ross, 1987, 1989). Nevertheless, as mentioned, relying on surface features may also facilitate structural mapping (Bassok, 1996; Reeves & Weisberg, 1994). Moreover, if there are multiple ways to map an example and a problem, the multi-constraint theory (Holyoak, 2012; Holyoak & Thagard, 1997) postulates that a good mapping procedure (i.e., helping in the pragmatic contexts) maximizes correspondences between (surface) elements.

In the final fourth stage, some schematic abstraction might arise from a mapping process. Such a first-schema abstraction may not necessarily lead to a highly generalized representation enabling transfer to any problem with the same deep structure; learners tend to be conservative in this respect (Reeves & Weisberg, 1994). For further abstraction, a multitude of examples and deliberate comparison processes are necessary. If a first schema has already been constructed in stage 1, it may be modified and made more abstract in this stage. Sometimes, however, learners may follow previous examples “verbatim” (e.g., VanLehn, 1998) without relying on abstract principles. In these cases, learners do not induce a schema. They need to engage in elaboration and analogical mapping during problem solving so that abstractness is added to the concrete representations. Overall, AR research assumes that there are two “opportunities” for schema abstraction processes. First, an initial schema might be constructed when (multiple) examples are encoded (stage 1); second, when the knowledge gained from studying examples is applied to a new problem, further abstraction can take place (stage 4).

Interestingly, working memory resources are crucial in AR (cf. cognitive load theory). The comparison and mapping of two analogical problems induce heavy demands on working memory resources, especially when the problems are complex (Holyoak, 2005, 2012). Hence, sufficient working memory resources are crucial for effective analogical transfer. Restricted capacity leads to reliance on surface features rather than structural features (e.g., Waltz, Lau, Grewal, & Holyoak, 2000). Relying on “directly” perceivable surface feature requires less capacity than focusing on relations that must typically be inferred (Holyoak, 2012). Working memory constraints come into play especially when (a) there are multiple relations (instead of just one) to be mapped (complex comparison and mapping demands) and (b) when “seductive” features (e.g., misleading surface similarity) must be inhibited, which is usually attributed to the executive function of working memory (e.g., Cho, Holyoak, & Cannon, 2007; Viskontas, Morrison, Holyoak, Hummel, & Knowlton, 2004). In addition, the performance of learners with restricted working memory capacity, that is, older (Viskontas et al., 2004), distracted (e.g., Waltz et al., 2000), or young learners whose executive functions are not fully developed yet (Richland, Morrison, & Holyoak, 2006) are particularly hampered by multiple relations and by correspondences to be inhibited. Reducing working memory demands should therefore foster the acquisition of effective problem-solving schemas and, thereby, transfer (Richland, Stigler, & Holyoak, 2012). At first glance, these assumptions on the importance of working memory capacity seem to parallel cognitive load theory. However, there are important differences as well that we discuss later on.

In summary, as the WE and OL approaches, AR research regards examples as an important source of the construction of abstract problem-solving schemas. All three approaches also assume that encountering examples is insufficient for effective learning. Instead, learners must actively compare and map examples to each other and to principles. An obvious difference refers to grain size. Whereas research on WEs and on OL focuses the acquisition of cognitive skills on a relatively coarse grain size, AR research analyzes the use of examples in great detail.

3. Integrative theory of example-based learning: Descriptive part

In this section, the three perspectives are synthesized into a theory focusing on descriptive aspects of example-based learning. We use “markers” to show which perspective supports each proposition: research on WE, OL, and AR. Note that this theory addresses only the cases when the principles are presented first (abstract principle method; Ross & Kilbane, 1997). When learners must abstract the principles on their own, in part other learning processes and relevant instructional principles come into play whose discussion is beyond the scope of this theory.

3.1. Relevance and effectiveness of examples

When examining the various findings on example-based learning, there can be little doubt that this learning method plays an important role in human cognition (WE, OL, AR). Various studies show that example-based skill acquisition is very effective as compared to various other learning methods such as learning by problem-solving or inquiry learning (see also Eysink et al., 2009). The main reasons why example-based learning in initial cognitive skill acquisition is so effective are that examples relieve learners from problem solving that can be—especially during initial skill acquisition, when learners lack understanding—slow, error-prone, and driven by superficial problem-solving strategies (WE, OL). Especially, the latter strategies fail to deepen understanding (WE). High problem-solving demands in the beginning of skill acquisition may even overburden working memory capacities and thereby strengthen learners' surface orientation (WE, AR). When studying examples, in contrast to problem solving, the learners have enough working memory capacity to gain understanding by self-explanation in the sense of example elaborations (WE) and comparisons that focus on structural features (AR). This argument is true for processing an initial example (WE) and when studying further examples that can be related to previous ones (AR). Hence, multiple examples, as employed in WE research, facilitate both initial encoding and comparing because they circumvent high working memory demands (WE, AR).

Elaborating and comparing examples are particularly important because abstract principles and concrete exemplars become interrelated (WE, AR), and learners gain understanding on how to apply principles in problem solving (WE, OL, AR). An important learning mechanism is the construction of generalized schemas, when learners select an appropriate example and map the problem at hand in terms of structural features; an abstracted schema is formed that can guide further problem solving (AR). Example-problem pairs as used in many studies on WEs provide an opportunity for such abstraction processes (AR, WE), if the learner can access or remember the previous example. In addition, when using sequences of example-problem pairs, difficulties during problem solving can make the learners aware of what to focus on in the next example in order to close knowledge gaps (OL). Furthermore, example-based learning allows for the encoding of concrete exemplars (AR). If future problems share features (on the superficial or structural level), it is easier to retrieve examples encountered earlier to analogically solve the problem at hand and to refer to the abstract underlying principle (AR).

3.2. Phases of skill acquisition

Again, the four phase assumptions of skill acquisition below only apply when employing example-based learning or other “traditional” instructional approaches. If instead, for example, problem-oriented or inquiry learning (e.g., Loyens & Rikers, 2011) is employed, the course of skill acquisition might be quite different (e.g., learners might not start with basic declarative domain knowledge but first acquire episodic knowledge about cases).

In the first phase (principle encoding), learners acquire basic declarative knowledge about a domain, particularly about the domain principles that should guide later problem solving (WE, OL). For example, they learn about principles of scientific argumentation (Schworm & Renkl, 2007). The learner does not yet know how to apply the principles (i.e., no rules for acting) (WE, OL). Note that in this phase, a principle is typically just encoded as fact and not as an abstract schema. For example, the learner has not yet understood how the different elements of a principle (e.g., p(A) from the multiplication rule in probability: p(A and B) = p(A) × p(B)) can be used as schematic, abstract slots to be “filled” with the concrete entities of a problem to be solved (Renkl, 2012).

In the second phase (relying on analogs), learners turn their attention to problem solving (WE, OL). They may first encounter examples printed in a textbook—as is often the case in mathematics textbooks after the basic introduction to a topic—or presented “live” by a teacher or, in a classroom, by a learning peer student (OL). Especially in the case of a peer student, the model might not perform smoothly; it may reveal impasses and how to overcome them. Written examples and enacted models tend to differ with respect to how motivating and attention-getting they are. To take up the example of argumentation, the learner observes instantiations, that is, exemplary models of scientific argumentation. These examples are encoded. The quality of this encoding depends on how well the learner processes them, that is, on their self-explanation and example-comparison activities. If a problem or just part of it (e.g., in the case of partially WEs; cf. van Merriënboer, 1990) must be solved, this is primarily done by analogy (AR). This does not mean that the learners disregard principles, but they are nevertheless first guided by analogs that are checked for suitability; analogs can then hint at the underlying principles if the analog is encoded accordingly (AR). Successful problem solving with reference to analogs contributes to the construction of abstracted schemas (AR). Hence, both initial example encoding and later retrieval for problem solving can foster schema formation (AR). To take up the example of the multiplication rule, a student learns in this phase that such elements as “p(A)” can correspond to different entities in different problems.

When learners develop declarative rules of action in a certain content area (WE), they have entered the third phase (forming declarative rules), which means they have acquired “verbalizable” if-then rules on how to act or to solve (parts of) problems (WE). In particular, knowledge about the application conditions of principles is added to learners' knowledge. Note also that these application conditions can be schematic in that they contain “abstract slots” (e.g., “two independent events” in case of probability problems) that can be filled in with a problem's specifics. For example, a learner having observed some argumentation models might be able to state the rule “if considering counterarguments against my position, I will (try to) disprove them.” Ideally, they are even embedded in broader schemas that allow for differentiating problems from related categories irrespective of their superficial features (WE, AR). In both the second and third phases, learners correct their declarative knowledge when they encounter an impasse while using their suboptimal knowledge structures (WE).

In the fourth and final phase (fine tuning: automation and flexibilization; WE, OL), learners have already learned to solve structurally identical problems by applying the acquired schemas. There are, however, two ways to optimize the acquired skill. First, single solution steps can be chunked, and the skill might be automated so that the performance accelerates and working-memory demands are minimized in problem solving (WE). This means that “proceduralized” rules are formed. If certain problems recur, the solution can also be directly retrieved from memory (WE). Second, learners might adapt their skill to changes in contextual conditions or even changes in the structural features of problems to be solved. Learners gain flexibility (OL). These two aspects of improvement (i.e., automation and flexibility) are not independent. If working memory resources are saved by automation, more capacity remains for engaging in reasoning processes that render a skill more flexible (WE).

These phases have no strict boundaries (cf. Anderson et al., 1997; Schunk & Zimmerman, 1997, 2007; VanLehn, 1996). Especially when acquiring complex skills, learners might be in an early phase regarding some subskills, whereas others might already be automatized. Example-based learning comes into play in the second phase (relying on analogs). In this phase, learners are shown how principles are applied; concrete examples are encoded and ideally related to principles (WE, OL, and AR). In the next phase (forming declarative rules), example-based learning remains relevant when a learner is acquiring declarative rules on when to apply a certain principle (e.g., in the case of a certain problem category), and when not to (e.g., in the case of a related, but different problem category) (WE). In advanced skill acquisition, when automation and flexible application in problem solving are the main goals, examples are not considered to play a major role (WE, OL). Nevertheless, there might be cases in which high-performing models contribute to optimizing the skill in the final phase when the goal is flexibility and transfer to new contexts. Of course, one can rely on problem solving alone instead of studying examples (e.g., solving isomorphic problems by analogy and forming declarative rules by reflections on problem solving). However, example-based learning is usually superior.

3.3. Learning processes: Typical suboptimalities

We have already discussed learning processes. However, the aforementioned descriptions focused mainly on the “positive” case of successful skill acquisition. We now report the main findings on suboptimal learning processes, thereby providing the basis for instructional principles to be discussed in detail later on. For each principle, we present only a selected set of references to exemplary work (Table 2).

Table 2. Overview of instructional principles for example-based learning
Instructional PrincipleShort Characterization Learning outcomes are superior if …Main Qualification
Self-explanation and comparison principle… students are encouraged to compare examples and explain their underlying rationale to themselvesNo effect if learners are already heavily loaded by the learning task's complexity
Explanation-help principle… students get help in the form of instructional explanationPositive effects just with respect to conceptual knowledge, in mathematics, or when presented “by default” in contrast to learner demand
Model-observer similarity principle… observers perceive the model as resembling themUnclear which features of similarity are key
Example-set principle… students get sets of examples that make important and critical aspects salientNo effect if learners are not encouraged to compare examples
Easy-mapping principle… when students are supported in relating different information sourcesNo principled guideline on how to select different instructional procedures (e.g., integrated format or color coding)
Meaningful building blocks principle… the “sense” of single solution steps is made salientEffect “only” well established in fields involving mathematical solution procedures
Studying errors principle… correct and incorrect worked examples are presentedNo such effect if learners with low prior knowledge get no additional support for studying errors
Imagery principle… learners are instructed to imagine the solution steps of already studied examplesNo effect if learners have too little prior knowledge to imagine solution steps
Interleaving by fading principle… problem-solving steps are increasingly introduced into example studyEffect “only” well established in fields involving mathematical solution procedures

Many learners do not spontaneously self-explain (e.g., Renkl, 1997) or compare sets of examples (e.g., Catrambone & Holyoak, 1989). They therefore do not acquire abstract schemas. These deficits might be due to learner passivity or lacking prerequisites (e.g., prior knowledge). If it is the latter cause (e.g., Berthold & Renkl, 2009), it makes sense to help by providing instructional explanations (explanation-help principle; e.g., Renkl, 2002). To overcome passivity, prompts can be employed to activate learners to self-explain or compare examples (self-explanation and comparison principle; e.g., Renkl, 2005; Gentner et al., 2003). A further possibility is that learners are shown models similar to them so that they can identify with the models, which leads to closer attention and deeper processing (model-observer similarity principle; e.g., Bandura, 1986). Another possible approach is to compose sets of examples so that learners' attention and processing are guided toward structural or other important features (example-set principle, e.g., Quilici & Mayer, 1996).

Another potential problem is when learners are cognitively overloaded by the learning materials, which induces a surface orientation. This is a particular danger when multiple representations and information sources are used, as is typically the case in many multimedia learning environments. Facilitating the integration of diverse information sources, for example, by coding corresponding information in the same color, can help (easy-mapping principle; Renkl, 2005). Another way to reduce cognitive load is to avoid complex “holistic” solution formula (at least in the beginning; Fig. 1); instead, they can be split up into smaller meaningful units (meaningful building block principle; e.g., Gerjets, Scheiter, & Catrambone, 2006).

WEs are particularly conducive for “illusions of understanding” (Chi et al., 1989). They do not require learners to do something that is followed by feedback. Some knowledge deficits that are ideally corrected in the second and third phases of skill acquisition may go unnoticed. It may therefore make sense to have learners study WEs with typical errors or watch models that overcome initial faults (studying error principle; e.g., Große & Renkl, 2007).

Another problem can arise from a poor match between the actual state of skill development and the provision of examples or problems for learning. Examples are best for gaining understanding, but not for automation (expertise reversal effect; Kalyuga, 2007). One means of moving further toward problem solving in later stages of skill acquisition is to instruct the learner to first study an example and then to imagine the solution (imagery principle; e.g., Cooper et al., 2001). Another tried-and-tested method is to fade out worked steps according to individual learning progress (interleaving by fading principle; Renkl & Atkinson, 2003).

4. Integrative theory of example-based learning: Prescriptive part

The instructional design of example-based learning environments must take the many factors moderating the effectiveness of this learning method into account (e.g., Renkl, 2005). We have coped with the complexity of research findings as follows: (a) the diversity of findings is synthesized by clustering related effects under a certain instructional principle; (b) we considered only findings that do not come from just one research group (relying on one experimental paradigm) and/or that were backed-up by research on AR or OL. Note that although the following instructional principles have been well researched in controlled experiments, there is a general lack of classroom studies addressing longer learning periods.

It should be acknowledged that important and interesting findings are not included in the instructional principles, for example, cooperative versus individual study of examples (Krause, Stark, & Mandl, 2009; Retnowati, Ayres, & Sweller, 2010) or informing learners how to process multiple representations included in WEs (Schwonke, Berthold, & Renkl, 2009). However, it is “too early” to derive instructional principles based on these findings. Overall, nine instructional principles are presented. Table 2 provides an overview.

4.1. Self-explanation and comparison principle

In their seminal study, Chi et al. (1989) found that successful learners studied examples longer and explained them more actively to themselves (“self-explanation effect”). They tried to figure out the rationale of solution procedures. A self-explanation effect is evident even when the example “study time” is kept constant (Renkl, 1997). Learners' self-explanations are crucial for fully exploiting the potential of example-based learning. Besides self-explanations, comparing examples is the second pathway to attain principle-based understanding (Gerjets, Scheiter, & Schuh, 2008; Nokes-Malach et al., 2012), as highlighted by AR research.

4.1.1. Self-explanation

Learners engage primarily in two types of self-explanations to assign meaning to examples (Conati & VanLehn, 2000). Principle-based explanations relate solution steps to abstract principles (e.g., mathematics theorem; e.g., Renkl, 1997). An example of an elaborated principle-based self-explanation is illustrated in the learner statement: “In this step, the probabilities are multiplied because the events occur together and are independent: multiplication rule.” Meanwhile, the term “principle-based self-explanations” is also used when elaborating examples that do not provide single solution steps. For example, Schworm and Renkl (2006) analyzed student teachers' elaborations while studying both well-designed and poorly designed examples in order to learn about example design. Principle-based explanations were in this case statements that related the well- and poorly designed WEs to instructional design principles (for a corresponding prompt see Fig. 1, lower left corner, above the note box). Many studies from diverse domains have meanwhile shown that principle-based explanations foster learning and, in particular, transfer to novel problems (e.g., Atkinson et al., 2003: probability; Conati & VanLehn, 2000: Newton physics; Hilbert, Renkl, Kessler, & Reiss, 2008a: proof finding in mathematics; Schworm & Renkl, 2007: scientific argumentation).

Analogical reasoning research views the processes relating presented examples to an abstract schema that contains the underlying solution principle as a crucial learning process (cf. Holyoak, 2005, 2012; Ross & Kilbane, 1997). Reeves and Weisberg (1994) emphasize that learners must learn how abstract principles are used to solve concrete problems. Hence, although AR research uses other terms, the importance of what we call “principle-based explanations” is also stressed. Bandura (1986) emphasizes the relevance of principle-based self-explanations as well, by stating that model behaviors must be encoded conceptually, that is, related to the underlying rules (e.g., Decker, 1980, 1984).

Goal-operator elaborations are a means by which learners assign meaning to operators via identifying the subgoals achieved by those operators (e.g., in a probability example: “By subtracting the probability of red items from 1, we obtain the probability of non-red items”). This activity fosters the representation of (sub-) goals and of knowledge about operators for achieving these (sub-) goals. Such elaborations have usually been investigated when the examples have discrete solution steps, showing that they foster transfer to novel problems (e.g., Catrambone, 1996; Chi et al., 1989; Conati & VanLehn, 2000; Renkl et al., 1998). Such elaborations can also foster knowledge about the goal structure of certain problem categories, that is, the sequence of goals to be reached for the final solution (see Reimann & Neubert, 2000).

4.1.2. Example comparison

Comparing examples can induce an abstract schema including the general principle (e.g., Gick & Holyoak, 1983; Holyoak, 2005, 2012; Reeves & Weisberg, 1994). In further problem solving, such schemas can be used for transfer problems. Studies in the Bandura tradition also investigated example comparison. Braaksma et al. (2002; Braaksma, Rijlaarsdam, van den Bergh, & van Hout Wolters, 2006) provided good and poor models with the instruction to compare them. Bandura (1986) recommended such contrasting to make the important aspects of good performance more salient and to show what should be avoided. Self-explanation and example comparison can actually serve the same function. For example, principle-based self-explanations relate concrete examples to abstract principles; the same effect might result from comparing two (or more) examples and noting that they instantiate the same principle (Nokes-Malach et al., 2012).

Example comparisons are typically within-category comparisons (Gerjets et al., 2008). A category refers to a set of problems solvable by applying the same (set and sequence of) principle(s). When learners compare examples from the same category, they can learn to differentiate between (constant) structural features and varying surface features. They observe that the structural, not the surface features, determine the solution procedure. Beyond correlational evidence showing that example comparisons lead to superior transfer (e.g., Catrambone & Holyoak, 1989; Gick & Holyoak, 1983), there are several studies on the effectiveness of corresponding prompts (i.e., request to identify commonalities and differences) or aids (e.g., more scripted procedures; e.g., Catrambone & Holyoak, 1989; Gentner et al., 2003; Gerjets et al., 2008; Richland & McDonough, 2010).

In addition, there are types of comparisons that can be labeled as critical-feature comparisons. In these approaches, the usual rationale is to use examples that share all but one critical feature so that that one difference stands out and is encoded (cf. also Bransford & Schwartz, 1999; Ross & Kilbane, 1997). The instructional purpose might be rather different, depending on the specific critical feature. For example, Rittle-Johnson, Star, and Durkin (2009) guided their learners to explain the difference between two worked solution methods to the same problem and the conditions which must be met for the more parsimonious method to be applied. Flexible problem solving was taught in this manner. Overall, current findings support the assumption that instructing learners to compare examples or worked solutions methods with respect to critical features actually leads to the expected positive learning effects.

4.1.3. Fostering self-explanations and comparisons

There are two main ways to foster self-explanation: training and prompting. Renkl et al. (1998) developed a training approach tailored to example-based learning. Their short intervention (10–15 min.) in the area of interest calculation included the following components: (a) information on the importance of self-explanations, (b) modeling self-explanations with one WE from interest calculation, and (c) coached practice with another example. This intervention had a strong effect on self-explanation activities and learning outcomes (Stark, Mandl, Gruber, & Renkl, 2002; see also Bielaczyc et al., 1995). Gentner et al. (2003) developed and successfully tested a short training intervention for example comparison. Prompting interventions were employed in many studies designed to test self-explanation effects (e.g., Atkinson et al., 2003). When computer-based environments are used, learners must usually type their self-explanations into text boxes. Sometimes self-explanations are supported by menus providing a list of potential principles or goals (e.g., Conati & VanLehn, 2000). Although the “generative” aspect is restricted in such menu-based self-explanations (Hausmann & VanLehn, 2007), they nevertheless foster learning. Typical prompts that worked well in within-category comparisons ask for communalities and differences and, in some cases, for an overall principle that can be applied (e.g., Thompson, Gentner, & Loewenstein, 2000). Prompts for critical-features comparisons are formulated quite diversely and are, of course, tailored to the specific critical feature to be identified.

A qualification of the self-explanation and comparison principle arises from studies that detected no positive effects (e.g., Gerjets et al., 2006; Große & Renkl, 2007; Mwangi & Sweller, 1998). Self-explanation and comparison prompts may impose too many processing demands (i.e., overload) when cognitive load is already high due to complex learning tasks in relation to learners' prior knowledge (i.e., intrinsic load) and/or to suboptimal instructional design (e.g., Kalyuga, 2010; Sweller, 2006).

In summary, learning outcomes are superior when students are encouraged to compare examples and to explain their underlying rationale to themselves. Such learning activities can be fostered by prompting or by training, except when learners are working on very complex tasks. In this case, learners may be overburdened.

4.2. Explanation-help principle

A typical model in the OL tradition includes help in the form of instructional explanations: “In the early phases of learning a skill, students benefit from observing models explain and demonstrate the skill” (Schunk & Zimmerman, 2007, p. 12). Actually, a recent meta-analysis by Wittwer and Renkl (2010) showed that on average instructional explanations added to examples have positive effects. However, this average effect was relatively weak (= .16). In AR research, Reed (1989) found no positive effects of principle-related explanations. It thus seems better if learners self-explain a model that only demonstrates the skill (Rummel et al., 2009). Prompted self-explanations are usually superior to instructional explanations (e.g., Schworm & Renkl, 2006). However, learners are sometimes unable to self-explain productively. Help in the form of instructional explanations is then sensible. Wittwer and Renkl's (2010) meta-analysis suggests several qualifications with respect to this principle. Instructional explanations added to examples have been shown to be effective only when the domain is mathematics, when the main goal is conceptual understanding, when the explanations focus on operators, or when they are presented “by default” (in contrast to learner demand). Berthold and Renkl (2010) recently demonstrated that prompts to further process instructional explanations added to examples also facilitate learning. In summary, learning outcomes are superior if students receive help in the form of instructional explanation. This is at least true when students cannot adequately self-explain, although they should, for example, gain conceptual understanding in mathematics.

4.3. Model-observer similarity principle

The model-observer similarity is one of the classic moderators of model effects in OL (e.g., Bandura, 1986; Gresham, 1985; Owens & Ascione, 1991; Schunk, 1987, 1999; Schunk & Zimmerman, 1997, 2007). There are two primary mechanisms leading to similarity effects: First, if models are too dissimilar, observers do not identify and thus do not imitate. Second, if dissimilar models are too advanced, observers assume that they cannot demonstrate the appropriate behavior on their own, that is, they lack self-efficacy (Schunk & Hanson, 1985). For example, Ryalls, Gul, and Ryalls (2000) found that 14- to 18-month- old children learned three-step sequences better from peer models than from adult models. Braaksma et al. (2002) provided both competent and non-competent models to learn argumentative writing. In one condition the learners were instructed to focus on the competent and in another condition on the non-competent model when comparing both models. Weak students profited more from focus on the non-competent model, and stronger students learned best when focusing on the competent model. This pattern was interpreted as a similarity effect. The coping model effect has also been interpreted as a similarity effect in OL research (e.g., Schunk, 1999). However, it is unclear whether the similarity or shown errors (and strategies to overcome them) are crucial. The findings of Schunk and Hanson (1985) suggest that similarity does play a role. Students learning subtraction skills profited more from peer models, either coping or mastery, than from an adult teacher model (mastery). Nevertheless, one qualification of this principle is that it is unclear which similarity features are crucial. In summary, learning outcomes are superior when the observers perceive the model as resembling them.

4.4. Example-set principle

This principle goes beyond using “just” multiple examples. Sets of examples are often used to guide learners' attention to important aspects (e.g., Bransford & Schwartz, 1999). The specific learning goals of such example sets can be quite different (e.g., Braaksma et al., 2002: differentiating useful and less useful writing strategies; Rittle-Johnson et al., 2009: conditions for applying simpler solution methods). The “structure-emphasizing example sets” by Quilici and Mayer (1996) combines examples in a way that (a) each problem category is exemplified by a set of different cover stories (i.e., surface), and (b) the same set of cover stories is used across the problem categories. Learners observe that cover stories and structure do not necessarily co-vary, and that relying on surface features can mislead when trying to find the correct solution. Paas and van Merriënboer (1994) compared pairs of isomorphic geometry WEs with an intermixed presentation mode (i.e., high variability condition) so that the pairs present related but not isomorphic problems. The three different problem categories varied with respect to the types of values to be determined and the locations of axes in a coordinate plane. High variability fostered comparison processes on subsequent examples with respect to relevant and irrelevant features and, thus, transfer. A qualification is that example-set effects are instable if example comparison is not explicitly fostered (Scheiter, Gerjets, & Schuh, 2003). Only active example comparison assures good learning outcomes (e.g., Catrambone & Holyoak, 1989; Gentner et al., 2003; Gerjets et al., 2008; Richland & McDonough, 2010). In addition, when comparing, more support (as compared to less) leads to better outcomes (e.g., Gentner et al., 2003). Against this background, some researchers see both “elements” (i.e., sets with multiple examples and prompts) as a treatment package (e.g., Braaksma et al., 2002, 2006; Gentner et al., 2003; Rittle-Johnson et al., 2009). In summary, learning outcomes are superior when students receive sets of examples that make the to-be-learned aspects stand out. Example-comparison processes must be prompted to ensure positive effects.

4.5. Easy-mapping principle

The positive effects of WEs are lost when learners have difficulty relating individual information sources to each other (e.g., Tarmizi & Sweller, 1988). Visual search processes require so much cognitive capacity (i.e., induce extraneous load) that self-explanations are more or less blocked. Facilitating the mapping between representations, for example, by physically integrating them, frees cognitive resources for self-explanations. WEs thus regain efficacy. The easy-mapping principle integrates several effects that Sweller (e.g., Sweller et al., 1998) and Mayer (e.g., Mayer & Moreno, 2003) observed: Split-attention effect versus integrated format (e.g., Tarmizi & Sweller, 1988), modality effect (i.e., providing some information in an aural and other information in a visual modality; Ginns, 2005), color-coding, and flashing effects (e.g., using the same color or simultaneous flashing of corresponding elements; Berthold & Renkl, 2009). An animated agent supporting mapping by gaze and gestures can also enhance example-based learning (Atkinson, 2002). Similarly, Bandura (1986) recommends attention cueing when narration is accompanied by complex behavior. One qualification is that it is an open question when to best use which procedure to facilitate mapping (e.g., integrated format or dual mode). In summary, learning outcomes are superior when students are supported to interrelate different information sources. Several instructional procedures (e.g., integrated format, color coding) accomplish this function.

4.6. Meaningful building blocks principle

If solution procedures are just encoded as a “fixed chain” of steps leading to the solution, the transfer to problems for which a modified solution procedure is necessary will likely fail. The “chain” cannot be broken up into meaningful building blocks that can be flexibly reassembled. To do so, learners must encode individual steps as meaningful building blocks (e.g., a certain type of subgoal is achieved by a certain operator). Hence, it is beneficial to present examples so that meaningful building blocks are easily identifiable. Catrambone (1996) has repeatedly shown that the ability to assemble new procedures can be fostered by making subgoals in worked solutions salient, either by visually isolating them (e.g., by circles) or by assigning a label (see also Spanjers, van Gog, & van Merriënboer, 2012). Salient subgoals lead to self-explanations about what these steps accomplished. A further possibility to make (sub-) goals salient is to use a step-by-step presentation of worked solutions (Atkinson & Derry, 2000; Schmidt-Weigand, Hänze, & Wodzinski, 2009).

Sometimes learners are barely able to identify meaningful building blocks of solutions because the instructional materials use complex formula that are computationally efficient but opaque (“molar” formulae). Beginning learners cannot understand such complex formulas and later, if necessary, re-construct them. For example, the problem in Fig. 1 can also be efficiently solved by the formula shown on the left side. However, the solution's rationale is much easier to understand from the solution on the right side. Breaking “molar” solutions into “modular” units (see Fig. 1) leads to better transfer performance (e.g., Atkinson et al., 2003; Gerjets et al., 2006). The meaningful building blocks principle is also supported by Bandura (1986), who claims that subdividing models and highlighting the constituent skill components leads to better learning because it is easier to focus and hold attention when parts (and not highly complex) skills must be mastered. In the latter case, crucial elements may get lost. A qualification of the meaning-full building block principle refers to the learning domain. Up to now, the really convincing evidence has originated mainly from studies on mathematics. In summary, learning outcomes are superior when the “sense” of single solution steps is made salient. This function can be achieved by instructional means such as assigning labels to subgoals or using modular solutions.

4.7. Studying errors principle

Errors can be a productive element in learning (cf. impasse-driven learning; VanLehn, 1999). Actually, Siegler and Chen (2008) found that self-explaining correct and incorrect solutions is more beneficial than self-explaining correct solutions only; explaining incorrect solutions helps to avoid these errors later (see also Durkin & Rittle-Johnson, 2012). Several OL studies compared mastery models that reveal a smooth performance and coping models that initially also contained errors, and how they can be overcome. Learners usually profit more from coping than from mastery models (Kitsantas, Zimmerman, & Cleary, 2000; Schunk et al., 1987; Zimmerman & Kitsantas, 2002). Note, however, that coping and mastery models differ in more than whether errors are shown, as coping models also demonstrate strategies to overcome difficulties. Nevertheless, these findings are in line with the aforementioned studies on the effects of errors in examples. A qualification of the present effect is that only learners with sufficient prior knowledge might profit from studying errors in examples (Große & Renkl, 2007). Providing errors in WEs too early in the learning process overwhelms learners. Weaker learners need additional support by explicitly marking errors (Große & Renkl, 2007) or by expert explanations as to why certain moves were correct or not (Stark, Kopp, & Fischer, 2011). In summary, learning outcomes are superior if correct and incorrect WEs are presented. However, learners with low prior knowledge need support when processing faulty examples.

4.8. Imagery principle

Sweller and colleagues identified positive effects of imagery instructions (e.g., Cooper et al., 2001; Ginns, Chandler, & Sweller, 2003): Learners were to read a worked solution, turn away from the screen, and imagine performing the solution procedure (in the control conditions, the learners went on to study the solution). OL research contributes evidence supporting the imagery principle (e.g., Corriss & Kose, 1998; Hall et al., 2009; Vogt, 1995). Mental imagery can have effects similar to those apparent when actually performing the task. A qualification refers to the fact that imagining is ineffective when working memory load is high due to two visual information sources that must be integrated and mentally manipulated (e.g., Tindall-Ford & Sweller, 2006). In addition, learners with a lack of prior knowledge simply cannot imagine the solution when looking away from the example (Cooper et al., 2001; Ginns et al., 2003). It thus makes sense to first provide an example for study and second an example for imagery (Ginns et al., 2003). In summary, learning outcomes are superior when learners imagine the solution steps of previously studied examples. This is at least true when learners have sufficient prior knowledge to imagine the solution steps.

4.9. Interleaving by fading principle

Classic WE studies (e.g., Sweller & Cooper, 1985) used isomorphic example-problem pairs as an “example condition.” The problems were thought to motivate example processing, meaning that example study and problem solving were combined. Trafton and Reiser (1993) claim that positive effects from example-problem pairs result from the opportunity to form proceduralized rules when directly applying what is learned from the examples. AR research assumes that solving a problem by referring to an isomorphic preceding analog, as given in example-problem pairs, fosters schema induction—if it can be re-inspected or remembered by the learner. Such use of a previous analog is the main route of schema construction in AR models (Holyoak, 2005, 2012; Ross, 1989). Whereas AR research mainly proposes benefits via a “backwards process” (i.e., referring back to an analogical problem), Bandura postulates a “forward process:” Practice between observations leads to more focused attention on yet-to-be-learned aspects of subsequent models.

Recent findings have raised doubts on the general superiority of using example-problem pairs as compared to examples only (e.g., van Gog & Kester, 2012; van Gog, Kester, & Paas, 2011). The safer way to foster learning by interleaving is to gradually fade worked solutions steps (e.g., Atkinson et al., 2003). In such a fading procedure, a complete example is presented first. Second, an isomorphic example is presented in which a single step has been omitted. After trying to supplement the faded step, the learner receives feedback about the correct solution. Then, in the following examples, the number of blanks increases until only the problem formulation is left, that is, a problem to be solved. Such a fading procedure has proven to be more effective than example-problem pairs (e.g., Kissane, Kalyuga, Chandler, & Sweller, 2008; Renkl & Atkinson, 2003). Optimally, an adaptive fading procedure is used (e.g., Kalyuga & Sweller, 2004). Salden, Aleven, Renkl, and Schwonke (2009) faded a specific worked step when the learner provided correct self-explanations on a preceding isomorphic step, thereby indicating understanding of the respective knowledge component (Koedinger, Corbett, & Perfetti, 2012). Note that such fading on the knowledge-component level leads to a kind of interleaving on the problem level. If learners approach a WE with one step to be determined (and two steps worked out), they will again encounter worked steps after the first problem-solving demand.

One main difference between fading and example-problem pairs is constant versus growing problem-solving demands over time. From the expertise-reversal effect perspective (Kalyuga, 2007), slowly increasing problem-solving demands can be considered a main reason for the effectiveness of fading. A qualification of the present principle refers to the fact that up to now the fading effect has only been well established in mathematical solution procedures. How effective fading can be implemented in less well-structured skills is an open issue. In summary, learning outcomes are superior when problem-solving steps are increasingly introduced into example study. Adaptive fading to individual progress is especially effective.

5. Added value of the integrative theory

First of all, research on OL and on AR confirms many existing assumptions in WE research. Although such confirmation might not seem “spectacular” at first glance, it is important for at least three reasons: (1) Many findings and theories in fields such as cognitive science, psychology, and education “suffer” from fragile and inconsistent evidence; having well-established findings and theories backed-up by diverse research areas stands in positive contrast to the state-of-the art in many other research areas. (2) There have been several recent efforts to synthesize the multitude of often-inconsistent findings on learning and instruction in order to provide practice guidelines for teachers (e.g., Cromley & Byrnes, 2012; Ontario Ministry of Education, 2011; Pashler et al., 2007); this is why it is crucial to select and rely on those findings that are backed-up by many studies and diverse research approaches. (3) Although example-based learning is effective, it is not a learning method in line with the Zeitgeist of (socio) constructivist approaches (see Tobias & Duffy, 2009). For example, most (German) teachers seem to think that example-based learning is old-fashioned and just leads to superficial learning (e.g., Hilbert et al., 2008a). It is thus important to have strong and converging evidence to convincingly argue for example-based learning in practice. Beyond this convergence, there are six “complementary” benefits of the present integrative perspective for WE research (although a thorough discussion of what the OL and AR approaches can learn is beyond the scope of this article):

Learning not only during initial example encoding. The WE and, in particular, cognitive-load approaches remain unclear about what the specific processes are when knowledge gained from examples is used in later problem solving. From AR research, we know that there are three important processes beyond example encoding, which is the focus of WE research (for exceptions see Chi et al., 1989; VanLehn, 1998): searching and selecting, mapping, and schema construction. The classic WE approach (Sweller & Cooper, 1985) assumes that problems given after examples mainly heighten the motivation to study the examples (encoding). AR research shows that such an arrangement is also fruitful because students refer to the preceding examples when trying to solve problems, map the problems to examples, and thereby construct generalized schemas for later problem solving. As the construction of generalized schemas can occur during example encoding as well as during subsequent problem solving, the two explanations complement each other. In addition, Bandura (1986) argued that when examples and problems are interleaved, experienced deficits provide a productive guideline for the learners as to what to focus on when studying the next example.

Working memory limitations also relevant when using knowledge gained from examples. Both WE and AR research emphasize the relevance of working memory limitations. The arguments are similar: The construction of appropriate schema is hindered when learners are focusing on the surface features of the problems at hand due to heavy working memory load (Holyoak, 2012; Sweller et al., 1998). However, both approaches address different phases in which working memory limits hinder learning. WE approaches focus on example encoding, whereas AR research mainly considers working memory limitation during searching and selecting, as well as mapping. Hence, it is also important to reduce working memory load when applying knowledge gained from examples.

Relevance of surface features. WE research primarily considers (abstracted) schemas that are constructed from examples and later applied. However, AR research shows that learners use their knowledge traces of concrete examples and their surface features to retrieve and use analogs for problem solving, and even advanced learners rely on surface features in their first attempts. Although reliance on such surface features can be detrimental, more advanced learners notice when they have been initially misguided (Novick, 1992). In many real-life situations, reliance on surface features can provide useful cues for analogs and corresponding principles (e.g., Bassok, 1996; Blessing & Ross, 1996). In a nutshell, the WE approach underestimated the relevance of surface features for problem solving.

Self-explaining and comparing examples. In WE research, self-explanation is regarded as the “silver bullet” of example processing. Example comparison plays a minor role in different variants of the self-explanation concept (Chi et al., 1989; Renkl, 1997). Self-explanation refers mainly to elaborating on single examples (e.g., by explicating underlying principles). AR research shows that example comparison is also a very important way to gain understanding and construct abstracted schemas (Holyoak, 2012).There is unfortunately next to no research comparing both possibilities, with the one exception being Nokes-Malach et al. (2012), who found that both pathways lead to about the same level of conceptual understanding, although self-explanation revealed slight transfer advantages (see also Ngu & Yeung, 2012). In any case, WE research has so far largely ignored the potential of example comparison. It is worth noting that these two types of processing do not exclude each other.

Enacted models and their possibilities. Whereas WEs are typically written materials, the OL approach shows that perceptually rich, “enacted” models attract learners' attention. When they can identify with the models (Bandura, 1986), they feel motivated to employ what has been modeled. Such social and motivational aspects have been largely neglected in WE research thus far. Furthermore, coping models can show which strategies can be employed to overcome potential impasses during performance (cf. van Gog & Rummel, 2010). Overall, enacted models, as compared to traditional WEs, open additional instructional possibilities.

Gaining flexibility in later phases of skill acquisition. WE phase models perceive the main accomplishment in the later phase in gaining speed, accuracy, and the direct retrieval of solutions. However, the optimization of complex skills, which include non-recurrent aspects that cannot be fully automated (e.g., composing good arguments), also includes gaining flexibility and broadening the applicability of skills to new contexts (see Schunk & Zimmerman, 1997, 2007). In that sense, good examples or models can also help in the fourth phase when learners observe how skills can be further refined or extended to new areas of application.

6. Summary and outlook

In this article, we have proposed a theory of example-based learning informed by three research traditions: WE, OL, and AR. This theory has a descriptive and prescriptive part (for a summary see Table 3). As it would be redundant to reiterate the contents of Table 3, in the remaining discussion we provide examples of how the OL and AR traditions could profit from taking a look at WE research. Do note, however, that a profound discussion on the potential lessons to be learned by OL and AR approaches is beyond the scope of the article.

Table 3. Summary of the main theoretical propositions
Phases of skill acquisition

Four overlapping phases:

  1. Principle encoding
  2. Relying on analogs
  3. Forming declarative rules
  4. Fine tuning: automation and flexibilization
  • High in initial cognitive skill acquisition
  • Superior to learning by doing, even if well supported
  • Active self-explaining and comparing foster transfer to novel problems
  • Low in later phases, especially when automation is the main goal
Explanations of effectiveness
  • Reducing superficial strategies and cognitive load irrelevant for schema construction
  • Affordance for self-explaining and comparing examples, resulting in schema construction
  • Affordance for relating abstract principles with concrete episodes
  • Provision of potentially useful “episodic” cases
Design principles
  • Self-explanation and comparison principle
  • Explanation-help principle
  • Model-observer similarity principle
  • Example-set principle
  • Easy-mapping principle
  • Meaningful building blocks principle
  • Studying errors principle
  • Imagery principle
  • Interleaving by fading principle

In OL arrangements, examples usually include instructional explanations (Schunk & Zimmerman, 2007). This combination is often suboptimal, as traditional WEs research has shown. Supporting the self-explanation of examples is usually the better option (Schworm & Renkl, 2006). In some modeling studies, such as that of Braaksma et al. (2006), students were actually encouraged to deeply process and compare two examples—both a self-explanation and comparison activity. However, it is unclear what effects the induction of self-explaining and comparing had in these studies because there were no corresponding control conditions. In a nutshell, although self-explanation activities and corresponding instructional procedures are very important factors in determining the effects of example-based learning, they have thus far played a minor role in OL research and models.

Analogical reasoning research focuses mainly on example comparison and relating examples to abstract principles, which is, without a doubt, extremely important. However, WEs research shows that the use of earlier analogs is determined by many important factors, in particular, by the elaboration of single examples and example-design features (e.g., easy mapping, meaningful building blocks). Hence, AR research would profit greatly by considering the instructional principles detailed in this article from both an instructional perspective and because applying or neglecting these principles may moderate AR processes.

In conclusion, there is little doubt that example-based learning is one of the best researched learning methods. The present theory integrates the most important findings on example-based learning and has led to a number of well-founded theoretical propositions. We hope that this integrative theory will stimulate fruitful research that further advances our understanding of example-based learning, one of the most powerful learning methods we know.


Thanks to Carole Cürten for correcting and refining my English.