There is a strong tendency in cognitive psychology to compartmentalize research into different areas, such as memory, learning, categorization, or decision making. As a result, there has been little contact between these fields, which has led to notable blind spots. We will focus on one particular example of mutual blindness, namely research on causal and category induction, which traditionally have been treated as separate learning phenomena.
Typically, studies on causal learning present learners with precategorized potential causes (e.g., presence or absence of fertilizers) that could be potentially related to preclassified effects (e.g., presence or absence of blooming). The categories referring to causes and effects have been treated as unproblematic givens; thus, the only remaining task was to learn about the existence and strength of the causal relations (e.g., De Houwer & Beckers, 2002; Shanks, Holyoak, & Medin, 1996).
Research on categorization has largely neglected the role of causal information for category induction. This is particularly clear for theories that solely focus on the role of similarity in the formation of categories (see Murphy, 2002). However, even within the paradigm of theory-based categorization (Murphy & Medin, 1985), the main focus has been on internal category structure, that is, the causal and functional relations that link features within categories. For example, disease categories can often be represented as common-cause models with the category features representing causes (e.g., viruses) and effects (e.g., symptoms). It can be shown that the type of causal model linking separate features within such categories influences learning, typicality judgments, and inductive inferences (Rehder, 2003a,b; Rehder & Hastie, 2001, 2004; Waldmann, Holyoak, & Fratianne, 1995; Waldmann, 1996, 2000, 2001). However, the interrelation between learning categories of cause and effect and the induction of the causal relations linking the category members with other events have been neglected.
1.1. The tight coupling of category and causal induction
In a seminal study, Lien and Cheng (2000) explored the relationship between category learning and causal induction. In their learning experiments, participants were presented with a set of uncategorized cause exemplars, which could be classified at different hierarchical levels of abstraction. No category labels were provided. Instead participants only observed which cause exemplars (different types of substances) generated a specific causal effect (blooming of flowers). The question was how participants would categorize the cause events in the absence of any explicit information on category structure. The results of the experiments showed that learners categorized the exemplars at the hierarchical level that was most predictive for the effect. Lien and Cheng (2000) interpreted this as evidence for their maximal contrast hypothesis: People tend to induce categories that maximize predictiveness.
More recently, Marsh and Ahn (2009) have reported converging results. In their studies, participants were presented with exemplars that varied on a continuous dimension (e.g., high, intermediate, low height of bacteria). Marsh and Ahn manipulated the assignment of the exemplars to a binary effect (e.g., presence or absence of a protein). For example, in one condition only exemplars with high and intermediate values, but not those with low values caused the effect. This was contrasted with a condition in which only exemplars with high values caused the effect, but not exemplars with intermediate or low values. The results showed that learners tended to categorize the cause exemplars according to the boundaries entailed by the effect. Thus, the ambiguous intermediate value was classified together with the high value when both caused the effect; otherwise it was classified with the low value. These findings support Lien and Cheng’s (2000) theory by showing that learners attempt to categorize causes according to their effects and create categories which are maximally predictive. In sum, previous research has shown that subjects categorize cause exemplars according to their effects. Here, we will investigate whether effect exemplars are categorized according to their causes (i.e., whether people form cause-based categories).
Whereas Lien and Cheng (2000) and Marsh and Ahn (2009) showed that people categorize exemplars according to the features that maximize predictiveness, Kemp, Goodman, and Tenenbaum (2010) were interested in whether people categorize nondiscriminable objects according to their causal power. In their experiments, they presented subjects with perceptually indistinguishable blocks that either activated a machine or did not. In Experiment 1, Kemp and colleagues manipulated the grouping of the blocks with respect to their causal power. For example, in one condition four blocks never activated the machine and four blocks activated the machine half of the time. The machine was never active in the absence of a block. It was expected that learners would induce two classes of otherwise indistinguishable blocks which differ with respect to their causal power. To test this prediction, learners were confronted with single trials of novel test blocks. Although subjects saw the test blocks either activating or not activating the machine only once, they were capable of predicting the effects of a test block in a hypothetical setting in which the block would be placed inside the machine multiple times. For example, when the test block activated the machine, subjects inferred in the condition described above that the test block probably belongs to the category of blocks with intermediate causal power, whereas it probably belonged to the other category when it failed to activate the machine. Thus, learners used previously acquired category knowledge to make the inductive inference. Although learners had difficulties in a condition in which both categories had probabilistic causal power (0.1 vs. 0.9), additional experiments clarified that this difficulty can be overcome if additional feature information aiding the categorization process is made available. Another interesting finding was that learners tended to abandon the previously induced categories and induce new ones when the behavior of the test object seemed to be inconsistent with the previously observed categories. For example, when a new test block activated the machine often, whereas previously observed blocks did not, learners tended to conclude that they were observing an example from a new category. This finding shows that learners may use bottom-up statistical knowledge about causal power to decide whether a novel exemplar belongs to a previously seen or a new category.
1.2. Category transfer across multiple causal relations
The studies reported in the previous section examined the interplay between category and causal learning within single cause-effect relations. While maximal predictiveness can easily be defined when only a single cause-effect relation is considered, the situation becomes more complex when the category members are involved in multiple causal relations. In such a situation, each causal relation may in principle entail a different maximally predictive categorical scheme. These schemes may potentially be conflicting.
In the present research, we studied categories involved in multiple causal relations. In particular, we presented participants with causal chains linking three entities,1ABC. Assume, for example, that an initial cause A, radiation, influences an intermediate entity B, viruses, which in turn may cause a swelling of the spleen (i.e., splenomegaly) (event C). Whereas A and C are both presented as precategorized binary events (radiation vs. no radiation, and splenomegaly vs. no splenomegaly), B is a set of uncategorized exemplars (viruses). Thus, the intermediate entity B is part of two causal relations, each of which could be used as a basis of causally motivated categorization. Using one of the two relations to induce categories yielding maximal contrasts will lead to optimal categories for the respective single relation, but these categories may not necessarily be optimal for the other relation.
If both causal relations were learned at once, the computational problem would be to create categories for entity B that are globally optimal for predicting both related events, although they may not be locally optimal with respect to either. Alternatively, one could try to induce two category schemes from the causal information, which would not necessarily have to overlap. In the present research, we will not study learning situations in which information about multiple, interconnected causal links is simultaneously presented. We will instead focus on a case that seems more frequent in real-world learning. We rarely learn about all relations of a causal model at once but rather acquire causal knowledge in fragments, which later are integrated into more complex causal models (Fernbach & Sloman, 2009; Lagnado, Waldmann, Hagmayer, & Sloman, 2007; Waldmann, Cheng, Hagmayer, & Blaisdell, 2008; Waldmann, Hagmayer, & Blaisdell, 2006). For example, people may first learn about the causal relation AB and use A to categorize B (i.e., induce a cause-based category). In a later learning context, they may be presented with the causal relation BC. The main question of interest in the present research is under which conditions the categories for B (induced in the initial learning phase; AB) will be transferred to learning about BC, at the possible cost of reduced predictiveness for the second causal relation. Alternatively, learners could abandon the previously acquired category scheme for B and induce new, effect-based categories for B better suited for predicting event C.
Waldmann, Meder, von Sydow, and Hagmayer (2010) presented a first set of studies investigating this type of category transfer within causal chains. In one experiment, we used the three events mentioned above, radiation (A), viruses (B), and splenomegaly (C), within a two-phase causal learning paradigm. Importantly, no feedback about category labels of the events in B was provided. Learners observed the virus exemplars (entity B) along with information about their causes (event A) in the first causal learning phase, and the virus exemplars with information about their effects (event C) in the second causal learning phase. The general finding was that participants indeed tended to stick to the initially acquired virus categories and used this category scheme to learn about the second causal relation. Interestingly, this was the case, although a more predictive classification with respect to the final effect of the chain existed. Thus, learners did not form different category schemes for different causal relations but tended to favor category parsimony over flexibly recategorizing the same entities with respect to the learning context.
1.3. Boundary conditions of category transfer
In the present set of studies, we are interested in exploring the boundary conditions of category transfer. Although Waldmann et al. (2010) generally found evidence for category transfer, this may not universally be the case. In fact, previous experiments (Waldmann & Hagmayer, 2006) suggest that there may be circumstances in which learners might be reluctant to transfer categories. In these studies, a two-phase learning paradigm was used in which a supervised category learning phase was followed by a causal learning phase involving the same or similar exemplars. Thus, unlike in the studies discussed above, explicit feedback about category labels was given in the initial category learning phase. For example, participants first learned to categorize fictitious viruses into two mutually exclusive classes (e.g., allovedic vs. hemovedic viruses). In the subsequent causal learning phase, the exemplars were presented along with information about the presence or absence of a causal effect (splenomegaly). In these studies, participants also revealed a strong tendency to continue to use old conceptual schemes rather than inducing new ones. However, category transfer depended on the relevance of the categories for the causal effect. Whenever the category labels suggested natural kinds that could be plausibly related to the causal effect, transfer was observed. But when the categories were arbitrary, or could semantically not be linked to the causal effect, learners abandoned the categories and induced a novel set of categories that mirrored the category boundaries entailed by the causal effect. This was demonstrated in Waldmann and Hagmayer’s (2006) experiment 3, where participants first learned that exemplars could be classified into two types of viruses. In the subsequent causal learning phase, the same virus exemplars were introduced as blueprints for aesthetic patterns used in interior design. Participants’ task was to learn which virus patterns people liked. Learners abandoned the previously acquired categories and induced a new scheme that was maximally predictive for the causal relation.
One possible interpretation of these findings is that they support psychological essentialism (Medin & Ortony, 1989), which is claimed to underlie the naïve representation of natural kinds from childhood on (see also Ahn et al., 2001; Gelman, 2003; Rehder, 2007; Rehder & Kim, 2006). According to this theory, people tend to ascribe stable hidden essences to natural kind categories, such as viruses, which may cause various visible features. Once learners believe that their categories refer to something real and stable in the world, they should be reluctant to change these categories even when they only generate weak probabilistic relations in future causal relations. For example, many people treat gender or race as a natural kind category and are perfectly willing to accept weak probabilistic relations instead of looking for more predictive categorizations of people (see Hirschfeld, 1996). Thus, transfer should be observed in the present when learners believe that the same essentialist natural kind categories are causally relevant in the two presented causal relations. Viruses are certainly plausible generators of splenomegaly so that learners may have felt that the virus categories are causally relevant in the causal learning phase, and should therefore be reused. However, in experiment 3 of Waldmann and Hagmayer (2006), the cover stories suggested that the virus categories from the first learning phase were not relevant for the relation between the viruses and the aesthetic assessment in the causal learning phase, which may have led to the tendency to abandon the initial categories and recategorize the stimuli.
1.4. The unbroken mechanism hypothesis
The results of Waldmann and Hagmayer (2006), however, are theoretically ambiguous. The observed category transfer may only occur when people believe that the hidden (essential) features of natural kind categories are linked to the effect presented in the causal learning phase. Although we believe that psychologically essential features do play an important role in eliciting category transfer, category transfer in causal induction need not be restricted to the use of essential features. Instead, category transfer may depend on people’s causal theories, whether essences play a role or not.
We have developed the unbroken mechanism hypothesis, which posits that it is the causal relevance of the involved features and peoples’ assumptions regarding the involved causal mechanisms that drive category transfer. This hypothesis is inspired by the idea that people have strong intuitions that statistical contingencies arise from the operation of (often unobservable) causal mechanisms that specify how the cause events generate (or inhibit) the effect events (see Ahn & Kalish, 2000; Ahn, Kalish, Medin, & Gelman, 1995; Griffiths & Tenenbaum, 2009; Waldmann, 1996). These intuitions need not be very precise or correct, they may be vague or even faulty (Rozenblit & Keil, 2002). Nevertheless, they might play an important role in the way we perceive, interpret, and represent data.
The central idea behind the unbroken mechanism hypothesis is that transfer of causally induced categories to further learning episodes depends on whether learners assume a continuous causal mechanism connecting different categories of events. Whenever learners assume an unbroken causal mechanism, we expect participants to induce a coherent category scheme comprising all involved causal relations. Thus, for transfer it is not sufficient that the same entities are involved in multiple relations to form a complex causal model; rather the same properties or features of the entities must be causally relevant for both causal relations so that the same reference classes are picked out for the two causal relations. If different, causally unconnected features of the same entities are involved in multiple causal relations, learners might opt for inducing new, more predictive classification schemes. In this case, different sets of categories may be used for the two causal relations.2
The simplest variant of an unbroken mechanism are cases in which the same features of the category members are causally relevant. Assume the causal chain’s initial event is radiation which affects the DNA of a set of uncategorized virus exemplars (second event). The viral DNA, in turn, might determine whether a given virus exemplar does or does not cause a disease (final event). In this case, the same features (i.e., DNA) of the category members (i.e., viruses) are causally relevant for both relations. By contrast, if the chain’s initial event affects the surface features of the viruses while the viruses’ DNA is responsible for the final effect in the causal chain, different features of the entity would be relevant for the two causal relations. In this case, different category sets may be induced for each of the relations; hence, no category transfer should be observed.
There are cases in which different properties of the category members are involved in the two causal relations, but nevertheless an unbroken mechanism exists. For example, if in the first relation the DNA of the category members is involved, and the second causal relation is triggered by surface features, then an unbroken mechanism might still be in place if learners assume that the surface features of the category members are caused by their DNA. Although different causal features are involved in the two relations, the internal causal structure of the category links the different features with each other. This causal model entails that categories describing the status of the DNA are viewed as direct causes of the categories, summarizing the surface features. Therefore, the DNA is viewed as an indirect cause of the target effect. Thus, this is again an example of an unbroken mechanism. Hence, transfer is predicted.
1.5. Preview of experiments
In the present set of experiments, we studied the role of unbroken versus broken mechanisms in causal chains containing three entities ABC. The learning input was kept constant across conditions, while learners’ assumptions regarding the underlying hidden mechanisms causally connecting the three entities were manipulated. In particular, learners’ assumptions about the causal relevance of different features of the intermediate entity and its internal causal structure were manipulated.
There are a number of ways in which the two causal links constituting a causal chain model can be linked to each other. The range of these possibilities is illustrated in Fig. 1. One possibility, tested in Experiment 1a, is that the initial entity A affects hidden features of the intermediate entity B. These hidden features are not directly observable for the subjects but can be inferred from visible surface-features which are caused by the hidden features. For example, different types of microbes (entity A) might affect hidden features of protozoa’s DNA (property of entity B) thereby systematically altering the protozoa’s visible appearance (second property of entity B). If the hidden features are assumed to be causally responsible for the final effect (a swelling of the spleen), the causal mechanism underlying the observable correlations is unbroken. This is because the feature (i.e., the DNA) affected by the initial cause also affects the final effect (Causal Model I in the left column of Fig. 1). For this scenario, the unbroken mechanism hypothesis predicts category transfer.
We contrasted this case in Experiment 1b with an example of a broken mechanism in which the two causal relations were not linked within entity B. In particular, we investigated a situation in which the initial cause entity A affects the visible surface features of the intermediate entity but the hidden features of this intermediate entity are causally responsible for the final effect C (Causal Model II in the left column of Fig. 1). Here, different features of entity B are causally relevant for the two relations and there is no causal link between hidden and surface features. In contrast to essentialism, the unbroken mechanism hypothesis predicts that the initially induced categories of B would not be used when learning about the second causal relation, because there is no causal link between the surface and the hidden features of B.
The goal of Experiment 2 was to further explore the boundary conditions of category transfer. The experiment consists of a set of four closely related studies, in which participants were presented with different causal models containing different types of causal mechanisms. Whereas the unbroken causal mechanism in Experiment 1 is mediated through the hidden features of entity B, there are further cases of unbroken mechanisms which could link an initial cause A with a final effect C (Fig. 1, right column). A second possibility for an unbroken causal mechanism is that the initial cause A affects the hidden features of the intermediate entity B, which in turn influence its visible surface features (Causal Model II in the right column of Fig. 1). The surface features of the entity B are then causally relevant for the final effect C. This is a different version of an unbroken mechanism linking the components of the causal chain. A further case of an unbroken mechanism is a situation in which the initial cause A directly affects the surface features of entity B, and these features in turn cause the final effect C (Causal Model IV in the right column of Fig. 1). Note that no hidden (or essential) features are involved in this case. Nevertheless, transfer is predicted due to the presence of a continuous series of mechanisms. Such a finding would be critical for the position that transfer should only be observed when essential features of natural kinds are involved.
Experiment 2 investigated all three kinds of unbroken mechanisms (Causal Models I, II, and IV in Fig. 1, right column) and contrasted them with a case in which the causal mechanism is broken. In this case, the initial cause A affects hidden properties of entity B, but the final effect C is only causally dependent on surface features of B with no causal relation linking the hidden properties with the causally relevant surface features (Causal Model III in Fig. 1, right column). Although the essential features are linked to the first cause, no category transfer is predicted by the unbroken mechanism hypothesis.
In summary, the central goal of both experiments was to test a novel hypothesis about the boundary conditions for category transfer in sequential causal learning. Whereas previous accounts, inspired by psychological essentialism, assume that transfer is primarily governed by the causal role of hidden, essential features of natural kinds, we propose a more general hypothesis. According to this hypothesis, people’s beliefs in the connectedness of the causal mechanisms underlying a causal chain drive category transfer, rather than the involvement of essential features. Thus, the hypothesis predicts that no transfer will be observed if the mechanism is broken even when the natural kinds’ essential features are involved (either as causes or as effects). Conversely, category transfer is predicted in situations involving an unbroken mechanism even when no essential features of the categorized entities are causally relevant.