A recent, and very useful, book, the Cambridge Handbook of Computational Psychology (Sun, 2008) attempts to characterize the current state of the art in the field. The book begins with a section called “Cognitive Modeling Paradigms” and contains chapters on connectionist models (Thomas & McClelland, 2008), Bayesian models (Griffiths, Kemp, & Tenenbaum, 2008), dynamical systems approaches (Schöner, 2008), declarative/logic-based models (Bringsjord, 2008), and cognitive architectures (Taatgen & Anderson, 2008). I prefer the terms “framework” and “approach” to “paradigm.” I will comment separately on each of the first four approaches, then turn attention briefly to cognitive architectures and hybrid approaches (Sun, 2002).
Attempts to compare and contrast approaches have often been framed in terms of levels (e.g., Marr, 1982). However, the question of levels does not fully capture the different commitments represented by the alternative approaches described above. For example, it is widely argued that connectionist/parallel distributed processing (PDP) models address an implementational level of analysis (Broadbent, 1985; Kemp & Tenenbaum, 2008; Pinker & Prince, 1988; Smolensky, 1988), while other approaches focus on the level of algorithms and representations or on Marr’s highest “computational” level. My colleagues and I have argued instead that the PDP approach exploits alternative representations and processes to those used in some other approaches (Rogers & McClelland, 2008; Rumelhart & McClelland, 1985), and that the framework takes a different stance at the computational level than the one taken by, for example, Kemp and Tenenbaum (in press), in their structured probabilistic models (Rogers & McClelland, 2008). Similar points can be made in comparing the other indicated approaches (for a more extended discussion, see Sun, 2008).
Each approach has attracted adherents in our field because it is particularly apt for addressing certain types of cognitive processes and phenomena. Each has its core domains of relative advantage, its strengths and weaknesses, and its zones of contention where there is competition with other approaches. In the following I consider each of the approaches in this light. The material represents a personal perspective, and space constraints prevent a full consideration.
4.1. Connectionist/PDP models
Connectionist/PDP models appear to provide a natural way of capturing a particular kind of idea about how many cognitive phenomena should be explained while also offering alternatives to once-conventional accounts of many of these and other phenomena. The natural domain of application of connectionist models appears to be those aspects of our cognition that involve relatively automatic processes based on an extensive base of prior experience; among these aspects I would include perception, some aspects of memory, intuitive semantics, categorization, reading, and language.
It cannot be denied that connectionist modelers take inspiration from the brain in building their models; but the importance of this is often overplayed. Most connectionist cognitive scientists find this inspiration appealing not for its own sake, but for its role in addressing computational and psychological considerations (Anderson, 1977; McClelland, Rumelhart, & Hinton, 1986). For example, in the interactive activation model (McClelland & Rumelhart, 1981), Rumelhart and I wanted to explore the idea that the simultaneous exploitation of multiple, mutual constraints might underlie the human ability to see things better when they fit together with other things in the same context (Rumelhart, 1977). To capture this idea, we found that a connectionist/neural network formulation was particularly useful. The computational motivation clearly preceded and served as the basis of our enthusiasm for the use of a connectionist network. Similarly, the motivation for the development of our past tense model (Rumelhart & McClelland, 1986) was to explore the idea that people might exhibit regularity and generalization in their linguistic (and other) behavior without employing explicit rules. Further motivation came from the belief that linguistic regularities and exceptions to them are not categorical as some (Jackendoff, 2002, 2007; Pinker, 1991) have suggested but fall instead on a continuum (Bybee & Slobin, 1982; McClelland & Bybee, 2007; McClelland & Patterson, 2002). Again, the inspiration from neuroscience—the use of a model in which simple processing units influence each other through connections that may be modifiable through experience—provided a concrete framework for the creation of models that allow the implications of these ideas to be effectively explored.
Both benefits and potential costs have come from the neural network inspiration for connectionist models. On the down side, some have criticized connectionist models from concerns about their fidelity to the actual properties of real neural networks. Although I have at times shared this concern, I now think debate about this issue is largely misguided. The capabilities of the human mind still elude our most sophisticated models (including our most sophisticated connectionist models). Our efforts to seek better models should be inspired, and indeed informed, by neuroscience, but not, at this early stage of our understanding, restricted by our current conception of the properties of real neural networks.
In a similar vein, I do not wish to give the impression that relying on a neural inspiration somehow constitutes support for a connectionist model. Just as it seems to me improper to rule out a model because it does not seem biologically plausible enough, we would not want to think that a model deserved special credence at the psychological level because it adopted some specific idea from neuroscience as part of its inspiration.
In sum, PDP/connectionist models in cognitive science are offered primarily for what their protagonists see as their usefulness in addressing certain aspects of cognition. The inspiration from neuroscience has provided a heuristic guide in model development, but it should not be used either as a privileged basis for support or as a procrustean bed to constrain the further development of the framework.
4.2. Rational and Bayesian approaches
Rational approaches in cognitive science stem from the belief, or at least the hope, that it is possible to understand human cognition and behavior as an optimal response to the constraints placed on the cognizing agent by a situation or set of situations. A “rational” approach would naturally include Bayesian ideas, which can provide guidance on what inferences should be drawn from uncertain data, but in general it would also incorporate cost and benefit considerations. The costs include time and effort, which clearly place bounds on human rationality (Simon, 1957).
It certainly makes sense to ask what would be optimal in a given situation, thereby providing a basis for judging whether what people actually do is or is not optimal, and for focusing attention on any observed deviations. I am fully on board with this effort, and it seems clear that it can inform all sorts of modeling efforts.
There are also cases in which a rational analysis has correctly predicted patterns of data obtained in experiments. As one example, Geisler and Perry (in press) carried out a statistical analysis of visual scene structure to determine the relative likelihood that two line segments sticking out from behind an occluder are part of a single continuous edge or not, as a function of the two segments’ relative position and orientation. We can intuit perhaps that approximately colinear segments are likely to come from the same underlying edge; Geisler and Perry went beyond this to uncover the details of the actual scene statistics. In subsequent behavioral experiments, Geissler and Perry found that human observers’ judgments of whether two segments appeared to be from the same line conformed to the relative probabilities derived from the scene statistics. Both the scene statistics and the judgments did not match predictions based on a number of other prior proposals. Here is a clear case in which knowing what is optimal led to interesting new predictions, subsequently supported by behavioral data, and indicating that, in this case at least, human performance approximates optimality. Similarly, rational (Bayesian) approaches have been useful in predicting how human’s inferences concerning the scope of an object or category label is affected by the distributional properties of known exemplars (Shepard, 1987; Xu & Tenenbaum, 2007).
One reaction to some of this work is that it explores Bayesian inference in a very specific inferential context. For example, in Kemp and Tenenbaum (in press) the goal of learning in a cognitive domain is construed as being to discover which of several preexisting types of knowledge structure best characterizes observations in a particular domain. While the authors eschew any commitment to the particular procedures used for calculating which alternative provides the best fit, their approach specifies a goal and a set of alternative structures to consider. Theirs is not, therefore, simply a Bayesian model but a particular model with a number of properties one might or might not agree with, quite apart from the question of whether the mind conforms to principles of Bayesian inference. What separates some of the models my collaborators and I have explored (e.g., Rogers & McClelland, 2004) from these approaches may not be whether one is Bayesian and the other connectionist, but whether one views learning in terms of the explicit selection among alternative prespecified structures.
Considering the “rational models” approach more broadly, one general problem is that what is “rational” depends on what we take to be an individual’s goal in a given situation. For example, if a human participant fails to set a hypothesized response threshold low enough to maximize total reward rate in a reaction time experiment, we cannot judge that the participant is behaving nonoptimally, because we may have missed an important subjective element in the participant’s cost function: to the participant, it may be more important to be accurate than to earn the largest possible number of points in the experiment. If we allow a free parameter for this, we unfortunately can then explain any choice of threshold whatsoever, and the appeal to rationality or optimality may lose its force, unless we can find independent evidence to constrain the value of the free parameter. Another problem is that what is rational in one situation may not be rational in another. We then confront a regress in which the problem for rational analysis is to determine how we should construe a given situation.
Appeals to evolution as an optimizing force are often brought up in defense of optimality analysis. The difficulties with this were clearly spelled out by Gould (1980) in The panda’s thumb: Evolution selects locally for relative competitive advantage, not for a global optimum. The forces operating to determine relative competitive advantage over the course of evolutionary history are themselves very difficult to determine, and the extent to which whatever was selected for will prove optimal in a given situation confronting a contemporary human is also difficult to judge. It is possible to agree fully that selection shapes structure and function while stopping short of a conclusion that the result is in any way optimal, much less optimal in the contemporary world, which is so different from the one we evolved in.
Thus, as with seeking one’s inspiration from the properties of biological neural networks, seeking one’s inspiration from a rational analysis is no guarantee of success of a cognitive model. In my view, we should evaluate models that come from either source for their usefulness in explaining particular phenomena, without worrying too much about whether it was neural inspiration or rational analysis that provided the initial motivation for their exploration.
4.3. Dynamical systems approaches
The dynamical systems approach begins with an appeal to the situated and embodied nature of human behavior (Schöner, 2008). Indeed, it has mostly been applied either to the physical characteristics of behavior (Does a baby exhibit a walking gait when placed in a particular posture? Thelen & Smith, 1994), or to the role of physical variables in determining behavior (Does the physical setting of testing affect the presence of the A-not-B error? Thelen, Schöner, Scheier, & Smith, 2001). The presence or absence of the behaviors in question has previously been taken by others to reflect the presence or absence of some cognitive or neural mechanism; thus, dynamical systems approaches have clearly brought something new into consideration. For example, babies held in a standing position exhibit a stepping gait at a very early age but do not do so when they are a little older. A dynamical systems approach explains this not in terms of the maturation of top-down inhibitory mechanisms, but in terms of the greater weight of the legs of the older baby (Thelen & Smith, 1994).
The idea that dynamical systems ideas from the physical sciences might usefully apply to cognitive and psychological modeling is still a fairly new one—partly because the analysis of physical dynamical systems is itself relatively new. While there can be little doubt that the approach has led to some interesting models of aspects of performance in physical task situations, the idea that it will be possible to build an interesting theory of cognition within this approach, as Schöner (2008) proposes, remains open at this stage.
To date at least, the dynamical systems approach has also been relatively silent about the processes that give rise to changes in behavior in response to experience and over the course of development. Continuous underlying change in some parameter is often assumed to account for developmental differences, and the occurrence of discontinuous changes that can result from such changes has been demonstrated using simulations based on this approach. But the source of change often comes from outside the model, as an externally imposed “control variable.” Finding ways to allow the changes in these variables to arise as a result of experience and behavior will greatly enhance the dynamical systems framework. Extending the framework to deal with less obviously concrete aspects of behavior will also be an interesting challenge for the future. The importance of gesture, at least as a window on more abstract mental process, and perhaps even as a factor in these processes (Alibali & Goldin-Meadow, 1993), suggests that the extension may well have considerable value.
4.4. Symbolic and logic-based approaches
The notion that thought is essentially the process of deriving new propositions from given propositions and rules of inference lies at the foundation of our philosophical traditions, and thus it is no accident that a focus on this type of process played a prominent role in the early days of cognitive modeling. Fodor and Pylyshyn (1988) articulated the view that the fundamental characteristic of thought is its ability to apply to arbitrary content through the use of structure- rather than content-sensitive rules.
It may well be that some of the supreme achievements of human intelligence have been the creation of inference systems of just this sort. These systems, once developed, have allowed such things as proofs of very general theorems and construction of beautiful systems for mathematical reasoning such as geometry, algebra, and calculus. It is therefore perhaps only natural that many cognitive theorists have sought a basis for understanding human thought as itself being essentially such a formal system. The acquisition of formal systems (arithmetic, algebra, computer programming) is a central part of modern education, and many of the errors people make when they use these systems can be analyzed as reflecting the absence (or perhaps the weakness) of a particular structure-sensitive rule (for an analysis of the errors children make in multicolumn subtraction, see Brown & VanLehn, 1980). Therefore, the framing of many cognitive tasks in terms of systems of rules will surely continue to play a central role in the effort to model (especially the more formal) aspects of human thinking and reasoning.
It seems at first glance very natural to employ this type of approach to understanding how people perform when they are asked to derive or verify conclusions from given statements and to provide explicit justifications for their answers (Bringsjord, 2008). Surely, the field of cognitive science should consider tasks of this sort. Yet there seem to me two central questions about the approach. To what extent does this approach apply to other aspects of human cognition? And to what extent, even in logic and mathematics, are the processes that constitute insightful exploitation of any formal system really processes that occur within that system itself?
The claim that the approach is essential for the characterization of language was explicitly stated by Fodor and Pylyshyn but has been challenged both by linguists (Bybee, 2001; Kuno, 1987) and by cognitive scientists (Elman, in press; McClelland, 1992). Fodor and Pylyshyn do not take the position that the structure-sensitive rules are accessible to consciousness—only that the mechanism of language processing must conform to such rules as an essential feature of its design. Indeed it has been thought that the supreme achievement of linguistics has been to determine what these rules are, even though they must be inferred from what is and what is not judged to be grammatical (Chomsky, 1957). This is not the place to review or try to settle this debate, but only to remind the reader of its existence, and perhaps just to note the agreement about the implicitness of the knowledge in this case—here, it is not assumed that native speakers can express the basis for their linguistic intuitions. Other domains in which such questions have been debated include physical reasoning tasks such as Inhelder and Piaget’s balance scale task (van der Maas & Raijmakers, 2009; Schapiro & McClelland, in press; Siegler, 1976). Here the question of explicit justifications is much more complex. People often do provide an explicit justification that accords with their overt behavior but this is far from always true, making the question of whether the rule is used to guide the behavior or only to justify it a very real one (McClelland, 1995).
The idea that a logic-based approach might be applicable to explicit logical inference tasks (those in which conclusions are to be derived and verified along with an explicit justification for them) seems almost incontrovertible, and yet one may still ask whether these rules themselves really provide much in the way of explanation for the patterns of behavior observed in such task situations. The pattern of performance on the Wason Selection Task (Wason, 1966) is perhaps the best known challenge to this framework. As Wason found, people are very poor at choosing which of several cards to turn over to check for conformity to the rule “If a card has a vowel on one side, then it has an even number on the other.” Performance in analogs of this task is highly sensitive to the specific content rather than the logical form of the statement (e.g., “If the envelope is sealed, then it must have a 20 cent stamp on it”), and attempts to explain the pattern of human performance in this task are not generally framed in terms of reliance on abstract rules of logical inference (Cosmides, 1989; Oaksford and Chater, 1996).
The approach described in Bringsjord (2008) is one variant of a symbolic approach, and it differs in many ways from other symbolic approaches. Most such approaches are not limited to modeling explicit reasoning tasks and indeed are not generally construed as necessarily conforming to principles of logic. While the best-known current framework for symbolic cognition (ACT-r, Anderson & Lebiere, 1998) employs explicit propositions and production rules that capture explicit condition and action statements, these propositions and productions are not constrained to apply only to explicit reasoning tasks and, even when they do, can make reference to problem content and context, thus avoiding the problems facing an essentially logic-based approach. Furthermore, the content of production rules is not generally viewed as directly accessible, allowing for dissociations between the processes that actually govern performance and the verbal statements people make in explaining the basis for their performance.
Furthermore, both the declarative content and the production rules in contemporary production system models have associated strength variables, making processing sensitive not only to the content and context but also to frequency and recency of use of the information and to its utility. Such models generally can and often do exhibit content- as well as structure-sensitivity, contra the strong form of the idea that human thought processes are well captured as an instantiation of an abstract formal reasoning framework (Marcus, 2001). The inclusion of graded strengths and mechanisms for adjusting these strengths can make it difficult to define empirical tests that distinguish production system models from connectionist models or models arising in other frameworks. It seems likely that further developments will continue to blur the boundaries between these approaches.
4.5. Cognitive architectures and hybrid systems
The fact that each of the four approaches discussed above has its own relative strengths appears to underlie both hybrid systems (Sun, 2002) and contemporary integrated cognitive architecture-based approaches (Jilk, Lebiere, O’Reilly, & Anderson, 2008). The motivation for the cognitive architectures approach is to try to offer an instantiation of a complete human-like cognitive system. Newell (1994) explicitly advocated broad coverage at the possible expense of capturing data within each domain in all of its details, perhaps reflecting his desire to achieve a useable engineered system, in contrast to the goal of accurately characterizing the properties of human performance within a single domain.
The building of cognitive architectures seems especially natural if, as some argue, our cognitive systems really are composed of several distinct modular subsystems. In that case, to understand the functions of the system as a whole one needs not only to understand the parts but also to have an understanding of how the different parts work together.
It seems evident that there is some specialization of function in the brain, and that characterizing the different roles of the contributing parts, and the ways in which these parts work together, is worthy of consideration. Indeed, I am among those who have offered proposals along these lines (McClelland, McNaughton, & O’Reilly, 1995). I would argue, however, that some hybrid approaches take too much of an either–or approach, assigning some processes to one module or another, rather than actually considering how the components work together. I see this as an important area for future investigations. A small step in the right direction may be the ACT-R model of performance of the balance scale task (van Rijn, van Someren, & van der Maas, 2003), which uses declarative knowledge as a scaffold with which to construct production rules. I would urge consideration of models in which differentiated parts work together in a more fully integrated fashion. One example is Kwok’s (2003) model of the roles of hippocampus and cortex in episodic memory for meaningful materials. In this model, reconstructing such a memory is a mutual constraint satisfaction process involving simultaneous contributions from both the neocortex and the hippocampus. Sun (2008) also stresses the importance of synergy between components in his hybrid CLARION architecture.
To return to the main point of this section, it seems clear to me that each approach that we have considered has an appealing motivation. As each has its own domains of relative advantage, all are likely to continue to be used to capture aspects of human cognition.