Representation, inference, and transcendent encoding in neurocognitive networks of the human brain


  • Marsel Mesulam MD

    Corresponding author
    1. Cognitive Neurology and Alzheimer's Disease Center, Northwestern University Feinberg School of Medicine, Chicago, IL
    • Cognitive Neurology and Alzheimer's Disease Center, Northwestern University Feinberg School of Medicine, 320 East Superior Street, Chicago, IL 60611
    Search for more papers by this author

  • Potential conflict of interest: Nothing to report.


The anatomical basis of conscious experience has traditionally been linked to sensory-fugal (inward) pathways that convey sensory information to progressively “higher” association cortices. Current thinking is emphasizing the importance of sensory-petal pathways that run in the opposite (outward) direction. According to emerging views, many aspects of cognition may represent an iterative neural dialogue between sensory-fugal connections, which reflect the physical nature of ambient events, and sensory-petal connections, which infer the nature of the stimulus based on empirical accounts of past experience. These reciprocal pathways, embedded within the internally generated oscillations of the brain, are further modulated by top-down projections from high-order association cortices, most prominently located in prefrontal cortex. This set of top-down projections has the capacity to transcend experience-based representations and to insert internally generated priorities into the interpretation of ongoing events. The characteristically human capacity for resisting stimulus-bound responses and favoring novel interpretations may be linked to the influence of these top-down projections. The reciprocal sensory-processing pathways and their top-down modulations collectively define the conscious interpretation of experience. Ann Neurol 2008;64:367–378

A central aim of behavioral neurology is to explore the ways in which sensory stimuli become incorporated into the matrix of consciousness. How does the brain transform sensory experiences into memories, auditory and visual word-forms into meaning, and spatial locations into coordinates for orientation? In addition to their intrinsic scientific interest, these questions also guide the evaluation of patients who display clusters of deficits collectively known as disconnection syndromes. Disconnection syndromes such as prosopagnosia, pure alexia, word deafness, optic anomia, and conduction aphasia have occupied pivotal positions in the intellectual history of behavioral neurology.1, 2 Although this review does not focus on any one of these individual syndromes, it summarizes the evolution of modern concepts related to sensory processing pathways in the human brain and some of the current reformulations that are likely to be shaping future thinking in this field.

The second half of the 20th century witnessed dramatic advances in the identification of axonal pathways that underlie the elaboration of sensory information.3, 4 Although the emerging information was anything but simple, it did tend to conceptualize the central nervous system as a collator of sensory inputs first into incrementally more complex percepts and then into multimodal concepts. The critical flow of information underlying this process was directed from the external world into progressively “higher” levels of internal representations. Consciousness appeared to be constructed from bottom up, one sensation at a time, through a process mirroring the progression from “simple” to “complex” and “hypercomplex” visual neurons delineated by Hubel and Wiesel.5 The frontal lobes were occasionally placed at the apex of this pyramid, as an obligatory portal into consciousness, reminiscent of the Cartesian theater where information from multiple sources was ushered into the pineal gland to be apprehended by an immaterial soul.6, 7

It did not take long to identify the limitations of this linear model and shift emphasis to the more parallel, probabilistic, and interactive nature of the process. The visual system was the first to illustrate the computational advantage of parallel over serial processing, allowing basic parameters such as color and motion to be segregated so that they could remain available for multiple combinatorial rearrangements before losing their identity through convergence.8 Parallel pathways were soon shown to play a central role in all major neurocognitive networks, allowing vast informational landscapes to be surveyed rapidly as the network settled into a state of least conflict determined by the goals and constraints of the task at hand.9 But even these developments tended to overemphasize a bottom-up incremental elaboration of sensory events into behavior, cognition, and consciousness.

These views are also changing. The emphasis is now shifting to the influence of the apex on the base of the pyramid. The brain is no longer viewed as a transformer of ambient sensations into cognition, but a generator of predictions and inferences that interprets experience according to subjective biases and statistical accounts of past encounters. This top-down, inside-out account of brain activity is not new but is becoming increasingly more important to recent thinking in cognitive neuroscience. Reality according to this approach does not appear to be built one sensation at a time but as an emergent product of the iterative dialectic between externally generated inputs and internally generated hypotheses, expectations, and interpretations. Subjectivity is back on the agenda, as is the innate human capacity to transcend appearance in favor of significance.

Cortical Zones and Their Interconnectivity

The human cerebral neocortex contains approximately 30 billion neurons spread over 2,500cm2 of surface area.10, 11 Each neuron establishes at least 1,000 contacts with other neurons, making the number of cortical synapses much greater than the total number of stars in the Milky Way. This immense synaptic space provides a matrix for the orderly transfer of information along serial and parallel pathways so that simple sensations and muscle contractions can generate subjective experiences and purposeful actions. The cerebral cortex can be divided into five principal cortical zones: primary sensory-motor, modality-selective (formerly known as unimodal), heteromodal, paralimbic, and limbic, each characterized by a distinctive cytoarchitectonic arrangement and connectivity pattern.12

Primary sensory areas provide obligatory portals for the entry of sensory events into cortical circuitry, whereas the primary motor areas provide a final common pathway for programming movements needed to manipulate the environment and shift locations within it. At the opposite pole of this continuum, limbic areas have functional affiliations most closely associated with the internal milieu rather than the extrapersonal space. These two poles are bridged by the modality-selective, heteromodal, and paralimbic zones, which collectively enable the needs of the internal milieu to be pursued or suppressed according to learned expectations and perceived context. Two of these zones, the modality selective and heteromodal, are mostly involved in perceptual elaboration and motor planning, whereas the paralimbic zone plays a more central role in channeling emotion and motivation to action and perception. Heteromodal, paralimbic, and limbic areas are also collectively known as transmodal cortices because their neuronal responses are not segregated according to the modality of sensory input.

Pathway tracing and single-unit recordings in monkeys indicate that this plan of organization is based on a core synaptic hierarchy. Such experiments have shown that modality-selective areas can be further divided into upstream components, which receive their major inputs from the corresponding primary area, and downstream components, which receive their major inputs from upstream modality-selective areas. At the next synaptic stage, heteromodal areas receive convergent projections from multiple modality-selective areas and send massive projections to paralimbic areas. In turn, paralimbic areas are the principal sources of cortical projections to limbic areas. Limbic areas (eg, amygdala, hippocampus) and paralimbic areas (insula, orbitofrontal, cingulate, parahippocampal, temporopolar) are the only parts of the cortex that have substantial connections with the hypothalamus, a region of the brain that functions as a pivotal ganglion for the homeostatic, autonomic, and endocrine aspects of the internal milieu.

Neuroanatomical experiments have shown that lateral projections, interconnecting upstream and downstream components of one modality with those of another, are much sparser than lateral projections among transmodal areas.13 Consequently, the “job description” of an area becomes increasingly more heterogeneous and difficult to characterize with increasing synaptic distance from primary areas. For example, the insula and anterior cingulate gyrus show functional magnetic resonance imaging activations in a large number of tasks, whereas the functional activation of the fusiform gyrus, a visual modality-selective area, is limited to a handful of tasks, most of which involve face/object identification or lexical processing.

Functional Implications of Sensory Hierarchies and Group Encoding

The sensory-fugal “hierarchy” noted earlier can be described as a set of overlapping spheres of influence where individual zones (with the exception of primary areas) have fuzzy boundaries. Although the hierarchy may be soft, its role in the transformation of sensation into cognition permeates all major domains of mental activity. In the process of reading, for example, the shape of a word (defined by its font) is encoded in upstream visual association areas, its lexical identity (independent of its font) in downstream visual association cortex, and its meaning in heteromodal association areas.14 In the auditory modality, the identification of tones and pitch is mediated by primary cortex (A1), the representation of phonemic sequences (even of nonwords) by upstream association cortex of the superior temporal gyrus, and the encoding of meaningful words by downstream auditory association cortex of the middle temporal gyrus.15, 16 In general, the sensory-fugal hierarchies within the language and face/object recognition networks lead from the encoding of perceptual features (first two synaptic levels in Fig 1) to the generic and specific identification of the entity (levels 3 and 4 in Fig 1), then to the conseptual and experiential (ie, subjective) recognition of the specific item (levels 5 and 6 in Fig 1).

Figure 1.

Each concentric ring represents a different synaptic “level.” Any two consecutive levels are separated by at least one unit of synaptic distance. Level 1 is occupied by the primary sensory cortex, levels 2 to 4 by modality-selective cortices, and levels 5 to 6 by transmodal cortex. Within modality-selective areas, level 2 is most upstream (close to the source of input) and level 4 is most downstream. Small open circles represent macroscopic cortical areas, or “nodes,” one to several centimeters in diameter. Nodes at the same synaptic level are reciprocally interconnected by the black arcs of the concentric rings. Colored lines represent monosynaptic connections from one synaptic level to another. Visual pathways are shown in green, auditory pathways in blue, and transmodal pathways in red. The anatomical identity of many of the nodes is not specified. This review is guided by the hypothesis that these types of anatomical interconnections and functionally specialized nodes, experimentally established in the monkey brain, also exist in the human brain, even though, in many instances, their exact location has not yet been determined. The dashed lines interconnecting visual and auditory pathways in the first four synaptic levels indicate the scarcity of monosynaptic connections between sensory hierarchies belonging to different modalities. This diagram more closely reflects the hierarchical arrangement of sensory-fugal (forward) pathways than the more complex and less hierarchical sensory-petal (backward) pathways. A1 = primary auditory cortex; f = area specialized for face encoding; L = hippocampal-entorhinal or amygdaloid components of the limbic system; P = heteromodal posterior parietal cortex; Pf = lateral prefrontal cortex; s = area specialized for encoding spatial location in the auditory (blue) and visual (green) modalities; T = heteromodal lateral temporal cortex; v = area specialized for identifying individual voice patterns; V1 = primary visual cortex; V2, V3, V4, V5 = additional visual areas; W = Wernicke's area; wr = area specialized for encoding word-forms in the auditory (blue) and visual (green) modalities.

One of the most important principles of sensory representation is that of group encoding. In monkeys, single-unit recordings have shown that representations of faces are established through group encoding within inferotemporal visual association areas that occupy the fourth synaptic level of sensory-fugal processing (“f” in Fig 1). The tuning is broad and coarse: One neuron may be activated by several faces, and the same face may excite several neurons.17 Face-responsive neurons are selectively tuned to category-specific canonical features such as intereye distance, distribution of hair, facial expression, and direction of gaze.18 Neurons tuned to similar object features form vertical columns that measure approximately 0.4mm in diameter, and several adjacent columns responsive to similar effective features may be linked to form larger “patches“ or modules that measure several millimeters.19, 20 Interpatch connections can be strengthened by experience so that the identification of one canonical feature makes the set more responsive to (or predictive of) the others.21 This arrangement enables a rapid transition from the detection of a single canonical feature to the identification of the entire categorical pattern as a “face.”

Total columnar activation may encode generic properties, whereas the activity pattern of individual cells within the column may help to encode distinguishing and specific (subordinate, proprietary) features of objects.22 In response to a face, for example, a small subset of neurons in a given column can fire maximally and set constraints to guide and restrict the interpretation of subordinate detail encoded by less active (or higher threshold) neurons within the same ensemble.23 Identification can thus start by detecting the coarse or generic features needed to establish the presence of a face, and can then focus on finer, subordinate detail needed to characterize sex, age, expression, identity, and so on. The further linkage of this visual information to transmodal areas of the temporal lobe (at levels 5 and 6 in Fig 1) enables the recognition of the stimulus by the observer not only as a face but also as a specific face associated with a unique set of subjective experiences. This last step for linking perceptual to experiential associations becomes interrupted in patients with the syndrome of prosopagnosia.

Within the context of this system, the identity of an individual face is not encoded by fixed rates of firing of specific neurons but by the relative firing frequencies and interneuronal correlations across the entire ensemble.24 The processing parsimony offered by this organization is substantial.25 A large number of faces can be encoded by a small number of neurons, recognition can be graded and triggered by partial information, the same information can be probed through multiple associations, generalizations based on few common features (or analysis based on differences) can be achieved rapidly, the progression from categorical to subordinate identification can proceed smoothly, and damage or refractory states in a subset of neurons within the ensemble can lead to a graceful, partial degradation of function. These general principles guiding the visual identification of faces are also likely to apply to the encoding of other complex entities such as tools, animals, and words.

Multimodal Integration through Binding versus Convergence

Everyday experiences unfold in multiple modalities. The establishment of a durable record of experience and the associative incorporation of this record into the matrix of consciousness necessitate multimodal integration. The desirability of such integration had been articulated and its presence postulated on multiple occasions by philosophers and neurologists ranging from Descartes to Wernicke. The first demonstration of an anatomical substrate that could be mediating such integration was achieved in the 1960s and 1970s through a series of experiments based on the tracing of Wallerian degeneration and axonal transport in monkeys.3, 4, 26 These experiments identified an orderly chain of monosynaptic connections from primary sensory to first, second, and third order sensory association areas, which, in turn, sent convergent projections to higher-order multimodal association areas. Although these pathways generated considerable enthusiasm as the long-awaited fulfillment of predictions made by the associationists of the 19th century, the limitations of such a serial and convergent organization of knowledge also started to be recognized.9, 27, 28 Two of these objections are most relevant to this review. First, if a convergent representation of entity “X” were to be encoded in a unique location of cortex, the brain would have to resolve the cumbersome problem of conveying X-related information in all relevant modalities to this highly specific address. Second, within the context of such an organization, the modality-specific attributes of entity X would succumb to cross-modal contamination during the process of convergence, and the sensory fidelity of the experience would be lost within a short time.

Both objections can be addressed by assuming that the role of heteromodal, paralimbic, and limbic areas (ie, transmodal cortices) is not only to enable convergent multimodal synthesis but also to bind distributed modality-specific fragments into coherent experiences, memories, and thoughts.9, 13, 29 Transmodal areas (levels 5 and 6 in Fig 1) can thus enable the binding of modality-specific information at lower synaptic levels into multimodal representations that have distributed and convergent components. Examples include the pivotal role of midtemporal and temporopolar cortices for face and object recognition (“T” in Fig 1), Wernicke's area in the left temporoparietal junction for lexical associations (“W” in Fig 1), hippocampal-entorhinal components of the limbic system for episodic memory (“L” in Fig 1), and posterior parietal cortex for spatial orientation (“P” in Fig 1).13 Transmodal areas provide critical gateways for accessing the relevant distributed information but also become “neural bottlenecks“ in the sense that they constitute regions of maximum vulnerability for lesion-induced deficits in the pertinent cognitive domain, causing syndromes such as object agnosia, aphasia, amnesia, hypoemotionality, and hemispatial neglect.

Backward (Inside-Out) Pathways: Are There Any Unimodal Areas in the Brain?

The synaptic hierarchy underlying multimodal convergence is based on connections that originate in primary sensory areas and that run in the sensory-fugal direction. The process of binding, in contrast, requires axonal projections that run in the opposite (sensory-petal) direction, from transmodal to modality-selective areas. In a pivotal review of the literature focusing on the importance of such reciprocal pathways, Karl Friston pointed out that forward connections have sparse axonal bifurcations, that they originate in supragranular layers, and that they terminate largely in the inner granular layer (layer 4), the same layer that receives inputs from modality-specific thalamic nuclei.30 Backward connections, in contrast, have abundant axonal bifurcations, a more sprawling territory of distribution, prominent infragranular origins, and terminations that favor agranular layers. Backward connections are also less hierarchical and tend to jump synaptic levels. In the sensory-fugal direction, for example, primary visual cortex (V1) obeys the synaptic hierarchy by sending no direct connections of any significance to downstream visual association cortex (areas TEO and TE) or to the amygdala, whereas both the amygdala and downstream visual association areas sidestep the hierarchy by projecting directly to V1.31, 32

Forward connections elicit obligatory responses and, therefore, have a transforming influence on their downstream targets. As messengers of extrapersonal reality, they cannot be ignored. Backward connections, in contrast, tend to be modulatory. They influence the responses elicited by forward projections according to learned biases and contextual singularities. Friston's review points out that forward connections are mediated through fast amino-hydroxy-methyl-isoxalone propionic acid (AMPA) and GABA receptors with decay times of a few milliseconds, whereas backward connections may be more closely associated with slower N-methyl-D-aspartate receptors (approximately 50-millisecond decay). The N-methyl-D-aspartate receptors are also particularly important for the long-term modifications of synaptic strength according to the Hebbian rule of transsynaptic coactivation, where a change in synaptic strength (Δw) between neurons i and j can be expressed as a product of their correlated/simultaneous firing rates (r), Δwij= krirj, where k is a learning rate constant.33, 34 Through this property, backward (sensory-petal) connections can sculpt an experience-based template for interpreting incoming stimuli according to past encounters and global context.

The characterization of backward connections helps to address some of the criticism that has been directed at the hierarchical account of cortical connectivity, especially its core assumption that the first few synaptic stages of sensory-fugal processing take place in areas dedicated to a single sensory modality. One review addressing this issue questioned the presence of any modality-specific areas and concluded that “presumptive unisensory areas are in fact multisensory in nature.”35 To support this conclusion, the authors cite somewhat unexpected findings in monkeys showing that primary and upstream association areas in the visual modality receive direct monosynaptic input from auditory and somatosensory association cortices.36, 37 Further evidence against the existence of truly unimodal areas comes from experiments in which event-related somatosensory potentials have been obtained in auditory association areas and in which audiovisual voice-face integration has been detected in the auditory association cortex of monkeys and humans.35, 38

There are several reasons why these results should not be overinterpreted. First, cross-modal inputs into areas thought to be unimodal are quantitatively quite sparse. Furthermore, they tend to have anatomical characteristics of “backward” or modulatory projections.36, 37 From the vantage point of neuronal physiology, an area could therefore display unimodal features in the forward or dominant mode of neurotransmission but not necessarily in the physiologically more latent backward or modulatory mode. In fact, damage to areas traditionally considered to be modality specific do not lead to observable clinical deficits in tasks guided by other sensory modalities. Nonetheless, the new evidence showing the cross-modal connections noted earlier has led me to abandon the “unimodal” designation that I have used in the parcellation of the cerebral cortex12 and to replace it with the “modality-selective” designation, acknowledging that strict dedication to a single modality may not exist when the totality of connections is considered.

Internal Rhythms of the Brain: A Platform for Change with Stability

The earlier account may give the misleading impression that the interplay between forward and backward connections unfolds on a resting baseline. However, anyone who has looked at an electroencephalogram knows that the living brain is perpetually active, and that it can spontaneously generate rhythmic oscillations at frequencies ranging from delta to high gamma. In the evocative terminology of György Buzsáki, external stimuli “calibrate” this internally generated activity into what we identify as conscious experience.39 These oscillations provide the equivalent of a carrier wave modulated by signals transmitted along sensory-processing pathways. They could also offer neurophysiological advantages analogous to the mechanical advantages offered by a gyroscope, namely, dynamics that dampen the effects of sudden change, that allow rapid but transient responses to external forces, and that promote flexible returns to one of several semistable states.

The internally generated oscillations of the brain reflect hardwired properties of thalamocortical circuitry, architecture of inhibitory synapses, and biophysics of neuronal membranes. The dynamics of the self-generated rhythms and their response to stimulation involve neural loops where relationships among constituents are nonlinear, dependent on history, and where predictability is limited according to the chaos theory of complex systems.39 The architecture and distribution of these oscillations provide the physiological background for the interactions shown in Figure 1. There are, therefore, two internally generated influences that can modulate the response to external sensory events: the inside-out (backward) sensory-petal pathways and the state-dependent oscillations. As I show later in this review, a third distinctive type of internal control emanates from circuits centered in prefrontal cortex.

Generation of Inference through Predictive Encoding in Sensory Hierarchies

The reciprocal forward-backward connections, the dynamics of oscillatory activity, the availability of parallel pathways, and the prominence of group encoding collectively enhance computational capacity and flexibility. Through these mechanisms, complex brains have evolved the abilities for the proactive seeking and dynamic interpretation, rather than mere registration, of sensory experience. These processes, guided by past experiences and current context, can be formulated quantitatively by Bayesian statistical theory.40 In its simplest form, Bayes' theorem can be stated as P(A\B) = P(B\A)P(A)/P(B), where P(A) is the “prior” probability of event A irrespective of its relationship to event or context B, P(A\B) is the conditional or “posterior” probability of A dependent on a specific value of B, P(B\A) is the conditional (posterior) probability of B given A, and P(B) is the prior probability of B.

A model that Friston30 proposed explains how the interaction between the forward and backward axonal projections described earlier can enable the brain to approach its environment in a Bayesian fashion. The dynamics of the cerebral cortex, according to this view, are poised to minimize the uncertainty triggered by a sensory stimulus. Association areas function to dampen this uncertainty by inferring the causes of their sensory inputs according to Bayes' law and based on patterns established through Hebbian principles of coactivation in the course of past exposures to similar events. The process starts with sensory input emanating from the extrapersonal event. The information is first conveyed along sensory-fugal (forward) projections to the next higher level of the hierarchy shown in Figure 2, where it triggers a set of probabilistic predictions as to its cause. These predictions are then conveyed through backward (outside-in) connections to the lower synaptic level so that the interpretation of the extrapersonal event can be guided by contextual and experiential inferences emanating from the higher (more abstract) level of representation.41 This guidance (or bias) conveyed by sensory-petal projections from higher to lower levels of the hierarchy in Figure 2 are likely to be probabilistically layered, the front runners representing the most typical, likely, generic, and frequent of previously experienced causes. According to Friston,30 when prediction is incomplete or incompatible with the representation in a lower area, a prediction error signal is generated through forward connections and prompts the synaptically higher area to generate different or more specific hypotheses, triggering iterative interactions until reconciliation is achieved by settling into a state of least conflict or best fit.30

Figure 2.

Diagrammatic representation of sensory-processing pathways showing the iterative and sequential interactions that lead to the interpretation of sensory events. Sensory information from the extrapersonal world is conveyed to sequentially higher synaptic levels (red arrows). At each higher level beyond the primary sensory area, an inference concerning the nature of the incoming information is generated. This inference, conveyed by backward projections to the lower synaptic level, is based on the record of past experience encoded according to Hebbian principles of coincident activation and Bayes' theorem of empirical predictions. If the inference does not match the input, the lower level generates an error message that keeps the iteration open until a reconciliation is achieved. In this hierarchy, the lower synaptic levels encode surface sensory properties of stimuli, whereas successively higher levels of modality-selective cortex mediate the perceptual, generic, and specific identification of the stimulus. Through the mediation of transmodal areas, sensory representations of objects, faces, or words can elicit multimodal and experiential association. Top-down pathways arising from prefrontal cortex insert internally guided modulations that can transcend surface appearance and the empirical record of past experiences. This sort of top-down modulation gains increasingly greater strength in downstream modality-selective and transmodal areas. For the sake of simplicity, only vertical projections among synaptic levels are shown. There are also numerous lateral connections within a given synaptic level. The entire set provides an architecture where a serial hierarchical core is embedded within a network that also allows parallel processing. There is physiological evidence showing that each node is continually passing information to the others rather than fulfilling its part of the processing and then transmitting a completed product to the next station.64 Although there is considerable physiological support for the existence of Hebbian synapses, the existence of Bayesian circuits currently rests on speculation.

Accordingly, many aspects of cognition represent a reciprocal neural dialogue between sensory-fugal (inward or forward) connections, which reflect the physical nature of external events, and sensory-petal (outward or backward) connections, which insert individual biases and expectations into the interpretation of these events. Each level of the hierarchy in Figure 2 imposes constraints on the representation conveyed forward by the earlier synaptic level. The constraints generated by higher synaptic levels during this process can have varying degrees of complexity that reflect the range of past experiences, current expectations, and the evaluation of the prevailing context. A further source of complexity is contributed by the self-generated oscillations that provide the platform on which the forward and backward projections interact. Interactive sensory-petal and sensory-fugal processes in the neocortex can thus enable sensation to shape cognition at the same time that cognition constrains perception.

According to this scenario, fewer neuronal resources should become engaged by predictable, and therefore accurately inferred, than by unexpected events because the latter would take longer to reach the type of “reconciliation” described earlier. In fact, the P300 potential, indicative of neuronal activation, is generated in response to a deviation from routine (ie, when expectation is violated), and the N400 occurs when a noun is followed by contextually incongruous words or by the picture of an object belonging to a different semantic category. The converse of this phenomenon is known as “repetition suppression.” This term is used to designate the attenuation of neural responses to repeated stimuli, presumably because recognition of the first stimulus decreases the prediction error generated on exposure to the second. Repetition suppression is sensitive not only to sensorial properties but also to complex expectations determined by context. This was demonstrated in an experiment where subjects were shown pairs of faces, with each member of the pair presented in rapid succession. In one condition, the two members of a pair were most frequently identical; in the other, they were most frequently different. The results showed that suppression of responses to the second member of a pair of identical faces was much less pronounced during the condition in which nonidentical pairs dominated.42 It was as if the second stimulus, despite identical sensorial properties, had lost its “sameness” in a context where the subject expected the second member of the pair to be different.

As mentioned earlier, the activation of auditory cortex during silent lipreading is often cited to support the contention that cortical areas traditionally considered to be unimodal can support multimodal integration.38 The process outlined earlier offers an alternative explanation. It suggests that the visual stimulus of moving lips is likely to evoke the experientially established expectation of phonation, triggering predictive signals directed to auditory cortex. In this case, the interactions would be cross-modal and initiated by visual forward projections that first reach transmodal cortex and then trigger backward predictive signals directed to the auditory system. Such modulatory signals might normally help to constrain the auditory interpretation of the words according to visual patterns of lip movements.

The earlier account is most relevant to the predictive encoding of discrete entities such as faces, objects, and words. An equally important goal of sensory processing is to infer the likely outcome of observed biological motion. Interpreting the purpose of movements made by conspecifics, for example, becomes important for predicting likely consequences, regulating social interactions, and observational learning. A premotor “mirror neuron” network displays the startling property of inferring the intended goal of transitive actions performed by others.43 Its organization may be analogous to the organization of sensory-processing hierarchies described earlier.44

From Confirmation to Innovation: A Phylogenetic Perspective

These considerations support the emerging view of the neocortex as a computational device that can use partial information to infer the most likely nature of ambient events, based on the record of past experience. One advantage of this arrangement is to diminish noise, according to the principle, enunciated by Pasteur, that chance favors only the prepared mind. One potential disadvantage is that it may lead to a state of self-fulfilling prophecy where neural processing becomes driven by the intrinsic goal to confirm the past and reinforce the ordinary. This “disadvantage” protects the safety of simple brains that cannot afford to be innovative, but it becomes a stumbling block for more complex brains that have the computational power to transcend the status quo in search of novel and risky but potentially more advantageous solutions to common challenges. How do complex brains capitalize on the advantages of predictive encoding without suffering its stifling impact on prospects for innovation?

The biological reality of a drive for innovation becomes most plausible when approached from a phylogenetic vantage point. Insects, amphibians, reptiles, and even birds are not particularly fond of unfamiliar stimuli; simple mammals are attracted to new objects and thrive in enriched environments; nonhuman primates work hard just for the pleasure of peeking out a window; and humans become so addicted to variety that they experience hallucinations when subjected to sensory deprivation.13, 45–48 It is also worth keeping in mind that even the most advanced primates such as the bonobo and gorilla have not displayed much of an inclination for innovation. For millions of years, they have stuck to the same dietary preferences and used the same means of communication, except in those cases where experimenters have spent fortunes to teach them a few symbolic gestures. In stark contrast, the human brain has generated a bewildering diversity of approaches to common challenges, even in identical habitats, as exemplified by the thousands of different languages designed to express the same messages and the thousands of different cuisines to satisfy the same hunger. In comparison with the eons of sameness experienced by all other species, the tools and mores of even a few hundred years ago would be hopelessly outdated in human societies. The species-specific history of the human race revolves around a powerful drive to abandon the safety of established patterns in search of new approaches, sometimes for the sheer pleasure of satisfying curiosity. What aspect of brain activity underlies this novelty-seeking drive? How does it interact with the sensory hierarchies described earlier?

Transcendent Encoding and Contingent Routing

A prerequisite for innovation is to transcend the surface properties of an event and the prepotent response tendencies it elicits in order to explore alternative actions and interpretations: What else could an event signify? What would it imply in a different context? How might it evolve in the future? How would it appear from the perspective of another observer? How would it react to a different response? In this account, I will use the term transcendent encoding for representations of events that are not confined to the sensory data associated with the experience, and contingent routing for the ability to link one stimulus to multiple responses. Observations in the neurology clinic and functional imaging laboratory suggest that networks centered in prefrontal cortex play a special role in these aspects of neural activity. Patients with frontal lobe damage become indifferent to boredom, cannot distinguish appearance from significance, are unable to entertain nonegocentric points of view, and become prone to stimulus-bound perseverative responses.45, 49

Functional mapping experiments have started to show potential physiological bases for the prefrontal contribution to these aspects of neural activity. For example, novel stimuli strongly activate prefrontal areas, the magnitude of prefrontal activation predicts the amount of time that the stimulus will attract attention, and prefrontal neurons are more strongly engaged by circumstances where the outcome is ambiguous rather than certain.49, 50 In one functional magnetic resonance imaging experiment, subjects were shown pictures of faces, houses, or cars. There were two types of sessions. In one session, subjects were instructed to determine whether the stimulus was a face, and in another, whether it was a house. A portion of the inferior temporal lobe known as the fusiform face area (FFA) responded more to faces than houses regardless of instruction type, whereas the medial prefrontal cortex (MFC) was more sensitive to the nature of the instruction and showed greater responsivity during the face detection sessions, regardless of the nature of the stimulus that was being viewed. Furthermore, the differential response of the FFA to faces versus houses increased during sessions when the subjects had been instructed to look for faces. A dynamic causal model of activations showed that forward connections from lower visual areas to the FFA were strengthened when responding to faces compared with the other stimuli regardless of instruction type. In contrast, backward connections from the MFC to the FFA (but not the reciprocal forward connections from the FFA to MFC) were strengthened during the sessions when the subject was instructed to attend to faces regardless of stimulus type.51 One conclusion arising from this experiment is that MFC adds an instruction-based cognitively generated expectation, on top of the experientially generated (Bayesian) predictive encoding to further facilitate the identification of events fitting the instruction.

The directional asymmetry of the instruction-dependent change in effective connectivity reported in this experiment is particularly noteworthy. It suggests that the top-down projection from MFC may not be subject to modification by reciprocal error messages from FFA, and that it, therefore, has the potential to override the Bayesian constraints on the interpretation of the sensory data. This feature is a key prerequisite for transcendent encoding and contingent routing. Such a top-down projection can enable recognition on the basis of cognitively inferred rather than sensorially experienced input and increase the influence of context on response selection. One corollary outcome, the differential encoding of appearance versus significance, paves the way for the realization that “all that glitters is not gold.” In fact, single-unit recordings in monkeys show that orbitofrontal neurons can respond differentially to two presentations of the identical stimulus in ways that reflect changes in its motivational valence.52 Less advanced brains lack this feature and react to similar stimuli in identical fashion time after time, even under drastically different contexts. In the human brain, however, the same stimulus can elicit a large number of alternative responses, each alternative reflecting contextual peculiarities and individual idiosyncrasies. The psychological manifestations of this one-to-many mapping is experienced as choice, decision making, and free will.

Even in the absence of external stimuli, the top-down influence of prefrontal circuits on sensory hierarchies could potentially activate virtual representations of words, faces, and other entities in a manner that can promote mental imagery and thought, two of the most advanced manifestations of transcendent encoding.53 These comments should not be interpreted to imply that prefrontal cortex is the “seat” of behavioral flexibility, imagery, or thought. These aspects of human neural processing are ingrained at every level of the neuraxis, especially within transmodal association areas. The one unique property of prefrontal cortex is to have a much greater proportion of its synaptic space devoted to these functions.

Transcending the Constraints of Real Time: Working Memory

The iterative process that leads to the recognition of a specific stimulus along sensory-processing hierarchies can reach closure within hundreds of milliseconds. This rapidity is important because significant events in everyday life can succeed each other at great speed, and several events can arise simultaneously at different locations. However, because human attention is a resource of limited capacity,54 not all events can be apprehended simultaneously, and the attentional focus (the sphere of optimal processing) needs to be shifted from one event to the other. In the course of such serial sampling of the environment, there is rarely enough time to transfer information emanating from each significant event into off-line storage and retrieve it in time for seamless incorporation into the flow of awareness. Consciousness, therefore, risks resembling a string of individual beads stranded on the flow of time but lacking the ability to become integrated into a continuous whole. This limitation is overcome by the agency of working memory.

Working memory refers to the capacity for the temporary, on-line holding and mental manipulation of information for durations that fall between those of iconic memory and those of long-term, off-line storage. Working memory stretches the temporal impact of transient events. It enriches the texture of consciousness by transforming information access from a sequential and disjunctive process, where only one item can be heeded at any given instant, to a conjunctive pattern where multiple items become concurrently available. Working memory determines the number of parallel channels of information that can be handled on-line in a way that keeps them accessible, interactive, and transiently protected from interference. The span of working memory could be likened to the number of balls that a juggler can simultaneously hold in the air. In daily life, it supports the capacity for “multitasking.”

Numerous observations show the critical role of prefrontal cortex in working memory. Some of the key experiments in this field were done in monkeys performing delayed-response tasks. In such tasks, the animal is briefly shown a sample stimulus that then disappears from view for a delay that can last up to 20 seconds, after which a second stimulus appears and the animal must make a response based on whether the second stimulus matches the first. Responses are not allowed during the delay, and the task cannot be performed correctly by off-line storage alone because the same stimulus can be a correct match in one trial but not in another. The crucial component is the delay period during which the animal has to maintain a mental, on-line representation of a stimulus that is behaviorally relevant but no longer part of the ambient reality being encoded by the sensory-processing relays. Lateral prefrontal neurons emit sustained responses during delay periods as if prolonging the impact of the stimulus or anticipating its reappearance.55 These neurons also mediate the on-line integration of information belonging to different modalities. For example, in an experiment where delayed responses required the animal to retain the identity and location of an object, prefrontal neurons that retained object information in the initial delay switched to holding spatial information in the second.56 Prefrontal lesions can disrupt delay activity in sensory areas corresponding to the modality of stimulus presentation, indicating that prefrontal cortex exerts a top-down influence on working-memory activity within sensory-processing hierarchies.57 In humans, prefrontal lesions impair working memory, whereas working-memory tasks, including the ability to hold one goal in mind while pursuing another, lead to selective neural activation in prefrontal cortex.58, 59

The Combined Prefrontal Influence on Experience

Transcendent encoding and working memory collectively allow the contents of awareness to be chosen deliberately rather than set reflexively by the most conspicuous surface property of ambient events. They act as counterweights to premature closure, stimulus-bound behaviors, and the magnetism of prepotent responses. They can serve as circuit breakers to override Bayesian empirical inferences or, alternatively, to extend the iterative reconciliation process until uncommon interpretations are given a chance. One common denominator, if one can be found, is the insertion of a neural buffer between stimulus and response in a way that delays closure so that novel associations and alternative responses can be considered. The outcome is a mental relativism where each event can evoke multiple scripts and scenarios that can then compete for access to thought and behavior. These processes reflect a capacity for contingent rather than obligatory routing of responses, and promote the inferential, relativistic, and interpretive aspects of mental function. This high-order mapping of experience requires a conditional suppression of prepotent responses, perhaps through inhibitory GABAergic pathways. In fact, efferent prefrontal connections can directly synapse onto GABAergic interneurons in a way that could mediate putative gatekeeping and shunting functions.60 The potentially disadvantageous slowing of processing that may result from the insertion of this conditional shunting can be forestalled by the recruitment of the parallel pathways that are known to interconnect prefrontal cortex with other components of sensory-processing hierarchies.13

One could invoke a somewhat fanciful analogy from the field of sculpture to highlight the influence of prefrontal circuits in shaping our experience of the external world. Statues can start either as clumps of clay to be molded into the desired shape or as blocks of marble to be reshaped by chipping away the pieces that occlude the desired form. The sensory hierarchies and posterior cortices of the human brain serve the first approach as they synthesize veridical representations of external reality; prefrontal cortex promotes the second approach by chipping away at surface appearance until a deeper “meaning“ is uncovered. In a figurative sense, it could be suggested that the dialectic tension between these representative and interpretive approaches to neural encoding help to set the tone of human consciousness.

Overview and Selectively Distributed Processing

Within the context of the overall plan shown in Figures 1 and 2, sensory information undergoes extensive associative elaboration as it flows along a core synaptic hierarchy of primary sensory, upstream modality-selective, downstream modality-selective, and transmodal areas. Upstream sectors of modality-selective areas encode basic features of sensation such as color, motion, form, and pitch. More complex contents of sensory experience such as objects, faces, word-forms, spatial locations, and sound sequences become encoded within downstream sectors of modality-selective areas by groups of coarsely tuned neurons. The hierarchy of processing is more conspicuous along sensory-fugal (forward) than along sensory-petal (backward) connections. The backward projections provide the source of experientially established predictive encoding for guiding the identification of sensory stimuli. Top-down projections from prefrontal cortex introduce an internally generated influence that can transcend surface information and weaken prepotent response tendencies in ways that allow innovative interpretations and contingent responses. These synaptic interactions occur on a platform of state-dependent, large-scale oscillations that provide dynamic stability.

All cognitive processes are likely to arise from analogous associative transformations of similar sets of sensory inputs. The differences in the resultant cognitive operation are determined by the anatomical and physiological properties of the transmodal node that acts as the critical gateway for the dominant transformation. As described in other publications,9 interconnected sets of transmodal nodes provide computational epicenters for large-scale neurocognitive networks. Each epicenter of a large-scale network displays a relative specialization for a specific behavioral component of its principal neuropsychological domain. The destruction of transmodal epicenters causes global impairments such as multimodal anomia, neglect, and amnesia, whereas their selective disconnection from relevant sensory areas elicits modality-specific disconnection syndromes such as prosopagnosia, pure word blindness, or word deafness.

The human brain contains at least five anatomically distinct networks. The network for spatial awareness is based on transmodal epicenters in posterior parietal cortex and the frontal eye fields, the language network on epicenters in Wernicke's and Broca's areas, the explicit memory-emotion network on epicenters in the hippocampal-entorhinal complex and the amygdala, the face-object recognition network on epicenters in midtemporal and temporopolar cortices, and the working memory-executive function network on epicenters in prefrontal and perhaps inferior parietal cortices.

The computational architecture of large-scale networks remains unsettled. As mentioned earlier, there are many reasons for rejecting models based on serially arranged convergent pathways. Such models would necessitate unrealistically complex connectivity patterns and would be painfully slow. A strict parallel distributed processing model is equally unlikely because it overlooks anatomical segregation of distinct computations. An intermediary model is based on selectively distributed processing, according to which networks contain critical epicenters and other participating components.9, 61 In this model, anatomically segregated epicenters are interconnected not only with each other but also with each participating area. Consequently, whenever one epicenter receives a message from the other, it also receives an edited form of the same message from each participating area. This allows vast informational spaces to be surveyed rapidly until the network as a whole settles into a best fit such as finding the right word to express a thought, detecting a target in the environment, or reconstructing a past event.

A modification of selectively distributed processing can be described as a selectively distributed probabilistic processing model according to which each network component has a probabilistic linkage to its preferred domain. For example, although prefrontal cortex and inferotemporal cortex can both be recruited by tasks of working memory and object categorization, the probability of activation is greater in prefrontal cortex for working memory tasks and in inferotemporal cortex for object categorization.62 There are, therefore, no task-specific gatekeepers along sensory-processing pathways. Instead, any sensory input can potentially access all networks, but the responsivity it elicits is regionally biased by task demands. One example of this arrangement was demonstrated in a task where written words had to be processed phonologically or orthographically.63 Both tasks led to activation of Broca's area (a critical network node) and the fusiform gyrus (where visual word-forms are encoded). Task-specific activations were seen in parietal cortex for orthographic processing and lateral temporal cortex for phonological processing. Analysis by dynamic causal modeling showed that the effective connectivity of Broca's area and of the fusiform gyrus was higher with parietal cortex during the orthographic task and with temporal cortex during the phonological task. One interpretation of this experiment is that Broca's area, in its capacity as a critical network component, is sensitive to the cognitive goal and modulates the receptivity (or response probability) of the two participating (task-specific) network components so that identical word-form inputs from the fusiform area can lead to different outputs.

The human brain is the single most complex device in the known universe. Truly detailed accounts even of its individual component processes such as color vision, object naming, or hand reaching would each deserve an entire volume dedicated to that subject. However, the neurologist who needs to diagnose impairments of language, memory, attention, or problem solving in clinical practice cannot afford to acquire a specialized knowledge of each component area. It is, therefore, useful to have “large-picture” overviews that provide context for the individual phenomena encountered at the bedside. Such overviews are, by their nature, bound to oversimplify matters. Their only justification is to provide plausible explanations that help to understand the nature of deficits encountered in the clinic and to place them within the larger context of cerebral organization. The account in this review has aimed to provide a plausible explanation, based on recent developments, of how we interpret sensory stimuli, and how the objective and subjective aspects of this information are incorporated into consciousness. Although this review does not focus on individual syndromes and functions, topics that have been addressed in previous publications,9, 13 it deals with aspects of neural function that permeate all cognitive domains and neurocognitive networks. The model outlined here will undoubtedly change as new evidence continues to strengthen and also challenge currently accepted principles of cerebral organization.


This work was supported by the Davee Foundation.

I gratefully acknowledge the critical readings of this manuscript by Drs T. Egner and S. Weintraub.