should be sent to Berit Brogaard, Departments of Philosophy and Psychology, University of Missouri, Saint Louis, MO 63121. E-mail: email@example.com
David Milner and Melvyn Goodale’s dissociation hypothesis is commonly taken to state that there are two functionally specialized cortical streams of visual processing originating in striate (V1) cortex: a dorsal, action-related “unconscious” stream and a ventral, perception-related “conscious” stream. As Milner and Goodale acknowledge, findings from blindsight studies suggest a more sophisticated picture that replaces the distinction between unconscious vision for action and conscious vision for perception with a tripartite division between unconscious vision for action, conscious vision for perception, and unconscious vision for perception. The combination excluded by the tripartite division is the possibility of conscious vision for action. But are there good grounds for concluding that there is no conscious vision for action? There is now overwhelming evidence that illusions and perceived size can have a significant effect on action (Bruno & Franz, 2009; Dassonville & Bala, 2004; Franz & Gegenfurtner, 2008; McIntosh & Lashley, 2008). There is also suggestive evidence that any sophisticated visual behavior requires collaboration between the two visual streams at every stage of the process (Schenk & McIntosh, 2010). I nonetheless want to make a case for the tripartite division between unconscious vision for action, conscious vision for perception, and unconscious vision for perception. My aim here is not to refute the evidence showing that conscious vision can affect action but rather to argue (a) that we cannot gain cognitive access to action-guiding dorsal stream representations, and (b) that these representations do not correlate with phenomenal consciousness. This vindicates the semi-conservative view that the dissociation hypothesis is best understood as a tripartite division.
David Milner and Melvyn Goodale’s dissociation hypothesis is commonly taken to state that there are two functionally specialized cortical streams of visual processing originating in striate (V1) cortex: a dorsal, action-related “unconscious” stream and a ventral, perception-related “conscious” stream (Goodale & Milner, 1992; Goodale, Milner, Jakobson, & Carey, 1991; Milner & Goodale, 1995, 2008). The ventral stream processes information about color and shape and relational properties of objects in allocentric (scene-based) space, whereas the dorsal stream computes information about absolute size and orientation and viewpoint-dependent properties of objects in egocentric space (Schenk, 2006). Whereas ventral stream processes often correlate with visual awareness, the dorsal stream normally operates in the absence of visual awareness.
Second, absolute properties of objects (reflectance properties, shape, and size) are computed in subcortical structures such as dorsal lateral geniculate nucleus (LGN) and superior colliculus and viewpoint-dependent properties of objects are computed in striate cortical areas (Bullier, Hupé, James, & Girard, 2001). Both subcortical and striate areas have projections into the ventral and the dorsal streams.
Third, dissociation between unconscious vision for perception and unconscious vision for action is supported by recent and traditional blindsight studies. Blindsight is a kind of residual vision sometimes found in individuals with lesions to striate cortical areas (Weiskrantz, 1986, 2009; Weiskrantz, Barbur, & Sahraie, 1995; Weiskrantz, Warrington, Sanders, & Marshall, 1974). Individuals with blindsight can make above-chance predictions about aspects of stimuli in their blind field (e.g., color and location) (Stoerig & Cowey, 1992). But blindsight is not associated with any distinctly visual phenomenology. Blindsight studies suggest that information for unconscious vision for perception is computed in subcortical structures in the ipsilesional hemisphere (Marcel, 1998; Silvanto, Cowey, Lavie, & Walsh, 2007; Silvanto, Cowey, & Walsh, 2008; Weiskrantz, 2009). This information propagates to spared, but defective, striate areas and is then transmitted to extra-striate areas in the ventral stream. From here it can, through several feedback–feedforward cycles (Bullier et al., 2001), give rise to stabilization in working memory (e.g., orbitofrontal cortex in the prefrontal cortical areas—Padoa-Schioppa & Assad, 2008) and can be accessed and reported, but without a distinctly visual phenomenology.
Milner and Goodale (1995) originally argued that individuals with blindsight are able to guess the location of a stimulus in their blind field on the basis of unconscious cues from motor programming. Blindsight was thus thought to be primarily a result of dorsal stream processing. Milner and Goodale (2008) now acknowledge that findings from blindsight studies suggest a more sophisticated picture that replaces the distinction between unconscious vision for action and conscious vision for perception with a tripartite division between unconscious vision for action, conscious vision for perception, and unconscious vision for perception. The combination excluded by the tripartite division is the possibility of conscious vision for action. But are there good grounds for concluding that there is no conscious vision for action? There is now overwhelming evidence that illusions and perceived size can have a significant effect on action (Bruno & Franz, 2009; Dassonville & Bala, 2004; Franz & Gegenfurtner, 2008, Franz, Gegenfurtner, Bülthoff, & Fahle, 2000; McIntosh & Lashley, 2008). There is also suggestive evidence that any sophisticated visual behavior requires collaboration between the two visual streams at every stage of the process (Schenk & McIntosh, 2010).
I nonetheless want to make a case for the tripartite division between unconscious vision for action, conscious vision for perception, and unconscious vision for perception. My aim here is not to refute the evidence showing that conscious vision can affect action but rather to argue (a) that we cannot gain cognitive access to action-guiding dorsal stream representations, and (b) that these representations do not themselves correlate with phenomenal consciousness. This vindicates the semi-conservative view that the dissociation hypothesis is best understood as a tripartite division.
2. Evidence for dissociation
Research by Milner and Goodale and others has shown that vision is not a single fully integrated system that creates a single representation in the brain (Goodale & Milner, 1992; Goodale et al., 1991; Milner & Goodale, 1995, 2008). Two interface systems comprise functionally specialized and anatomically differentiated processing streams in which a ventral (occipital–temporal) stream is responsible for computing information for object recognition and a dorsal (occipital–parietal) stream is responsible for computing information for visuomotor functions. Both streams start in the early visual areas of the occipital lobe (V1, V2). The ventral stream runs into the temporal lobe and then connects to other temporal and frontal lobe structures that are responsible for brief declarative memory, decision-making, and so on. The dorsal stream runs upward through the occipital lobe into the parietal lobe and continues until it makes contact with the primary somato-sensory cortex and the primary motor cortex.
Milner and Goodale’s original studies indicated that lesions to the dorsal stream can impair visuomotor control while leaving visual perception intact, whereas lesions to the ventral stream can impair visual perception while leaving visuomotor control intact. Milner and Goodale conducted a series of neuropsychological studies on a patient, D.F., with CO-triggered visual form agnosia, which involves damages to structures of the ventral stream (Goodale & Milner, 1992; Goodale et al., 1991). D.F. could consciously see some color and texture but was unable to recognize shapes and identify objects. Her dorsal stream was intact. She could accurately reach to and grasp objects. For example, she could post a card into a slot she could not describe and adjust her finger-thumb grip size perfectly to the width of a rectangular block she could not describe. However, action delays led to deterioration in performance, and she had difficulties grasping X-shaped objects and objects with holes in them (McIntosh, Dijkerman, Mon-Williams, & Milner, 2004).
Milner and Goodale originally argued that because D.F. had no visual awareness of the shape or identity of objects, and she had no working memory of the objects she could reach to and grasp without difficulty. The information in the dorsal stream, they concluded, is not stored in working memory. Actions performed on the basis of working memory require the ventral stream.
Milner and Goodale’s hypothesis was further tested on optic ataxia patients. Optic ataxia is the mirror syndrome of visual agnosia. Data confirm that individuals with optic ataxia cannot adjust their handgrip to the size of objects without delay. Milner et al. (2001) found that optic ataxic patient I.G., who had a lesion in the posterior parietal cortex, showed a poor correlation between peak grip aperture and object size when the task was to grasp the object immediately upon seeing it. When the action was delayed, the correlation between grip aperture and object size improved, indicating that control of memory-guided action was controlled by I.G.’s ventral stream. There was even some evidence that I.G. learned to make use of information stored in the ventral stream when the object remained visible.
A third kind of evidence for the dissociation hypothesis comes from studies of optical illusions (e.g., the Ebbinghaus illusion, Fig. 1) (Aglioti et al., 1995). These studies have shown differential effects of optical illusions on perception and action.
Milner and Goodale originally concluded on the basis of these and other studies that our visually guided real time (nondelayed) actions are not in the direct control of what we consciously see (Goodale & Milner, 1992; Goodale et al., 1991). The dorsal system, they reasoned, is dedicated to the rapid and accurate guidance of our movements and computes information about viewpoint-dependent properties (e.g., the object’s position relative to the body) and absolute properties (e.g., absolute size) required to accurately reach to the object and grasp it online, that is, as a result of a crude, fast, and automatic visuomotor transformation process. Dorsal stream information thus guides programming and unfolding of real-time action needed when delayed action is counterproductive. However, while the dorsal stream mediates real-time action, under delayed movement conditions dorsal stream representations decay and action becomes mediated by the ventral stream.
The ventral system, on the other hand, is responsible for object recognition and classification. It codes abstract allocentric (scene-based) information for storage in and retrieval from memory (e.g., hippocampus), and it often correlates with visual awareness. The ventral stream furthermore allows us to plan actions off-line, that is, without acting immediately (e.g., by simulating them, imagining them, or calculating how to do them). The ventral stream is also utilized in carrying out familiar activities that are grounded in information stored in visual memory.
The ventral and dorsal streams do indeed interact in nontrivial ways in action simulation, immediate, and delayed action (Allen & Humphreys, 2009; Ball et al., 2009; Decety et al., 1997; Gallese, 2007; Grezes, Costes, & Decety, 1998; Helbig, Graf, & Kiefer, 2006; Jeannerod, 2006; McIntosh & Schenk, 2009; Milner & Goodale, 2008; Rogers, Smith, & Schenk, 2009; Schenk & McIntosh, 2010). For example, attempts to memorize actions in order to imitate them activate areas in the dorsal stream, whereas attempts to memorize actions in order to recognize them on a later occasion activate areas in the ventral stream. Decety et al. (1997) scanned the brains of individuals who were presented with videos of meaningful and meaningless actions and asked to observe the stimuli either with the aim of imitating the action or with the aim of recognizing it. The findings indicated that observation of action in order to imitate it was specifically associated with bilateral activation of the dorsal pathways, reaching the premotor cortex. The ventral pathway, on the other hand, was activated when the task was to observe in order to recognize the action after the scan (Decety et al., 1997).
In another study, Grezes, Costes, and Decety (1998) performed PET scans in two sessions using meaningless and meaningful actions. In the first session, subjects were asked to look at videos without any specific aim. In the second, they were asked to look at the videos with the aim of imitating the actions presented. In the first session, it was found that the pattern of brain activation depended on the nature of the movements presented. In the second session, the task representation triggered information processing in the dorsal system.
To demonstrate familiar-size effects on action, Robert McIntosh and Gavin Lashley (2008) asked subjects to reach to and grasp the standard large Swan Vestas and the standard small Scottish Bluebell match box in a series of baseline trials. In a series of perturbation trials, subjects were instructed to reach for a smaller replica of the Svan Vestas match box and a larger replica of the Scottish Bluebell match box (Fig. 2).
The findings demonstrated that the expected size of the match boxes affected both the preshaping of the hand and the amplitude of reaches to grasp them. The researchers hypothesized that the grasp effects could arise either because the retinal size of the targets was modified by familiar size or because familiar size contributed more directly to the programming of grasp formation.
More recently, Allen and Humphreys (2009) discovered that tactile stimuli can activate visual and temporal areas in visual agnosia patients. Allen and Humphreys used fMRI scans to test whether dorsal regions in the lateral occipital cortex are activated during touch. One of their patients, H.J.A., had suffered a stroke that led to visual form agnosia. He had bilateral lesions to the ventral occipito-temporal cortex but a spared dorsolateral occipital cortex (LO). Surprisingly, however, while sight could not activate the visual brain regions for object recognition sufficiently to generate conscious awareness of shape, touch could. The authors reasoned that both touch and sight activate the LO when the selected objects are commonly touched or grasped. They speculated that that this may be the case for “all regions selective to graspable objects.” As touch is importantly involved in reaching to and grasping objects, this is suggestive evidence that memory of shape through touch may play a role in memory-guided action.
Schenk and McIntosh (2010) argue that sophisticated visual behavior most likely always involves cooperation between the ventral and dorsal streams. They review data that show that whereas normal individuals can rely on a variety of contextual cues in order to accurately grasp an object, D.F.’s grasp behavior deteriorates when binocular and object-based cues are unavailable. They take this to suggest that the ventral stream contributes to action programming at every stage.
However, as Locklin and Danckert (2010) point out in response to McIntosh and Schenk, the data do not necessarily counter the dissociation hypothesis functionally understood. As Milner and Goodale (2008) emphasize, the ventral and dorsal streams are not independent systems but coworkers with different primary job functions. The two pathways are functionally specialized, not functionally isolated from each other.
There is also evidence countering the older idea that the dorsal stream computes absolute size and shape as well as viewpoint-dependent properties of objects in egocentric space, whereas the ventral stream computes information about properties of objects in allocentric space (Bullier et al., 2001; Marcel, 1998; Schenk & McIntosh, 2010).
Anthony Marcel (1998) carried out a number of studies on blindsight subjects G.Y. and T.P. Two of these show that information about absolute properties (e.g., shape) is available for further processing in either the dorsal stream or the ventral stream.
In the first study, cylinders and spheres whose perceivable shape invoked differential manual grasping postures were placed close to the subjects’ hand. The subjects were instructed to open their eyes when the researcher requested it and, without delay, reach and grasp the object as soon as they could without relying on guesswork. The results from this study showed that visual information both for three-dimensional location and for shape, size, and orientation guiding actions was remarkably accurate. This confirms Milner and Goodale’s hypothesis that information about absolute properties enters the dorsal stream where it can be processed for the guidance of action in real time.
However, a second study conducted by Marcel showed that information about absolute properties of objects can also enter the ventral stream and reach working memory, where it can be accessed and reported, but in blindsight subjects without distinctly visual awareness. It was found that G.Y. and T.P. could make above chance predictions about the orientation of straight lines and curves and relationships between strokes. G.Y. and T.P. were presented with an array of letters. One row contained the target and a letter of a similar stroke composition (i.e., the orientation of straight lines and curves) and structural description (i.e., the relationships of strokes). Another row contained two letters with stroke types and orientations similar to the target but with different structural descriptions, and the third row contained two letters both of whose stroke types, orientations, and structures were dissimilar to those of the target (Fig. 3). The target stimulus was presented to the subject’s blind field, and the six choice letters were exposed in the subject’s sighted field for as long as needed. The subjects were asked to identify the correct letter and guess which row had letters with similar strokes.
Marcel found that choice of letter and choice of correct row (one containing the target letter and one containing letters with similar strokes and structural description) were above chance for both subjects after several training trials. As the information about the letters had to reach working memory in order for the subjects to make their reports, this study strongly indicates that information about direction and continuity can enter the ventral stream.
Earlier studies conducted by Larry Weiskrantz (1986) have shown that in blindsight, information about shape that normally enters the ventral stream is determined on the basis of information about stimulus orientation. Weiskrantz found that blindsight subject D.B. could make above-chance predictions about orientation and use this information to discriminate among some forms, for example, X and O. He did not fare as well with respect to letters composed of elements that had the same direction. For example, he could not discriminate rectangles with different relations of long-to-short side. Given that D.B. could not distinguish shape when he could not use orientation cues, it is likely that information about orientation is computed in subcortical structures such as the dorsal lateral geniculate nucleus (LGN) or the superior colliculus but that shape processing occurs at later stages of visual processing and depends on the availability of information about orientation. Marcel’s second study confirmed this hypothesis.
Together Marcel’s and Weiskrantz’s studies indicate that information about absolute properties is processed in subcortical structures. This information can then propagate to extrastriate cortical areas in the ventral stream or to parietal areas in the dorsal stream.
In normal subjects, information about absolute properties is further processed in the ventral stream to yield an allocentric (scene-based) representation of the stimulus. Ventral stream areas compare elements of the scene in terms of their color, adjusting for differences in illumination, and joins line-segments together to form integrated contours.
According to Milner and Goodale’s original hypothesis, information about viewpoint-dependent properties is computed in the dorsal stream (Goodale & Milner, 1992; Goodale et al., 1991). Because online action reduces the viewing time and changes viewing conditions drastically, online action primarily makes use of information that relates the object to the perceiver. Shape, size, and color constancies are less important. Goodale and Milner put the hypothesis as follows:
For the purposes of identification, form information learning and distal (e.g. social) transactions, visual coding often … needs to be ‘object-centred’; i.e. constancies of shape, size, colour, lightness, and location need to be maintained across different viewing conditions. The above evidence from behavioral and physiological studies supports the view that the ventral stream of processing plays an important role in the computation of such object-specific descriptions. In contrast, action upon the object requires that the location of the object and its particular disposition and motion with respect to the observer is encoded. For this purpose, coding of shape would need to be largely ‘viewer-centred’ with the egocentric coordinates of the surface of the object or its contours being computed each time the action occurs. We predict that shape-encoding cells in the dorsal stream should predominantly have this property. Nevertheless, certain constancies, such as size, would be necessary for accurate scaling of grasp aperture, and it might therefore be expected that the visual properties of the manipulation cells … in the posterior parietal region would have this property. (Goodale & Milner, 1992, p. 23)
Milner and Goodale’s empirical data confirm that viewpoint-dependent properties are required for the guidance of online action.
Viewpoint-dependent properties, however, are not irrelevant to conscious perception (Brogaard, 2010; Peacocke, 1983; Tye, 1996). We perceive objects not just in relation to other objects in a scene but also in relation to an egocentric frame of reference. We locate and attribute properties to objects relative to our own location: Objects are experienced as near or far, to the right or to the left relatively to the location of our bodies. We experience spatiality insofar as we can relate locations to our own perspective as perceivers. I perceive the clock as being over there, the coffee mug as being to the right of me, the ceiling as being above me, and the floor as being below me. We furthermore perceive objects as having viewpoint-dependent shapes, sizes, and colors. In addition to consciously perceiving a tilted bracelet as circular-shaped, we also normally can consciously see it as having elliptical cross-sections (Fig. 4).
Michael Tye puts it as follows: The [bracelet] is represented as having boundaries “which would be occluded by an elliptical shape placed in a plane perpendicular to the line of sight of the viewer … In this sense, the [bracelet] is represented as being elliptical from here. But it is also simultaneously represented as being at an angle and as being itself circular. This is why the tilted [bracelet] both does, and does not, look like the same [bracelet] held perpendicular to the line of sight” (Tye, 1996, p. 125, fn 10).
Taken together, empirical evidence and theoretical considerations suggest that subcortical structures process information about absolute properties, whereas cortical areas shared between the ventral and dorsal pathways process information about viewpoint-dependent properties. From here this information can propagate through either the ventral or the dorsal stream. In the ventral stream, the information is further processed to form allocentric representations. In the dorsal stream, the information is further processed to form action-guiding representations.
But the two sets of data, while countering specific details of Milner and Goodale’s original proposal, do not undermine the dissociation hypothesis when correctly stated as a division of labor in the visual brain (Clark, 2009; Jacob & Jeannerod, 2003; Jeannerod & Jacob, 2005). The two visual streams interact—perhaps at every stage of sophisticated action programming, as suggested by Schenk and McIntosh (2010). The egocentric/allocentric distinction is genuine but orthogonal to the vision for action/vision for perception distinction (Bullier et al., 2001). Both sets of data—arguing for stream interaction and awareness of egocentric properties—are thus consistent with the hypothesis that the ventral stream is highly specialized for vision for perception, whereas the dorsal stream is highly specialized for vision for action.
4. Results from two blindsight studies
While the question of whether there is dissociation between vision for perception and vision for action has received much attention in the literature, the question of whether there is dissociation between unconscious vision for perception and unconscious vision for action has been given relatively less attention. Milner and Goodale originally suggested that when blindsight subjects are prompted by researchers to make a guess about aspects of a stimulus in their blind field, they utilize unconscious cues from motor programming (Milner & Goodale, 1995). Hence, blindsight, it was originally suggested, is primarily a result of dorsal stream processing. Milner and Goodale now acknowledge that perception can be unconscious, and that unconscious perception is dissociated from unconscious action. In a recent review, they write:
Most psychologists would accept the notion that perceptual processing does not always achieve consciousness, despite the fact that at some level the mental representations of conscious and unconscious percepts, and presumably their neural correlates, are qualitatively similar. There is much theoretical speculation about what distinguishes conscious from unconscious percepts, but for our purposes both can be seen as gaining their content from common mechanisms in the ventral stream. (Milner & Goodale, 2008, p. 775)
However, skepticism is unwarranted. It has been known for some time that structural encoding associated with ventral stream processes need not correlate with visual awareness. For example, it has been shown that the later parts of the ventral stream (the inferior temporal complex) can compute object representations which include information about relationships between the object’s parts but not about relationships between objects in allocentric space (Cooper & Schacter, 1992; Gordon & Irwin, 1996). This information need not correlate with visual awareness.
However, the most convincing evidence for unconscious visual perception comes from blindsight studies that demonstrate that color and shape information gives rise to visual awareness in blindsight subjects only when it is further processed in the contralesional hemisphere. Two studies stand out in this regard. One is Marcel’s (1998) classical study in which conscious after-images and illusions of shapes spanning the blind field were induced in two blindsight subjects. The other is a recent study conducted by Juha Silvanto et al. (2007, 2008) in which colored phosphenes were induced in the blind field of a blindsight subject.
In Marcel’s first experiment, a flash gun built into a black box with shaped slits (semi-circles or lines) was used to produce after-images in blindsight subjects G.Y. and T.P. After dark-adapting for 3 min, G.Y. and T.P. were asked to fix on a fixation light and the flashgun was fired. After removal of the stimulus and turning on the light, the subjects were asked to draw what they saw, including the relative position of the fixation light. They were informed not to guess. They were then told to blink, stare at a white surface, and draw what they saw. This was repeated with a black surface. The results are summarized in Fig. 5.1
The data presented in Fig. 5 show that when only the sighted hemifield was exposed to the flash, an after-image representing the shape appeared in the sighted hemifield. When only the blind hemifield was exposed, only the fixation light was seen. When both the sighted and the blind hemifields were exposed to a stimulus that formed a coherent figure, an after-image which represented the figure in both hemifields appeared. When both the sighted and the blind hemifields were exposed to an inappropriate shape, an after-image which spanned only the sighted hemifield and which represented the shape shown on that side would appear for nonclosed figures, and an after-image which spanned both the sighted hemifield and the blind hemifield and which represented a symmetric shape would appear for closed figures.
Upon completion of the flash gun study, G.Y. and T.P. were then seated facing a screen eight feet away, on which Kaniza figures were front-projected. A fixation point was shown until each trial began. The stimuli would in some cases form a triangle but would in some cases be inappropriate (see Fig. 6). The subjects were asked what they saw and to guess if necessary.
Both subjects said that they never saw the inducing black pacman figures but only white triangles (if anything). G.Y. reported seeing a triangle in nearly all cases where the inducing figures were appropriate and reported not seeing a triangle when the inducing figures were inappropriate. He was not sure about the phenomenology and reported in some cases that both halves of the triangle seemed to be located in his sighted hemifield. T.P. was less accurate than G.Y. but usually reported seeing a triangle “out there on the screen” when the inducing figures were appropriate but not when they were inappropriate.
In a third experiment, G. Y. and T.P. were exposed to stimuli, forming different Kanizsa figures. The third experiment was introduced to test G.Y. and T.P.’s abilities to determine which shape they saw. Two kinds of stimuli were used. In one case, the inducing figures were supposed to induce the illusion of a hexagon. In the other case, they were supposed to induce the illusion of a pentagon (Fig. 7). The experimental strategy was the same as in the previous experiment except that the exposure time was longer.
G.Y. and T.P. reported that they never saw the inducing black pacman figures in their blind field but only white shapes (if anything). G.Y. and T.P. were less accurate than in the previous trials but in most cases they reported accurately when the stimulus formed an appropriate figure. When the figure was inappropriate, they reported that they did not see it.
Marcel took the results from the three studies to show that “there must have been a veridical non-conscious representation of the stimulus in the blind field, which depended on the content of contralateral field percepts only for its becoming conscious” (Marcel, 1998, p. 1582). Conscious vision in the blind field required that the stimuli in the two fields formed a symmetrical figure or had joint figural elements and formed a good figure. In the case of the after-images, the subjects saw veridically what was in their blind field as long as the whole figure made a good figure. Marcel thus reasoned that the visual process in G.Y. and T.P. cannot be described as a case of completion, where what is seen in the normal hemifield is completed in the blind hemifield. The oblique lines in the blind field were not seen when presented alone. But they were veridically seen when they were accompanied by information that would make a good figure. In the two subsequent studies, what was shown in the blind field affected what appeared as an illusion in the blind field.
Marcel concluded that visual awareness of contour requires separating parts of the scene into figure and ground. One element has to take the role of ground. In the Kanizsa inducing experiments, the black stimuli had to be processed as grounds for the triangles. The triangles had to be seen as lying on top of the inducing stimuli. Marcel hypothesized that “the non-conscious representation of the stimulus takes the form of a perceptual hypothesis, which is confirmed by its consistency with perceptual information in the other field. To the extent that it is confirmed by the presence of such consistency it becomes conscious” (Marcel, 1998, p. 1583). Below I will argue that a more interesting conclusion can be drawn from the data. The data, I will argue, help to establish the primary visual cortex as a neural correlate of shape percepts. First, however, let us look at a more recent study conducted by Juha Silvanto and colleagues.
Silvanto’s team applied transcranial magnetic stimulation (TMS) to induce colored phosphenes in blindsight subject G.Y. and three control subjects (Silvanto, 2008; Silvanto et al., 2007, 2008). When TMS was applied to G.Y.’s normal hemisphere, a colorless phosphene appeared. With visual adaptation to a uniformly colored stimulus, both unilateral stimulation of G.Y.’s normal hemisphere and stimulation of one of the hemispheres in the control subjects induced phosphenes in the corresponding area of the visual field. The color of the phosphene corresponded to the color of the stimulus.
Bilateral stimulation without exposure to a color stimulus induced a white arc spanning both hemifields in both G.Y. and control subjects. With exposure to a color stimulus, bilateral stimulation induced a phosphene spanning both hemifields, and the phosphene’s color corresponded to the color of the stimulus in both G.Y. and control subjects.
When exposure to color was restricted to one hemifield in the control subjects, the component of the induced phosphene overlapping the adapted hemifield was colored and the component overlapping the unadapted hemifield was colorless. In G.Y., stimulation gave rise to a bilateral phosphene that appeared uniformly colored when the adaptation was restricted to the sighted field but appeared colorless when the adaptation was restricted to the blind field. With a bi-colored stimulus, control subjects experienced bi-colored phosphenes. In G.Y., the color of the phosphene depended exclusively on the color of the stimulus to which the normal hemifield had been adapted.
The researchers took these findings to counter an earlier hypothesis to the effect that feedback to striate cortex from higher visual brain regions in one of the hemispheres is required for visual awareness in the visual field corresponding to that hemisphere (Silvanto, Cowey, Lavie, & Walsh, 2005). Silvanto concluded that the findings from the TMS studies suggest a neural correlate of consciousness (Silvanto, 2008). Activation in extrastriate areas (e.g., V4 areas) of the ipsilesional hemisphere in G.Y. was consistently required in order for a colored phosphene to appear in G.Y.’s blind field. None of the areas in the contralesional hemisphere sufficed for inducing colors in the blind field. The lack of a functional striate cortex apparently weakens neural responses in extrastriate areas such as V4 and V5/MT. Hence, extrastriate areas appear to be closely correlated with color experience.
5. Discussion of the blindsight studies
Marcel (1998) concluded on the basis of his experiments that visual awareness arises in the blind field of blindsight subjects only if the stimulus to which both hemifields had been exposed constitutes a coherent figure. However, this conclusion fails to answer the crucial question of what kind of information about shape is processed in ipsilesional regions in blindsight.
Marcel’s after-image study showed that an after-image spanning both hemifields arose only when the stimuli to which the two hemifields had been exposed were processed together. If a significant amount of information about shape in the blind field was further processed in striate regions in the ipsilesional hemisphere, then whether an after-image spanning both hemifields was produced would not depend exclusively on whether a coherent figure could be generated. This suggests that no significant amount of shape information was further processed in striate regions of the ipsilesional hemisphere.
Hence, a plausible interpretation of Marcel’s data, which is also consistent with the data gathered in Silvanto’s study, is that information about stimuli in the blind field can enter the contralesional hemisphere where it can be compared to information about stimuli to which the sighted field has been exposed. Information representing a coherent figure is generated if the stimuli to which the two hemifields were exposed form a coherent figure. If it does not form a coherent figure, only the stimulus in the sighted hemifield is processed to give rise to a semi-shape in the normal hemifield.
Together Marcel and Silvanto’s studies indicate that the only visual information processed in the ipsilesional hemisphere in blindsight is information about wavelength, light-intensity, or the location and direction of small line-elements of the scene, which does not suffice for awareness of contour or color. Call this kind of information “quantitative information.” The underlying mechanisms for quantitative information processing in blindsight are still largely unknown. Empirical findings suggest that quantitative information is processed in either the retina and the LGN, or the superior colliculus in the midbrain.
An alternative hypothesis about quantitative color processing in blindsight is that information from the retina propagates to the LGN and the primary visual cortex via the standard retina-geniculate-striate pathway but that the neural signals in ipsilesional striate regions are too sluggish to trigger proper activation of higher brain regions (Gazzaniga, Fendrich, & Wessinger, 1994; Silvanto, 2008). Both the retina and the LGN contain red–green, blue–yellow, and black–white opponent-cells, which compute differences in cone firing (Kentridge, Heywood, & Weiskrantz, 2007; Lennie, 2000; Stoerig & Cowey, 1992). They can, for example, detect differences in the activity of the red and green cones and hence determine whether the wavelength of light reflected from one region of the scene is in the red or the green region of the color spectrum. However, they are unable to compare neighboring regions of the scene and adjust for differences in illumination (Conway & Livingstone, 2006; Kentridge et al., 2007). As neurons in the retina and the LGN do not have the mechanisms for color contrasting across the scene, and color contrasting is necessary for color experience, the hypothesis that blindsight uses the standard visual pathway would explain why blindsighters lack color experience in the blind field but nonetheless can detect wavelengths in experimental settings.
This hypothesis was confirmed by Robert Kentridge et al. (2007). When they exposed blindsight subject D.B. to pairs of same-colored dots against a color gradient background, they found that D.B. was unable to make use of color contrast to determine color in his blind hemifield. He relied exclusively on the wavelength of light emitted from the dots. In contrast, normal individuals and achromatopsics, who can only see black, white, and shades of gray, determine color on the basis of color contrast rather than physical wavelength.
Similar conclusions have been drawn from studies of shape perception. Empirical findings suggest that direction and local texture is processed in subcortical structures (Li, Piëch, & Gilbert, 2006). One hypothesis is that shape opponent cells in the retina and the LGN determine which neighboring elements belong together. However, while neurons in striate cortical areas are capable of joining elements across a larger area of a scene, processes in the retina and LGN can only link neighboring elements together to determine direction and local texture.
Quantitative color and shape information is thus most likely computed in the retina, the LGN, or the superior colliculus. However, quantitative shape and color information does not suffice for color and shape awareness because color and shape awareness requires a considerable amount of recoding of the quantitative information yielding qualitative information (Chalmers, 1990).
Qualitative color information can be understood as the output of the process of comparing color inputs of elements of a scene to adjacent elements in terms of wavelength and adjust for variations in illumination. This process is partially responsible for the constancy of perceived colors irrespective of the light used to illuminate them, also known as “color constancy.” Color constancy illusions arise when neurons interpret two elements with same color as being differently colored owing to a difference in perceived illumination conditions. For example, despite the fact that the light reflected from A has the same physical properties as the light reflected from B in Fig. 8, we initially judge that A and B have different achromatic colors.
According to Lotto and Purves (2000), we get fooled by color illusions like the checker shadow illusion because we determine stimulus color on the basis of the combination of wavelengths and illumination that usually would have produced this stimulus in the past. In past real-world scenarios, squares A and B would have had the same perceived color under the depicted illumination conditions despite their different surface-spectral reflectance properties. The illusion thus occurs because we hypothesize on the basis of past experience that the squares must be differently colored.
Though the processes necessary for awareness of chromatic colors (e.g., red) are located upstream in the V4/V8 color complex, recent research has shown that double-opponent cells in V1 have a receptive field structure that can contribute significantly to color constancy (Conway, 2001; Conway & Livingstone, 2006; Kentridge et al., 2007).
Double-opponent cells are not sensitive to absolute cone responses (Conway, 2001; Conway & Livingstone, 2006). Rather, they compare differences in cone responses across cone type (cone opponency) and across space (spatial opponency). Double-opponent cells can, via spatial opponency processes, determine which scene elements have the same color by adjusting for differences in registered illumination. Spatial opponency processes thus partially explain why squares A and B in Fig. 8 are perceived as having different achromatic colors.
Color awareness requires spatial opponency processing. As this kind of processing does not take place in the ipsilesional hemisphere in blindsight subjects, blindsight subjects normally lack color awareness. In Silvanto’s study, G.Y. experienced a colored phosphene in his blind field but this phenomenon depended on information processing in the contralesional hemisphere. This suggests that G.Y.’s ipsilesional hemisphere was unable to compute qualitative color information on the basis of color exposure in the blind field. The information necessary for color awareness was computed in contralesional striate and extrastriate cortical regions on the basis of inputs from both eyes.
Qualitative shape information, as I shall construe the term here, is information computed in striate cortex which represents salient contours that result when elements of a scene are joined to other elements of the scene to form figures that stand out from the ground. The process of generating qualitative shape information is also known as “contour integration” (Li et al., 2006). Contour integration is governed by a principle that instructs us to combine segments that form a good continuation relative to global context (Li & Gilbert, 2002; see also Stevens, 2004).
In Marcel’s studies, whether a shape percept was present in the blind hemifield depended on whether the stimuli to which the two hemifields were exposed formed a good figure. If no coherent figure could be formed, nothing consciously appeared in the blind field. This suggests that ipsilesional hemisphere was unable to compute qualitative shape information on the basis of exposure to shapes. Qualitative shape information was computed in contralesional striate regions via contour integration processes that integrate information from both hemifields.
The upshot is this: Marcel and Silvanto’s studies suggest that when a colored phosphene or an after-image representing a shape appears in a blindsighter’s blind field, contralesional striate areas are involved in processing information from subcortical structures to yield qualitative color and shape information necessary for color and shape experience. Normal blindsight subjects do not have color or shape experiences because the information processed in subcortical structures is not properly processed in ipsilesional striate areas. However, the information from subcortical structures can nonetheless reach stabilization in working memory, given proper prompting, training, or cueing (Danckert & Goodale, 2000).
This interpretation of Marcel and Silvanto’s findings confirm dissociation between conscious and unconscious vision for perception. In the case of unconscious vision for perception, quantitative information about wavelength and light-intensity, or the location and direction of small line elements in the scene, is computed in the retina, the LGN, or the superior colliculus. From here it normally propagates to extrastriate brain regions (V4 and V5 regions) and can reach working memory (prefrontal cortex) and hippocampus (storage memory) via the temporal lobe. But quantitative information does not normally correlate with color or shape experience because color and shape experience requires contour integration and the computation of color contrast across the scene. While specific hues and complex shapes are determined in higher cortical areas (V4/V8), contour integration and color contrasting appear to take place in the primary visual cortex.
Visual awareness of color and shape does sometimes occur in the blind field of blindsight subjects in experimental conditions. However, in order for this to happen quantitative information computed in subcortical structures must enter the contralesional hemisphere via colossal pathways and be processed together with information from the sighted hemifield. Qualitative information computed in the contralesional hemisphere must furthermore re-enter the ipsilesional hemisphere via colossal pathways, and from here it must reach stabilization in working memory in prefrontal cortex via extra-striate and temporal lobe areas.
6. Why there are no conscious action-guiding representations?
The data reviewed suggest a more sophisticated picture that replaces the distinction between unconscious vision for action and conscious vision for perception with a tripartite division between unconscious vision for action, conscious vision for perception, and unconscious vision for perception. This leaves out only conscious vision for action.
Some of the evidence presented against the unconscious nature of dorsal stream comes from cases of neglect. Vittorio Gallese (2007) argues that the inferior parietal lobule (IPL) is best seen as part of what he calls the “ventral-dorsal stream.” The ventral-dorsal stream (and the IPL in particular), Gallese says, plays a crucial role in our visual spatial awareness.2 He presents a range of cases in support of the claim that spatial and body representations required for action are conscious representations. Vision, sound, and action, he argues, “are parts of an integrated system; the sight of an object at a given location, or the sound it produces, automatically triggers a ‘plan’ for a specific action directed toward that location. What is a ‘plan’ to act? It is a simulated potential action” (Gallese, 2007, p. 7). In support of his view, Gallese cites studies of neglect brought about by lesions to dorsal stream recipient parietopremotor circuits in monkeys and humans. On an influential view of neglect, neglect is a disorder of attention that interferes with what the subject is able to pay attention to (Driver, Mattingley, Rorden, & Davis, 1997). Evidence for this view comes from cases of extinction where bilateral stimulation leads to neglect of ipsilesional input. The theory here is that competition for attention leads to a failure to attend to the contralesional input. A complete loss of attention quite plausibly leads to a loss of awareness.
Patients with neglect brought about by lesions in parietal areas appear not only to fail to attend to objects located contralateral to the lesion, they also fail to reach to objects in contralesional peripersonal space, the space within which we can grasp objects. The same patients often have no trouble reaching to and grasping objects in extrapersonal space. Gallese concludes on the basis of these and other empirical data that Head and Holmes (1911) and Schilder’s (1935) distinction between an unconscious body schema and a conscious body image is too simplistic. Because neglect leads to both a failure to reach into and a lack of awareness of peripersonal space, he says, brain regions crucially involved in governing action are partially responsible for computing conscious spatial representations.
However, the evidence cited by Gallese does not quite show what he says it does. Gallese may well be right that the IPL plays a role in visual spatial awareness, and the studies he cites indicate that damage to the IPL can disturb both visual spatial awareness and action, but they do not demonstrate that parietal representations of peripersonal space are conscious. Numerous other hypotheses explain the data cited by Gallese equally well (Bullier et al., 2001; Jacob & Jeannerod, 2003, Jeannerod & Jacob, 2005; Rosetti 1998). One hypothesis is that the IPL transmits information to the ventral stream, perhaps via feedback to striate cortex, and that this feedback of information is required in order for ventral stream processing to give rise to conscious spatial representations. This hypothesis is consistent with Jean Bullier et al.’s, (2001) suggestion to the effect that feedback from the dorsal stream to striate cortex can influence ventral stream processing. On this view, the two visual streams interact via extrastriate-striate or patietal-striate feedback.
The question of whether action-guiding dorsal stream representations sometimes are conscious has interesting consequences for enactive theories of perception. At least some enactive theories of perception appear to be inconsistent with a strong dissociation between vision for perception and vision for action. On Alva Noë’s (2004) enactive theory of perception, perceptual experience is the exercise of sensorimotor know-how. But Milner and Goodale’s dissociation hypothesis presents a prima facie challenge to this view. If the exercise of sensorimotor know-how is exhausted by action-guiding dorsal-stream representations, and Milner and Goodale are right that action-guiding dorsal-stream representations are unconscious, then perceptual experience is unconscious. But perceptual experience essentially is conscious. So, Noë must hold either that action-guiding dorsal-stream representations can reach consciousness or that sensorimotor know-how is not exhausted by action-guiding dorsal-stream representations.
In the introductory remarks of his study, Noë (2004) states that “the basis of perception, on our enactive, sensorimotor approach, is implicit practical knowledge of the ways movement gives rise to changes in stimulation” (p. 8). This characterization of the enactive view seems to implicate know-how storing cerebellum areas as a constituent, or supervience basis, of perceptual experience. But stored know-how is integrated in action-guiding representations upon neural discharge. So, if Noë does indeed hold that know-how is the basis of perception, then the most viable response to Milner and Goodale’s hypothesis that action-guiding dorsal-stream representations never reach consciousness may be to reject it.
I want to make a case for the tripartite division between unconscious vision for action, conscious vision, and unconscious vision for perception. This excludes the possibility that action-guiding dorsal stream representations can reach consciousness and thus challenges the enactive view of perception. My aim here is not to refute the evidence showing that conscious vision can affect action and that the ventral and dorsal streams interact but rather to show that we cannot have cognitive access to dorsal stream action-guiding representations. Lack of conscious access, I will argue, entails lack of phenomenal consciousness.
One highly influential opponent of the hypothesis that a lack of cognitive access entails a lack of phenomenal consciousness is Ned Block. Block (1995) makes a case for distinguishing two kinds of consciousness that sometimes are confused in the literature, viz. “access consciousness” or “A-consciousness” and “phenomenal consciousness” or “P-consciousness.” Block (1995) stipulates that for information to correlate with A-consciousness, it must be poised for the direct rational control of action. For it to correlate with P-consciousness, on the other hand, there must be something it is like for us to be in a state with that information content. Information in working memory is accessible, and hence correlates with A-consciousness but it need not correlate with P-consciousness.
Block (2007) sets forth a new definition of “access consciousness.” According to Block (2007), information is cognitively accessible if it is actually in working memory. Cognitive accessibility, he says, cannot be understood in a broader sense that includes, for example, information stored in visual areas or the hippocampus. One reason for not allowing the broader notion of accessibility, he argues, is that a completely unconscious representation can be potentially in working memory. If attention were shifted slightly, the information would be in working memory. So, Block argues, on the broader notion of cognitive accessibility, an unconscious representation could be A-conscious, which he thinks would be undesirable.
In support of his own hypothesis, Block draws on data presented by Marisa Carrasco and colleagues (Carrasco, 2009; Carrasco, Ling, & Read, 2004). In one study, an attention-attracting dot was presented on one side of a screen before a pair of gratings with different contrasts and a certain orientation. The researchers asked participants to report the orientation of the one of a pair of gratings that seemed to them to have the higher contrast. They found that attention could make a grating that was lower in contrast than the comparison seem higher in contrast. Similar results were found in color saturation studies (Carrasco, 2009). Block takes Carrasco’s findings to show that a shift in attention can affect phenomenal consciousness to a degree that suffices for making the content of phenomenal consciousness accessible from working memory.
However, there is some reason to be skeptical about Block’s claim that unconscious representations can be potentially in working memory because they would be there if attention were shifted. Selective attention is a top–down factor that modulates representation. Attention can affect the representational phenomenal character of a mental state (this, in fact, is the most natural reading of the findings published in Carrasco et al., 2004). For example, the clock now in the periphery of my visual field is presented to me in a way that leaves out information. I am aware of its round shape and some smeared pattern in the middle. Were I to attend to it, I would also be aware of the numbers and their colors. So, attention can change the representational phenomenology of my experience. In a phrase: Attention affects representation.
The findings in Carrasco’s studies thus do not support the hypothesis that attention can enhance cognitively inaccessible representations to the point of consciousness. At best they show that attending to the meaning of a previously inaccessible state may generate a new, conscious representation with a different content. It should be fairly uncontroversial that completely inaccessible representations cannot become accessible regardless of how much they are enhanced by selective attention (Clark, 2007, 2009).
Recent empirical findings support the hypothesis that access consciousness understood as information that is potentially in working memory correlates with phenomenal consciousness. Nathan Cashdollar et al. (2009) offer compelling evidence that a sharp distinction cannot be drawn between the neural correlates of short-term and intermediate-term memory. The most frequently cited data on memory have come from work on amnesiacs. Amnesiacs are unable to form long-lasting memories, but they are fully capable of keeping information active in their head for a short amount of time. Cashdollar and colleagues found that epileptics with damages to the hippocampus behaved like the amnesiacs studied in these two respects, but they differed from amnesiacs in one crucial way. Unlike amnesiacs, the epileptics could not recall details of pictures they had just seen (e.g., whether the table was located to the left or the right of the chairs). These findings suggest that there are two kinds of short-term memory: One that can store details of a scene in the hippocampus and another that can keep fewer and less detailed items of information active in other brain regions, for example, the prefrontal cortex.
If short-term memories of details of scenes and events are stored in hippocampus in normal individuals as well, then not all of the content of short-term memory can be identified with what is actually in working memory. This observation creates trouble for Block’s hypothesis. If you have a visual experience of a table located to the left of two chairs, then there is something it is like for you to have that experience, and what it is like for you to have that experience is different from what it would be like for you to have the experience that the table as located to the right of the two chairs. So, the table’s location makes a difference to the phenomenology of your visual experience. As visual experiences are both temporally extended and rich in terms of the information they convey, some of the information that makes a difference to the phenomenology of your visual experience is not actually in working memory but is briefly stored in the hippocampus. However, the information stored in the hippocampus could have been in working memory. So, information that is potentially but not actually in working memory can make a difference to phenomenal consciousness. What matters for phenomenal conscious is, among other things, whether the information is cognitively accessible. If the information is not cognitively accessible, it does not contribute to phenomenal consciousness.
I grant that A-consciousness and P-consciousness are distinct concepts. A philosophical zombie can be A-conscious of a visual stimulus but, by definition, a philosophical zombie has no P-consciousness (Chalmers, 1996). However, the conceptual possibility of creatures like us with A-consciousness but no P-consciousness does not undermine my point that cognitive accessibility and phenomenal consciousness actually occur in conjunction. In fact, my argument depends only on the much weaker claim that if a process in the human brain is phenomenally conscious, then the information it represents is cognitively accessible: The information can be retrieved from short-term memory sites such as the hippocampus or the prefrontal cortex.
The question remains whether action-guiding dorsal-stream representations are cognitively accessible. There is good reason to think that they are not. Action requires discharging potentiation (the transmission efficacy at the synapses) in the sites hosting procedural, or routine-based “how-to,” memory, viz., the basal ganglia (the striatum) and the cerebellum (Cavaco, Anderson, Allen, Castro-Caldas, & Damasio, 2004). Neither the basal ganglia nor the cerebellum generates direct conscious outputs.
Furthermore, unlike vision for object recognition, vision for action is multi-modal (Gentilucci et al., 1995; Jacob & Jeannerod, 2003). Though vision is the dominant sensory modality in vision for action, action-guiding representations integrate a variety of nonvisual stimuli, including tactile, kinesthetic, and proprioceptive input. Even when subjects are not consciously aware of changes in object size, they still change their hand apertures to fit the object (Gentilucci et al., 1995). Tactile and proprioceptive information from the hand about the object’s size determines the kinematics of reach-to-grasp movements. This finding by itself indicates that action-guiding representations are not cognitively accessible, as cognitive access to kinesthetic and proprioceptive information is much weaker than cognitive access to visual stimuli.
Other data confirm the hypothesis that the multimodal action-guiding representations are cognitively inaccessible—perhaps partially in virtue of being multimodal (MacKenzie & Iberall, 1994, chap. 5, Paulignan, MacKenzie, Marteniuk, & Jeannerod, 1991). Calculating reaching- and grasping-behavior requires visual representations of object and object location as well as proprioceptive representations of hand and arm. If an object suddenly changes location, corresponding adjustments in arm velocity and trajectory are made in less than 100 ms, which is not enough time for the human brain to consciously represent the change in object location or the corresponding change in velocity and trajectory (Paulignan et al., 1991). Studies have shown that when subjects are asked to use a minimally demanding vocal response (Tah!) to signal their awareness of a change in object location, correction of movement occurs significantly faster than the vocal response. Corrections of trajectory and hand aperture occurred within 100 ms, whereas the vocal response happened after 420 ms (Castiello & Jeannerod, 1991; Castiello, Paulignan, & Jeannerod, 1991).
Studies of pointing and saccadic eye movement further indicate that subjects can correct saccadic eye and pointing movements faster than they can consciously perceive a change in object location (Goodale, Pelisson, & Prablanc, 1986; Pelisson, Prablanc, Goodale, & Jeannerod, 1986). In one study, subjects were asked to point as fast and accurately as possible to stimuli occurring in the dark (Pelisson et al., 1986). In the first series of trials, the target leaped from an initial position to a randomly selected position. In the second series, the target made a second jump in the same direction as the initial jump. Subjects reported that they were unaware of the second jump, and they were unable to predict its direction, but while saccadic eye and pointing movements were initially aimed at the target’s position after the first jump, both were immediately adjusted to fit the target’s new location. Even though the participants had no conscious awareness of the two jumps, they were clearly seeing and acting on both jumps. The findings indicate that the subjects updated movement trajectory and target location without conscious awareness of the update.
Similar results were reported by L. S. Jakobson and Goodale (1989). They first showed that subjects could not detect a three-degree shift in vision through wedge prisms. They then monitored the subjects’ movements. Despite no reported conscious awareness of the shift in vision, the shift generated a modified hand-path curvature.
Together these findings indicate that subjects rely on various sources of inaccessible information in grasping, reaching, and pointing behavior. This is strong evidence in favor of the hypothesis that action-guiding dorsal-stream representations that represent grasping, reading, and pointing behavior are inaccessible to consciousness.
The perhaps strongest argument for thinking that action-guiding dorsal-stream representations are cognitively inaccessible comes from work on visual agnosia patients. Goodale, Jakobson, and Keillor (1994) showed that while D.F. had no apparent difficulties reaching to and grasping objects she had visual access to, she experienced difficulties when the object was removed 2-s prior to the onset of her movement toward the object. The data indicate that the dorsal stream has no memory capacity of its own but must rely on the memory capacities of the ventral stream if grasping occurs without immediate visual access to the object.
Goodale, Jakobson, and Keillor’s results provide a compelling argument for thinking that we cannot cognitively access action-guiding dorsal-stream representations. When we access and manipulate information, we rely on our ability to keep information in our head for a short period of time (Miller, 1956). Information that cannot enter a “working memory” is inaccessible for cognitive manipulation. The very plausible hypothesis that the dorsal stream has no memory of its own thus suggests that action-guiding dorsal-stream representations cannot enter a “working memory” from which they can be accessed.
The possibility remains that action-guiding representations are stored in the ventral stream via feedback from the parietal lobe to striate cortical areas. We already know that ventral-stream information influences action-guiding dorsal-stream representations in delayed action and maybe even in online action, as the research on optic ataxic patient I.G. shows.
However, it is unlikely that action-guiding representations can be stored in visual or temporal areas. David Westwood and Goodale (2003) asked participants to estimate the size of rectangles located next to larger, smaller, or same-sized rectangular “flanking” objects. The participants consistently judged that the rectangles, when accompanied by the larger flanking objects, were smaller than they in fact were. In one series of no-vision grasping trials, the rectangles were not visible immediately prior to action. In a second series of prior-vision trials, the rectangles remained visible immediately until the onset of action, after which further visual input was blocked. A robust “distorting” effect of the flanking objects on the grip aperture was found only in the no-vision trials. Westwood and Goodale inferred that the time immediately prior to the onset of action is important in generating accurate grasping behavior. When the object is not visible immediately prior to the onset of action, the dorsal stream must retrieve perceptual representations of the object from memory storage facilities in the ventral stream.
As Westwood and Goodale point out, this is suggestive evidence that the dorsal stream does not generate an action-guiding representation until the exact time at which action is needed. This, in turn, indicates that action-guiding representations are not stored in memory but are continuously generated anew. There is no “short-term memory” for action-guiding representations.
A hypothesis defended by Jacob and Jeannerod (2003: chap. 6) provides a plausible explanation of why there is no short-term memory for action-guiding representations. Action-guiding dorsal-stream representations are coded in egocentric coordinates, whereas ventral-stream representations primarily are coded in an allocentric frame of reference. Information coded in egocentric coordinates cannot be stored in its original form but must be recoded in allocentric coordinates in order to enter storage space in the ventral stream (this recoding and decoding likely takes place in the primary visual cortex—Bullier et al., 2001). The recoding hypothesis also explains the finding by Schenk and Milner (2006) that visual agnosia patient D.F. appears to have “some degree of access” to dorsal stream information, which she can use as a cue in form discrimination.3 The hypothesis furthermore is consistent with the finding by Keira Ball et al. (2009) that while allocentric and egocentric cues both can be used in short-term spatial priming, egocentric cues seem more effective. It is quite plausible that a recoding of egocentric cues in allocentric coordinates undergirds the appearance that egocentric information can persist for several seconds.
Now, cognitive accessibility to information likely requires working memory storage of some form or other (Westwood & Goodale, 2003). As there is no memory that can store action-guiding representations, action-guiding representations are not consciously accessible. As I argued earlier, information that is not accessible from working memory does not give rise to phenomenal consciousness. So action-guiding dorsal stream representations do not correlate with phenomenal consciousness.
These considerations speak in favor of the semi-conservative view that there is a tripartite division between unconscious vision for action, conscious vision for perception, and unconscious vision for perception. This leaves out the fourth possibility of cognitive access to action-guiding representations. Our conscious ventral-stream representations of the world are transparent. The world reveals itself to us in our experiences, as it were. How our brains are planning to guide our hand through space toward an object and adjust hand aperture relative to object size is not transparent to us. Action-guiding representations are genuinely inaccessible to experience and cannot be brought to consciousness by shifting attention. In some sense we are never consciously aware of exactly what we are going to do.4
Note that while drawing requires motor control, it is not the case that the information available to the blindsight subjects was processed entirely by dorsal stream areas. The drawing was delayed, and as Milner and Goodale points out, delayed action requires input from the ventral stream (Milner & Goodale, 2008).
Jacob and Jeannerod (2003: 252–255) also present an interesting view of IPL and its role in visual awareness. They do not hold that there are conscious action-guiding representations but rather that information specified in egocentric coordinates requires recoding in order for it to enter higher ventral stream areas.
Strictly speaking, it was not form discrimination but form reporting. When D.F. was asked to report the shape of a visually presented object, she did better when she was also reaching for the object (Schenk & Milner, 2006). Since verbal reporting requires access to working memory, it seems plausible that information from the dorsal stream had entered the ventral stream via feed-back processing and had undergone recoding.
I am grateful to three reviewers and David Chalmers for their helpful comments on an earlier version of this paper.