In this study we report four main findings related to the initial response component (~40 ms) elicited by viewing forms. The response amplitude for lines is stronger than that for rhomboids (Fig. 5A), latencies for lines and rhomboids do not differ significantly (Table 1) and estimated sources for both lines and rhomboids are distributed in both striate and prestriate cortices (Table 3). DCM shows that a multi-input model is the best fit to explain our data, for both lines and rhomboids (model 6 in Fig. 4). In earlier evoked response studies, it was widely thought that the N75 response, with a latency ~75 ms after stimulus onset, is the earliest visual evoked response component in the cortex (Nakamura et al., 1997; Tobimatsu & Celesia, 2006); however, some previous studies using MEG have shown that there is a very early response, ~40–50 ms after stimulus onset, in V1 (P50 m, Nakamura et al., 1997; 37M, Inui et al., 2006) and V5 (ffytche et al., 1995). From the viewpoint of latency and source locations, the initial response we detected may in fact correspond to the 37M response which Inui et al. (2006) reported. Thus our findings shed light on the initial stage of form perception in the visual cortex.
These results speak in favour of a parallel strategy within one of the parallel processing systems of the visual brain, the form system. Our results gain strength from previous studies on the latency of activation for various visual areas, which have shown considerable overlap between different visual areas including between areas V1 and V2, thus casting doubt on strict hierarchical processing (Schmolesky et al., 1998; Schroeder et al., 1998; Nowak et al., 1995). However, these earlier results used flash stimuli, which may not always be the optimal stimuli for activating, with short latencies, areas that have concentrations of cells with particular and exigent requirements. An interesting example here is that of V5, which is heavily involved with visual motion (Zeki, 1974; Watson et al., 1993; Orban et al., 1995). The latency of activation in that area is 28–32 ms after onset of a fast moving stimulus (> 10 °/s) and 74 ms after onset of a slow moving stimulus (< 5 °/s; ffytche et al., 1995); hence V5 is activated before V1 with the former and after it with the latter, a finding that flashed stimuli would not have revealed. This has led to the suggestion that there is dynamic parallelism in activation of V5, depending upon the speed configuration of the stimulus (ffytche et al., 1995). Moroever, the work of Schoenfeld et al. (2003) shows that latency can be modulated by other factors such as attention. In the present study, we confined ourselves to stimuli composed of lines, which are known to activate OS cells in V1, V2 and V3 (Hubel & Wiesel, 1962, 1965; Zeki, 1978b; Yacoub et al., 2008; Aspell et al., 2010; Freeman et al., 2011; Tong et al., 2012), the three visual areas we were principally interested in for comparing directly the latency of activation produced by the same two stimuli. The results showed that simple forms (lines) produced a stronger earlier response than complex ones (rhomboids) with little difference in the latency of the initial response (40 ms), a finding that cannot be accounted for by feedback, which of course is known to play an important role in regulating the properties of cells in V1 (Lamme & Spekreijse, 2000; Murray et al., 2002). That this was not due to a failure of the MEG technique to detect differences is shown by its ability to detect a main response latency difference for nasal quadrant stimulation (see Table 2). Moreover, the DCM modeling suggests a strong preference for the parallel model in form perception, involving feed-forward as well as feedback connections between striate and prestriate cortex and with both areas receiving primary visual input.
Our results lead us to conclude that the perceptual hierarchy of forms is not mirrored by a sequential temporal hierarchy. This of course does not imply that a hierarchical strategy is not used within each area, as apparently is the case in V1 and V2 (see, for example, Alonso & Martinez, 1998; Martinez & Alonso, 2001) although even here a parallel operation may be at work, reflected in the fact that there is also little or no difference in onset and offset latencies for two categories of cell, the simple and complex ones, in the hierarchical chain (Bair et al., 2002). But our results here, as well as previous studies, suggest that if a hierarchical strategy is used, it must be used in parallel in each of the three areas, at least in the context of the stimuli that we have used.
There are three potential limitations to this study. (i) We only identified an initial response, at ~40 ms, in about half of our measurements (Table 1). Although previous studies have shown that the early components of visual responses are not always identified, Shigeto et al. (1998) reported the N75 m response in 75% cases and Nakamura et al. (1997) also reported that they could only detect P50 m in a few case even with the use of powerful stimuli such as black-and-white checkerboard pattern reversals. This low detection ratio naturally raises suspicions about the response. However, we successfully estimated the sources in appropriate locations in a group level (between-subjects) analysis, which has a higher sensitivity than sensor-level analysis. Furthermore, we identified the laterality between hemispheres and amplitude differences between forms at the sensor level across subjects. In addition, the probability of observing an effect at P = 0.05 (i.e. greater than averages + 2 SD) in 10 out of 20 individuals is P = 1.3 × 10−8; in other words, the finding of an effect in only half of the subjects is very unlikely to have happened by chance. These findings indicate that there is a very early (initial) response even if peak detecting ratio in individual subjects was just over 50%. (ii) Inui et al. (2006) detected their very early response (37M) in all subjects in spite of using flash stimulation. They used a 37-channel axial-type first-order biomagnetometer but not of the whole head, and the intensity of their stimuli was 370 lux at eye position. Non-whole-head sensors allow for shorter distances between visual cortex and sensors while flash stimulation (which is very bright) activates a larger number of neurons in occipital cortex. These more favourable conditions might have allowed them to detect the very early response at sensor level at a higher rate than us. (iii) To resolve any doubt that the initial response might be a filter artifact produced by P100 m after using the 13-Hz high-pass filter, we also used a forward filter which does not produce such an artifact before P100 m. This filter erased the artifact just before P100 m (~70 ms) but the initial response did not disappear. In addition, if the initial response was a filter artifact produced by the P100 m response, we would expect that a larger P100 m would produce an ‘initial response’ more often than a smaller P100 m. But there was no correlation (P = 0.47) between P100 m amplitude and ‘initial response’ occurrence in our data; mean amplitude of P100 m was 1.87×103 fT when we detected the initial response and 2.03×103 fT when we did not. Furthermore, contour maps for the initial response and the main response were different (Fig. 2). We conclude that the initial response is not a mere artifact of filtering related to P100 m.
Despite these potential limitations, our findings strongly suggest the existence of parallel processing streams in the visual form system.