Dissociated modulations of multivoxel activation patterns in the ventral and dorsal visual pathways by the temporal dynamics of stimuli

Abstract Introduction Previous studies suggested temporal limitations of visual object identification in the ventral pathway. Moreover, multivoxel pattern analyses (MVPA) of fMRI activation have shown reliable encoding of various object categories including faces and tools in the ventral pathway. By contrast, the dorsal pathway is involved in reaching a target and grasping a tool, and quicker in processing the temporal dynamics of stimulus change. However, little is known about how activation patterns in both pathways may change according to the temporal dynamics of stimulus change. Methods Here, we measured fMRI responses of two consecutive stimuli with varying interstimulus intervals (ISIs), and we compared how the two visual pathways respond to the dynamics of stimuli by using MVPA and information‐based searchlight mapping. Results We found that the temporal dynamics of stimuli modulate responses of the two visual pathways in opposite directions. Specifically, slower temporal dynamics (longer ISIs) led to greater activity and better MVPA results in the ventral pathway. However, faster temporal dynamics (shorter ISIs) led to greater activity and better MVPA results in the dorsal pathway. Conclusions These results are the first to show how temporal dynamics of stimulus change modulated multivoxel fMRI activation pattern change. And such temporal dynamic response function in different ROIs along the two visual pathways may shed lights on understanding functional relationship and organization of these ROIs.


| INTRODUC TI ON
It is currently a popular assumption that the visual pathway consisted of two distinct pathways. The ventral pathway is involved in object identification, projecting from the primary visual cortex (V1) to the inferior temporal lobe (Ungerleider & Mishkin, 1982).
By contrast, the dorsal pathway projecting from V1 to the posterior parietal lobe is concerned with visually guided action, such as reaching a target and grasping a tool (Goodale & Milner, 1992).
However, relationships between the two pathways have also been proposed. For example, implied motion is perceived when observers have recognized animate objects in static pictures (Kourtzi & Kanwisher, 2000;Lorteije et al., 2006). As the dorsal pathway typically processes motion information to guide action, the perception of implied motion would involve both the ventral and dorsal pathways.
Indeed, visual implied motion was found to be encoded in the dorsal pathway, suggesting dynamic interactions between the two visual pathways (Lu, Li, & Meng, 2016). Similarly, object recognition sometimes rely on perceiving structure from motion (Kourtzi, Krekelberg, & Van Wezel, 2008;Murray, Olshausen, & Woods, 2003). Several studies found brain regions in both the ventral and dorsal pathways involved in structure from motion processing (Kourtzi, Bülthoff, Erb, & Grodd, 2002;Paradis et al., 2000;Wang et al., 1999). The integration of structure recognition and motion processing again reflects functional interactions between the two visual pathways.
Even if there were interactions between the two visual pathways, temporal dynamics of the interactions are unknown.
Several studies suggested more rapid processing in the dorsal pathway than the ventral pathway, as that responses to high temporal dynamic visual stimuli were found primarily in the dorsal pathway Liu & Wandell, 2005;Stigliani, Jeska, & Grill-Spector, 2017), and that perceptual integration may be formed quickly in the dorsal pathway (Liu, Wang, Zhou, Ding, & Luo, 2017).
By contrast, studies of implied motion suggested ventral pathway process "what" information first, indicating that temporal processing in the ventral pathway would be faster than the dorsal pathway.
Moreover, a few fMRI studies estimated how much information can be processed in a unit of time in the two visual pathways. For example, the univariate averaged BOLD response of FFA peaked at the temporal rate of 4-5 items per second, suggesting a capacity limit of temporal processing (McKeeff, Remus, & Tong, 2007;Stigliani, Weiner, & Grill-Spector, 2015). While another recent fMRI study examined how brain activity in the dorsal pathway would be modulated by temporal frequency of stimuli, relationship between the two visual pathways in capacity limit of temporal processing remains largely unclear Liu & Wandell, 2005;Stigliani et al., 2017). Here, fMRI activity corresponding to watching images of faces and t·ools was measured in our study to examine the temporal processing capacities in the brain areas within two visual pathways (e.g., FFA and SPL). Different from previous fMRI studies that only analyzed univariate averaged BOLD responses to investigate the temporal capacity, multivoxel pattern analysis (MVPA) was employed in our study. Comparing to MVPA, univariate analysis may poorly reveal object category encoding (Chen, Garcea et al., 2017;Guo & Meng, 2015). Multivoxel activity patterns are also known to comprise faster temporal dynamics than univariate averaged BOLD responses (Kohler et al., 2013).
In addition, motivated by time-resolved papers (Carlson, Grol, & Verstraten, 2006;Dux, Jason, Asplund, & René, 2006;Formisano & Goebel, 2003;Ogawa et al., 2000), we examined the modulation of temporal dynamics of stimuli by manipulation of interval between two stimulus images. By repeatedly sampling brain activity while participants repeatedly performed a task with temporal jitter, we were able to discern the duration of a neurophysiological process. For example, the dynamic neural basis underlying dual-task limitation was investigated by using two stimulus-onset-asynchronies (SOAs): The SOA between the two tasks was either 300 ms or 1,560 ms (Dux et al., 2006). It was hypothesized that the two tasks would interfere more for the short SOA condition than for the long SOA condition.
According to increasingly longer response time to the second task as the SOA decreases, it was then deducted that the responses of brain regions, whose temporal profile of activation tracked the time course of dual-task processing, should be modulated by the varying SOA. Consistent with this notion, a neural network of frontal lobe areas was found to be a temporal processing bottleneck for multitasking (Dux et al., 2006). Similarly, temporal dynamics of inferotemporal cortex activity in visual object recognition (Carlson et al., 2006), posterior parietal cortex activity in mental imagery (Formisano et al., 2002), primary visual area activity in flash visual stimulation (Ogawa et al., 2000) were effectively estimated. Closely related to this idea, rapid serial visualpresentation (RSVP) has been used to estimate the rate at which the visual system can process a series of objects (McKeeff et al., 2007;Robinson, Grootswagers, & Carlson, 2019;Stigliani et al., 2015).
Specifically, we investigated fMRI responses corresponding to participants watching two stimulus images that were serially presented. The interstimulus interval (ISI) between the first and second stimulus images varied at four levels (33, 67, 133, and 267 ms). Both univariate analysis and MVPA were conducted to evaluate the effect of ISI in the FFA (ventral pathway) and SPL (dorsal pathway).
Results of previous studies suggested that the capacity limit of temporal processing in the FFA is about 4-5 items per second (McKeeff et al., 2007;Stigliani et al., 2015). If capacity limit of temporal processing in the SPL would be faster than the FFA, we may find responses in the SPL peak at shorter ISIs (i.e., 33 or 67 ms) than at longer ISIs (i.e. 133 or 267 ms). However, if we would find responses in the SPL peak at a similar rate to the FFA, it would suggest no dissociation of the capacity limit of temporal processing in the two visual pathways. As the FFA and SPL were localized on the basis of object category selectivity (faces vs. tools), to clarify that there are no category selectivity and two-pathway confounding, additionally we performed searchlight mapping to identify brain regions in which responses decreased/increased as a function of ISI.

| Participants
Eighteen right-handed participants (8 male; ages 20-40) with normal or corrected to normal visual acuity participated in the experiment. Data of one participant were excluded from further analyses due to anatomical abnormalities revealed by structural MRI. The study was approved by human subjects review committee of South China Normal University. All participants provided written informed consent.

| Functional localizer
Regions of interest (ROIs) in both visual pathways were functionally localized with separate scan runs by contrasting brain activation corresponding to an independent set of faces images versus tools images that were not used in the main experimental runs. These ROIs include the FFA and SPL that were preidentified according to literatures, to avoid "double-dipping" analyses (Nikolaus, Kyle, Bellgowan, & Baker, 2009). To localize the functional ROIs, each participant was asked to complete two 336 s localizer runs. Each run consisted of 10 stimulus blocks interleaved with 11 fixation blocks. The stimulus blocks consisted of five face-image blocks and five tool-image blocks that were presented in a random order. In each stimulus block, there were 16 visual stimuli, and each visual stimulus was presented at the center of screen for 500 ms, followed by a 500 ms fixation-only interval. Four of the stimulus images in each block may be presented repeatedly. To ensure that participants attended to the stimuli, they were asked to report whether each presented image had been new.

| Main experiment
Each participant performed ten main experiment runs. A slow eventrelated design was used. There were four experimental conditions: (a) a face was shown the first followed by a face (Face-Face: FF); (b) a tool was shown the first followed by a tool (Tool-Tool: TT); (c) a face was shown the first followed by a tool (Face-Tool: FT); (d) a tool was shown the first followed by a face (Tool-Face: TF). In each trial, two 100 ms images were presented in quick succession, shown in Figure 1. Participants were asked to report whether the second stimulus image of each trial was a face or a tool. Critically, the ISI between the first and second image varied at four levels (33, 67, 133, and 267 ms). Thus, in total, each run consisted of 16 experiment trials (4 conditions × 4 ISIs). A blank display with a fixation that was presented at the center of the screen was shown after the second stimulus image, to make each trial 14 s long, and each run began with a 14 s period of such fixation-only display.

| Preprocessing
Preprocessing was conducted by using AFNI (Cox, 1996). All EPIs were head movements corrected, spatially smoothed with a 4 mm full width at half maximum (FWHM), filter and linear drift corrected to remove baseline drifts. Slice timing correction was conducted for the main experimental EPIs. All data were then transformed according to the Talairach template into normalized coordinates (Talairach & Tournoux, 1988).

| ROIs localization
To define functional ROIs, a whole-brain general linear model (GLM) analysis was performed. The right FFA in 15 out of the 17 participants was individually localized as a cluster of 20 or more contagious voxels that show significantly stronger activation for faces than for tools (p < 10 -3 , uncorrected) in the right fusiform gyrus (Kanwisher et al., 1997). The right FFA for the other two participants was defined by comparing the face-image blocks with fixation blocks (p < 10 -30 , uncorrected, cluster size >20 voxels). Similarly, the left SPL in 11 participants was individually localized as a cluster of 20 or more contagious voxels that show significantly stronger activation for tools than for faces (p < 10 -3 , uncorrected) (Chao & Martin, 2000;Kristensen et al., 2016). The left SPL for the other six participants was defined by comparing the tool-image blocks with fixation blocks (p < 10 -7 , uncorrected, cluster size >20 voxels).
The mean Talairach

| Univariate analysis
The time courses of BOLD signals were extracted by averaging percent signal change (PSC) across all voxels in each ROI. To calculate the PSC of each trial, baseline was defined as averaged of activity at the last TR before and the first TR of trial onset. Consistent with previous studies, the PSC peaked at the third TR after stimuli onset (Aguirre, Zarahn, & D'esposito, M., 1998;Lu et al., 2016;Miezin, Maccotta, Ollinger, Petersen, & Buckner, 2000). The PSC peak at the third TR was then used in subsequent multivariate analysis.

| Multivariate pattern analysis
Multivariate pattern analysis (MVPA) was performed by using PyMVPA (Hanke et al., 2009) and the PSC peak values of all trials were employed. Through a leave-one-trial-out cross-validation procedure, pattern classification of the FF versus TT (same category) was performed with linear support vector machines (SVMs). Prediction of F I G U R E 1 Slow event-related experimental design. Each trial was 14s long. Left: In each trial, the stimuli were presented for 100 ms per image in succession. Critically, the ISI between the first and second stimulus varied at four levels (33, 67, 133, and 267 ms). Right: Four main stimulus conditions with the successionally presented two stimulus images belonging to either the same category (FF or TT) or different categories (FT or TF) each trial in FF and TT conditions was exported from the classification as FF or TT and then was used to calculate classification accuracy for these two conditions of four ISIs. The same analysis was also conducted for different category conditions (FT and TF).

| Multivariate searchlight analysis
A whole-brain searchlight analysis was employed by using PyMVPA and MATLAB (Kriegeskorte, Goebel, & Bandettini, 2006). For each participant, activity patterns were extracted from a spherical searchlight with a two-voxel radius (33 voxels in each searchlight including the central voxel) that traversed all gray matter voxels. Then, MVPA was performed by using linear SVMs for each searchlight ROI corresponding to a central voxel (i.e., each voxel across the whole gray matter mask). To ensure independence between training and testing, cross-validations were performed using the leave-one-trialout procedure, and then classification accuracy of the FF versus TT conditions was calculated. To further understand the dissociated modulations of the activation patterns between the two visual pathways, planned linear trend analyses of the effect of ISI were conducted. Then, slope of the linear trend for classification accuracy as a function of ISI was calculated for each participant. After spatially smoothing (4 mm FWHM), statistical analysis (t test) across all participants was performed for each voxel. Finally, the slope for each searchlight ROI was mapped by using SUMA (AFNI surface mapper; Saad & Reynolds, 2012).

| Univariate averaged BOLD activity
To evaluate whether the results described above were driven by averaged BOLD responses of the ROIs, conventional univariate fMRI analyses were conducted for the averaged amplitudes of  Figure 2. The averaged BOLD activity of FF in the FFA increased as a function of ISI, as the fMRI responses to larger ISI (267 ms) was the peak activity. While the averaged BOLD activity of TT in the SPL decreased as a function of ISI, as the fMRI responses to shorter ISI (33 ms) was the peak activity. Firstly, we conducted a three-way ANOVAs to evaluate the effects of ROI (right FFA vs. left SPL), ISI (33, 67, 133, 267 ms), and category (FF vs. TT). The interaction between ROI and category was highly significant (F (1,32) = 41.439, p < 10 -6 , 2 p = 0.564), and the main effect of ROI was marginally significant (F (1,32) = 3,452, p = .072, 2 p = 0.097), while all other effects and interactions were not significant (Fs < 1.096, n.s.). These results merely replicate that the FFA responded strongly to the FF condition whereas the SPL responded strongly to the TT condition.
For the same category conditions, two-way ANOVAs were con-

| Faster temporal processing capacity in the SPL than the FFA revealed by MVPA
The ROI-based MVPA results are shown in Figure 3. In the FFA (ventral), classification accuracy of the FF versus TT conditions significantly increased as a function of ISI (F (3,48) = 3.171, p < .05, 2 p = 0.165). By contrast, in the SPL (dorsal), classification accuracy of the FF versus TT conditions significantly decreased as a function of ISI (F (3,48) = 3.496, p < .05, 2 p = 0.179). These results suggest that the maximum temporal processing capacity of the FFA would be 367ms (267 ms ISI plus 100 ms stimulus display) or longer and that in the SPL would be 133 ms (33 ms ISI plus 100 ms stimulus display) or shorter.
For the BA17, the effect of ISI was not significant (F (3,48) = 2.744, p > .05, 2 p = 0.146), suggesting that our results may not be driven by low-level stimulus properties. For comparisons, classification accuracy of the FT versus TF conditions was neither significantly above the chance level (ts < 1.758, ps > 0.098), nor modulated by ISI (Fs < 1.761, ps > 0.167). Presumably, sluggish BOLD signal would lead to temporal mixing of the responses to face and tool stimuli.
Therefore, the classification accuracy of the FT versus TF conditions failed to reach statistical significance. However, previous studies suggested that temporal processing capacity could be assessed by BOLD signals corresponding to RSVP stimuli that belonged to a For the same category conditions, to further specifically compare the FFA and SPL that represent the two visual pathways respectively, we conducted within-subject model repeated-measure ANOVAs to evaluate the effects of ROI (FFA vs. SPL) and ISI (33,67,133,267 ms). The main effect of ROI was significant (F (1,16) = 6.828, p < .05, 2 p = 0.299), while the main effect of ISI was not significant (F (3,48) = 0.472, p > .05, 2 p = 0.029). Critically, the interaction between ROI and ISI was significant (F (3,48) = 5.795 p < .01, 2 p = 0.266), suggesting a marked dissociation of temporal processing capacity between the FFA and SPL.

| Dissociation of temporal processing capacity along the two visual pathways revealed by results in other ROIs and multivariate searchlight analysis
Given that the most interesting findings of the present study were in in the ventral pathway. The positive linear trend indicated the accuracies increased as a function of ISI in the fusiform tool area, different from results in the SPL/IPL (respond stronger to tools) in the dorsal pathway. Therefore, we think it is unlikely that there were categories and pathways confounding.
To further explore the temporal dynamics of object processing across the ventral and dorsal pathways, a multivariate searchlight analysis was performed on the main experiment data. It is worth noting that, the localizations of the FFA/SPL were independent from the searchlight analysis, and we had decided to select these ROIs before the searchlight analysis, thus our selection of ROIs was not biased by the searchlight results. Figure 4 shows the map of brain areas with significant (p < .01) linear trends for the classification accuracy as a function of ISI. The dissociation between the two visual pathways is evident. Significant negative linear trends (blue colors) were found in the dorsal pathway, whereas significant positive linear trends (orange and yellow colors) were found mainly in the ventral pathway.
In addition, clusters (cluster size >40 voxels) with significant linear trends (p < .01) in the ventral and dorsal pathways are presented in Tables 1 and 2, respectively. Seven ROIs with significant positive linear trends were found in the ventral pathway, and four ROIs with significant negative linear trends were found in the dorsal pathway.
Taken together, these results suggest that the dorsal pathway would process rapidly presented stimuli more efficiently than slowly presented stimuli, whereas the ventral pathway would be the opposite and therefore slower than the dorsal pathway for processing the stimuli.

| DISUSS ION
We used MVPA and ISI manipulation to overcome temporal delay of BOLD responses, and compared how the two visual pathways respond to the dynamics of visual stimuli. The MVPA results suggest that the temporal dynamics of stimuli led to dissociated modulations of activation patterns in the two pathways. Specifically, shorter ISIs (33 ms, 67 ms) led to better decodability for FF versus TT conditions in the dorsal pathway. By contrast, longer ISIs (133 ms, 267 ms) led to better decodability for the FF versus TT conditions in the ventral pathway. In comparison, the effect of ISI was not significant for decoding the FT versus TF conditions, confirms sluggish BOLD responses and that our results were not driven by any artifacts due to the variation of temporal presentation for the second stimuli.
Instead, our time-resolved approach revealed only the dynamic interaction between repeatedly presented stimuli.
Previous studies suggested temporal limitation of object processing capacity in the ventral pathway by using RSVP (Gauthier, Eger, Hesselmann, Giraud, & Kleinschmidt, 2012;McKeeff et al., 2007;Stigliani et al., 2015). For example, face-selective areas and place-selective areas showed peak tuning at about 4-5 items per second. Consistent with the notion of limited temporal capacity in the ventral pathway, electrophysiological studies have revealed that neural responses were stronger for slower image presentation rates during RSVP (Keysers & Perrett, 2002;Keysers, Xiao, Földiák, & Perrett, 2001). However, limitation of temporal capacity in the dorsal pathway was unclear. Moreover, previous studies only analyzed univariate averaged BOLD responses to investigate the temporal processing capacity in the ventral pathway, while the univariate averaged BOLD responses are known to comprise slower temporal dynamics than MVPA (Kohler et al., 2013). Therefore, few studies were able to access the temporal processing capacity in the dorsal pathway, assuming it would have been much faster that the ventral pathway. Indeed, our results suggest that the dorsal pathway is most sensitive to temporal interactions between two rapidly presented stimuli for the shortest ISI (33 ms) we had tested. Future studies are needed to further examine how the dorsal pathway may respond to even faster stimuli with ISI shorter than 33 ms.
Specifically, it has been proposed that while the ventral pathway  Most importantly, our whole-brain results (Figure 4) suggest that not only the left SPL, other four ROIs (Table 1) in the dorsal pathway would process rapidly presented stimuli more efficiently than slowly presented stimuli, whereas not only the right FFA, other seven ROIs (Table 2) in the ventral pathway would be the opposite and therefore slower than the dorsal pathway for processing the stimuli. It worth noting that, different from the FFA and SPL, there were no such significant category selective (faces vs. tools, p < 10 -4 ) in these seven ROIs in the ventral pathway and in the four ROIs in the dorsal pathway. Therefore, it is further demonstrated that there were no categories and pathways confounding.

Number of Voxels
In summary, our study found that the temporal dynamics of stimuli led to dissociation of fMRI activation patterns in the two visual pathways. Given that activation patterns may reflect population responses of neuron ensembles, temporal encoding in the dorsal pathway appears to be faster than the ventral pathway. These findings may shed lights on further understanding functional relationship and organization of the two visual pathways. Methodologically, shortening the ISI practically enables us to assess the temporal profile of object category decoding, which has significant potential for studying fMRI response patterns of subsecond dynamic range. Future work can adopt similar time-resolved paradigm in combination with MEG or EEG and further investigates fine-scale temporal profile of object encodings.

ACK N OWLED G M ENTS
We

CO N FLI C T O F I NTE R E S T
None declared.