Correspondence should be sent to Janet H. Hsiao, Department of Psychology, University of Hong Kong, Room 627, 6/F, The Jockey Club Tower, Centennial Campus, Pokfulam Road, Hong Kong. E-mail: firstname.lastname@example.org
Through computational modeling, here we examine whether visual and task characteristics of writing systems alone can account for lateralization differences in visual word recognition between different languages without assuming influence from left hemisphere (LH) lateralized language processes. We apply a hemispheric processing model of face recognition to visual word recognition; the model implements a theory of hemispheric asymmetry in perception that posits low spatial frequency biases in the right hemisphere and high spatial frequency (HSF) biases in the LH. We show two factors that can influence lateralization: (a) Visual similarity among words: The more similar the words in the lexicon look visually, the more HSF/LH processing is required to distinguish them, and (b) Requirement to decompose words into graphemes for grapheme-phoneme mapping: Alphabetic reading (involving grapheme-phoneme conversion) requires more HSF/LH processing than logographic reading (no grapheme-phoneme mapping). These factors may explain the difference in lateralization between English and Chinese orthographic processing.
In the research on visual word recognition, it has been consistently reported that there is a left hemisphere (LH) lateralization effect in processing words in alphabetic languages. For example, a classical right visual field (RVF)/LH advantage in processing English words or letter strings has been consistently reported in various tachistocscopic word recognition tasks, such as tachistoscopic identification of letters/nonsense words (e.g., Bryden & Rainey, 1963; Fontenot, 1970), word naming (e.g., Bradshaw & Gates, 1978; Brysbaert & d'Ydewalle, 1990; Leehey & Cahn, 1979), and lexical decision tasks (e.g., Faust, Babkoff, & Kravetz, 1995; Babkoff, Genser, & Hegge, 1985; Barry, 1981; Brand, van Bekkum, Stumpel, & Kroeze, 1983; Measso & Zaidel, 1990; see also Chiarello, Nuding, & Pollock, 1988). This RVF/LH advantage has also been observed in the recognition of words in other alphabetic languages, such as in word matching tasks in German (e.g., Hausmann, Güntürkün, & Corballis, 2003; Hausmann, Durmusoglu, Yazgan, & Gunturkun, 2004), lexical decision in Spanish (e.g., Hernandez, Nieto, & Barroso, 1992), tachistoscopic word naming (e.g., Bentin, 1981) and lexical decision in Hebrew (e.g., Koriat, 1985), and lexical decision in Japanese phonetic script kana (e.g., Hanavan & Coney, 2005). Consistent with the behavioral data, data from fMRI studies have shown a region inside the left fusiform area Visual Word Form Area responding selectively to visual words and pseudo-words that follow orthographic regularities in English; this area has been argued to reflect expertise in abstract visual word form perception and involves processes that are separate from higher order linguistic processing (e.g., McCandliss, Cohen, & Dehaene, 2003). Event-related potential (ERP) studies also show that words elicit a larger early visual component N170 in the LH than the right hemisphere (RH), in tasks such as repetition detection (Maurer, Zevin, & McCandliss, 2008; Maurer, Brandeis, & McCandliss, 2005), target detection (i.e., an oddball task; Bentin et al., 1981), and even a word orientation judgment task, which involved little phonological/semantic processing (e.g., Rossion, Joyce, Cottrell, & Tarr, 2003). Consistent with these findings, Magnetoencephalography (MEG) studies also showed that early visual processing of word or letter string stimuli is associated with responses at around 150 ms in the left inferior occipito-temporal cortex (Helenius, Tarkiainen, Cornelissen, Hansen, & Salmelin, 1999; Tarkiainen, Cornelissen, & Salmelin, 2002).
This RVF/LH advantage in visual word recognition in alphabetic languages has been argued to be due to the LH lateralization in language processing (e.g., Voyer, 1996; Maurer & McCandliss, 2007). Nevertheless, this claim has been challenged by at least one counterexample, that is, the recognition of Chinese characters. In contrast to the RVF/LH advantage in the recognition of English words, the recognition of Chinese characters, a logographic writing system, has been shown to have a left visual field (LVF)/RH advantage in tachistoscopic character identification/naming (e.g., Tzeng, Hung, Cotton, & Wang, 1979; Cheng & Yang, 1989) and lexical decision tasks (e.g., Leong, Wong, Wong, & Hiscock, 1985). In addition, Hsiao and Cottrell (2009) showed that Chinese readers (experts) have a perceptual bias toward the left side of characters (from the viewer's perspective) when judging character similarities, whereas non-Chinese readers (novices) do not have any bias; a similar effect was also observed in face perception (Fig. 1): A chimeric face made from two left-half faces is usually judged more similar to the original face than the one made from two right-half faces (Gilbert & Bakan, 1973). This left-side bias effect suggests the RH involvement in both face and expert Chinese character processing (e.g., Burt & Perrett, 1997). Yang and Cheng (1999) contrasted orthographic and phonological processing of Chinese character recognition, and showed that in a tachistoscopic character matching task, when the orthographic similarity of two alternative items for choice was manipulated, there was a LVF/RH advantage; in contrast, when the phonological similarity of two alternative items for choice was manipulated, there was a RVF/LH advantage (see also Leong et al., 1985). This result suggested RH lateralization in orthographic processing and LH lateralization in phonological processing of Chinese characters.
In short, Chinese character recognition has been shown to involve less LH processing/more RH processing compared with word recognition in alphabetic languages, in particular in the orthographic/visual word form processing. Consistent with the behavioral data, although most of the existing fMRI studies on Chinese character recognition usually used character pronunciation or semantic tasks, they generally showed more RH-lateralized activation relative to English reading particularly in the visual areas (e.g., Tan et al., 2000, 2001). Although in contrast to Chinese character recognition, the recognition of Chinese two-character words has been shown to have a RVF/LH advantage in tachitoscopic word identification tasks (e.g., Tzeng et al., 1979; Cheng & Yang, 1989; note, however, that Keung & Hoosain, 1989; reported a RH advantage in processing high stroke number low frequency two-character Chinese words with short exposure time and low luminance), fMRI studies of Chinese word pronunciation also in general showed more bilateral or RH-lateralized activation in the occipitotemporal system, in contrast to the typically LH-lateralized occipitotemporal activation in English word reading (according to a meta-analysis by Tan, Laird, Li, & Fox, 2005; see also Fu, Chen, Smith, Iversen, & Matthews, 2002; Chen, Fu, Iversen, Smith, & Matthews, 2002; Kuo et al., 2001). In ERP studies of Chinese character pronunciation, source localization data have also shown more RH-lateralized or bilateral processing in the occipitotemporal region, in contrast to the LH lateralization observed in English word pronunciation (e.g., Liu & Perfetti, 2003; Hsiao, Shillcock, & Lee, 2007).
The RH advantage in orthographic processing of Chinese character has been argued to reflect the RH superiority in handling holistic pattern recognition (Tzeng et al., 1979). Nevertheless, findings in later studies do not support this claim. For example, Cheng and Yang (1989) showed no laterality effect in the tachistoscopic identification of non-characters and pseudo-characters, suggesting that this RH advantage may be related to lexical knowledge of Chinese characters or learning experience. Also, in contrast to Tzeng et al.'s claim, Hsiao and Cottrell (2009) showed a reduced holistic processing effect in Chinese character perception in Chinese readers compared with non-Chinese readers.1 Thus, it remains unclear why Chinese character recognition differs from the recognition of English words in terms of hemisphere lateralization particularly in the visual system.
Here, we aim to examine potential visual and task characteristic factors that may influence hemispheric lateralization in visual word recognition, to have a better understanding on why word recognition in different language systems involve different hemispheric lateralization in the visual system. More specifically, we hypothesize that the laterality of language processing is not necessary to account for hemispheric lateralization in visual word recognition, and differences in visual characteristics and word-to-sound mapping task alone can account for lateralization differences in visual word recognition between languages. We adopt a computational modeling approach to elucidate the mechanism underlying the potential visual and task characteristic effects, as modeling enables us to have a better control over variables that may be difficult to tease apart in human subject studies, such as a brain without language lateralization, which allows us to examine visual and task characteristic effects without the influence of language lateralization. Our model is a trainable, generic learning model that follows known observations about visual anatomy and neural computation. We introduce our computational model below.
Anatomical evidence shows that our visual field is split along the vertical midline, with the two visual hemifields initially contralaterally projected to different hemispheres (e.g., Nolte, 2002). Hsiao, Shieh, and Cottrell (2008b) conducted a hemispheric modeling study of face recognition, aiming to account for the left-side bias effect in face perception. The left-side bias effect in face perception refers to the phenomenon that a chimeric face made from two left-half faces is usually judged to be more similar to the original face than the one made from two right-half faces (e.g., Gilbert & Bakan, 1973; Fig. 1); it has been argued to be an indication of the RH involvement in face processing (e.g., Burt & Perrett, 1997). Hsiao et al. (2008b) showed that their hemispheric processing model was able to account for the left-side bias effect in face perception. The model incorporates several known observations about visual anatomy and neural computation: Gabor filters (linear filters that have spatial frequency and orientation information of an input image) are used over the input images to simulate neural responses of cells in the early visual system (Lades et al., 1993); principal component analysis (PCA), a generic linear compression technique for dimensionality reduction that has been suggested to be biologically plausible (Sanger, 1989), is used to simulate possible information extraction processes beyond the early visual system. This PCA representation then is used as the input to a two-layer neural network (Fig. 2). In addition, the model implements a theory of hemispheric asymmetry in perception, double filtering by frequency (DFF) theory, proposed by Ivry and Robertson (1998; see also Sergent, 1982). We describe the background of this theory below.
In visual perception, how we process the global form and local features of a stimulus has been extensively studied. Navon (1977) proposed that the global form is unavoidably processed before the local features (i.e., the “global precedence hypothesis”). This effect was later shown to depend on hemispheric differences in processing local and global features; more specifically, it has been consistently shown that there is a RVF/LH advantage for responses to local features and a LVF/RH advantage for responses to global features of a stimulus (e.g., Sergent, 1982; Ivry & Robertson, 1998; Van Kleeck, 1989; Delis, Robertson, & Efron, 1986; Robertson & Delis, 1986; Robertson, Lamb, & Zaidel, 1993; Martinez et al., 1997; Proverbio, Minniti, & Zani, 1998; Han et al., 2002; Weissman & Woldorff, 2005; Flevaris, Bentin, & Robertson, 2010). For example, Sergent (1982) presented hierarchical letter patterns, that is, a big letter pattern that is composed of many small letters (Navon, 1977) in participants’ LVF or RVF very briefly and asked them to detect a target letter regardless of whether the target letter was the big letter (the global form) or the small letters (i.e., the local form). She showed that participants were faster at detecting a target letter in the global form than the local form when the stimulus was presented in the LVF/RH, and vice versa when the stimulus was in the RVF/LH. She thus concluded that the global precedence phenomenon was a characteristic of the RH but not the LH. In addition, she referred to the global and local form processing as having different spatial frequency content: low spatial frequency (LSF) for the global form, and high spatial frequency (HSF) for the local form. Accordingly, the hemispheric asymmetry in processing global and local features is related to the hemispheric difference in processing low- and high-frequency information: There is a RH advantage in processing low-frequency information, and a LH advantage in processing high-frequency information. This hemispheric asymmetry in spatial frequency processing has been supported by several follow-up studies (see Ivry & Robertson, 1998), using different tasks, including face recognition (e.g., Keenan, Whitman, & Pepe, 1989), spatial frequency identification (e.g., Kitterle, Christman, & Hellige, 1990), and grating discrimination (e.g., Proverbio, Zani, & Avella, 2002), and in fMRI (e.g., Han et al., 2002) and Electroencephalogram (EEG) studies (e.g., Flevaris et al., 2010).
Nevertheless, some studies using grating detection or contrast sensitivity tasks failed to show this hemispheric asymmetry effect in spatial frequency processing (e.g., Kitterle et al., 1990; Rijsdijk, Kroon, & Van der Wildt, 1980; Di Lollo, 1981; Peterzell, Harvey, & Hardyck, 1989; Fendrich & Gazzaniga, 1990; Peterzell, 1991). Thus, some researchers proposed that this hemispheric asymmetry in spatial frequency processing must emerge at a higher stage of perceptual processing beyond the sensory level (e.g., Sergent, 1982; Hsiao, Shahbazi, & Cottrell, 2008a; Hsiao, Cipollini, & Cottrell, in press; Heinze, Hinrichs, Scholz, Burchert, & Mangun, 1998). Consistent with this argument, Heinze et al. (1998) showed in an ERP study that the hemispheric asymmetry effect in processing global and local features was observed in the N2 component but not in the earlier, sensory-evoked P1 component. fMRI studies also showed that the locus of this asymmetry is in the inferior parietal lobe/superior temporal gyrus region (e.g., Weissman & Woldorff, 2005; Robertson, Lamb, & Knight, 1988; Fink et al., 1997) and the occipital/occipitotemporal region (Han et al., 2002; Martinez et al., 1997).
Accordingly, Ivry and Robertson (1998) proposed the DFF theory and argued that information coming into the brain goes through two frequency filtering stages: The first stage involves attentional selection of a task-relevant frequency range. At the second stage, the LH amplifies high-frequency information, whereas the RH amplifies low- frequency information. This differential frequency bias in the two hemispheres is implemented in Hsiao et al. (2008b) model using two sigmoid weighting functions to assign different weights to the Gabor responses of different frequency scales in the two hemispheres (Fig. 2).
Here, we apply Hsiao et al.'s (2008b) hemispheric processing model to visual word recognition, to examine whether visual and task characteristics of a writing system alone are able to account for differences in hemispheric lateralization between different languages, without assuming the influence of language processing being LH lateralized (as this is not implemented in the model). We hypothesize that at least two factors other than language lateralization may influence hemispheric lateralization in visual word recognition; we manipulate these two factors during the model training and examine the resulting effects:
(1) Visual similarity among words in the lexicon
The more similar words look visually to one another in the lexicon, the more HSF information is required to recognize them; this leads to more LH lateralization. We hypothesize that at least two factors may influence visual similarity among words in the lexicon:
Number of letters shared among words in the lexicon: The more letters are shared among words in the lexicon, the more similar the words look visually to one another. This factor is influenced by the ratio between alphabet size (i.e., the number of letters in the alphabet) and lexicon size (i.e., the number of words in the lexicon); that is, given a fixed lexicon size, the smaller the alphabet size is, the more letters may be shared among words in the lexicon, and thus the more similar the words look visually to one another. Similarly, given a fixed alphabet size, the larger the lexicon size is, the more letters may be shared among words, and thus the more similar the words look visually.
Similarity among letters in the alphabet: As letters are components of words, the more similar the letters look visually to one another, the more similar the words look visually to one another, even when they do not share common letters. This factor may be influenced by the number of letters in the alphabet; that is, given a fixed representation space for all possible letters, when we gradually increase the number of letters in the alphabet, it becomes more likely that some letters will look similar to each other (i.e., close to each other in the space).
According to these two factors, we predict that with a fixed lexicon size, when we gradually increase the alphabet size, the model will first exhibit more and more LSF reliance since the words in the lexicon will share fewer and fewer common letters (factor (i)); when the letters in the alphabet start to look visually similar to one another because of the alphabet size increase, the model will start to exhibit reduced LSF reliance (factor (ii)). In other words, we expect that there will be an inverted U-shaped curve in LSF reliance/RH lateralization of the model when we gradually increase the alphabet size given a fixed lexicon size.
(2) Requirement to decompose a word into its constituent graphemes for grapheme-phoneme mapping
Maurer and McCandliss (2007) proposed the phonological mapping hypothesis to account for the difference in ERP N170 lateralization between faces and words: N170 has been found to be larger in the RH compared with the LH in face recognition, whereas in the recognition of English words, it is larger in the LH compared with the RH (e.g., Rossion et al., 2003). They argued that given phonological processes are typically LH lateralized (e.g., Rumsey et al., 1997), specialized processing of visual words in visual brain areas also becomes LH lateralized. Accordingly, the LH lateralization of N170 may be specifically related to the influence of grapheme-phoneme conversion established during learning to read. According to this hypothesis, this phonological modulation should be less pronounced in logographic scripts such as Chinese (Maurer & McCandliss, 2007). Nevertheless, it remains unclear why grapheme-phoneme conversion will lead to more LH phonological processing than logographic reading, which involves mapping from logograms to corresponding pronunciations at the syllable level. Thus, in contrast to the phonological mapping hypothesis, here we test the hypothesis that the LH lateralization in word recognition in alphabetic languages is due to the requirement to decompose a word into graphemes (either sequentially or in parallel, although the model assumes parallel processing), which is not a requirement for logographic reading, without assuming phonological processes being LH lateralized.
Here, we test these hypotheses through two simulations. In the first simulation, we contrast the effect of two mapping tasks, word identity mapping and letter identity mapping tasks, with the input stimuli controlled by using the same English pseudo-words in both tasks. In the word identity mapping task, the model learns to distinguish different words (similar to logographic reading), whereas in the letter identity mapping task, the model learns to identify the constituent letter in each letter position of the input word (similar to alphabetic reading). We predict that in the word identity mapping task, with a fixed lexicon size, when we gradually increase the alphabet size, the model will first rely more on LSF information, since words become visually more dissimilar to one another; this LSF reliance will then decrease when words start to become more similar to one another due to increased letter similarity in a large alphabet. In addition, we predict that the letter identity mapping task will require more HSF information (LH lateralization) compared with the word identity mapping task, because words may be differentiated by word outline shapes (LSF information) without letter identification; in contrast, letter identity mapping involves decomposition of a word into its constituent letters for identification, and thus relatively higher spatial frequency information is required.
In the second simulation, instead of mapping word image input to either word or letter identities, we model visual word pronunciation by mapping them to pronunciations. Two pronunciation conditions are created: In the alphabetic reading condition, each grapheme of a word maps to a consonant or vowel in the pronunciation systematically, whereas in the logographic reading condition, each word maps to a pronunciation (at the syllable level) randomly without a systematic relationship between its graphemes and the phonemes in the pronunciation. As the first simulation, we use the same stimuli, English pseudo-words, in both conditions to control for the difference in input when comparing the two mapping tasks. We expect that the alphabetic reading condition will require more HSF information (LH lateralization) compared with the logographic reading condition, due to the requirement to decompose a word into its graphemes to map them to corresponding phonemes in the pronunciation.
2. Methods and results
To test our hypotheses, we applied the intermediate convergence model proposed by Hsiao et al. (2008b) to visual word recognition. In the model, the input word images were first filtered with a rigid grid of overlapping 2D Gabor filters (Daugman, 1985). Gabor filters are linear filters that can be used for edge detection in image processing, and the frequency and orientation representations of Gabor filters have been shown to be similar to neural responses of cells in the early visual system (e.g., Lades et al., 1993; Ringach, 2002). At each grid point, we used Gabor filters of eight orientations and a fixed number of frequency scales. The number of scales used depended on the task-relevant frequency range, which was determined according to the smaller dimension of the images; the highest frequency scale did not exceed the smaller dimension of the images (following Hsiao et al., 2008b). In the current simulations, the dimensions of the English pseudo-word images used were 35 × 100 (Fig. 3a). Thus, the number of scales used was five, corresponding to 2–32 (21–25) cycles along the shorter side of the images (the sixth scale would have 64 cycles along the shorter side, and hence one cycle would cover smaller than one pixel). We applied the Gabor filters to a 5 × 18 grid of points on each word image. Thus, each image was transformed into a 3600-element perceptual representation (5 × 18 sample points × 8 orientations × 5 scales).
After obtaining the Gabor response representation, two conditions were created: the baseline condition and the biased condition. In the baseline condition (the unbiased, control condition), Gabor responses in different frequency scales were given equal weights (i.e., no frequency bias), whereas in the biased condition, we implemented the second stage of the DFF theory by using a sigmoid weighting function to give more weights to the LSF scales in the Gabor responses of the left-half word (RH), and to give more weights to the HSF scales in the Gabor responses of the right-half word (LH) (Fig. 2). The perceptual representation of the left- and right-half words was compressed using PCA into a 50-element representation each (100 elements in total, following Hsiao et al., 2008b)2. This PCA representation then was used as the input to a two-layer neural network; thus, the input layer of the neural network had 100 nodes (Fig. 2); nodes in each layer were fully connected with the next layer. The hidden layer contained 20 nodes (see Hsiao et al., 2008b, for more simulation details). The output layer of the neural network contained corresponding output representation of the model, such as word identity for a word recognition task, in which each output node corresponds to a word identity (the number of nodes in the output layer depended on the task the model performed). The network was trained to map each input representation to the corresponding output by adjusting the connection weights.
We trained our neural network model to recognize the word images until either the mean squared error was less than 0.00001, or the number of training epochs reached 10,000. In all simulations reported here, after training the performance on the training set reached 100% accuracy. The network responded correctly if the correct output node was more active than any other output nodes. The training algorithm was gradient descent with an adaptive learning rate (implemented by the Matlab Neural Network Toolbox): The learning rate was increased by 5% if the performance improvement (measured by summed squared errors) between the two most recent runs was smaller than the previous one, and decreased by 30% if the performance improvement was larger than the previous one (gradient descent is a generic optimization algorithm for finding a local minimum of a function, in which one takes a small step each time along the local gradient of the function; the size of the step is determined by the learning rate). The initial learning rate was 0.1. We used lateralized input to examine hemispheric lateralization effects (Fig. 3b). Lateralized input images were made by setting one half of the PCA representation to zero, so that when mapping these images to the corresponding output, only the representation from the left- or right-half word was available for recognition. The left-side bias effect (RH advantage) was thus measured as the accuracy difference between recognizing a left-half word (which carried only LSF/RH information in the biased condition) as the original word and recognizing a right-half word (which carried only HSF/LH information in the biased condition) as the original word (cf. Hsiao et al., 2008b).
2.1. Simulation one: Word versus letter identity mapping tasks
2.1.1. Word identity mapping task
We first used images of six-letter English pseudo-words to examine how visual similarity among words in the lexicon influences lateralization in visual word recognition. To counterbalance the information carried in the two halves of the words so that there would not be any input information asymmetry to influence lateralization in the model, we used palindrome pseudo-words as the stimuli (Fig. 3a); in addition, mirror images of the stimuli were used in half of the simulation runs.3 We created artificial lexicons with an increasing alphabet size, ranging from 3 to 20. In each lexicon, letters in the alphabet were randomly chosen from the English alphabet, and 26 palindrome words were randomly chosen from all possible combinations of letters in the alphabet. The model then was trained to recognize words in the lexicon, with each output node corresponding to a word identity (thus, the output layer had 26 nodes). We ran each model with a different alphabet size 20 times and analyzed their lateralization effects.
For example, in a lexicon with letters “a,” “b,” and “c,” there were 27 possible combinations: aaaaaa, aabbaa, aaccaa, abaaba, etc.4 The randomly chosen 26 words looked very similar to one another since they shared a lot of common letters. When we increased the alphabet size to 5, the number of possible combinations became 125, and the randomly chosen 26 words became more dissimilar visually to one another. In other words, the larger the alphabet size was, the lower the visual similarity was among words in the lexicon. Nevertheless, the visual similarity among words may start to increase when letters in the alphabet started to look similar to each other with increasing alphabet size. Here, we examined how the model's lateralization changed when we gradually increased the alphabet size with a fixed lexicon size.
In each lexicon, we used eight different fonts for each word, with four of them used as the training data, and the other four used as the testing data (counterbalanced across simulation runs). Thus, in both the training and testing data sets, each word had four images of different fonts (Hsiao et al., 2008b). Using different fonts for training and testing was to ensure that the model's performance could be generalized to words in new fonts; human readers, who typically can read words in different fonts, also possess this generalization ability.
After training, the model had 100% accuracy on the training data, and 95% accuracy on average on the testing data. The results are shown in Fig. 4a. The RH/LSF preference measure was defined as the difference in the left-side bias effect between the biased condition and the baseline condition (the left-side bias effect was measured as the accuracy difference between recognizing a left-lateralized word as the original word and recognizing a right-lateralized word as the original word; it reflects RH lateralization); it reflected how much the model in the biased condition preferred the RH/LSF-biased representation over the LH/HSF-biased representation compared with the baseline condition when no frequency bias was applied (Hsiao et al., 2008b). As shown in Fig. 4a, when the alphabet size was small, the model had low RH/LSF preference. When we increased the alphabet size, the RH/LSF preference became larger, and then decreased after the peak at around alphabet size 6 (i.e., an inverted U-shape in Fig. 4a). There was a significant quadratic correlation between alphabet size and RH/LSF preference (r2 = .148, p < .001). Thus, consistent with our hypothesis, with a fixed lexicon size, when gradually increasing the alphabet size, the visual similarity among words decreased, and the model relied more on LSFs to distinguish the words. When the alphabet size kept increasing, more and more letters with similar shapes might be used in the alphabet (e.g., “c” and “o,” “b” and “h,” “m” and “n”), and the visual similarity among words in the lexicon started to increase; as the result, the model required more HSFs to distinguish the words.5
To further explore the relationship between alphabet size, word similarity, and RH/LSF preference, we examined the correlations among these three variables. We considered the Gabor response representation of each word as a point in a high-dimensional space. The single linkage (shortest distance) method was used to calculate the shortest distance between the Gabor responses of a word and all the other words, as a measure of how similar this word was to the other words in a lexicon (e.g., Mardia, Kent & Bibbi, 1980). As shown in Fig. 4(b), when the alphabet size increased, the word dissimilarity increased first, and then leveled off gradually; there was a significant quadratic correlation between alphabet size and word dissimilarity (r2 = .746, p < .001). In addition, there was a significant quadratic correlation between word dissimilarity and RH/LSF preference of the model in the word identity mapping task (r2 = .140, p < .001). As shown in Fig. 4(c), there was an inverted U-shaped relationship between word dissimilarity and RH/LSF preference of the model: The model preferred using more HSF information when words were either very similar or very dissimilar.6
Thus, the inverted U-shaped relationship between alphabet size and RH/LSF preference of the model observed in Fig. 4a may be due to two factors. First was the inverted U-shaped relationship between alphabet size and word dissimilarity (Fig. 4b). When we increased the alphabet size, the dissimilarity among words increased first since words in the lexicon shared fewer and fewer letters. With further alphabet size increase, some letters in the alphabet started to have similar shapes, and thus the dissimilarity among words started to decrease. Second, there was also an inverted U-shaped relationship between word dissimilarity in the lexicon and RH/LSF preference of the model: More HSF information was preferred when words were either very similar or very dissimilar to one another (Fig. 4c). These two factors contributed to the observed inverted U-shaped relationship between alphabet size and RH/LSF preference in the word identity mapping task (Fig. 4a). This result supports our hypothesis that visual similarity among words in the lexicon may account for difference in hemispheric lateralization between different languages.
2.1.2. Letter identity mapping task
When reading words in alphabetic languages, although the readers could theoretically map visual word forms directly to word meanings, they typically map visual word forms to word pronunciations first and then from pronunciations to meanings; this is because there is usually a systematic mapping between graphemes and phonemes, and it is relatively easier to map word pronunciations to meanings due to prior spoken language experience (e.g., Van Orden, Johnston & Hale, 1988; Lesch & Pollatsek, 1993; Lukatela, Lukatela & Turvey, 1993; Lukatela & Turvey, 1994a,b). Thus, reading words in alphabetic languages typically involves decomposing visual word input into its constituent graphemes and map them to corresponding phonemes. This decomposition may require more details of the word image and thus rely more on HSF information compared with the word identity mapping task, since it requires identifying letter boundaries and recognizing each letter; in contrast, LSF information such as word outline shape may be sufficient for word identity mapping without letter identification. Here, we examined lateralization effects in a letter identity mapping task using the same stimuli (English pseudo-words) as used in the word identify mapping task. Instead of learning to map word images to word identities (Fig. 2), the model was trained to map a word image to its constituent letter identities. The output layer of the model was divided into three parts corresponding to the first three letter positions in a word (the end three letters were the same as the first three since they were palindrome words). The number of nodes in each part was equal to the alphabetic size; each node corresponded to a letter in the alphabet (see an example in Fig. 5a).
After training, the model had 100% accuracy on the training data, and 99% accuracy on average on the testing data. Fig. 5b shows the results. The results showed that no significant lateralization effect was observed in most of the cases we tested. More importantly, compared with the word identity mapping task, the letter identity mapping task required more LH/HSF (i.e., less RH/LSF) information: ANalysis Of VAriance (anova) with alphabet size (3 to 20) as a between-subject variable and mapping task (word-identity or letter-identity) as a within-subject variable showed a strong effect of mapping task (F(1, 324) = 657.206, p < .001); post hoc t-tests showed that the mapping task effect was significant in all lexicons with different alphabet sizes (Fig. 5b). Thus, the modeling results support our hypothesis that the decomposition of words into letters for letter-sound mapping in alphabetic language reading requires more HSF information and results in stronger LH lateralization compared with logographic language reading.
2.2. Simulation two: alphabetic versus logographic reading (pronunciation) tasks
Here, we examine hemispheric lateralization difference between logographic and alphabetic language reading in a pronunciation task. We created both logographic and alphabetic reading conditions with the same orthography (artificial lexicons with English pseudo-words) as the input. In the alphabetic reading condition, each letter in a word mapped to a phoneme in the pronunciation systematically; in contrast, in the logographic reading condition, each word had a randomly assigned pronunciation without a systematic mapping between its letters and the phonemes in the pronunciation. The pronunciations had a consonant-vowel-consonant structure. In contrast to Simulation One, here we fixed the alphabet size, and examined the lateralization effect when we gradually increased the lexicon size.
In the artificial lexicons we modeled, the lexicon size varied from 26 to 60. In the alphabet of the artificial lexicon, there were in total 26 letters (“a” to “z” in the English alphabet), with 13 letters corresponding to a consonant in the pronunciation and 13 letters corresponding to a vowel in the alphabetic reading condition; the vowel and consonant letters were randomly chosen in each lexicon. The pronunciation of each word had a consonant–vowel–consonant format. Thus, in the output layer of the model, there were three parts, each corresponding to a position in the consonant–vowel–consonant format; each node in a part corresponded to a consonant/vowel in that position (Fig. 6a). In addition, to counterbalance the visual information carried in the two sides of the pseudo-words in the lexicon, palindrome words of six letters long were created as input images with the three-letter words in the artificial lexicons, and mirror images of the pseudo-words were used in half of the simulation runs. In the alphabetic reading condition, each letter was systematically mapped to either a vowel or a consonant in the pronunciation, whereas in the logographic reading condition, each word was mapped to a randomly assigned pronunciation without a systematic letter-phoneme mapping. Similar to Simulation One, in the data sets there were eight different fonts for each word, with four fonts used during training and the other four fonts used for testing; the fonts used for training and testing were counterbalanced across simulation runs. We ran each model with a different lexicon size 16 times and analyzed their lateralization effects.
After training, the model had 100% accuracy on the training data, and 88% on average on the testing data. Fig. 6b shows the results. As shown in the figure, in most cases no significant lateralization was observed in the logographic reading condition, whereas a strong LH (HSF) lateralization (i.e., negative RH/LSF preference) was observed in the alphabetic reading condition in all cases tested. More importantly, the RH/LSF preference measure in the logographic reading condition was significantly larger than that in the alphabetic reading condition: anova with reading condition (logographic vs. alphabetic) as a within-subject variable and lexicon size (26–60) as a between-subject variable showed a strong reading condition effect (F(1, 270) = 354.827, p << .001). This effect interacted with lexicon size (F(2, 270) = 1.810, p = .027); post hoc t-tests showed that the reading condition effect was significant in all of the lexicon sizes larger than 26 (paired t-tests, t(15) > 2.2, p < .05). In addition, the RH/LSF preference measure decreased with increasing lexicon size in both the alphabetic reading condition (r2 = .147, p < .001) and the logographic reading condition (r2 = .014, p = .042). This effect corresponded to the rising part of the quadratic relationship between the model's RH/LSF preference and word similarity in Simulation One (see Fig. 4c): With a fixed alphabet size, when we gradually increased the lexicon size, words in the lexicon shared more common letters and became more visually similar to one another, and thus more HSF/LH (i.e., less RH/LSF) information was required for recognition.
The result thus suggests that logographic reading requires more LSF information compared with alphabetic reading. This result is consistent with the visual word recognition literature, which shows more RH lateralization in reading logographic languages such as Chinese compared with alphabetic languages such as English. We have summarized the results of both Simulation One and Two in Table 1.
Table 1. Summary of the simulation results
Simulation One: Word images were mapped to individual nodes representing word/letter identity.
Word identity mapping
Letter identity mapping
Lexicon size: fixed to 26 words
Alphabet size: ranging from 3 to 26 letters
In general RH (LSF) lateralization was observed; the RH lateralization was weaker with either a very small or very large alphabet size (i.e., either very high or very low visual similarity among words).
In general no lateralization was observed. In all cases the letter identity mapping task showed significantly less RH (LSF) preference than the word identity mapping task.
Simulation Two: Word images were mapped to pronunciations with a consonant-vowel-consonant structure.
Lexicon size: ranging from 26 to 60 words
Alphabet size: fixed to 26 letters
In general, no lateralization was observed; the RH (LSF) preference became weaker with increasing lexicon size (i.e., increasing visual similarity among words).
In general, LH (HSF) lateralization was observed; the LH lateralization increased with increasing lexicon size (i.e., increasing visual similarity among words)
Visual word form processing in alphabetic languages such as English has been reported to be LH lateralized, and argued to be due to the LH lateralization of language processes. Nevertheless, a RH/LVF advantage has been reported in orthographic processing of Chinese characters (a logographic language), and thus is a counterexample of this claim. Here, we applied Hsiao et al. (2008b) hemispheric processing model of face and object recognition to visual word recognition and showed that visual and task characteristics alone are able to account for differences in hemispheric lateralization in visual word recognition between different languages without assuming the influence from LH-lateralized language processes. In other words, the laterality of language processing is not necessary for accounting for hemispheric lateralization in visual word recognition.
We first showed that visual similarity among words in the lexicon could influence lateralization in visual word recognition. We used artificial lexicons with the same number of words and word length, but with different alphabet sizes, and trained the model to map word images to word identities. The results showed an inverted U-shaped pattern (Fig. 4a): When the alphabet size increased, the model initially relied more on the RH/LSF information; with further increase, the model's RH/LSF reliance started to decrease. Our further analysis showed that this inverted U-shaped pattern might be due to two factors:
A quadratic correlation between alphabet size and visual dissimilarity among words in the lexicon: When the alphabet size increases, words in the lexicon become more dissimilar to each other; when the alphabet size is so big that some letters start to have similar shapes, visual dissimilarity among words starts to decrease.
A quadratic correlation between visual dissimilarity among words in the lexicon and RH/LSF preference of the model: The model relies more on LH/HSF information when words are either very dissimilar or very similar to each other. Recent research has suggested that whether a recognition task requires HSF or LSF information depends on the diagnostic information for the task (e.g., Oliva & Schyns, 1997; Schyns, 1998; Schyns & Oliva, 1999). Thus, when words are very dissimilar to each other, there may be more featural differences among them than overall configuration differences so that HSF information is more diagnostic to distinguish them. On the other hand, when words are very similar to each other, they may have very similar shapes and configurations that cannot be distinguished by LSF information, and thus HSF information is also more diagnostic. This speculation is consistent with the literature on object and face recognition. For example, the discrimination of visually dissimilar categories (such as cups and books) relies more on part-based, featural processing compared with the identification of individual faces, which requires configural, LSF information (Dailey & Cottrell, 1999; see also Tanaka & Gauthier, 1997; Farah, Wilson, Drain & Tanaka, 1995). On the other hand, when the stimuli are visually so similar to each other that the diagnostic information for the task lies mainly in the HSF range, HSF information becomes important (e.g., Schyns, 1998).
Thus, hemispheric lateralization in visual word recognition can be influenced by visual similarity among words in the lexicon, or more specifically, the diagnostic information for distinguishing different words in the lexicon.
We then showed that the requirement to decompose a word into its constituent letters could also influence lateralization in visual word recognition. We used the same artificial lexicons but trained the model to perform a letter identity mapping task and found that the task requires more HSF (LH) information than the word identity mapping task. This result is consistent with Grainger and Jacobs’ dual read-out model in accounting for word context effects in letter perception (Grainger & Jacobs, 1994); their model assumes that word representation units reach critical activation levels for correct recognition with shorter presentation duration as compared with letter representation units, suggesting that letter perception involves sharper visual distinction than visual word processing. Also, in a more recent study, Grainger and Ziegler (2011) proposed a dual-route approach to orthographic processing, which posits a coarse-grained route that uses minimal information to identify a word (similar to the word identity mapping task in the model), and a fine-grained route that involves chunking of graphemes for grapheme-phoneme conversion (similar to our letter identity mapping), consistent with our results.
In addition, in Simulation Two, we modeled word pronunciation by mapping each word to a pronunciation with a consonant-vowel-consonant format. In contrast to Simulation One, we fixed the alphabet size to be 26 and gradually increased the lexicon size from 26 to 60. In the alphabetic reading condition, each letter in a word systematically mapped to a phoneme in the pronunciation, whereas in the logographic reading condition, each word had a randomly assigned pronunciation without a systematic mapping between letters and phonemes. Consistent with our hypothesis, the results showed that logographic reading requires more LSF information than alphabetic reading, resulting in more RH lateralization.
In contrast to the phonological mapping hypothesis (Maurer & McCandliss, 2007), which argues that specialized processing of visual words in visual brain areas becomes LH lateralized because phonological processing is typically LH lateralized, in our modeling we did not assume any lateralization of phonological processing. The difference in lateralization between alphabetic and logographic reading emerged naturally due to the difference in the mapping task requirement, or more specifically, the frequency content of the input image that is diagnostic to the task. Computational modeling allows us to tease apart these factors that may be difficult to examine in human subject studies.
The proposed two factors related to visual and task characteristics of a writing system (visual similarity among words and grapheme-phoneme mapping) are able to account for lateralization differences in visual word processing between different languages observed in human data. For example, it has been shown that orthographic processing in English word recognition is lateralized to the LH (e.g., McCandliss et al., 2003), whereas that in Chinese character recognition is more bilateral or lateralized to the RH (e.g., Tzeng et al., 1979). Regarding visual similarity among words (if we do not consider morphology or orthographic regularities), compared with Chinese, words in the English lexicon may look more similar to one another because English has a smaller alphabet size (only 26 letters) and a much larger lexicon size (about 20,000 base words for a university-educated native speaker; Nation & Waring, 1997); in contrast, Chinese has a smaller lexicon size (about 4,500 characters for a native speaker), but a much larger “alphabet” (about 200 basic stroke patterns defined in Cangjie, a Chinese transcription system developed by Chu in 1979; according to a database analysis by Hsiao & Shillcock, 2006; basic stroke patterns, or single bodies, are recurrent orthographic units of Chinese characters that cannot be further decomposed into other units. These stroke patterns are the smallest functional orthographic units for Chinese character recognition; Chen, Allport & Marshall, 1996). If we take the ratio between alphabet size and lexicon size in the two languages, and compare them with the modeling data in Fig. 4a, they will be at the rising part of the curve (in the modeling data the lexicon size was 26 and the alphabet size to lexicon size ratio ranged from 3/26 to 26/26). Accordingly, reading Chinese characters may require more LSF information compared with reading English words because of the higher visual dissimilarity among words in the lexicon (Fig. 4a).
There may also be other factors that influence visual similarity among words in the lexicon, such as features of letters/stroke patterns and configurations of words/characters. For example, Chinese stroke patterns usually have more strokes and visually more complicated than English letters; also, in Chinese stroke patterns are arranged into a square shape to form characters, whereas in English letters are arranged in a serial manner. To examine the difference in visual similarity among English words and Chinese characters/words, we randomly selected 1000 6-letter English words (taken from Brysbaert & New, 2009) and 1000 Chinese characters (taken from Ho, 1998) and compared the visual similarity among the English words and among the Chinese characters (i.e., distances in the Gabor representation space; Chinese character images were 60 × 60 pixels, whereas English word images were 36 × 100 pixels; thus they could be compared in the same representation space). We ran this comparison 10 times, and in every case the visual similarity among English words was significantly higher than that among Chinese characters (t-test over 10 runs, t(18) = 135.518, p < .001). This effect was also found when we randomly selected 1000 Chinese two-character words (taken from Taiwan Ministry of Education, 1997; both Chinese and English word images were 36 × 96 pixels): the visual similarity among English words was significant higher than that among Chinese words (t-test over 10 runs, t(13.982) = 180.457, p < .001). In another test, we directly compared the 1000 most frequent English words (according to Brysbaert & New, 2009; the image size was 18 × 200; the word length ranged from 1 to 15) and the 1000 most frequent Chinese characters (according to Ho, 1998; the image size was 60 × 60; the number of strokes ranged from 1 to 25). We calculated the visual similarity among the 1000 English words/Chinese characters for four times, each with a different font. We found that in all four cases, the 1000 most frequent English words had much higher visual similarity among one another than that among the 1000 most Chinese characters (t-test, t(6) = 13.735, p < .001).
As for the requirement to decompose a word into graphemes for grapheme-phoneme mapping, English is an alphabetic language and thus reading English requires this decomposition, whereas Chinese is a logographic language and reading Chinese does not have this requirement7. Thus, Chinese logographic reading may require more LSF information that leads to more RH lateralization compared with English alphabetic reading8. Our results also suggest that this difference in orthography to phonology mapping (alphabetic vs. logographic reading) may be the main factor that accounts for the lateralization difference between them (see Figs. 5b and 6b).
To examine whether our model is able to account for the RH lateralization in Chinese character recognition, we have recently conducted simulations with Chinese pseudo-characters (characters that are formed by combining real stroke patterns in Chinese) using the same model (Hsiao & Cheung, 2011a). We created artificial lexicons of pseudo-characters that consisted of a semantic radical (component) and a phonetic radical (i.e., phonetic compound characters). Each lexicon contained 100 characters, randomly selected from all possible combinations of 15 semantic and 15 phonetic radicals. The character image size was 60 × 60 pixels. We created two mapping conditions: (a) Regular mapping, in which characters with the same phonetic radical had the same pronunciation (in total 15 pronunciations, corresponding to the 15 phonetic radicals), similar to the phonetic compound characters in the real Chinese lexicon. (b) Logographic mapping, in which the pronunciation of a character was randomly assigned from the 15 possible pronunciations. The results showed that in general, all models with Chinese characters as the stimuli showed a strong LSF/RH lateralization effect; this LSF/RH lateralization effect was generally stronger than the simulations with English pseudo-words reported here, which typically showed either a weak LSF/RH bias, no bias, or a HSF/LH bias. This phenomenon is consistent with the literature showing that English word recognition typically involves stronger LH lateralization than Chinese character recognition in the visual system/in visual processing (e.g., Tan et al., 2005; Tzeng et al., 1979; Cheng & Yang, 1989; Yang & Cheng, 1999). In addition, the regular mapping condition demonstrated a stronger LH lateralization effect than the logographic condition. This effect may be because in the regular mapping condition, the phonetic information was biased to the location of the phonetic radical, and this encouraged the model to decompose a character into radicals, resulting in greater demands on HSF information/LH processing. This effect is consistent with the literature of Chinese character recognition showing that phonetic compound processing involved stronger LH lateralization than the processing of integrated characters, which do not have a phonetic radical (Weekes & Zhang, 1999). Future study will compare English word and Chinese character processing more closely by examining their difference in visual similarity among the basic written unites (i.e., letters and basic stroke patterns) and in the relationship between number of basic written units, number of words, and hemispheric lateralization/spatial frequency bias.
In contrast to the RH lateralization of Chinese character processing, the processing of Chinese two-character words has been shown to have a LH lateralization in tachistoscopic word identification tasks (e.g., Tzeng et al., 1979). Consistent with this finding, in an ERP study, Maurer et al. (2008) reported that the recognition of Japanese two-character kanji words in a one-back repetition detection task also showed LH-lateralized N170. It has been suggested that the LH lateralization of Chinese two-character word processing is due to the LH superiority in handling sequential and analytic tasks, as characters need to be put together to retrieve word meaning (Tzeng et al., 1979). In contrast, our results suggest that this lateralization difference between Chinese single character and two-character word processing may be accounted for by the factor that reading two-character words requires decomposition of a word into its constituent characters in order to map them to the pronunciation at the syllable level, and thus involves more HSF/LH processing.
Sequence knowledge, such as orthographic letter bigrams/trigrams, has also been suggested to play an essential role in visual word recognition in alphabetic languages such as English (e.g., Seidenberg & McClelland, 1989; Plaut, McClelland, Seidenberg & Patterson, 1996). In our model, in which we used word images as the input, orthographic sequence representation such as letter bigrams may be implicitly encoded by the spatial frequency range that optimally represents the identity of the letter bigrams learned by the model through training. Thus, our modeling results suggest that the link between sequential/analytic processing and LH lateralization (e.g., Bradshaw & Nettleton, 1981) may be due to the requirement to decompose a stimulus into components for sequential mapping/processing, such as the grapheme-phoneme conversion established during learning to read English words.
Our modeling results also have important implications for word processing in alphabetic languages with different orthographic depths. Orthographic depth refers to the deviation of a language from having a consistent grapheme-phoneme correspondence. For example, languages such as Italian and Welsh have a shallow orthography because of high consistency in the grapheme-phoneme mapping, whereas languages such as English and French have a deep orthography. It has been suggested that in reading languages with a shallow orthography, because the regular grapheme-phoneme correspondence can be conveniently used, readers tend to rely more on a phonological decoding (sublexical) strategy over a whole-word (direct lexical) approach compared with readers of deep orthographies (e.g., Wimmer & Goswami, 1994; Ellis & Hooper, 2001; Spencer & Hanley, 2003; Hanley, Masterson, Spencer & Evans, 2004). Since phonological processing is typically LH lateralized, consequently languages with a shallow orthography tend to have stronger LH lateralization in word processing than those with a deep orthography. For example, Beaton, Suller and Workman (2007) found that Welsh-English bilinguals had a significantly stronger RVF/LH advantage in naming Welsh words than naming English words (Welsh has a shallow orthography), and the lateralization effect was unaffected by which language was learned first or by age of second language acquisition (see also Workman, Brookman, Mayer, Rees & Bellin, 2000). In contrast to the argument that the difference between languages with a deep or shallow orthography is due to the use of LH-lateralized phonological processing, our modeling results suggest that this difference may instead be due to the difference in spatial frequency content required for performing a consistent or inconsistent grapheme-phoneme mapping (e.g., our alphabetic and logographic reading conditions in Simulation Two), since our model does not assume phonological processing being LH lateralized9.
Also, although we have not tested it in the model yet, it is possible that when exception and regular words coexist in a lexicon, such as in English, regular word processing may involve more LH/HSF processing than exception words due to the use of the fine-grained (sublexical) route (Grainger & Ziegler, 2011), especially for low frequency words. In the literature of English word recognition, mixed results have been obtained regarding whether there is an interaction between word regularity and presented visual field/hemisphere. For example, in a word naming task, Scott and Hellige (1998) manipulated presented visual field, word frequency, word regularity, and word orientation (vertical vs. horizontal), and reported that while there was a significant three-way interaction between visual field, word regularity, and word orientation, the interaction between visual field and word regularity did not reach significance in either orientation condition. In contrast, in a lexical decision task, Weems and Zaidel (2005) found that the interaction between word regularity and word frequency was significant in the LVF/RH, but not in the RVF/LH. Also, in a word naming task, in which words were presented vertically, Parkin and West (1985) observed a significant interaction between presented visual field and word regularity: the regularity effect was significant in the RVF/LH, but not in the LVF/RH (or in other words, regular words involved a stronger RVF/LH advantage than irregular words). Future work will examine whether our model is able to address this possible interaction between hemisphere and word regularity.
Note that this study did not rule out possible influence from the LH lateralization of phonological processing on the lateralization of visual word recognition. It is possible that visual and task characteristic factors interact with the lateralization of phonological processing. This speculation is consistent with the finding that Chinese character recognition has usually been found to be RH lateralized in orthographic processing tasks (e.g., Tzeng et al., 1979), and LH lateralized in phonological processing tasks (in particular for characters that have a phonetic component, or radical; e.g., Weekes & Zhang, 1999). Hsiao and Liu (2010) recently contrasted the processing of a dominant type of Chinese characters, SP characters, which have a semantic radical on the left and a phonetic radical on the right, and a minority type of character, PS characters, which have the opposite arrangement. They showed that in Chinese character naming, the processing of SP characters had LH lateralization, whereas that of PS characters did not have a significant lateralization effect. They argued that this effect may be due to the dominance of SP characters in the lexicon that makes readers opt to obtain phonological information from the right of the characters, and thus the automaticity of phonological processing in SP character recognition is superior to that in PS character recognition, resulting in more LH phonological modulation (Maurer & McCandliss, 2007)10. This result thus is consistent with the claim that the lateralization of phonological processing may also influence the lateralization of visual word recognition.
In conclusion, through computational modeling, here we show that visual and task characteristics of a writing system alone can account for lateralization difference in visual word recognition between different languages, without assuming any influence from the lateralization of language/phonological processes. Specifically, they are (a) visual similarity among words in the lexicon, and (b) the requirement to decompose a word into letters for letter-sound mapping during learning to read.
We are grateful to the Research Grant Council of Hong Kong (project code: HKU 744509H and HKU 745210H to J.H. Hsiao). We thank Mr. Kit Cheung for his help on the word/character dissimilarity analysis, and Mr. Chee Fung Cheung for his help on preliminary studies of this study. We also thank Professor Garrison Cottrell for his help and comments on an earlier version of the study. We thank the Editor, Professor Marc Brysbaert, and two anonymous reviewers for their helpful comments.
Janet H. Hsiao and Sze Man Lam, Department of Psychology, the University of Hong Kong, Hong Kong.
Holistic processing refers to the phenomenon of viewing a visual stimulus as a whole instead of various parts; it has been reported in face recognition consistently. Holistic processing is usually assessed through the composite paradigm (Young, Hellawell & Hay, 1987). In this paradigm, two stimuli are presented briefly, either sequentially or simultaneously. Participants attend to either the top or bottom halves of the stimuli and judge whether they are the same or different. In congruent trials, the attended and irrelevant halves lead to the same response, whereas in incongruent trials, they lead to different responses. Holistic processing is indicated by interference from the irrelevant halves in matching the attended halves; it can be assessed by the performance difference between the congruent and incongruent trials.
In a separate simulation, we found that using 100 components each made the representation noisier and deteriorated the model's performance.
In the research on reading, it remains controversial whether the fovea representation (about the central 2º of the visual field) is bilaterally projected to both hemispheres (e.g., Huber, 1962; Stone, Leicester & Sherman, 1973; Jordan & Paterson, 2009), or split along the vertical midline with the two halves initially contralaterally projected to different hemispheres (e.g., Brysbaert, 2004; Lavidor & Walsh, 2004; Ellis & Brysbaert, 2010). Although our model assumed a split along the vertical midline in the visual field, including the fovea region, in our simulations the two sides of the input stimuli always had the same amount of information towards the word identity (e.g., palindrome words), since our aim was to examine effects of hemispheric processing difference rather than input asymmetry; thus, a similar result would be obtained if the model assumed a bilateral projection (since the input was always symmetric). Note that examining the difference between a split and a non-split, bilateral representation is beyond the scope of this study. Note also that in English words, word beginnings (i.e., the left half of words) are usually more informative than word endings. The current simulations did not reflect this asymmetry, since here we aimed to examine the influences of visual similarity and mapping task demands on lateralization in visual word form processing with the asymmetry in visual stimuli (and lateralization of phonological processing) controlled. In another study (Hsiao, 2011), I showed that information asymmetry in visual word stimuli can also account for asymmetry in word processing behavior.
Words in natural languages usually follow certain orthographic regularities, for example, some components/letters only appear at certain locations in a word. In this simulation, we aimed to examine word similarity effects, and thus words in the artificial lexicons were randomly chosen without any orthographic regularity. Some orthographic regularity was used in simulation two (i.e., the consonant-vowel-consonant format) to more realistically model visual word pronunciation in natural languages.
Here, we did not manipulate similarity of the letters and size of the alphabet separately, and thus the effect of alphabet size here may be influenced by the similarity of the letters included in the alphabet (i.e., Fig. 4(a, b), and 5).
In Fig. 4c, to better demonstrate the quadratic relationship, we added 20 data points by running the model with 10 new lexicons (twice for each lexicon: once with original images and the other with mirror images) that had very low word dissimilarity (by using the alphabet size 3 and visually very similar letters, such as letters “a,” “c,” and “e”); these points formed a cluster on the bottom left corner of the figure (i.e., the data points in gray; the quadratic correlation after adding these points, r2 = .156, p < .001). Note that the relatively small r2 value suggests that there may be other factors modulating the lateralization effect. This requires further examination. The steep increase in Fig. 4a is likely a result of the way we created the artificial lexicons (i.e., we fixed the lexicon size to 26 and gradually increased the alphabet size), which made a steep change in word dissimilarity when the alphabetic size was small before it leveled off gradually (as can be seen in Fig. 4b); this allowed us to better explore the dissimilarity space. In our simulation two, in contrast, we fixed the alphabet size to 26 and gradually increase the lexicon size. This gave us a more gradual change in word dissimilarity: as shown in Fig. 6b, when the lexicon size increased, words in the lexicon became more similar to each other since they shared more common letters, and the model relied more on HSF (LH) information. This change corresponded to the rising part of the quadratic relationship shown in Fig. 4c.
Note that although some Chinese characters have a phonetic radical that has information about the character pronunciation, phonetic radicals can usually be stand-alone characters, and the mapping from a phonetic radical to its pronunciation is at the syllable level (logographic mapping), in contrast to the grapheme-phoneme correspondence in alphabetic languages.
Using the composite paradigm (Young et al., 1987), a commonly used method for assessing holistic processing in visual perception, we have recently shown that expert Chinese character recognition involves RH lateralization (i.e., the left-side bias effect) and reduced holistic processing, suggesting that RH lateralization and holistic processing (assessed by the composite paradigm) do not always go together (Hsiao & Cottrell, 2009; see also Hsiao and Cheung, 2011b, for a related modeling study). Consistent with this finding, recent research showed that holistic processing could be modulated by writing/drawing experience in Chinese character/face perception, whereas the left-side bias effect (in Chinese character perception) could not (see Tso, Au & Hsiao, 2011; Zhou, Cheng, Zhang & Wong, 2012). Thus, the logographic mapping condition in our simulation may be better characterized by the whole-word coarse-grained route in Grainger and Ziegler's (2011) dual-route model of orthographic processing, instead of the holistic processing assessed by the composite paradigm.
Note however that Paulesu et al.'s (2000) showed that when Italian and English readers were naming non-words, English readers had greater activation in the left posterior inferior temporal gyrus and anterior inferior frontal gyrus than Italians. This phenomenon may be related to activation of orthographic neighbors in the lexicon for assigning pronunciations to non-words, as in Italian there are 33 graphemes representing 25 phonemes, in contrast to the 1,120 possible graphemes for 40 phonemes in English (Paulesu et al., 2000). Whether our model is able to address this effect requires further examination.
Note however that our recent modeling data suggest the difference in lateralization between SP and PS character processing may also be due to higher visual similarity among characters of a dominant type than that among characters of a minority type (Hsiao & Cheung, 2011a).