A Computational and Empirical Investigation of Graphemes in Reading


Correspondence should be sent to Conrad Perry, Faculty of Life and Social Sciences (Psychology), Swinburne University of Technology, John Street, Hawthorn, Vic. 3122, Australia. E-mail: conradperry@gmail.com


It is often assumed that graphemes are a crucial level of orthographic representation above letters. Current connectionist models of reading, however, do not address how the mapping from letters to graphemes is learned. One major challenge for computational modeling is therefore developing a model that learns this mapping and can assign the graphemes to linguistically meaningful categories such as the onset, vowel, and coda of a syllable. Here, we present a model that learns to do this in English for strings of any letter length and any number of syllables. The model is evaluated on error rates and further validated on the results of a behavioral experiment designed to examine ambiguities in the processing of graphemes. The results show that the model (a) chooses graphemes from letter strings with a high level of accuracy, even when trained on only a small portion of the English lexicon; (b) chooses a similar set of graphemes as people do in situations where different graphemes can potentially be selected; (c) predicts orthographic effects on segmentation which are found in human data; and (d) can be readily integrated into a full-blown model of multi-syllabic reading aloud such as CDP++ (Perry, Ziegler, & Zorzi, 2010). Altogether, these results suggest that the model provides a plausible hypothesis for the kind of computations that underlie the use of graphemes in skilled reading.