Moderating role of reading comprehension in children's word learning with context versus pictures

words 0.22 0.61 0.24 0.56 0.15 0.50 0.27 0.76


| INTRODUCTION
Vocabulary is an important predictor of reading comprehension abilities across primary school, and reading comprehension in turn is an important skill for vocabulary learning (e.g., Verhoeven, Van Leeuwe, & Vermeer, 2011). As reading comprehension is highly relevant for general future educational success, there is a high need of effective vocabulary instructions at school (Biemiller, 2006). Word learning during text reading becomes more efficient when word meanings are explicitly taught (Marulis & Neuman, 2010). Besides vocabulary breadth (quantity of words in the mental lexicon), specifically vocabulary depth (how much is known) is a significant and unique predictor of reading comprehension (e.g., Cain & Oakhill, 2014). To foster such deeper word knowledge, extended direct vocabulary instruction with richer semantic information has been shown to be effective (e.g., Coyne, McCoach, Loftus, Zipoli, & Kapp, 2009). Such instruction may be especially helpful for children with lower reading comprehension abilities because they have problems in inferring word meanings from context during reading (Cain & Oakhill, 2011). To provide richer semantic information during word learning, often context sentences or pictures are added to definitions, albeit that the effectiveness of the latter is less clear for abstract words (e.g., National Reading Panel, 2000;Sadoski, 2005). So far, the unique benefits of context sentences or pictures over and above definitions on children's word learning have not yet directly been compared with word concreteness and reading comprehension differences being taken into account. In the present study, we, therefore, investigated how adding context sentences or pictures to definitions supports Dutch fourth graders in their direct learning and retention of concrete and abstract words in relation to their reading comprehension.
Children learn new words by inferring their meanings from their daily input without explicitly being taught the meaning (Webb & Nation, 2017). For each word, a phonological, orthographical, and semantic representation is stored in memory, known as lexical representation (Perfetti, 2007(Perfetti, , 2017. The quality of lexical representations is determined by the strength of each informational node and their interconnections. In terms of semantic representations, the quality increases with the degree to which new word meanings are integrated in the network of already existent and semantically related lexical representations (Read, 2004). Deep word knowledge contains all semantic features with which words from the same semantic field can be distinguished, includes information about the syntactical and morphological use together with its use in context of other words (Perfetti, 2007(Perfetti, , 2017. The acquisition of deep semantic knowledge is an incremental long-term process, as frequent exposure and variation in context are relevant to construct successively deeper semantic knowledge (Webb & Nation, 2017). Without frequent and salient input along with active processing, children quickly forget newly encountered words (Vlach & Sandhofer, 2012). The complementary learning systems model of memory distinguishes two cognitive stages (Davis & Gaskell, 2009). Directly after the first encounter, the word representation is hold as an episodic trace in the hippocampal system, which is bound to the specific occurrence and is yet isolated from the already existent knowledge. Next, a more generalized and abstract form is gradually integrated into long-term memory. Such lexical integration, also termed consolidation, requires time and is supported by repeated rehearsal or overnight sleep, as has been evidenced in children (van der Ven, Takashima, Segers, & Verhoeven, 2017).
Explicit teaching of word meanings during reading helps children to learn more words than if word meanings have to be inferred from context (Marulis & Neuman, 2010). A higher intensity of such instruction strengthens this effect not only in terms of the number of learned words but also in the acquired semantic depth and retention (e.g., Beck & McKeown, 2007;Coyne et al., 2009;Laufer & Rozovski-Roitblat, 2015). Instructions for vocabulary breadth rather provide a brief definition, whereas instructions focusing on vocabulary depth involve more frequent encounters in various contexts, activities with a deeper interaction with the word meanings and forms besides discussions about semantic features and the relationship to the already existent knowledge (e.g., Coyne et al., 2009). Examples are the rich vocabulary instruction approach of McKeown and Beck (2004) or the semantic feature analysis described by Pittelman, Heimlich, Berglund, and French (1991).
Providing definitions during explicit vocabulary instruction can be considered the default training form, which is often semantically enriched by adding context sentences and pictures to facilitate the development of deeper semantic knowledge (e.g., National Reading Panel, 2000;Sadoski, 2005). Due to the role of context in word learning, context sentences may naturally provide deeper semantic information than definitions alone (Beck, McKeown, & Kucan, 2013). They are examples for the use of the new word in the context of other words and convey information on the co-occurrence of words as well as morphological and syntactical features (e.g., Bullinaria & Levy, 2007). Associated words will be activated in the mental lexicon while reading the context sentence, and this activation spreads to semantically related concepts (Collins & Loftus, 1975). This may help in linking the new word to the already existent knowledge and in recalling the information (McKenzie & Eichenbaum, 2011). Aligned with this, several studies showed that the combination of definitions and context sentences in explicit first language vocabulary instruction with children or college students led to better learning than providing definitions (Kolich, 1991;Stahl, 1983;Stahl & Fairbanks, 2006) or context sentences only (Bolger, Balass, Landen, & Perfetti, 2008). The former was also the case for children with low vocabulary skills for retention (Nash & Snowling, 2006). In contrast, this contextual benefit was not found in other studies with college students on first language learning (Smith, Stahl, & Neil, 1987) or with adults on foreign language learning (e.g., Golonka et al., 2015;Webb, 2007a).
Besides context sentences, adding pictures to definitions may improve children's word learning according to the cognitive theory of multimedia learning (Mayer, 2014). This theory is based on the idea of dual coding, which states that all knowledge is stored in memory via a verbal and nonverbal code (Paivio, 1991;Sadoski, 2005). Pictures may support mental imagery for word meanings and, therefore, can support word learning and retrieval when added to definitions. Within the cognitive theory of multimedia learning, it is assumed that visual and verbal (auditory) information are processed in different systems of the human working memory and that the amount of information which can be processed within one system is limited (Baddeley, 1999;Sweller, 1994). If information is perceived via text and pictures, the visual and verbal working memory channels can process different information parallel, which, thus, can lead to better and deeper learning by integrating the information from different sources than if information is presented by text or pictures only. This is known as the multimedia effect. Processing all information only via the visual or verbal system may lead to a cognitive overload and may disturb learning.
According to the cognitive theory of multimedia learning, adding pictures to definitions may foster word learning more than adding context sentences to definitions. The benefit of multimedia implementation in instruction for children is widely supported (Bus & Neuman, 2009;Zhou & Yadav, 2017). Also, several studies on word learning with children or college students (Kim & Gilman, 2008;Smith et al., 1987) and adults (Plass, Chun, Mayer, & Leutner, 2003;Shen, 2010) provided evidence for a superior word learning effect, if verbal and visual information were combined during instruction of words compared with when only verbal or visual information was available. This benefit from multimedia use in word learning seems to account especially for children or adults with lower language abilities (Silverman & Hines, 2009;Yun, 2011). Not all studies showed, however, a benefit of adding pictures to verbal information during children's word learning (Acha, 2009;Cohen & Johnson, 2011). Pictures can also distract from storing the word form, as was observed for kindergarteners practicing sight word reading of semantically familiar words (Elliott & Zhang, 1998). To truly enrich the semantic information of definitions, pictures need to provide information of additive value (Carney & Levin, 2002;Kalyuga & Sweller, 2014).
In the reviewed studies on the pictorial support effect, pictures were added to text in a combined form of definitions and context sentences. Furthermore, they were conducted with older children and adults or focused on word learning in a foreign language. The unique benefit of either adding context sentences or pictures to definitions has not yet directly been compared for young children who are learning new words in their first language. Based on the strong evidence for superior learning if text and pictures are combined than from text alone (Bus & Neuman, 2009;Mayer, 2014), adding pictures to definitions may lead to better learning than adding context sentences to definitions or from definitions alone.
The above mentioned studies have not taken individual and word level characteristics into account when they investigated contextual and pictorial support effects in word learning, although this is particularly relevant to develop efficient vocabulary instructions for struggling readers (Elleman, Steacy, Olinghouse, & Compton, 2017). It has been shown that higher reading comprehension abilities led to better word learning from context and definitions (e.g., Bolger et al., 2008;Cain, Oakhill, & Elbro, 2003;Cain, Oakhill, & Lemmon, 2004;Perfetti, Wlotko, & Hart, 2005). This link was explained by lower vocabulary along with lower integration and inference abilities of participants with problems in reading comprehension, which are highly relevant for word learning from context and definitions (Fukkink, 2005;Stafura & Perfetti, 2017). Children who struggle with reading comprehension lack strategic knowledge about how to make use of context and seem to construct less stable memory traces for newly encountered words (Nation & Snowling, 1999;Perfetti et al., 2005). Therefore, pictures might be especially useful to support word learning of low comprehenders.
In terms of word level differences, vocabulary learning with context or pictures may also depend on word concreteness because the representation of concrete and abstract words in the mental lexicon is conceptually different (Hoffman, 2016;Mestres-Missé, Münte, & Rodriguez-Fornells, 2014). Concrete meanings are directly linked to sensory experiences (e.g., shape, taste, and sound) and are based on specific semantic properties. In contrast, abstract meanings are difficult to imagine, are fuzzier, rely more on semantic associations, and vary more across contexts. The dual coding theory (Paivio, 1991;Sadoski, 2005) has specified that only concrete words are stored with a verbal and visual code, whereas abstract meanings are only represented verbally. Therefore, pictures are often not considered useful for abstract word learning. It has been shown, however, that pictures support learning of foreign abstract words in adults, whereas this was not the case for concrete words (Shen, 2010). For concrete words, adults were assumed to activate own mental pictures. Pictures would then have been redundant for concrete but not for abstract words (e.g., Kalyuga & Sweller, 2014). In terms of contextual support effect, stronger benefits may be assumed for learning of abstract words than of concrete words based on the context availability theory (e.g., Schwanenflugel & Stowe, 1989). According to this theory, semantic (or contextual) information of abstract words is retrieved slower than of concrete words due to the larger variety of abstract word meanings in different context and due to the resulting wider spread of the semantic network. Concrete words are related to less semantic concepts, and their contextual information can directly be retrieved from the mental lexicon, which leads to a processing advantage (concreteness effect). This could be eliminated when abstract words were embedded in context sentences (e.g., Schwanenflugel & Stowe, 1989). The contextual benefit was even stronger for abstract than for concrete words in some studies (Hoffman, Jefferies, & Lambon Ralph, 2010), but this was not consistently found (Mestres-Missé et al., 2014).
In all, the unique benefit of context sentences or pictures over and above definitions on children's word learning is not yet clear, especially not when taking word concreteness and reading comprehension differences into consideration. In the present study, we investigated how adding context sentences or pictures to definitions supports primary school children in their first language direct word learning as well as retention of the knowledge for concrete and abstract words. We examined if reading comprehension functioned in these respects as a moderator. Dutch fourth graders learned a set of previously unknown Dutch concrete and abstract words. All participants read a written definition per word once. The context group also read a context sentence, and the picture group additionally received a picture. The control group only saw the word definition. The ability to define the words was measured directly after learning and 1 day later to assess consolidation.
Based on the cognitive theory of multimedia learning (Mayer, 2014), we first expected highest learning gains and retention for the picture group, followed by the context group, and then the control group. Second, we assumed that there were differences between concrete and abstract words across the groups driven by the theory that concrete knowledge is represented by a dual code (verbally and visually), whereas abstract knowledge is coded only verbally and can be facilitated more by contextual information (Paivio, 1991;Sadoski, 2005;Schwanenflugel & Stowe, 1989).
Therefore, we hypothesized that the above mentioned pattern of group differences only holds for concrete words. For abstract words, we expected that learning gains and retention are higher in the context group than that in the picture and control group due to the difficulty to represent abstract meanings with pictures, whereas the picture and control group do not differ from each other. Participants with lower reading comprehension abilities are considered to have problems in making use of verbal information in word learning such as context sentences and definitions (e.g., Bolger et al., 2008;Cain et al., 2004). In contrast, information from pictures can directly be retrieved without reading. Therefore, our third hypothesis was that towards lower levels of reading comprehension, the pictorial support effects become stronger, whereas the contextual support effects degrade. In other words, children with lower reading comprehension abilities were hypothesized to perform best in the picture group while performing similar in the context and control group.

| Participants
This study was conducted in spring 2017 with 191 Dutch-speaking fourth graders from seven different schools (nine classes) in urban and suburban south-eastern regions of the Netherlands. Two participants were excluded because no data were available from the experiment or on reading comprehension. The final sample of 189 participants had a mean age of 9 years and 10 months (SD = 5.58 months). Participants were allocated randomly within classrooms to one experimental group: context group (males = 32, females = 31), picture group (males = 28, females = 38), or control group (males = 30, females = 30). Teachers indicated for some participants, use of another or additional home language besides Dutch (n = 10), diagnosed dyslexia, or other language related (n = 17) and developmental problems (n = 8). Those participants were equally spread over the groups and were not excluded from the analysis because this reflects an authentic classroom in the Netherlands and as such provides ecological validity to our study. Passive parental consent was received. Ethical approval was obtained from the Ethics Committee of the Faculty of Social Sciences of our university with the number ECSW2016-2811-448.

| Items
We tested the knowledge of fifth graders on a set of infrequent nouns from a Dutch corpus (Tellings, Hulsbosch, Vermeer, & Bosch, 2014) to choose words that were most likely unfamiliar to fourth graders in our study. This was supplemented by words from an earlier study (van der Ven et al., 2017). The final set consisted of 10 concrete (e.g., fortress) and 10 abstract nouns (e.g., charisma) with a low written lemma frequency (M = 3.45, SD = 5.60, range 0-17, Tellings et al., 2014). Word concreteness was rated dichotomously by twonative-speaking authors of this study based on the ease to represent words with pictures, as concreteness and imageability are highly correlated (Connell & Lynott, 2012). The concreteness scores retrieved from a large Dutch database (Brysbaert, Stevens, De Deyne, Voorspoels, & Storms, 2014) were significantly higher for concrete than for abstract words, t(13) = −7.33, p < .001. Five concrete words were not listed in this database because we aimed to select very unfamiliar words based on van der Ven et al. (2017). Per word, a written definition was formulated. Additionally, a context sentence and a picture were designed to support the definition (see Appendix A). The context sentence aimed to provide a natural context at sentence level.
The definition and context sentences were based on information provided in an online dictionary (Van Dale, 2017) or a lexicon for children (Winkler Prins, 2012). The length of the definitions (M = 6.45 words, SD = 0.88 words) and context sentences (M = 9.00 words, SD = 1.07 words) were kept equal between items. The pictures were chosen from free online picture sources, the website DigiWak (Stichting Digiwak, 2017), or a child lexicon (Winkler Prins, 2012). For the context group, 2 s after the definition was provided, a green-coloured context sentence was presented below the definition in the same font and size. For the picture group, 2 s after the definition was provided, a picture (800 × 600 pixels) appeared below the definition. In the control group, no additional semantic information was provided. Figure 1 provides a translated example per group.

| Vocabulary experiment
The experimenter instructed the children to learn and remember the meaning of the words because they would be asked to provide it afterwards in their own words. The experiment was preceded by a practice item for a well-known word. The children could go through the words on a self-paced speed and were made aware that it was not possible to go backwards for restudying. The words were taught in two steps of each 10 words with an intermediate break. During the break, children conducted individual tasks to avoid interim discussions about the word meanings. To avoid difficulty differences between the words of each step, the items were separated in two comparable blocks by controlling for concreteness, written lemma frequency, orthographic regularity, and syllable number. The presentation of blocks was counterbalanced across the two steps of the experiment (93 participants started the experiment with word Block 1 and 96 participants with word Block 2). The words within blocks were presented in one of 50 random orders.

| Procedure
In the first session, testing on the control measures and the prior semantic knowledge on the target words took place individually for approximately 20 min. In the second session, participants studied 20 Dutch word meanings via a computerized experiment in groups of maximally eight children. To reduce the memory load, the words were taught in two steps of 10 words each with a short intermediate break. Directly after each step, the acquired semantic knowledge was tested (posttest). All children were tested again in class 1 day after the experiment (retention test). The second and third session took about 20 min, respectively. Each session was conducted on a separate day within maximally 2 weeks. Data were collected by three experimenters.

Visual memory
The Corsi block-tapping task, as described in Kessels, Van Zandvoort, Postma, Kapelle, and De Haan (2000), was conducted to assess visual memory. The experimenter tapped on wooden blocks in a certain order with an increasing number of blocks ranging from two to nine.

Verbal memory
A digit span task, taken from the Dutch translation of the Wechsler Intelligence Scale for Children (WISC-III NL, Kort et al., 2005) was used to measure verbal memory. Participants had to repeat auditory presented rows of numbers with increasing length in the same or reversed order. Per digit span, ranging from two to nine, two trials were conducted. The test was stopped after two incorrect trials of the same length. Cronbach's alpha of the task was on average .64, which indicated a low but acceptable reliability considering the low number of items (Kort et al., 2005;Pallant, 2013).

Reading fluency
An isolated Dutch word reading task (1-min test, Brus & Voeten, 1979, Form B) was used to indicate reading fluency. Children had to read words of different length from a list as quick and accurate as possible.
The test score represented the total number of words that were read correctly within 1 min. As the nature of the test does not allow calculating values of internal reliability, Brus and Voeten (1979) provided alternative measures. The correlation between parallel test forms (Forms A and B) at the same measurement moment was .92 to .94, FIGURE 1 To English translated example of the semantic information provided during the vocabulary experiment in each group for a concrete word (left) and an abstract word (right). The original font size and distances between units as presented on the computer screen were adjusted for this illustration. and the test-retest reliability was .85 to .87 for the same form or parallel forms.

Vocabulary
The schools provided us with vocabulary scores of a nationally standardized test for 122 participants (Reading Vocabulary; Verhoeven & Vermeer, 1999). Those tests are used by 95% of Dutch schools to monitor learning progress (Cito, 2018a(Cito, , 2018b. We conducted for the remaining 67 children a standardized vocabulary subtest from the Dutch test battery Language Test for Allochthone Children (Reading Vocabulary; Verhoeven & Vermeer, 1993). In both tests, children had to choose from four options the correct definition or synonym for a target word, which was embedded in a sentence. The tests consisted of the same items but were scored on a different scale. The predicted passive vocabulary breadth scores from the test manuals, therefore, were used as a common vocabulary measure for the whole sample.

| Reading comprehension
We received from schools reading comprehension scores from nationally standardized tests (Feenstra, Kleintjes, Kamphuis, & Krom, 2010;Staphorsius & Krom, 1998). In those tests, children had to read short texts and to answer multiple choice questions. The scores derived from two different test editions widely used in the Netherlands for monitoring of student's learning progress (Cito, 2018a(Cito, , 2018b. Their scores were comparable because they were based on the same ability scale with item response theory (Tomesen, Weekers, Hiddink, & Jolink, 2017). For both test versions, MAcc was higher or equal to .89 (Staphorsius, Krom, Kleintjes, & Verhelst, 2004;Tomesen et al., 2017).

| Semantic knowledge
At pretest, we checked the prior knowledge of children on the items from the experiment. The experimenter read the words aloud in a speed of one word per second, while the participants listened and silently joined reading. By this, children were familiarized with the phonological word form and influences of decoding problems were kept to a minimum. If participants knew a word meaning, they rang a bell, explained the meaning, and wrote it down on paper. Beforehand, a practice item was conducted to explain how words can be defined.
At posttest, directly after each word block, children wrote down definitions on a sheet of paper where all words forms of the respective word block were listed besides a practice item. The same procedure was followed at retention test, except that words were not separately pre- and for the retention test .70, which is considered an acceptable reliability (Pallant, 2013). To establish an interrater reliability measure, the three experimenters rated independently from each other the provided word definitions at posttest for a subsample of 40 participants.
The interrater reliability was considered high because the inter-item correlation between the raters ranged from .90 to .93 and the intraclass correlation was .97 (Field, 2009).

| Data analysis
We used the sum scores of the semantic knowledge at each measurement for the analyses. As they deviated from the normal distribution (skewness or kurtosis divided by their standard errors extending the absolute value of 2), we used square root transformed sum scores for the analyses (Field, 2009;Tabachnik & Fidell, 2013). After the transformation, the posttest and retention test scores were normally distributed, but this could not be reached for the pretest scores. That some children knew a few words at pretest, whereas most children did not, explained the strong deviation. We, therefore, decided to use gain scores from pretest to posttest in the analyses of learning gains (normally or close to normally distributed), to control for prior knowledge. The change in the semantic knowledge from posttest to retention test was analysed separately as forgetting effect. To answer each research question separately, we conducted several analyses steps with the relevant variables taken into account, as described below. When the moderational role of reading comprehension was investigated, reading fluency was added as covariate to control for decoding abilities. If not further specified, assumptions of the statistical analyses were met. Reading comprehension and reading fluency were centred for the analyses. The analyses were conducted in SPSS Version 23, and figures were created in R with the package jtools (Long, 2018).

| Descriptive statistics and group differences on the control variables
The participants in the context, picture, and control group did not dif-  Table 1 presents an overview of the mean sum scores of the semantic knowledge.

| Research Question 1: Group differences in the learning gain and forgetting effect
To investigate group differences in the learning gain, we conducted a one-way analysis of variance (ANOVA) with the gain scores of all words and group as between-subject factor (context, picture, and control). The results are presented in Table 2. The groups differed in learning gain, as there was a significant main-albeit small-effect of group.
Bonferroni corrected post hoc tests revealed that participants in the picture group had higher learning gains than participants in the context group (see Appendix C, Figure C1). Other group differences were not significant.
Second, we examined the forgetting effect of all words with a repeated measures ANOVA with time as within-subject factor (posttest and retention test) and group as between-subject factor (context, picture, and control). The forgetting effect did not differ between the groups, as the interaction effect between time and group was not significant (see Table 2).

| Research Question 2: Differences in learning concrete or abstract words
A repeated measures ANOVA was performed with the gain scores of concrete and abstract words to further specify how the previously found main effect of group for the learning gain of all words varied across word concreteness. Group functioned as between-subject factor (context, picture, and control) and word concreteness as withinsubject factor (concrete and abstract). The learning gain did not differ between concrete and abstract words as the interaction effect between group and word concreteness was not significant. The main effect of group and respective group differences remained the same.
The results are presented in Table 3.
To investigate group differences in the forgetting effect for concrete and abstract words, we conducted a repeated measures ANOVA with time (posttest and retention test) and word concreteness (concrete and abstract) as within-subject factors. Group was used as between-subject factor (context, picture, and control). The forgetting effect of concrete or abstract words did not differ between the groups as the three-way interaction effect between time, group, and word concreteness was not significant (see Table 3). The interaction effect between time and group became significant when taking word concreteness into account.
To better understand the significant interaction effect between time and group after taking word concreteness into account, we tested the interaction effect between time and group for each group contrast separately. This revealed that participants in the context group retained the semantic knowledge better than participants in the picture group and control group. Those were small to medium effects (η 2 p = .05 to .07). No differences were observed between participants in the picture and control group. Levene's tests for homogeneity of variance were significant for the comparison of the posttest Note. n items = 20, containing 10 abstract words and 10 concrete words. Note. Gain scores are the difference between pretest and posttest scores. The forgetting effect is the change of the semantic knowledge from posttest to retention test. Levels of time: posttest and retention test; levels of group: context group, picture group, and control group. The significance of group contrasts was tested with Bonferroni corrected post hoc tests for main effects and with follow-up tests per group contrast for interaction effects. n.a., post hoc tests not available or not applied when the main or interaction effects were not significant.

| Research Question 3: Moderating role of reading comprehension
Finally, we examined the moderating role of reading comprehension for the learning gain and forgetting effect of the total semantic knowledge by controlling for reading fluency. First, we conducted a univariate general linear model for the gain scores of all words with group as between-subject factor (context, picture, and control). We added reading comprehension as a main effect and the interaction between group and reading comprehension to the model. The results are presented in Table 4. Reading comprehension did not moderate the learning gain, as the interaction effect between group and reading comprehension was not significant. The main effect of group and respective group differences remained the same.
Second, to examine the forgetting effect, we conducted a repeated measures ANOVA with time as within-subject factor (posttest and retention test) and group as between-subject factor (context, picture, and control). We tested specifically for the interaction effect between time, group, and reading comprehension and entered the underlying main and two-way interaction effects to the model. Reading comprehension moderated the forgetting effect, as the interaction effect between time, group, and reading comprehension was significant (see Table 4).
To better understand the moderating role of reading comprehension for the forgetting effect, we examined the interaction effect between time and group (context, picture, and control) separately for each of the 33% extreme performing reading comprehension groups (33% with lowest reading comprehension test scores and 33% with highest reading comprehension test scores) with otherwise only the underlying main effects and reading fluency in the model. A similar approach was also followed by Plass et al. (2003). We considered participants with reading comprehension scores smaller or equal to 159 as low comprehenders (context group n = 20, picture group n = 19, and control group n = 22) and participants with scores higher or equal to 181 as good comprehenders (context group n = 20, picture group n = 21, and control group n = 28). The interaction effect between time and group was only significant for low comprehenders. Low comprehenders in the control group forgot more than low comprehenders in the context group and more than their peers in the picture group. The group differences were medium to large (η 2 p = .11 to .15). The forgetting effect did not differ for low comprehenders in the context and picture group.

| DISCUSSION
This study investigated how adding pictures or context sentences to definitions during explicit teaching of concrete and abstract word meanings benefitted the direct word learning and retention of Dutch Note. Gain scores are the difference between pretest and posttest scores. The forgetting effect is the change of the semantic knowledge from posttest to retention test. Levels of time: posttest and retention test; levels of group: context group, picture group, and control group; levels of word concreteness: concrete words and abstract words. The significance of group contrasts was tested with Bonferroni corrected post hoc tests for main effects and with follow-up tests per group contrast for interaction effects. n.a., post hoc tests not available or not applied when the main or interaction effects were not significant.
fourth graders. We further examined the role of reading comprehension as a moderator of these effects.
The picture group showed higher learning gains from pretest to posttest than the context group, but both experimental groups did not differ from the control group. The pictorial support effect disappeared over time, as the groups did not differ significantly anymore 1 day after learning. When word concreteness was taken into account, it was revealed that participants in the context group retained the meaning Note. Gain scores are the difference between pretest and posttest scores. The forgetting effect is the change of the semantic knowledge from posttest to retention test. Levels of time: posttest and retention test; levels of group: context, picture, and control. Reading comprehension and reading fluency were continuous variables and were centred for the analyses. The significance of group contrasts was tested with Bonferroni corrected post hoc tests for main effects and with follow-up tests for interaction effects per group contrast. n.a., post hoc tests not available or not applied when the main or interaction effects were not significant. Abbreviations: RC, reading comprehension; RF, reading fluency. Low RC, low comprehenders (context group n = 20, picture group n = 19, and control group n = 22) with reading comprehension scores smaller or equal to 159. Good RC, good comprehenders (context group n = 20, picture group n = 21, and control group n = 28) with reading comprehension scores higher or equal to 181.

FIGURE 2
Mean sum scores of the semantic knowledge at posttest and retention test (square root transformed) differentiated for participants with good reading comprehension (left) and low reading comprehension (right). *p < .05 of both concrete and abstract words better from posttest to retention test than participants in the picture and control group. Reading comprehension did not affect the learning gain but did impact the forgetting effect. Both the pictorial and contextual support effect for the knowledge retention became stronger towards lower levels of reading comprehension as compared with the control group. No differences were observed for children with higher reading comprehension abilities.

| Pictorial support effects
That adding pictures to definitions in contrast to adding context sentences to definitions benefitted the direct learning gain of all learners can be explained by the cognitive theory of multimedia learning (Mayer, 2014). Aligned with this theory and the descriptive results, it could be speculated that context sentences tended to have a distracting effect, which would, however, require further research.
Similar pictorial support effects were reported by Smith et al. (1987) and by Kim and Gilman (2008) for word learning when pictures were added to definitions and context sentences.
In contrast to the studies of Smith et al. (1987) and Kim and Gilman (2008), the pictorial support effect did not sustain over time. This might be related to the low teaching intensity because the children only went through the definitions once, whereas restudying was possible in the studies of Smith et al. (1987) and Kim and Gilman (2008). It has also been shown that overconfidence in the mnemonic power of pictures may deteriorate pictorial support effects, as undergraduate students only learned new foreign words better with pictures than translations, when this overconfidence was eliminated (Carpenter & Olson, 2012). Furthermore, the studies of Smith et al. (1987) and Kim and Gilman (2008) were conducted with older participants or on foreign language learning. Older participants may differ in their learning abilities, memory, and retrieval capacities from our younger participants because their working memory system is still under development (Baddeley, Gathercole, & Papagno, 1998;Mann, Newhouse, Pagram, Campbell, & Schulz, 2002). It is also widely supported that foreign or second language learning is not comparable with first language learning (e.g., Choi, 2016;Meisel, 2011).
The finding that the pictorial support effect sustained over time for participants with low reading comprehension abilities but not with good reading comprehension abilities in terms of the retention of the knowledge is in line with previous studies that concluded that multimedia instruction is especially useful for low ability learners (Silverman & Hines, 2009;Yun, 2011). Good comprehenders might already activate broader semantic information when reading the definitions, and by this, pictures probably did not provide sufficient additional semantic information (Kalyuga & Sweller, 2014). Thus, the effect may disappear quicker over time. Our results did not support the assumption of Plass et al. (2003) that learners with lower verbal abilities lack cognitive resources to integrate the visual and verbal information. In contrast to the task used by Plass et al. (2003), a complex integration of verbal information during word learning was not strictly required in our self-paced experiment (Mayer, 2014;Sweller, 1994).

| Contextual support effect
Although there is plenty of evidence for contextual support effects on children's word learning (e.g., Stahl, 1983;Stahl & Fairbanks, 2006), context effects in our study did not reach significance for the direct word learning and retention effect. It has been shown that more repetitions or exposure to various context sentences resulted in better word learning (Bolger et al., 2008;Stahl & Fairbanks, 2006;Webb, 2007b) and that context effects may just become evident on long term (Nash & Snowling, 2006). Other explanations could be that the context sentences did not sufficiently enrich the semantic information of definitions or that the participants did not actively process the information. Because the definitions of the words were simultaneously provided with the context sentences, meaning inferences were not required. Previous studies that found a context effect required inferences from context or involved other meaning related activities (e.g., Coyne et al., 2009;Nash & Snowling, 2006;Stahl, 1983). Also, Bolger et al. (2008) found that the variability of context sentences did not matter for word learning when definitions were simultaneously available.
We found a contextual support effect for participants with low reading comprehension abilities, despite their problems in making use of contextual information in word learning after controlling for decoding abilities. This, however, only accounted for the knowledge retention and not for the direct learning effect. Therefore, context sentences might have supported low comprehenders rather in the lexical integration or retrieval than in the enrichment of provided semantic information. Contextual information spreads the activation to related semantic concepts, which may foster consolidation (Collins & Loftus, 1975;McKenzie & Eichenbaum, 2011 Nash and Snowling (2006). Related to the complementary learning systems model (Davis & Gaskell, 2009), low comprehenders might also have difficulties in integrating a more generalized representation into their mental lexicon, and therefore, context sentences may have facilitated retrieval. Their representations might be in general fuzzier, which is why additional contextual information helps in processing, as is assumed for abstract words in the context availability theory (Schwanenflugel & Stowe, 1989). It could also be speculated that low comprehenders used the context sentences as a mnemonic for retrieval by inherently imagining pictures resembling the context sentences during the learning phase.
Within a word learning strategy, the so-called keyword strategy, pictures, or the creation of mental images served as mnemonic by representing the link between the meaning of new words and an already familiar, similar sounding word (e.g., Mastropieri, Scruggs, & Mushinski Fulk, 1990). Previous studies showed that children were able to imagine pictures for word meanings via context sentences (Cohen & Johnson, 2011;Elliott & Zhang, 1998). They were, however, receiving instructions.
It remains unclear why context sentences enhanced neither the direct learning nor the retention effect of good comprehenders.
Possibly, they might activate related words automatically on their own by reading the definitions and did not consider context sentences as relevant additional information, which could have covered the contextual support effect (Kalyuga & Sweller, 2014). Previous research also showed that richer contextual information makes it easier to guess the word meaning during learning, which may cause less thorough processing and by this worse retention (e.g., Bjork & Kroll, 2015;Golonka et al., 2015;Mondria & Wit-de Boer, 1991).

| Effect of word concreteness
Taking word concreteness into account further specified the results.
First, the pictorial support effect for direct word learning seemed to be valid for both concrete and abstract words. Second, there was a general contextual support effect for the retention of the meanings of all words, as the context group forgot less from posttest to retention test than participants in the picture or control group.
That the pictorial support effect did not differ between concrete and abstract words in the direct learning gain contradicts the assumption that concrete words are better taught with pictures and abstract meanings better verbally (e.g., Sadoski, 2005). Other studies with children who used mixed types of words also found a benefit of pictures in word learning (Kim & Gilman, 2008;Smith et al., 1987). Aligned with the findings of Shen (2010) for adults, the fact that there is a similar benefit for abstract words may result from the additional visual support, which is naturally not evoked when implementing new lexical representations for abstract word meanings (Hoffman, 2016;Mestres-Missé et al., 2014). Similarly, children with learning disabilities also learned concrete and abstract words better with pictures via the keyword strategy than with definition rehearsal (Mastropieri et al., 1990). In contrast to the study of Shen (2010) on foreign language learning in adults, children may not automatically imagine pictures for concrete and abstract words as the words were unfamiliar and not linked to a translation, which is why pictures were beneficial for both types of words.
Word concreteness did not influence the group differences in the retention of the knowledge. Taking word concreteness into account, however, revealed better retention in the context group than the picture or control group for both concrete and abstract word meanings.
Similarly, as argued for low comprehenders, context sentences may also have benefitted the consolidation or retrieval of newly acquired representations from the mental lexicon or could be used as a mnemonic. Although concrete words were overall learned and retained better than abstract words in our study, no differences were observed in the contextual support effect. Following the context availability theory (Schwanenflugel & Stowe, 1989), additional contextual information might have benefitted retrieval because the representations of both types of words were still impoverished after the short instructional period. In terms of the mnemonic function of context sentences, it could be speculated that context sentences may require the inherent activation of pictures for both concrete and abstract words, whereas real pictures only supported direct learning of abstract words. By this, the effort during learning may not differ between concrete and abstract words and leads to better retention for both (Bjork & Kroll, 2015).

| Limitations and future research suggestions
The first limitation of our study is the low teaching intensity, which might have covered group differences. The effects were small, possibly as a result of the low learning gains and need to be confirmed by future investigations. Second, we did not control the time each child spent on studying the words. Therefore, it is unclear if each child actively perceived all the provided information. Finally, we did not explicitly raise the attention of the participants to the link between the meaning and the word forms, which is, however, relevant for successful word learning (Perfetti, 2017;Perfetti et al., 2005). In these respects, the similarity between the word forms of some items could have contributed to confusions. Future investigations should take those above-mentioned aspects into account and other word and child characteristics such as differences between nouns, verbs, and adjectives or differences between learning styles (e.g., Blazhenkova, Becker, & Kozhevnikov, 2011;Elleman et al., 2017). Although this study explored the moderational role of reading comprehension, future research should consider the specific group of children with low reading comprehension abilities but intact decoding skills (poor comprehenders) as well as the heterogeneity within this group to develop efficient vocabulary instructions (e.g., Colenbrander, Kohnen, Smith-Lock, & Nickels, 2016). More repetitions, larger learner involvement, and a smaller word set could intensify the experimental conditions, which could lead to larger effects (Beck & McKeown, 2007;Coyne et al., 2009;Laufer & Rozovski-Roitblat, 2015). To facilitate contextual support effects, the learner involvement and effort could be increased by asking participants to infer the meaning from context precedent to the provision of the definition (Bjork & Kroll, 2015;Golonka et al., 2015;Laufer & Rozovski-Roitblat, 2015). This could be further supported by offering various context sentences (Bolger et al., 2008). To differentiate between overnight learning and long-term retention, future studies should include longer follow-up periods. With respect to the depth of the acquired semantic representations, transfer to reading comprehension should also be considered (e.g., Cain & Oakhill, 2014). It would be interesting to further investigate contextual support effects when children are instructed to imagine pictures resembling the context sentences. This ability might differ between children with various reading comprehension abilities as well as between concrete and abstract words.

| CONCLUSION AND IMPLICATIONS
In contrast to context sentences, adding pictures to definitions enhanced direct word learning of children with various reading comprehension abilities on short term better than adding context sentences. Although this effect disappeared over time, pictures fostered the retention of the acquired knowledge for low comprehenders over and above definitions. Context sentences were not supportive during direct word learning but seemed to help all learners to at least retain what they have acquired during the experiment, particularly children with low reading comprehension abilities.
Although a pictorial support effect over and above definitions was only found for low comprehenders in the retention of the knowledge, the fact that pictures supported direct word learning better than context sentences stresses the usefulness to provide pictures during explicit teaching of new vocabulary in a first language learning setting and in classrooms with large individual differences between children. This highlights also the broad utility of multimedia instruction at school.
Our study results should encourage teachers to provide rich semantic information during explicit vocabulary instruction at school, which is especially relevant for children with low reading comprehension abilities due to their problems in learning words naturally from context.

DATA ACCESSIBILITY STATEMENT
Data available on request from the authors.  Little dolls often used as talisman in the Netherlands

Spant Timber
Bbalken van een dak-of scheepsgeraamte Bar of the timberwork from a roof or ship Het oude schip had houten spanten en een zeil. The old ship had wooden timber and a rope.

Timberwork of a boat
Majuskel Majuscule

Name of the capital letters in old scripts
In het oude boek waren vooral de majuskels versierd. In the old book, mainly the majuscules were ornate.

Folding seat in a vehicle
Neerklapbare of vouwbare zitting in een rijtuig Seat in a vehicle that can be folded or collapsed In de bus was alleen een plek op een strapontijn vrij. In the bus, was only a place on a folding seat available.

Folding seats in a bus
Contour Contour Geschetste vorm of omtreklijn van iets Approximated form or outline shape of something Hij ziet de contour van zijn lichaam in de schaduw. He sees the contour of his body in the shadow.

Contour line of a dancer
Flank Flank Zijkant van een voorwerp of levend wezen Side of an object or creature Hij drukte zijn hak in de flank van het paard. He pressed his heels in the flank of the horse.

Horse from side perspective
Note. The English translation is provided in italics.

PARTICIPANT INFORMATION AND COMPARISON OF GROUPS (CONTEXT, PICTURE, AND CONTROL)
APPENDIX C. Note. Standard deviations are presented within parentheses. Maximum test scores are presented within square brackets. The group differences were tested with one-way ANOVAs ( F ). If the assumptions of one-way ANOVAs were not met, we used Kruskal-Wallis tests (χ 2 ; Field, 2009). a Test from Verhoeven and Vermeer (1999) as implemented in the Dutch schools' pupil and instruction monitoring system (Cito, 2018a(Cito, , 2018b.

ADDITIONAL FIGURES
b Dutch standardized test from Verhoeven and Vermeer (1993), which is similar to the test described in a. c Predicted from vocabulary tests a and b, see test manuals for further details. d Test editions from Feenstra et al. (2010) and Staphorsius and Krom (1998) as implemented in the Dutch schools' pupil and instruction monitoring system (Cito, 2018a(Cito, , 2018b.

FIGURE C1
Mean sum scores of the semantic knowledge (square root transformed) for each group over time