Protocol: Preschool Predictors of Later Reading Comprehension Ability: A Systematic Review

screening – February 2015 Full text screening – March 2015 Data extraction –May 2015


BACKGROUND
In today's technical and knowledge-driven society, it is paramount to be able to read well enough to acquire school-related knowledge and-later in life-to obtain and maintain a job.Longitudinal studies that follow typical children's language and reading skills over time can contribute to our knowledge of children's development, including the influence of preschool language skills on later reading ability.Such findings are of practical significance, as they have direct implications for how to best prepare children from an early age for later reading instruction.
Preventive school-based efforts must build upon insights into developmental variation in language acquisition.With this knowledge, we have the potential to recognize the signs of delayed or divergent development.When a child shows early signs of poor language development, we can, with more certainty, put in additional and focused efforts to help prevent later reading struggles.To identify such struggles, it is essential to research indications of different language and environmental markers that can be predictors of later reading skills.Although there is a relatively well-documented understanding of the different language skills underlying children's abilities to learn to read, there is still need for further research to both support and challenge findings in similar longitudinal studies.

A simple and augmented view of reading
The goal of school-based reading instruction is reading fluency and comprehension.Gough and Tunmer (1986) describe a "simple view of reading" as two equally important abilities that are needed to comprehend what is read: decoding and linguistic comprehension.Linguistic comprehension and decoding are two distinct and necessary processes that simultaneously affect and are dependent on one another for positive reading development (Bloom & Lahey, 1978).For this simple view, Hoover and Gough (1990) defined decoding as efficient word recognition: "the ability to rapidly derive a representation from printed input that allows access to the appropriate entry in the mental lexicon, and thus, the retrieval of semantic information on the word level" (p.130).Linguistic comprehension is defined as "the ability to take lexical information (i.e., semantic information at the word level) and derive sentence and discourse interpretations" (Hoover & Gough, 1990, p. 131).Reading comprehension involves the same ability as linguistic comprehension, but it also relies on graphic-based information arriving through the eye (Hoover & Gough, 1990).Individual differences in reading achievement are often understood as the product of these two parameters: decoding and linguistic comprehension (Gough & Tunmer, 1986).It is important to note that this "simple view" does not deny that capacities such as phonemic awareness, vocabulary knowledge, or orthographic awareness are important to reading; rather, it suggests that they are sub-skills of decoding and/or linguistic comprehension (Conners, 2009).Because the two parameters (decoding and linguistic comprehension) and the underlying factors simultaneously affect one another, fully disentagling the two skills is problematic.These skills are highly interrelated (Clarke, Truelove, Hulme, & Snowling, 2014).
Although there is support for the "the simple view of reading," there are also researchers who argue that additional components are needed in this model (Chen & Vellutino, 1997;Conners, 2009;Hoover & Gough, 1990).Longitudinal studies provide support for an augmented model (Geva & Farnia, 2012;Johnston & Kirby, 2006;Oakhill & Cain, 2012), derived from the remaining variation in reading ability that cannot be explained within the simple view model.In general, the model is augmented through the inclusion of additional cognitive general skills.These cognitive processes make significant contributions to reading comprehension beyond word recognition and linguistic comprehension.The augmented view of reading suggests that there needs to be a wider perspective on reading development that explores how different linguistic and cognitive processes affect and have longitudinal contributions to reading comprehension.Although there is a relatively well-documented understanding of the different language skills underlying children's abilities to learn to read, there is still a need for further research to both support and challenge findings in comparable studies.
The next section will further address what earlier research has found to be the most influential predictors of these three main dimensions: decoding, linguistic comprehension, and domain general cognitive skills.As stated above, these components and predictors are to a large degree interrelated, which makes examining the predictors of these three dimensions separately somewhat problematic.For instance, some of the predictors may have an influence on more than one factor related to later reading.Furthermore, the three constructs (decoding, linguistic comprehension, and domain general cognitive skills) are organized conceptually, but also in a way that simplifies the structure to fit in the model that is employed.In addition, this simple structure also works best for analyzing these important relationships empirically.We hope to further explore these issues in the analysis and subsequently in the final report.
It is also important to consider the longitudinal aspect of reading.Different factors and abilities make significant contributions at different times in the development process.In the beginning, when the child learns to match sounds to letters, phonological awareness, letter knowledge and naming speed have been shown to be important.Later, when the decoding has become automatized, capacities are freed up for the linguistic comprehension components.The present review will include studies that have measured reading comprehension abilities at different ages.Some studies may have assessed reading comprehension in second grade, while others have assessed it in tenth grade.Thus, decoding ability may, to varying degrees, be a factor, depending on the children's exposure to and amount of experience with reading.

Preschool predictors of decoding
Before children learn to decode, there are three key components (precursors) that are of particular importance.Phonological awareness, letter knowledge and rapid automatized naming (RAN) all play a key role when children try to figure out the alphabetical code, matching the corresponding sound (phoneme) to the letter.In a two-year large-scale longitudinal study by Lervåg, Bråten and Hulme (2009), the findings displayed the unique contribution of phoneme awareness, letter-sound knowledge, and non-alphanumeric RAN, which were measured four times, beginning 10 months before reading instruction began, to the prediction of the growth of word recognition skills in the early stages of development.
The strong connection between phonological awareness and reading development has been established among researchers (Hatcher, Hulme & Snowling, 2004;Høien & Lundberg, 2000;Lundberg, Frost & Petersen, 1988;Melby-Lervåg & Lervåg 2011).In many ways, letter knowledge is one of the main components of alphabetical reading.For instance, Muter, Hulme, Snowling and Stevenson (2004) reported that letter knowledge measured at school entry was a powerful longitudinal predictor of early decoding ability.Together with phonological awareness, letter knowledge assessed at school entry explained a total of 54% of the variance in decoding ability one year later.Subsequently, if a child struggles with this, it may be early signs of reading difficulties.Later, when the alphabetical code is solved, the child will learn to recognize frequent patterns of letters in words.Generally, this pattern recognition will contribute to the faster decoding of known words and clusters of letter combinations and thus help the child achieve a faster reading speed (Johnston & Kirby, 2006).
RAN, or naming speed, refers to the speed at which one can identify known symbols, numbers or letters.Wolf, Bowers and Biddle (2000) argue that naming speed "represents a demanding array of attentional, perceptual, conceptual, memory, lexical, and articulatory processes" (p.19).The hypothesis raised by Wolf and colleagues is that the ability to name symbols rapidly contributes to a quicker recognition of the orthographical patterns in a text.Johnston and Kirby (2006) discussed how although the unique contribution of naming speed was relatively small, naming speed contributed primarily in terms of word recognition.They also acknowledge that once the word-recognition component is included, naming speed has little more to contribute to reading comprehension.Lervåg and Hulme (2009) argued that variations in non-alphabetic naming speed, phonological awareness, and letter knowledge measured before school entry are strong predictors of variations in later reading fluency.Although there is support for these connections, there is still uncertainty as to how these different parameters are interrelated with each other and with reading comprehension.Accordingly, this uncertainty will have implications for the present systematic review by including the three abilities as predictors because they each have unique contributions in predicting later decoding abilities.

Preschool predictors of linguistic comprehension
Broad language skills are paramount to good reading comprehension (Carroll, 2011).To comprehend what one reads, one has to understand the language in its spoken form (Cain & Oakhill, 2007).This relationship changes with age.Cain and Oakhill (2007) refer to longitudinal studies that show that correlations between reading and linguistic comprehension (e.g., listening comprehension) are generally low in beginning readers, but these correlations gradually increase when decoding differences are low.Cain and Oakhill (2007) point to vocabulary and grammar as aspects of language that are likely to influence reading development.First, vocabulary knowledge is likely to have impact both in learning to recognize individual words and in text comprehension skills (Cain & Oakhill, 2007).Second, grammatical abilities may also aid word recognition through the use of context, thus contributing to the development of reading comprehension (Cain & Oakhill, 2007).
A child's vocabulary consists of the words that the child is familiar with in the language.
Vocabulary is the dimension of language that correlates the strongest with reading comprehension and has been the focus of much research (Biemiller, 2003;Dickinson & Tabors, 2001;Ouellette, 2006;Walley, Metsala, & Garlock, 2003).A child's early vocabulary predicts later reading development, especially reading comprehension development (Biemiller, 2003(Biemiller, , 2006;;Lervåg & Aukrust, 2010;National Reading Panel, 2000).The considerable contribution of vocabulary to reading development emphasizes the need for studies with a special focus on vocabulary and reading comprehension development.Although there is support for this strong connection, there is still uncertainty about how decoding and vocabulary are interrelated with reading comprehension (Ouellette, 2006).Systematic reviews that explore findings across an array of studies from different countries and different languages contribute to a broader picture of the coherence of this relationship.In a meta-analysis performed by the National Early Literacy Panel (2008), the early literacy or precursor literacy skills related to oral language measures of grammar, definitional vocabulary, and listening comprehension were generally significantly stronger predictors than were measures of vocabulary.The results from this meta-analysis must be interpreted with the knowledge that the outcome measure (reading comprehension) was measured in kindergarten and preschool.It is common to think that vocabulary has more of an influence in reading comprehension later-after the child acquires the initial alphabetical code and reads with more fluency.
Is the linguistic comprehension component one nested construct that ultimately taps and loads onto one core language comprehension dimension?In a longitudinal study of 216 children who were followed from age 4 to age 6, Klem et al. (2015) identified one unidimensional language latent factor consisting of sentence repetition, receptive vocabulary knowledge and grammatical skills that showed a good fit with the data and a high degree of longitudinal stability.As Klem et al. (2015) states, a child's understanding of a sentence that is read to him or her will, in turn, depend on semantic skills, including vocabulary knowledge and grammatical skills.Furthermore, a growing body of evidence supports the notion of a strong long-term stability of individual variation in core language skill throughout childhood (Bornstein, Hahn, Putnick, & Suwalsky, 2014;Melby-Lervåg et al., 2012) The present review will thus include both vocabulary and grammar as the main components in the linguistic comprehension construct.

Preschool predictors of domain-general cognitive skills
The role of memory in explaining individual differences in reading comprehension is one aspect that this review aims to explore.Text comprehension is a complex task that draws on many different cognitive skills and processes (Cain, Oakhill, & Bryant, 2004).Two different memory functions are often considered: short-term memory, i.e., "the capacity to store material over time in situations that do not impose other competing cognitive demands" (Florit, Roch, Altoè, & Levorato, 2009, p. 936) and working memory, i.e., "the capacity to store information while engaging in other cognitively demanding activities" (Florit et al., 2009, p.936).In a longitudinal study by Cain et al. (2004), working memory and component skills of comprehension predicted unique variance in reading comprehension.Florit et al. (2009) refers to previous studies that suggest that reading comprehension depends in part on the capacity of working memory to maintain and manipulate information.Cain et al. (2004) note that working memory appears to have a direct relationship with reading comprehension over and above short-term memory, word reading, and vocabulary knowledge.Working memory has an impact on reading development because of the need to store items for later retrieval and to partially store information demands related to several levels of text processing (Swanson, Howard, & Sáez, 2007).Furthermore, Swanson, Howard and Sáez (2007) argue that working memory plays a paramount role because it holds recently processed information to make connections with the latest input and maintains the key elements of information for the construction of an overall representation of the text.
As previously noted, higher-level linguistic and cognitive processes have also proven their contribution in explaining the variance and impact on reading comprehension.The goal of reading is to understand; meaning that beyond the child's decoding ability different comprehension processes are also acquired, at the word, sentence and text level.When reading, skilled readers make use of the background knowledge he or she has about the topic in addition to their reasoning skills and thus can make inferences about what's to come.This ability is also often embedded in the instruments used to assess reading comprehension ability.A child has to go beyond the meaning of words and sentences and reason when asked about something that is not explicitly written in the text.
The present review will thus include components of domain-general cognitive skills.

Model
Models that display components related to reading development can be helpful tools for the researchers when they explore the different aspects of a theory by unpacking and manipulating its parameters.The simple view is often used to frame a study by applying the researchers' own data to the parameters of the model.Earlier studies have provided a large body of research that supports the simple view and discusses the parameters and underlying abilities that this model includes and excludes.
On this page, our hypothesized model of interrelations is provided.These relations will be tested with use of the primary studies and the methods of analysis provided in the methods section.Examples of indicators are listed on the left side in the figure.

Definitions
To be precise about the terminology, we use the following section to provide a description of the predictor terms observed in the model.

Predictors of decoding:
 Phonological awareness: "the ability to detect, manipulate, or analyze the auditory aspects of spoken language (including the ability to distinguish or segment words, syllables, or phonemes), independent of meaning" (NELP, 2008, p. vii). Letter knowledge: "knowledge of the names and sounds associated with printed letters" (NELP, 2008, p. vii). Rapid automatized naming (RAN): "the ability to rapidly name a sequence of repeating random sets of pictures of objects (e.g., 'car,' 'tree,' 'house,' 'man') or colors, letters, or digits" (NELP, 2008, p. vii).

Predictors of linguistic comprehension:
 Vocabulary: the words with which one is familiar in a given language. Grammar-Syntax: knowledge about how words or other elements of sentence structure are combined to form grammatical sentences.

Domain general cognitive skills:
 Working memory: "a brain system that provides temporary storage and manipulation of the information necessary for complex cognitive tasks" (Baddeley, 1992, p. 556). Nonverbal ability: tasks that are not based on language skills, i.e., tasks with figures.

Previous systematic reviews
Our review will differ from prior reviews in several important ways: Although there are novel analyses planned for the current study, there are certain elements that will be comparable to the aforementioned reviews.The systematic reviews conducted by the National Early Literacy Panel (NELP, 2008) and García and Cain (2013) included published studies retrieved from searches conducted in two databases: PsycINFO and the Educational Resources Information Center (ERIC).Additionally, supplementary studies located through, for instance, hand searches of relevant journals, and reference checks of past literature reviews were utilized in the NELP (2008) review.The same databases are expected to be used in this study.In keeping with the guidelines of a Campbell review, our review will also include a systematic search for unpublished reports (to avoid publication bias).This search is one of the strengths of this present study, as such a search was not utilized in the other two reviews, which only included studies published in refereed journals.
In addition, the NELP (2008) review team coded the following early literacy skills or precursor literacy skills: alphabetic knowledge, phonological awareness, rapid automatized naming (letters or digits and objects or colors), writing or writing name, phonological memory, concepts about print, print knowledge, reading readiness, oral language, visual processing, performance IQ, and arithmetic.The outcome variables in that meta-analysis were decoding, reading comprehension, and spelling.In this current meta-analysis, we will have reading comprehension as the outcome, and the predictor variables will be decoding, phonological awareness, letter knowledge, naming speed, syntax, short-term memory, working memory, and nonverbal intelligence.The review by García and Cain (2013) assessed the relationship between decoding and reading comprehension, and they restricted their review to include these measures.
In contrast to our review, the García and Cain (2013) review studied the concurrent relationships between the included variables, i.e., the measures used to calculate the correlations were taken at the same time point.Our review will assess the longitudinal correlational relationships between the predictor variables in preschool and reading comprehension at school age, after reading instruction has begun.
Additionally, the NELP ( 2008) review only reported on reading comprehension in kindergarten and preschool, while our review will examine reading comprehension during formal schooling.If the included studies report on a number of reading comprehension time points in school, the last time point will be preferred.In the early stages, reading development is largely dependent on the child's decoding skills (Hoover & Gough, 1990;Lervåg & Aukrust, 2010).Later, after this process has become more automatized and fluent, there is greater opportunity to study other influential factors, for instance, vocabulary.
If possible, the review team will also code the number of years of reading instruction at the time the outcome measure is assessed in the included studies.This coding can help answer the following question: To what degree do other assumed influential variables (e.g., age, test types) contribute to explain the differences between the included studies?
Although the NELP (2008) review does not state that it restricted the included samples to typical monolingual children, the García and Cain (2013) review excluded bilingual children and those who were learning English as a second language, which the present review will also have as a criterion.García and Cain (2013) stated that studies conducted with special populations were discarded if they did not include a typically developing control sample.The only exception for this criterion was if the study included participants with reading disabilities.In the NELP (2008) review, the sample criterion was children who represented the normal range of abilities and disabilities that would be common to regular classrooms.In this regard, these reviews will differ from our review, as the planned review will only include typical children, i.e., it will not include children with a special group affiliation, for instance, children with reading disabilities.Furthermore, several have passed since the NELP (2008) review was undertaken, and this review is the most comparable to ours.The most recent study included in the NELP (2008) review was published in 2004.We suspect that there are a substantial number of new longitudinal studies that have been conducted and published since the last search was conducted.
The findings in this planned review are of practical significance, as they have direct implications for how to best prepare children for reading instruction.Additionally, with early knowledge of the patterns of normal development, we have the potential to recognize the signs of delayed or divergent development.When a child show signs of poor language development, we can, with better certainty, focus additional efforts that will help prevent later struggles with reading.To recognize when a child display signs of struggles with language risk factors or early reading ability, it is essential for research to give us important indications of the different language and environmental markers that can be predictors of later reading skills.

OBJECTIVES
The objective for this systematic review is to summarize the best available research on the correlation between reading-related preschool predictors and later reading comprehension ability.
The review aims to answer the following questions: 1) To what extent do phonological awareness, rapid naming, and letter knowledge correlate with later decoding and reading comprehension skills?
2) To what extent do linguistic comprehension skills in preschool correlate with later reading comprehension abilities?
3) To what extent do domain-general skills in preschool correlate with later reading comprehension abilities, and does this correlation have an impact beyond decoding and linguistic comprehension skills?
4) To what extent do preschool predictors of reading comprehension correlate with later reading comprehension skills after concurrent decoding ability has been considered?5) To what degree do other possible influential moderator variables (e.g., age, test types, SES, language, country) contribute to explaining any observed differences between the studies included?

I. Characteristics of the Studies Relevant to the Objectives of the Review
The primary studies should report on a longitudinal non-experimental design that follows a group of mainly monolingual children from preschool and over into school.Thus, the included studies must report at least two assessment time points: one at preschool age, before formal reading instruction has begun (predictors) and then one at school age, after formal reading instruction has been implemented (outcome: reading comprehension).
Brief description of representative study:

II. Inclusion and Exclusion Criteria
Five criteria will be used to identify eligible studies: 1-Primary study designs: The review will include longitudinal non-experimental studies that follow a cohort of children from preschool into formal schooling, after reading instruction has begun.Because there are different traditions concerning the start of formal reading instruction, preschool refers to testing of predictor variables before reading instruction has begun, ranging from 3-6 years of age.Moreover, because some countries start formal reading instruction earlier than others, the predictor assessment is included if conducted 6 months after the onset of reading instruction.The minimum length of duration between the first and second waves is one year (assessments conducted in the fall and spring of the same school year are accepted).In addition, control or comparison groups from experimental studies can be included if they are non-treatment control groups.

2-Population:
The review will include studies conducted with samples of unselected, mainly monolingual typical children, i.e., children without a special group affiliation (e.g., a special diagnosis or second language learners).If wholly selected samples (for instance bilingual or special diagnosis) are included in the review this could mean an overrepresentation of children with a risk of reading difficulties, which would not reflect the overall purpose of this review to understand typical reading development.
3-Qualifying outcomes: Eligible studies must report data on (1) at least one of the predictors (vocabulary, grammar, phonological awareness, letter knowledge, RAN, memory, nonverbal intelligence) and ( 2) one reading comprehension measure as measured by standardized or researcher-designed tests.

4-Quantitative information:
Studies that report a Pearson's r correlation between the linguistic comprehension measure in preschool and the reading comprehension test in school will be included.If the correlation is not reported in the study, we will contact the author to see if it can be retrieved.

5-Language of publication.
Studies conducted in any country and in any language of instruction are eligible for inclusion.However, studies must be reported in English (or be accompanied by an English translation of the full text of the report) to be included in the analyses.

III. Search Strategy
Studies will be collected using multiple approaches: 1) Studies included in previous reviews, including the NELP (2008) review and the Garcia and Cain (2013) review, will be collected first.
2) A manual review of the Tables of Contents will be conducted for key journals: -Journal of Educational Psychology -Developmental Psychology -Scientific Studies of Reading (The Official Journal of the Society for the Scientific Study of Reading) 3) Studies will be located using electronic searches through PsycINFO, ERIC (Ovid), Linguistics and Language Behavior abstracts, Web of Science, ProQuest Digital Dissertations, Open Grey and Google Scholar.
4) Unpublished reports, such as dissertations, technical reports, and conference presentations will be located through searches on OpenGrey.eu,ProQuest Dissertations and Theses, and Google Scholar.
The following search strategy was developed in close collaboration with the subject specialist librarians at the Oslo University Library.The search words that will be used and then combined are located in Appendix A.

Procedure:
To organize the complete search result from the seven different databases, the candidate studies will be imported from their respective databases to Endnote.From there, the references will be imported into the internet-based software DistillerSR (Evidence Partners, Ottawa, Canada).Once the references have been imported, the duplication detector application will be used to eliminate duplicates (i.e. the same reference from different databases).These duplicates will be quarantined in DistillerSR.Before starting title and abstract screening, a form will be made in DistillerSR.The abstract and title screening form will include five questions that will determine the relevance of that reference: 1 If either of the answers on the five questions are "No", the reference will be excluded at this stage.There will be two screeners at this stage.The first and second author will first double screen 25 % of the reference in order to establish coder reliability at this stage.The fifth question will act as the inter-rater reliability-question.If the Cohen's kappa inter-rater reliability for inclusion or exclusion, as indicated by Cohen's kappa, is satisfactory (above .80), the remaining references will be split in half and screened by either the first or second coder.If the inter-rater reliability is below .80 the two screeners will go through their conflicts and agree on the criteria's before continuing screening.Any disagreements will be resolved through discussions and by consulting the original paper.If the abstracts do not provide sufficient information to determine inclusion or exclusion (i.e."can't tell" on the aforementioned questions), the reference will be included to the next stage (full text screening) in order to confer with information given in the full text.
In order to begin the full text screening, the full text will be located either by downloading it via the journal online or by finding the full-text reports in the paper version at the University library.The library staff will be helpful in locating candidate references that can't be found at the University library.Once the full texts are available, a form will be made in DistillerSR.
The following questions will be answered to evaluate the relevance of the full texts: 1.If either of the answers on the seven questions are "No" (except question 3), the reference will be excluded at this stage.There will be two screeners at this stage.The first and second author will independently double screen 25 % of the full texts using DistillerSR, in order to establish coder reliability at this stage.The seventh question will act as the inter-rater reliability-question.If the inter-rater reliability, as indicated by Cohen's kappa, for inclusion or exclusion is satisfactory (above .80) the remaining references will be screened by either the first or second author.If the inter-rater reliability is below .80 the two screeners will go through their conflicts and agree on the criteria's before continuing screening.Any discrepancies will be discussed and resolved through consensus.

IV. Data Extraction and Study Coding Procedures
After the candidate studies have been screened and the eligible studies have been selected, the two first authors will code the studies following the guidelines described in the coding scheme below: 1

Statistics:
 Bivariate correlation between the predictors and reading comprehension (in accordance with the presented Figure).In addition, correlations between the predictors will be coded.
In following section the description of the measures included will be described in addition to the coding procedures (i.e., composites vs. single test scores).The selection is made in order to be able to make latent variables.Here, the descriptions made in the NELP-review (2008) are included when appropriate.

Measure Description
Reading comprehension "Measures of comprehension of meaning of written language passages.Typically measured with standardized test, such as the Passage Comprehension subtest of the Woodcock Reading Mastery Test" (NELP, 2008, p. 43).
Both tests designed for passage comprehension and sentence comprehension will coded.If the primary study report a composite score of reading comprehension this will be coded in an own category.
The type of test will be reported to control for the sensitivity of the measures: -Whether the child is corrected when he or she decodes incorrectly, in addition -If it is a test of silent or aloud reading -Whether comprehension is measured by asking control questions, multiple choice test or retelling.
If the primary study includes several follow-ups, the last assessment will be coded.2008, p. 42).

Decoding
Decoding ability will be coded the first time it is assessed in the primary study (which may be after the predictors are assessed) and concurrently with the outcome measure.If the studies include both decoding of single word and nonword reading, both will be coded.In addition, if the primary study report a composite score of decoding (i.e. a mix of real words and nonwords) this will be coded in an own category.

Vocabulary
Preschool vocabulary can include standardized or research-designed measures of vocabulary.Tests that tap receptive and/or expressive vocabulary and vocabulary composites will be coded.If the included studies have several assessment time points, the first time point in preschool will be coded.Vocabulary is typically assessed with a standardized test, such as the Peabody Picture Vocabulary scale (receptive).

Grammarsyntax
Grammar tests, which assess the child's knowledge about how words or other elements of sentence structure are combined to form grammatical sentences, will be coded.Tests that tap receptive and/or expressive grammar and composites will be coded.If the included studies have several assessment time points, the first time point in preschool will be coded.Grammar is typically measured with a standardized test, such as the Test for Reception of Grammar (TROG) (receptive).

Phonological awareness
"Ability to detect, manipulate, or analyze components of spoken words independent of meaning.Examples include detection of common onsets between words (alliteration detection) or common rime units (rhyme detection); combining syllables, onset rimes, or phonemes to form words; deleting sounds from words; counting syllables or phonemes in words; or reversing phonemes in words.Often assessed with a measure developed by the investigator, but sometimes assessed with a standardized test, such as the Comprehensive Test of Phonological Processing" (NELP, 2008, p. 42).
In the present study tests that tap rhyme-, phoneme awareness and composites will be coded.
If the included studies have several assessment time points, the first time point in preschool will be coded.
Letter knowledge "Knowledge of letter names or letter sounds, measured with recognition or naming test.Typically assessed with measure developed by investigator" (NELP, 2008, p.

42).
If the included studies have several assessment time points, the first time point in preschool will be coded.

Rapid automatized naming
Rapid naming of sequentially repeating random sets of pictures of objects, objects, letters or digits.Typically measured with researcher-created measure (NELP, 2008).If the primary study includes several measures, a composite score will be calculated one for alphanumeric RAN (letters and digits) and one for nonalphanumeric RAN (symbols and colors.In cases where RAN ability is reported in the correlation matrix as one composite, this will be coded in a separate category.

Memory
Short-term memory: "Ability to remember spoken information for a short period of time.Typical tasks include digit span, sentence repetition, and nonword repetition from both investigator-created measures and standardized tests" (NELP, 2008, p. 43).
Working memory: "the capacity to store information while engaging in other cognitively demanding activities" (Florit et al., 2009, p.936).Examples of tests include sentence span tests.
These tests measure the ability to store and process sentences/ numbers and non-word repetition and to recall them.Both STM and WM will be coded.A composite will not be computed, instead single test scores will be used since they often are not highly correlated.
Nonverbal intelligence "Scores from nonverbal subtests or subscales from intelligence measures, such as the Wechsler Preschool and Primary Scales of Intelligence or Stanford-Binet Intelligence Scale" (NELP, 2008, p. 43).
As long as there is a non verbal component included in the measure, it will be included (i.e., full scale IQ).
Moderator coding: To examine variables that could contribute to explaining the potential disparity between different studies, we will perform a series of moderator analyses.Moderator variables will therefore attempt to account for these types of differences related to, for instance, study quality (e.g., the sample size).Divergent correlations from the different studies may be influenced by systematic differences related to the following:

Age
Number of months for ages at testing for predictors and outcome reading comprehension.
Time between the different predictor and outcome assessments.

Country
Country where the study has taken place

Formal reading instruction
Number of years of reading instruction at the time of reading comprehension assessment.

Socioeconomic status
Indicators of how Socioeconomic status is assessed in the study: Examples: -Parental education level -Free/reduced-price lunches

V. Criteria for determination of independent findings
If the authors find primary studies that do not reference previous reports, but have an equal number of participants, provide a report of the exact same statistics, and use the same instruments, this suspicion will be further explored.The concern is that the same sample will be coded twice.Reports on the same study will be collected and treated as one collective report.

VI. Details of study coding categories (including methodological quality or risk of bias coding)
The bivariate correlation between the predictors and reading comprehension will be coded in CMA.If provided, sample size, mean age at the time of testing, country and language, and the mean of the sample's social economic status will be coded.
Study quality will be considered in terms of the following:  Sampling: convenience vs. random  Instrument quality: standardized vs. research-designed  Whether alpha test reliability is reported  Published or unpublished study  Whether there are any floor or ceiling effects on any measure: a measure has a floor effect if the mean value minus the standard deviation exceeds the value of 0.Moreover, a measure has a ceiling effect if the mean value plus the standard deviation exceeds the maximum possible value on a given measure.Information regarding the number of items on a measure must be provided in the article (or in the manual) in order to establish a presence of ceiling effect. The percentage of study attrition between the two time points

Inter-rater reliability:
In the third stage of the review process, data extraction, a coding scheme with all the relevant variables will be made by the first author.Excel and the CMA software will be used in order to extract data on study characteristics, study quality and correlations.The first and second author will be the primary coders.To ensure reliable coding, 25% of the included studies at this stage will be double coded.Inter-coder correlation (Pearson's r) will be calculated for the main outcomes and continuous moderator variables, in addition to the rate of parentage agreement.Cohen's kappa will be calculated for categorical moderator variables.Moreover, disagreements will be resolved through discussions and by consulting the original paper.If the inter-rater reliability is below .80 the two screeners will go through their conflicts and agree on the criteria's before continuing screening.Given an acceptable inter-rater reliability established by the double coding, the first and second author will code the remaining studies.

VI. Statistical procedures and conventions
The aim is to be able to perform meta-analytic structural equation modeling, provided a sufficient number of studies and data alignment with the provided hypothesized theoretical model.By applying meta-analytic techniques on the series of correlation matrices reported in the primary studies, we can create a pooled correlation matrix, which can then be analyzed using structural equation modeling (SEM) (Cheung & Chan, 2005).The planned statistical modeling will be conducted using the program Mplus (Muthén & Muthén, 1998-2012).Cheung and Chan (2005) describe a two stage process to integrate meta-analytic techniques and SEM into a unified framework.Moreover, Cheung and Chan (2005) propose to use the technique of multiple group analysis in SEM whereby the first stage entails the process of synthesizing the correlation matrices, and the second, to fit the hypothesized structural models based on the pooled correlation matrix.This approach takes into account one of the pitfalls of using MASEM, the assumption that the pooled correlation matrices are homogenous, by testing the homogeneity and fit the hypothesized model only when the correlation matrices are proven to be homogeneous.One other advantage of using this approach is that it allows the total sample to be utilized, thus information about sampling variation in the pooled correlations.

VII. Treatment of qualitative research
The studies that have used qualitative methods will not be eligible for inclusion in this review, as they do not have the measures required to fit the scope of this review.

ROLES AND RESP ONSIBILITIES
There is substantial expertise within the review team both in terms of content and methodology.The contributors in this review are all working in the field of language and reading comprehension.Professor Monica Melby-Lervåg has extensive experience with conducting meta-analyses and has the required statistical analysis competence.The first and last authors have also completed a two-day course on meta-analysis with Michael Borenstein (October, 2013), using "Comprehensive Meta-Analysis version 3." In addition, the review team has experience with electronic database retrieval and coding and has access to library support staff when needed.

PLANS FOR UPDATING T HE REVIEW
A new search will be conducted every other year.The first (Hjetland) and last author (Melby-Lervåg) will be responsible for updating the review.

AUTHORS' RESPONSIBIL ITIES
By completing this form, you accept responsibility for preparing, maintaining and updating the review in accordance with Campbell Collaboration policy.The Campbell Collaboration will provide as much support as possible to assist with the preparation of the review.
A draft review must be submitted to the relevant Coordinating Group within two years of protocol publication.If drafts are not submitted before the agreed deadlines, or if we are unable to contact you for an extended period, the relevant Coordinating Group has the right to de-register the title or transfer the title to alternative authors.The Coordinating Group also has the right to de-register or transfer the title if it does not meet the standards of the Coordinating Group and/or the Campbell Collaboration.
You accept responsibility for maintaining the review in light of new evidence, comments and criticisms, and other developments, and updating the review at least once every five years, or, if requested, transferring responsibility for maintaining the review to others as agreed with the Coordinating Group.

Publication in the Campbell Library
The support of the Coordinating Group in preparing your review is conditional upon your agreement to publish the protocol, finished review, and subsequent updates in the Campbell Library.The Campbell Collaboration places no restrictions on publication of the findings of a Campbell systematic review in a more abbreviated form as a journal article either before or after the publication of the monograph version in Campbell Systematic Reviews.Some journals, however, have restrictions that preclude publication of findings that have been, or will be, reported elsewhere and authors considering publication in such a journal should be aware of possible conflict with publication of the monograph version in Campbell Systematic Reviews.Publication in a journal after publication or in press status in Campbell Systematic Reviews should acknowledge the Campbell version and include a citation to it.Note that systematic reviews published in Campbell Systematic Reviews and co-registered with the Cochrane Collaboration may have additional requirements or restrictions for co-publication.Review authors accept responsibility for meeting any co-publication requirements.
I understand the commitment required to undertake a Campbell review, and agree to publish in the Campbell Library.Signed on behalf of the authors:

Figure 1 :
Figure 1: Predictors of reading comprehension Roth, Speece, and Cooper (2002)followed a group of normally developing kindergarten children over 3 years-in kindergarten, first grade and second grade.The sample included 66 native English speakers.The mean age at initial testing was 5 years and 6 months.The test battery consisted of multiple domains pertaining to the present review: phonological awareness, linguistic comprehension (vocabulary and grammar), domain general cognitive skills, decoding, reading comprehension and socioeconomic status.Phonological awareness was assessed with blending and elision task.Within linguistic comprehension, different aspects were assessed: receptive vocabulary (PPVT-R), expressive vocabulary (Oral vocabulary subtest from TOLD-P: 2 and the Boston Naming Test), receptive grammar (Test of Auditory Comprehension of Language-Revised) and expressive grammar (Formulating sentences subtest from the CELF-R).Nonverbal intelligence was measures with RAVEN.Both decoding of real words and non-words were assessed using respectively letter-word identification and word attack from Woodcock Johnson.Reading comprehension was measured in first and second grade using the Passage Comprehension subtest from Woodcock Johnson.Socioeconomic status has been reported via free/reduced-price lunch status.Correlations between the kindergarten measures and reading comprehension in second grade are provided.
. Does the reference appear to be a longitudinal non-experimental study (or with a non treatment control group)?Answer either: Yes/No/Can't tell 2. Does the reference appear to include a study of mainly monolingual typical children (i.e.not included because of a special group affiliation)?Answer either: Yes/No/Can't tell 3. Does the reference appear to have data from both preschool and school?Answer either: Yes/No/Can't tell 4. Does the reference appear to include data on at least one of the predictors and later reading comprehension?Answer either: Yes/No/Can't tell 5. Should this reference be included at this stage?Answer either: Yes/No Does it include data on at least one of the predictors (i.e phonological awareness, letter knowledge, RAN, vocabulary, grammar, working memory, non verbal IQ)? Answer either: Yes/No/Can't tell) Does it include data on reading comprehension?Answer either: Yes/No/Can't tell) 2.
. Study features: Phonological awareness: name of test, type of test (alliteration detection, rhyme detection, combining syllables, onset rimes, or phonemes to form words; deleting sounds from words; counting syllables or phonemes in words; or reversing phonemes in words.),standardized or researcher made.-Letter knowledge: name of test, type of test (recognition or naming, sounds or names), standardized or researcher made.-RAN: name of test, type of test (objects, objects, letters or digit), standardized or researcher made.-Vocabulary: name of test, type of test (receptive or expressive, depth or breath), standardized or researcher made.-Grammar: name of test, type of test (receptive or expressive, morphology, syntax), standardized or researcher made.-Memory: name of test, type of test (sentence repetition, digit span, non-word repetition), standardized or researcher made.-Non verbal intelligence: name of test, type of test (block design, matrix) -Decoding: name of test, type of test (single word reading, non word reading), standardized or researcher made.
"Decoding words: Use of symbol-sound relations to verbalize real words or use of orthographic knowledge to verbalize sight words (e.g., 'have,' 'give,' 'knight')."(NELP,2008,p. 42).Typically assessed with a standardized measure, such as word Identification subtest of the Woodcock Reading Mastery Test and subtest Form A -Sight Word Efficiency (SWE) of the Test of Word Reading Efficiency (TOWRE).