Early manifestations of intellectual performance: Evidence that genetic effects on later academic test performance are mediated through verbal performance in early childhood

Intellectual performance is highly heritable and robustly predicts lifelong health and success but the earliest manifestations of genetic effects on this asset are not well understood. This study examined whether early executive function (EF) or verbal performance mediate genetic influences on subsequent intellectual performance, in 561 U.S.- based adoptees (57% male) and their birth and adoptive parents (70% and 92% White, 13% and 4% African American, 7% and 2% Latinx, respec-tively), administered measures in 2003– 2017. Genetic influences on children's academic performance at 7 years were mediated by verbal performance at 4.5 years ( β = .22, 95% CI [0.08, 0.35], p = .002) and not via EF, indicating that verbal performance is an early manifestation of genetic propensity for intellectual performance.

and learning environments. We use a parent-offspring adoption design to examine two likely candidates for this early manifestation: Executive function (EF) and verbal performance in early to middle childhood. Our results are the first to document whether early EF or verbal performance have a mediating role, linking genetic influences on later intellectual performance in middle childhood and possibly also in adulthood. By identifying which of these, EF or verbal performance, serve as a principal manifestation of genetic influences, our results pave the way for investigations into how children's interactions with parents and teachers from early childhood onwards amplify or diminish these favorable outcomes.
Intelligence and academic performance are powerful predictors of psychological wellbeing, health, longevity, years of education, income, and employment status (Deary et al., 2010;Hummer & Hernandez, 2013;Kosik et al., 2018). Conversely, lower intellectual performance is associated with all-cause mortality and clinically important increases in the severity of psychopathology (Deary et al., 2010;Kosik et al., 2018;Yew & O'Kearney, 2013). Furthermore, there is evidence to suggest that academic performance in adolescence may have a negative causal connection with internalizing and externalizing problems in emerging adulthood (Wolchik et al., 2016). Consequently, promotion of intellectual performance in childhood may have broad effects across development, including improving educational, occupational and health outcomes, and diminishing the likelihood of some psychiatric problems. As a result, research aimed at understanding the processes involved in the early development of intellectual performance is crucial and may help uncover mechanisms that can be modified, not only to promote intellectual development, but also to promote a wide range of positive life outcomes and reduce the risk of psychopathology.
Intelligence and academic test performance have been reported to be highly heritable, especially as children get older, rising from 20%-60% in childhood and adolescence to 50%-80% in adulthood (Bouchard & McGue, 1981;Haworth et al., 2010;Kovas et al., 2013). Consequently, some have argued that environmental factors must play only a minor role in intellectual development (Plomin, 2018). However, twin and adoption studies provide evidence that environmental factors can have notable main effects and moderating effects on intellectual outcomes (Capron & Duyme, 1989;Kendler et al., 2015;Neiss & Rowe, 2000;Tucker-Drob & Bates, 2016). There is also evidence from the recent surge of literature using measured genotypes to examine genetic nurture (Bates et al., 2018;Kong et al., 2018;Wertz et al., 2020)-including studies that have combined polygenic scores with the adoption design (Cheesman et al., 2020;Domingue & Fletcher, 2020)-suggesting that parents influence children's educational outcomes not only through direct genetic transmission but also through environmentally mediated pathways. An additional, unheralded, mechanism is that the environment may have an amplifying effect on genetic influences, through evocative gene-environment correlation (rGE). This occurs when an individual's genetically influenced characteristics systematically evoke responses from their environment that, in turn, enhance or "canalize" genetic influences (Scarr & McCartney, 1983). As these evoked environmental conditions correlate with genetic influences their influence could be entirely masked by estimates of genetic main effects. Dickens and Flynn (2001) explore in detail the possibility that this process of amplification operates in the context of cognitive abilities across generations to account for rising levels of intelligence in successive cohorts of children and adults. While there is some evidence from phenotypic, twin and polygenic score research of evocative rGE in infant and early childhood cognitive development (Lugo-Gil & Tamis-LeMonda, 2008;Tucker-Drob & Harden, 2012;Wertz et al., 2020), the evidence base is small and the Dickens and Flynn hypothesis has never been robustly tested across the span of development within a generation. For these environmental amplification effects to be examined in detail, it is important to know at which developmental periods they may exert their influence on intellectual outcomes. For influences occurring early in development, it is crucial to identify the earliest manifestations of genetic advantage because these are likely the features that elicit the favorable environmental responses that amplify genetic effects.
In spite of the great importance of identifying the early manifestations of genetic influences on lifespan intellectual performance, the evidence base is small with regards to what these early manifestations might be. There is some indication that childhood scholastic performance from 6 to 7 years old onwards may be an early indicator of genetic advantage for intellectual performance in adulthood. For example, higher genome-wide polygenic scores of total years of education achieved by adulthood (EA PGS) predict stronger reading and math test performance at 6, 7, 12, and 16 years (Allegrini et al., 2019;Armstrong-Carter et al., 2020;Belsky et al., 2016;Selzam et al., 2017). This is supported by evidence that adopted children's math and reading performance at age 7 years is partially predicted by their birth parents' reading and math test performance (Borriello et al., 2020;Cioffi et al., 2021). These associations are not confounded by direct caregiving effects because adopted children and birth parents share genes, but birth parents do not provide the postnatal rearing environment. Furthermore, as the birth parent outcomes were measured in adulthood, the observed phenotypic associations between birth parents and children are akin to an "instant longitudinal study" from childhood to adulthood (Plomin, 1986) because, although these studies do not include longitudinal data from childhood to adulthood, they identify genetic factors accounting for the association between academic test performance in childhood and intellectual performance of biological relatives in adulthood. It remains less well understood whether there are earlier markers of genetic effects on lifespan intellectual performance than academic test performance from age 6-7 years onwards. There is mixed evidence from one longitudinal study (the Dunedin Study): Children in the sample with higher EA PGS began talking earlier, based on parent ratings of developmental milestones at 3 years old, but did not score any better in the Peabody Picture Vocabulary Test at 3 years old (Belsky et al., 2016). However, from 5 years old and onwards, children in the study with higher EA PGS scored higher on tests of intelligence. Additionally, there is some evidence from adoption studies that birth parent intelligence in adulthood predicts adopted children's EF, verbal and nonverbal intelligence from 1 to 3 years old (Leve, DeGarmo, et al., 2013;Plomin et al., 1997), indicating that these early abilities may be markers of genetic effects on adult intelligence. This is consistent with evidence, firstly, that EF and verbal performance in early childhood are partially heritable, including from as early as 2 years old, at which point the heritability of both are fairly low-around 20%- (Gagne et al., 2020;Galsworthy et al., 2000) and throughout early and middle childhood, by which point the heritability of EF and verbal performance appears to be approximately 60% (Davis et al., 2009a;Polderman et al., 2007). Secondly, these findings are consistent with evidence that early childhood EF and verbal performance predict subsequent intellectual performance (Duncan et al., 2007;McClelland et al., 2013;Yu et al., 2018). For example, there is longitudinal evidence from six studies that reading, verbal performance, and attention at school entry robustly predict later school math and reading test performance (Duncan et al., 2007). It seems likely, based on these converging bodies of research, that EF and verbal performance in early and middle childhood are early manifestations of genetic effects on later intellectual performance. However, no research has used an adoption design to combine these streams of evidence and investigate whether early-and apparently heritable-EF and verbal performance mediate genetic influences on later intellectual outcomes.
Early manifestations of genetic effects on intellectual performance are important to understand, first, because they may directly influence the development of later intellectual performance. Second, because they likely have an indirect influence on intellectual development through interaction with caregiving and learning environments that plausibly sustain and amplify these early manifestations. However, as there is limited evidence of what the early manifestations are and precisely when they manifest, research is not yet in a position to rigorously explore hypotheses about evocative effects of genetic influences underlying intellectual development on caregiving and learning environments. A critical first step is to identify the very early expressions of genetic advantage in intellectual performance using a longitudinal parent-offspring adoption study.
We examine the hypothesis that genetic effects linked to adult intellectual performance have their impact on child reading and math test performance at 7 years of age through two early-appearing pathways: via EF and verbal performance from 27 months to 6 years. Although this is the first research to bring together several streams of evidence to address the question of whether early EF and verbal performance mediate genetic influences on later intellectual outcomes, on the continuum from exploratory to confirmatory research, our hypothesis is largely confirmatory because it is directional and grounded in robust and converging bodies of literature. We address our hypothesis in two steps: First, we examine at what age, or ages-between the ages of 27 months to 6 years-is there evidence of genetic effects on EF and verbal performance. Second, if the first set of analyses confirm our expection that there will be evidence of genetic effects on early EF and verbal performance, we test for mediation of genetic effects on reading and math test performance at 7 years old via each of these pathways. Our expectation is that early EF and verbal performance will mediate genetic effects on intellectual performance, indicating that they are early manifestations of genetic influences on intellect. We employ birth mother general intellectual performancecaptured using a latent composite of intelligence, reading and math test performance-as a proxy for genetic influences. As adopted children and their birth mothers share genes but birth mothers do not provide the postnatal rearing environment, the adoption design eliminates the influence of birth mothers on the postnatal environment. Phenotypic associations between adopted children and their birth mothers will thus be taken to imply genetic effects. However, correlations between birth mothers and their adopted offspring can represent a combination of genetic and intrauterine effects. Birth fathers, who play an equal role to mothers in contributing to the child's genotype, provide an estimate of genetic effects that is not confounded by intrapartum effects. Consequently, we use a smaller subsample of birth fathers for replications of the birth mother analyses. Although birth parents tend to correlate on measures of intelligence (Bouchard & McGue, 1981) and birth father replications can only be considered quasi-independent rather than fully independent replications, broadly speaking, they provide convergent evidence regarding genetic-as opposed to intrauterine-effects on children's intellectual performance.

M ET HOD Participants
Participants were drawn from the Early Growth and Development Study (EGDS), a U.S.-based, longitudinal, prospective adoption study of 561 linked sets of adopted children and their birth mothers (n = 554), birth fathers (n = 210) and adoptive parents (562 adoptive fathers and 569 adoptive mothers; numbers do not sum to 561 adoptive mothers and fathers because the sample includes 41 same-sex parent families and 15 additional adoptive parents who entered the family after the original couple adopted the child; Leve et al., 2019). EGDS data were collected in two cohorts, recruited through 45 adoption agencies in 15 states across the United States (Leve et al., 2019): The first, a sample of 361 adopted children and their birth and adoptive families and, the second, a sample of 200. While some of the variables used in the analysis were collected in both cohorts, others were only collected in one cohort. For a breakdown of the number of participants by each variable, see Figure 1 and Table 1. The variables used in the present analyses were collected in 2003-2013 (cohort I) and in 2007-2017 (cohort II).

Ethics
Ethical approval was obtained from institutional review boards at the University of Oregon (Protocol number: 0304201400) and The Pennsylvania State University (Submission ID: CR00007591). Informed consent was obtained from all adult participants ahead of research participation and assent was obtained from children at age 7 years.

Measures
Using structural equation models, incorporating confirmatory factor analysis (CFA), we created the latent variables (displayed in Figure 1) across each of the domains outlined below. Prior to hypothesis testing, we ran longitudinal measurement models, assessing the fit of individual domains across all timepoints. Model fit was good in all of these models, supporting the use of latent variables.
Birth parent general intellectual performance As displayed in Figure 1, we created a latent variable of birth parent general intellectual performance-with the indicators of intelligence, reading, and math test performance listed below-as a proxy for genetic influences on children. Latent measurement drawing on a diverse range of indicators was justified by the internal consistency (birth mother α R = .84; birth father α R = .85) and bivariate correlations among measures of birth parent intelligence and academic test performance in the EGDS sample (Table 1), and the "generalist genes" literature which reports that approximately a third of the genetic variance of reading and math performance is in common with general intelligence (g; Davis et al., 2009b;Plomin & Kovas, 2005).

Wechsler Adult Intelligence Scale-III
We administered the 28-item Information subtest (Wechsler, 1997) to birth parents at 18 months postpartum. It loads onto the verbal comprehension index of the full measure and is considered to be a representative measure of g (g loading = .79; Kaufman & Lichtenberger, 1999). We used standardized scores, based on age.

Woodcock-Johnson Tests of Achievement III
At 4.5 or 7 years postpartum, we administered birth parents four subtests: (1) 76-item Letter-Word Identification, measuring reading decoding; (2) 32-item Word Attack, capturing decoding and phonetic coding; (3) 98-item Reading Fluency, measuring reading speed and semantic processing speed; (4) 160-item Math Fluency, indexing math and numerical performance (Woodcock et al., 2001). We used T-scores, standardized to have a mean of 50 and standard deviation of 10.

Stroop task
At 27 months, we administered the fruits-animals Stroop, modified by the EGDS team based on Kochanska et al. (2000). There were six trials, each scored on a scale from 1 to 3 (1 = incorrect on item and size; 2 = correct item, wrong size; 3 = correct item and size). The trials had strong internal consistency (α = .85) and were averaged to form a scale score. At 4.5 and 6 years, we administered the 16-trial day-night Stroop (Gerstadt et al., 1994), which has robust construct validity and internal and test-retest reliability (Montgomery & Koeltzow, 2010). Each trial had one point for a correct answer. Trials had strong internal consistency (α = .85) and were summed, resulting in a score between 0 and 16.

Gift delay task
At 27 months, children participated in a gift delay task similar to the one described by Kochanska et al. (1996). We coded the videotaped task based on how often the child (1) peeked, (2) touched the gift, and (3) used distraction strategies. In line with Leve, DeGarmo, et al. (2013) we averaged the three items to form a total score of inhibitory control, with higher scores indicating higher inhibition (α = .54; r = .08, .32, and .46 among items).

Guessing game
At 4.5 years old, children completed a task adapted from the Goldsmith and Rothbart (1999) laboratory assessment of temperament (Lab TAB) to measure their inhibitory control when told not to turn around or peek at hidden toys. The task was coded from 1 (not at all) to 5 (continually) on: "How often did the child keep their back turned around when asked to?".

Forbidden gift
We measured inhibitory control in the 4.5-year-olds using a forbidden gift task modified from the Lab TAB, which was videotaped and coded from 1 (very true) to 3 (not true) on whether: "The child asked for the gift".

Dinky toys
This inhibitory control task modified from the Lab TAB involved the 4.5-and 6-year-olds being asked to comply with rules about how to interact with a box of toys. We rated the task on: "The degree to which the child follows or violates instructions" from 1 (violates rules) to 3 (follows all instructions).

Go-NoGo
At 6 years, we administered a Go-NoGo task (Nosek & Banaji, 2001). In this 84-trial version, trials were divided into two blocks, the first of which contained only Go trials (when the child should press a button) and the second an equal combination of Go trials and NoGo trials (in which children are expected to inhibit their prepotent response by refraining from pressing a button). We measured selective attention and inhibition using the percentage of correct responses in the second block to both Go and NoGo stimuli.

Language Development Scale
Adoptive parents separately completed a measure of child language development at 27 months, based on the number of words that the child is reported to use spontaneously from a list of 310 items (Achenbach & Rescorla, 2000). Reports from both parents were used as indicators in the verbal performance latent variable at 27 months. Using nationally standardized normed scores, we converted raw scores to percentiles that reflected the child's language performance relative to same-age peers (Achenbach & Rescorla, 2000). Language Development scale scores have moderate to high correlations (r = .66-.87) with scores on standardized vocabulary tests (Klee et al., 1998;Rescorla & Alley, 2001) and are reported to have the best predictive validity performance of the language screening tools (Sim et al., 2019).

Wechsler Preschool and Primary Scale of Intelligence III
We administered the vocabulary assessment to 6-year-olds, measuring learning, comprehension, and verbal expression of vocabulary (Wechsler, 2002). Raw scores from the 50-item measure were converted to standardized scores from 1 to 19, based on the responder's age.

Dynamic Indicators of Basic Early Literacy Skills
We administered four sets of procedures and assessments to 6-year-olds: (1) 16-item Initial Sound Fluency, measuring phonemic awareness; (2) Letter Naming Fluency, capturing proficiency in naming upperand lower-case letters, using a list of 110 letters; (3) Phoneme Segmentation Fluency, assessing proficiency in fluently segmenting three-and four-phoneme words into their individual phonemes, using a list of 24 words; (4) Nonsense Word Fluency, testing understanding of the alphabetic principle, including letter-sound correspondence, using a list of 50 nonsense words (Good & Kaminski, 2002). Initial Sound Fluency and Letter Naming Fluency have good test-retest reliability (r = .88-.93) and robustly predict later reading performance (Kaminski & Good, 1996). Raw scores, which represent the number of items a child has answered correctly in 1 min, were converted to percentiles, reflecting verbal performance relative to same gradelevel peers, based on nationally standardized normed scores (Good & Kaminski, 2002).

Child academic test performance
Justified by the high genetic correlations between reading and math performance in childhood (Davis et al., 2009b;Plomin & Kovas, 2005), and the internal consistency (α = .87) and bivariate correlations in the EGDS sample (Table 1), we created a latent variable to estimate child academic test performance at 7 years old, drawing on the same four indicators of reading and math performance that were administered to birth parents from the Woodcock-Johnson Tests of Achievement III (Woodcock et al., 2001)-see Figure 1.

Covariates
We included adoption openness, sex of child, and prenatal risk as covariates. We used a mean standardized composite of birth mother and adoptive parent-reported adoption openness, using a four-item measure (Ge et al., 2008), averaged across ratings provided at 9, 18, and 27 months postpartum. We collected birth mother reports of maternal and pregnancy complications, labor and delivery complications, and neonatal complications at 5 months postpartum and scored them based on the 76-item McNeil-Sjostrom Scale for Obstetric Complications (McNeil et al., 1994). We used a weighted total prenatal risk score based on work by Marceau et al. (2016).

Data analysis
We conducted our primary analyses using birth mother and child data only and used data from a smaller subsample of birth fathers to carry out a quasi-independent replication. Although the birth father sample is the largest ever recruited in a prospective parent-offspring adoption study, it has reduced statistical power compared to the birth mother analyses. Thus, we anticipated that comparisons between results for birth mothers and birth father would focus on the magnitude of the path coefficients rather than p values or confidence intervals. We tested our hypothesis in two steps, in the lavaan package (Rosseel, 2012) in R 4.0.0, using structural equation modeling, which combines a measurement model (also known as CFA) with a structural model testing the proposed causal relations. First, we constructed longitudinal models examining: (1) Whether EF and verbal performance were stable across 27 months, 4.5 years, and 6 years, and predicted academic test performance at 7 years; and (2) whether there were genetic effects on child EF, verbal performance, and academic test performance. Second, if the models were consistent with the mediation of genetic effects on academic test performance at 7 years through early EF or verbal performance, we ran mediation models examining the indirect effects of birth parent intellectual performance on child academic test performance at 7 years. We included the covariates in all of our models and we used bootstrapping with 5000 repetitions to test the indirect effect in the mediation models (Bollen & Stine, 1990). Based on recommendations by Hu and Bentler (1999), we used a combination rule, according to which model fit was considered adequate if SRMR < .09 and RMSEA < .06.
Variable sample sizes are reported in Table 1 and Figure 1. The primary source of missing data in child EF models using birth mother data was child EF measures at 4.5 years. In child verbal performance models using birth mother data, it was the Dynamic Indicators of Basic Early Literacy Skills Initial Sound Fluency subscale. In birth father and child models, it was missing information on birth father intellectual performance. The data used in the analyses were not missing completely at random [Little's MCAR χ 2 (4598) = 4884.36, p < .01]. We ran an attrition analysis using the Missing Value Analysis function in SPSS, which creates an indicator variable identifying variables that contain missing values. This indicator value is then used to compare group means among different variables in the dataset, using the t-test procedure. The attrition analysis revealed that the patterns of missingness for the majority (69%) of study variables were related to the observed values of one or more other variables in the dataset. Full results from the attrition analysis are available from the authors on request. This analysis ruled out the possibility that the data were MCAR, which occurs when the probability of being missing is the same for all cases and there is no systematic association between the missingness of the data and any other values, observed or missing. It was not possible to rule out the possibility that the data were missing not at random, which is when the missingness of the data is systematically related to unobserved data. However, the associations found in the attrition analysis are consistent with the data being missing at random (MAR), which occurs when the missingness of a variable is systematically related to the observed but not the unobserved data. Full information maximum likelihood (FIML) and multiple imputation (MI) are both suitable for data that is MAR, and are of comparable performance (Allison, 2003). We prioritized FIML for the results that we report and, additionally, re-ran the models using MI, with 100 imputations. Overall, FIML and MI produced equivalent results. The few discrepancies between them are reported in Supporting Information. Full results from the models using MI are available from the authors on request.

Sensitivity analyses
We conducted sensitivity analyses to examine: (1) the impact of the removal of earlier time points on associations between birth parent general intellectual performance and child EF, verbal performance, and academic test performance; (2) whether the age at which birth parents were administered measures of intellectual performance was associated with their intellectual performance and, if so, whether birth parent age confounded the associations between birth parents and children; and (3) if the indirect association between birth parent intellectual performance and academic test performance at 7 years, via children's earlier verbal performance still held when the mediation models were re-computed using only the math subscale of the academic test performance measure at 7 years. The third sensitivity analysis was conducted as a robustness check to rule out the likelihood that mediated effects on academic test performance via verbal performance were simply due to the content overlap between the measures of early verbal performance and the reading subscales of the academic performance outcome measure at 7 years old. By way of comparison, the EF mediation models were also re-computed, using only the math subscale as the outcome, rather than the latent measure of academic test performance.

R E SU LT S
Means, standard deviations, sample sizes, and bivariate correlations between study variables are presented in Table 1.

Effects on math performance
As in the original model that was being re-computed, in the sensitivity analysis re-computing the mediation analysis with the latent academic test performance variable at 7 years old replaced with the math fluency subscale of the Woodcock-Johnson, the indirect effect of birth mother intellectual performance on math test performance at 7 years old, mediated through child EF at 27 months, was small and not statistically significant (β = .05, 95% CI [−0.29, 0.39], p = .754). The indirect effect was 36% of the total effect and half the size (50%) of the indirect effect in the original model. Model fit: χ 2 (52) = 160.94, p < .001, CFI = .83, RMSEA = .09, SRMR = .06.

Effects on math performance
As in the original model that was being re-computed, in the sensitivity analysis re-computing the mediation analysis using the math fluency subscale of the Woodcock-Johnson at 7 years old (rather than the latent measure of academic test performance), the indirect effect of birth father intellectual performance on math test performance at 7 years old, mediated through child EF at 27 months, was small and not statistically significant (β = .02, 95% CI [−0.13, 0.18], p = .768). The indirect effect was 5% of the total effect and 29% the size of the indirect effect in the original model. Model fit: χ 2 (36) = 49.98, p = .061, CFI = .96, RMSEA = .03, SRMR = .07.

F I G U R E 2
Longitudinal structural equation model testing the main effects of birth mother intellectual performance on child EF and academic test performance. Note: Model fit: χ 2 (170) = 347.59, p < .001, comparative fit index = .90, root mean square error of approximation = .05, standardized root mean square residual = .06. Standardized estimates reported. Dashed lines represent parameters that are fixed to 1. Adoption openness, child sex, and obstetric risk were included as covariates in the model. BM, birth mother; DT, dinky toys; EF, executive function; FG, forbidden gift; GD, gift delay; GG, guessing game; G NG, Go NoGo; LW, letter-word association; MF, math fluency; RF, reading fluency; WA, word-attack; WAIS Info, Wechsler Adult Intelligence Scale-III Information Subscale; WJ, Woodcock-Johnson III. ns p ≥ .1; *p < .05; **p < .01; ***p < .001 Early verbal performance and later academic test performance

Effects on math performance
In the sensitivity analysis that re-computed the mediation analysis with the latent academic test performance variable at 7 years old replaced with the math fluency subscale of the Woodcock-Johnson, the findings were similar to those in the original model that was being re-computed. As in the original model, the indirect effect of birth mother intellectual performance on math test performance at 7 years old, mediated through child verbal performance at 4.5 years, was statistically significant (β = .14, 95% CI [0.03, 0.24], p = .011). The indirect effect was 88% of the total effect and just under two-thirds the size (64%) of the indirect effect in the original model. Model fit: χ 2 (46) = 149.43, p < .001, CFI = .87, RMSEA = .07, SRMR = .06.

Birth father effects
The model presented in Figure 4 was replicated in a sub-sample of children and their birth fathers. The model did not converge when the data at 27 months were included, so this timepoint was dropped from the model. As in the birth mother model, birth father intellectual performance significantly predicted child verbal performance at 4.5 years (β = .37, 95% CI [0.11, 0.62], p = .005)-see Figure 5. Similar to the birth mother findings, there was no evidence of direct effects of birth father intellectual performance on verbal performance at 6 years (β = .08, 95% CI [−0.21, 0.38], p = .575) and the total effect was significant (β = .36, 95% CI [0.14, 0.60], p = .002). Unlike in the birth mother model, there was no evidence of direct effects of birth father intellectual performance on child academic test performance at age 7 years (β = .09, 95% CI [−0.13, 0.30], p = .433), although, as in the birth mother model, the total effect at 7 years was significant (β = .33, 95% CI [0.14, 0.51], p = .001). The model accounted for 18% of the variance in child verbal performance at 4.5 years, 63% of the variance in verbal performance at 6 years, and 50% of the variance in academic test performance at 7 years. A sensitivity analysis revealed that, as in the birth mother sample, when verbal performance at 4.5 years was removed from the model, effects of birth father intellectual performance carried forward to verbal performance at 6 years ( Figure S3a). When verbal performance at 4.5 and 6 years was dropped from the model, the effect of birth father intellectual performance on academic test performance at 7 years became significant ( Figure S3b). Similar to the birth mother mediation model, the total effect of birth father intellectual performance on academic test performance at 7 years was statistically significant (β = .32, 95% CI [0.13, 0.50], p = .001). The direct effect of birth father intellectual performance on child academic test performance at 7 years old was not statistically significant (β = .12, 95% CI [−0.09, 0.32], p = .254) and the indirect effect, mediated through child verbal performance at 4.5 years was statistically significant (β = .20, 95% CI [0.04, 0.36], p = .016) and explained 63% of the total effect. The numerical estimates were similar to those in the birth mother model. Model fit: χ 2 (81) = 132.20, p < .001, CFI = .96, RMSEA = .04, SRMR = .07.

Effects on math performance
In the sensitivity analysis, re-computing the mediation model using only the math subscale at 7 years old, the effects of birth father intellectual performance continued to be mediated by verbal performance. As in the original model that was being re-computed, there was a significant indirect effect of birth father intellectual performance on child math performance at 7 years old, mediated via child verbal performance at 4.5 years (β = .09, 95% CI [0.02, 0.16], p = .018). The indirect effect was 20% of the total effect and 45% the size of the indirect effect in the original model. Model fit: χ 2 (46) = 54.55, p = .181, CFI = .98, RMSEA = .02, SRMR = .07.

DI SC US SION
Results were consistent with our hypothesis that effects of genetic influences on academic test performance at 7 years old are mediated by children's early verbal performance. Birth mother and birth father general intellectual performance each predicted child verbal performance from 4.5 years onwards, but not at 27 months, and genetic effects on academic test performance at 7 years of age were mediated through verbal performance at 4.5 years. This is consistent with the large literature on genetic influences on children's verbal performance (Stromswold, 2001) and extends the evidence by suggesting that verbal performance from 4.5 years old is an early manifestation of genetic influences on later intellectual performance. As the birth parent outcomes were measured in adulthood, the associations between birth mother or father intellectual performance and child verbal performance at 4.5 years are akin to "instant longitudinal" associations (Plomin, 1986), indicating that early verbal performance may be a marker of genetic effects, not only on academic test performance at 7 years, but also on general intellectual performance in adulthood.
The association between birth parent intellectual performance and child EF at 27 months, previously reported by Leve, DeGarmo, et al. (2013), was limited to this single occasion of measurement and did not reliably carry forward to 4.5 or 6 years in either the birth mother or birth father models. Additionally, the EF mediation models did not provide evidence of mediation of effects on academic test performance at 7 years through EF at 27 months.
Evidence that verbal performance from 4.5 years old may be an early manifestation of genetic influences on later intellectual performance converges with findings from the polygenic score literature. For instance, our findings are in line with results from the Born in Bradford study, reporting that genome-wide polygenic scores of total years of education achieved by adulthood (EA PGS) predicted a composite measure of academic test performance (including aspects of verbal performance) in 6-to 7-year-old school children (Armstrong-Carter et al., 2020). However, our results provide evidence in a younger age group-preschool 4.5-year-old children. The absence of effects, in our sample, of birth parent intellectual performance on verbal performance at 27 months is at odds with detection in the Dunedin Study of a positive association between EA F I G U R E 3 Longitudinal structural equation model testing the main effects of birth father intellectual performance on child EF and academic test performance. Note: Model fit: χ 2 (170) = 347.59, p < .001, comparative fit index = .90, root mean square error of approximation = .05, standardized root mean square residual = .06. Standardized estimates reported. Dashed lines represent parameters that are fixed to 1. Adoption openness, child sex, and obstetric risk were included as covariates in the model. BF, birth father; DT, dinky toys; EF, executive function; FG, forbidden gift; GD, gift delay; GG, guessing game; G NG, Go NoGo; LW, letter-word association; MF, math fluency; RF, reading fluency; WA, word-attack; WAIS Info, Wechsler Adult Intelligence Scale-III Information Subscale; WJ, Woodcock-Johnson III. ns p ≥ .1; *p < .05; ***p < .001 PGS and age of first words spoken, reported by parents when their children were 3 years old (Belsky et al., 2016). However, our findings are consistent with evidence from the same study (Dunedin) that, while there was no association between children's EA PGS and their scores in the Peabody Picture Vocabulary Test at 3 years old, from 5 years onwards higher EA PGS predicted higher scores of intelligence (captured by composite measures of verbal and nonverbal performance). Our findings are consistent with evidence that in infancy individual differences in verbal performance appear to be influenced to a greater degree by the shared environment than by genetic differences (Galsworthy et al., 2000) but that by middle childhood, heritability of verbal and nonverbal cognitive performance is higher and the shared environmental component reduces (Davis et al., 2009a). Our results are also in line with evidence that the cross-time correlations for genetic influences on cognitive outcomes are low in early childhood and increase substantially across childhood (Tucker-Drob & Briley, 2014), as well as with evidence that from middle childhood the same genetic influences on cognitive skills predominate, increasing in magnitude as children get older (Briley & Tucker-Drob, 2013). As noted by Briley and Tucker-Drob (2013), one possible explanation for higher heritability of verbal and nonverbal cognitive performance by the time children reach school age is that when children enter formal schooling, standardized educational practices somewhat equalize environmental differences between them, allowing genetic differences to have a greater influence on individual differences. An additional explanationwhich is compatible with our findings, as well as with the reviewed literature on the increasing heritability of cognitive performance throughout childhood and increasing stability of genetic influences as children age-is that transactional mechanisms of gene-environment interplay amplify genetic effects through processes such as evocative and active rGE (Scarr & McCartney, 1983).

Limitations and future directions
It remains unclear whether the inconsistency of EF effects reflects a lack of effects of birth parent intellectual performance on child EF at later timepoints, and the absence of mediation of genetic effects on intellectual performance via EF, or a failure to operationalize EF sufficiently reliably at these occasions of measurement. Although the EF measures used in the present study were less internally consistent than the measures of verbal performance, the use of latent variables corrected for attenuation by error and the temporal stability of the EF latent variables was high. Compared to the temporal stability of verbal performance, the temporal stability of EF was higher from 27 months to 4.5 years and equivalent at 4.5-6 years. It is also a possibility that EF was less predictive of later academic test performance than verbal performance due to high content overlap between indicators of verbal performance and the indicators of academic test performance that were included. However, this concern is somewhat mitigated by the results from the sensitivity analyses examining effects on only the math indicator of academic performance; the effects of birth parent intellectual performance continued to be mediated via verbal performance at 4.5 years old. This implies that verbal performance from 4.5 years is an early marker of genetic influences on a wider range of scholastic outcomes in middle childhood than simply those that are verbally oriented.
As our aim was to identify the earliest manifestations of genetic influences on later intellectual outcomes, it was important to include measures of EF and verbal performance from as early as 27 months in some of our analyses. However, as the 27 month measures miss important variance that is likely influenced by genetic pathways, estimates of effects on later child outcomes in the models that control for EF and verbal performance this early are substantially prone to omitted variable bias. Models not controlling for the earliest timepoint (which are thus less prone to this bias) are presented in Supporting Information.
While our findings have the potential to aid the development of promotive and preventative interventions, they are unable to resolve uncertainty about whether early verbal performance is a liability index (i.e., there are shared genetic factors that influence both verbal performance and subsequent academic test performance) or a causal mediator of genetic effects on subsequent academic test performance (i.e., limited verbal development would block the development of the skills necessary to perform well in academic tests; Kendler & Neale, 2010). Each would have important but different implications for interventions in childhood. Although both suggest that low verbal performance is a risk factor for low academic test performance, the latter suggests that early intervention targeted at verbal performance might offset risk, whereas the former might be an indication in favor of more sustained support. Future research should be aimed at testing these alternatives, through longitudinal examination of academic test performance following interventions directly on early verbal performance.
It is a strength of the current analysis that we controlled for the inf luence of the prenatal environment, by including a measure of prenatal risk and through replicating the analyses in the birth father sample. However, the lack of statistical power to accurately estimate the inf luence of birth father genetic effects is a limitation. Sufficiently powered research is needed on the inf luence of birth father contributions to intellectual outcomes. Birth father models are not fully independent replications and almost all of the measures of birth mother and birth father intelligence and academic test performance were correlated, suggesting the possibility of assortative mating, confounding, and partner interaction effects. In spite of the potential issues with spousal concordance, the birth father data add strength to our study-fathers play an equal role to mothers in contributing to the child's genotype, provide a control for intrapartum effects and are under-researched relative to mothers in developmental research. The role of birth fathers as a control for intrapartum effects is somewhat threatened by the potential for fathers to have indirect effects on fetal development through, for example, contributing to the family dynamics in the home, stress level of the mother, and material resources accessible to the mother. However, the likelihood of this confounding our results is diminished by the fact that the rates of birth parent cohabitation in the sample were low.
All behavior genetics findings represent "what is" in a particular sample and cultural context rather than what "could be" in a different context (Plomin et al., 2016). Consequently, it may be that there are features of the cultural milieu experienced by the U.S.-based adopted children in our sample, that "transmit" low-level genetic differences into differences in academic test performance to a greater or lesser degree than other cultural contexts might. Investigations into the representativeness of the EGDS sample have found that participating adoptive families appear to be representative of the U.S. population . However, relative to the birth parents, they are higher socioeconomic status (SES; Leve et al., 2019), which may bias findings. It cannot be assumed that the conclusions of this study hold for children reared in low SES environments, particularly as SES appears to moderate genetic effects on intellectual outcomes (Capron & Duyme, 1989;Tucker-Drob & Bates, 2016). There is evidence from the UK Biobank that EA PGS are more predictive of educational outcomes among nonadopted than adopted children, and that children in the lowest decile of polygenic score for educational attainment reach a significantly higher level of education if they are adopted than if they are not adopted. (Cheesman et al., 2020). This converges with evidence from the United States that children with low preadoption IQ scores experience substantial IQ score gains when adopted into high-SES families (Duyme et al., 1999), as well as with evidence that adoptees tend to academically out-perform their nonadopted biological siblings (Kendler et al., 2015). Collectively, these results indicate that genetic influences on education may be mediated by rearing environments or the wider cultural contexts that are associated with different rearing environments. Additionally, they suggest that estimates of direct genetic effects on academic outcomes may include mechanisms of rGE and interaction, pointing to the possibility that genetic differences correlate and interact with different environmental mechanisms in different sociocultural contexts. There is evidence to suggest that different ethnic groups in the United States and United Kingdom may exhibit different trajectories of verbal development (Saccuzzo et al., 1992;Zilanawala et al., 2016). For example, in the UK Millennium Cohort Study, the ethnic groups in the sample had different odds of being in high or low performing profiles of verbal development in early childhood and these observed differences were mediated by the home learning environment, family routines, and the psychosocial environment (Zilanawala et al., 2016). Such findings illustrate the nuances of verbal development in different contexts and suggest that our results might not hold in samples from different cultural and ethnic groups or socio-economic circumstances, within or outside of the United States. It remains unclear how mechanisms of gene-environment interplay influence the development of academic outcomes in a diverse range of cultural contexts. Most behavior genetics research-including the present study-is conducted in developed countries and majority White samples. Replication of these methods in other countries and sociodemographic groups is needed and until then it cannot be assumed that the present findings generalize to other cultural contexts. Our interest in identifying a mediator in the association between birth parent and adopted offspring intellectual performance stems, in part, from an overarching aim to understand how rearing and learning environments may amplify the early manifestations of genetic influences on intellectual performance. However, it was not possible to form hypotheses about evocative effects of genetic influences underlying intellectual development without first identifying an early manifestation of genetic advantage that might elicit favorable and amplifying effects from parents. Now that we have identified early verbal performance as a likely mediator of genetic influences on lifespan intellectual outcomes, we can posit early caregiving and learning conditions that might amplify genetic advantage. Children's verbal performance predicts parenting quality-including dimensions of parenting such as, sensitivity, positive regard, cognitive stimulation, and responsivenesswhich in turn predicts reading performance (Lugo-Gil & Tamis-LeMonda, 2008;Tucker-Drob & Harden, 2012). Consequently, future research should explore whether these aspects of parenting amplify genetic advantage in verbal performance.

CONC LUSION
This is the first study to examine whether early EF or verbal performance mediate genetic effects on later intellectual performance. Effects of birth parent intellectual performance on child academic test performance at 7 years old were mediated through verbal performance at 4.5-years-old but were not mediated by early EF. These findings suggest that early verbal performance may be a manifestation of genetic advantage for lifespan intellectual outcomes. Based on the importance of intellectual performance for lifelong health and adjustment, the apparent role of early verbal performance in intellectual development represents a critical finding.

ET H IC S STAT E M E N T
Ethical approval was obtained from institutional review boards at the University of Oregon (Protocol number: 0304201400) and The Pennsylvania State University (Submission ID: CR00007591).