DNA methylation at birth and lateral ventricular volume in childhood: a neuroimaging epigenetics study

Background Lateral ventricular volume (LVV) enlargement has been repeatedly linked to schizophrenia; yet, what biological factors shape LVV during early development remain unclear. DNA methylation (DNAm), an essential process for neurodevelopment that is altered in schizophrenia, is a key molecular system of interest. Methods In this study, we conducted the first epigenome‐wide association study of neonatal DNAm in cord blood with LVV in childhood (measured using T1‐weighted brain scans at 10 years), based on data from a large population‐based birth cohort, the Generation R Study (N = 840). Employing both probe‐level and methylation profile score (MPS) approaches, we further examined whether epigenetic modifications identified at birth in cord blood are: (a) also observed cross‐sectionally in childhood using peripheral blood DNAm at age of 10 years (Generation R, N = 370) and (b) prospectively associated with LVV measured in young adulthood in an all‐male sample from the Avon Longitudinal Study of Parents and Children (ALSPAC, N = 114). Results At birth, DNAm levels at four CpGs (annotated to potassium channel tetramerization domain containing 3, KCTD3; SHH signaling and ciliogenesis regulator, SDCCAG8; glutaredoxin, GLRX) prospectively associated with childhood LVV after genome‐wide correction; these genes have been implicated in brain development and psychiatric traits including schizophrenia. An MPS capturing a broader epigenetic profile of LVV – but not individual top hits – showed significant cross‐sectional associations with LVV in childhood in Generation R and prospectively associated with LVV in early adulthood within ALSPAC. Conclusions This study finds suggestive evidence that DNAm at birth prospectively associates with LVV at different life stages, albeit with small effect sizes. The prediction of MPS on LVV in a childhood sample and an independent male adult sample further underscores the stability and reproducibility of DNAm as a potential marker for LVV. Future studies with larger samples and comparable time points across development are needed to further elucidate how DNAm associates with this clinically relevant brain structure and risk for neuropsychiatric disorders, and what factors explain the identified DNAm profile of LVV at birth.


Introduction
Schizophrenia is an umbrella term encompassing a range of severe and highly heterogeneous psychiatric symptoms (Kahn et al., 2015).The heterogeneity of the disorder complicates our understanding of its potential pathophysiology.Intermediate phenotypes with a greater biological resolution, such as brain structures, may help to reduce this heterogeneity and improve our ability to identify underlying molecular pathways (Birnbaum & Weinberger, 2017).Enlargement of the lateral ventricles is one of the most replicated findings in schizophrenia (Kuo & Pogue-Geile, 2019;Olabi et al., 2011;van Erp et al., 2016), making it a particularly promising neural intermediate phenotype.
The lateral ventricles are the largest cavity of cerebral ventricular systems, taking shape during the embryonic development of the primitive neural tube (Orahilly & Muller, 1990).The development of these ventricles is thought to be influenced by both genetic and environmental influences beginning in utero.In particular, lateral ventricular volume (LVV) shows a twin and familial heritability ranging from 32% to 35% in childhood to 75% in adulthood, and an SNP-based heritability of 20% (Kremen et al., 2012;Vojinovic et al., 2018).Moreover, genetic risk for schizophrenia has been found to act jointly with environmental factors, such as birth complications or postnatal exposure to chronic stress, to predict ventricular enlargement in adolescence and young adulthood (Aas et al., 2013;Bolhuis et al., 2019;Cannon et al., 1993).Yet, how these influences come together at a molecular level to shape early LVV and downstream risk for schizophrenia, which typically manifests years later in adolescence and young adulthood, remains unclear.
Epigenetic processes, such as DNA methylation (DNAm), have emerged as a potential molecular system of interest, as they: (a) regulate gene activity in response to environmental influences beginning in utero (e.g.maternal stress; Kotsakis Ruehlmann et al., 2023) and (b) are partly under genetic control (Min et al., 2021).Furthermore, DNAm plays an important role in healthy neurodevelopment (Dall'Aglio et al., 2018), and disruptions in DNAm have been linked to psychiatric disorders, including schizophrenia (Chen et al., 2020;Richetto & Meyer, 2021;Smigielski, Jagannath, R€ ossler, Walitza, & Gr€ unblatt, 2020).Most studies to date have associated DNAm with schizophrenia using postmortem brain tissues, which cannot inform on developmental processes in living individuals (Lancaster, Morris, & Connelly, 2018).Neuroimaging epigenetics can help to bridge this gap by combining DNAm from peripheral tissues with in vivo brain imaging techniques, such as structural magnetic resonance imaging (MRI; Walton et al., 2023;Wheater et al., 2020).This approach could provide novel insights into epigenetic processes contributing to brain features associated with schizophrenia risk, such as LVV enlargement.While making mechanistic inferences about the role of peripheral DNAm in brain-based phenotypes is challenging (e.g.due to cross-tissue variability in DNAm), research has increasingly demonstrated the potential of peripheral DNAm as a non-causal biological marker for disease prediction, stratification, and diagnosis (Cecil, Neumann, & Walton, 2023;Min et al., 2021).
To our knowledge, no study has yet examined epigenome-wide DNAm patterns associated with LVV.Although previous studies have linked peripheral DNAm to other brain regions in the context of schizophrenia, such as hippocampal volume, these have primarily used cross-sectional designs in clinical samples of adults (Chen et al., 2020;Wheater et al., 2020).As such, it remains unclear whether DNAm patterns prospectively associate with brain features in the general pediatric population, before the manifestation of later psychotic symptoms.To clarify the direction of DNAm-brain associations and minimize potential confounding, such as by the use of medication, prospective studies linking early DNAm with the developing brain are needed.Furthermore, the focus on DNAm measured at a single time point poses challenges, given that DNAm is highly developmentally dynamic (Mulder et al., 2021), and importantly, that associations with brain-based phenotypes can be temporally specific.For example, growing evidence shows that DNAm at birth is more strongly associated with certain neurodevelopmental outcomes, such as ADHD and social communication deficits (Neumann et al., 2020;Rijlaarsdam, Cecil, Relton, & Barker, 2021), compared with DNAm patterns measured concurrently in childhood (i.e.prospective >cross-sectional associations).Leveraging repeated measures of epigenetic data is thus key to characterizing how DNAm markers may associate with brain phenotypes at different stages of development.However, little research to date has embedded replication attempts, which is important to establish the reliability and generalizability of findings.
In light of these gaps, we used data from two independent population-based cohorts, the Generation R Study and the Avon Longitudinal Study of Parents and Children (ALSPAC), to characterize early epigenetic correlates of LVV as well as to test the generalizability of our findings across different developmental periods.Based on Generation R, we performed the first epigenome-wide association study (EWAS) examining prospective associations between DNAm at birth (cord blood) and LVV in childhood (age 10 years).To verify the relevance of LVV to psychotic outcomes in the general pediatric population, we also examined the prospective association between child LVV and psychotic-like experiences in adolescence.We then utilized both probelevel and methylation profile score (MPS) approaches to evaluate the generalizability of our findings by: (a) testing whether associations are temporally stable when using DNAm measured at 10 years in Generation R (i.e.cross-sectional associations with LVV), and (b) whether DNAm at birth continues to predict LVV in young adulthood (as opposed to childhood in the discovery analyses) using an independent sample of males from the ALSPAC cohort.

Participants
Primary analyses were conducted using data from the Generation R Study, a population-based prospective cohort from early fetal life onward, based in Rotterdam, the Netherlands.The design and sample characteristics of Generation R have been described in detail elsewhere (Kooijman et al., 2016).For a subsample of 1,396 European singletons DNAm data was collected at birth.Of these, 1,382 remained after quality control and harmonization, including 840 children (50.4% female) with available data on neuroimaging measured at a mean (SD) age of 10.1 (0.6) years and relevant covariates.In addition, we included participants with information on selfreported psychotic-like experiences (N hallucinations = 2,360, N delusions = 1,899) at a mean (SD) age of 13.6 (0.4) years.In addition, 370 children had available data on DNAm (based on peripheral whole blood) at age of 10 years, in addition to neuroimaging and relevant covariates.A flowchart of sample selection is described in Figure S1.
The generalizability analyses were performed in an independent population-based prospective cohort, the ALSPAC study in the UK (Boyd et al., 2013;Fraser et al., 2013), based on data from 839 European singletons with available cord blood DNAm at birth.Of these, 114 males had neuroimaging measures at a mean (SD) age of 19.5 (0.8) years, and 528 participants (40.9% male) completed a semi-structured interview of psychotic-like experiences at a mean (SD) age of 24.4 (0.7) years, respectively.Full cohort descriptions are provided in Appendix S1.

Measures
Full information on measures of DNAm, structural neuroimaging, and psychotic-like experiences within each cohort is provided in Appendix S1.
DNA methylation.Briefly, DNAm was measured in cord blood (Generation R and ALSPAC) and peripheral whole blood at age of 10 years (only in Generation R).Samples were further processed with the Illumina Infinium HumanMethylation450 BeadChip (Illumina Inc., San Diego, CA, USA).Initial quality control was performed using the CPACOR workflow (Lehne et al., 2015) in Generation R and the meffil package (Min, Hemani, Davey Smith, Relton, & Suderman, 2018) in R in ALSPAC.To minimize cohort effects, we previously combined samples from the two cohorts and then normalized them as a single dataset (Mulder et al., 2020).Functional normalization (10 control probe principal components, slide included as a random effect) was performed with the meffil package in R.
Normalization took place on the combined Generation R and ALSPAC data for a total of 485,512 CpG sites.More detailed quality control and normalization steps for the combined dataset have been described previously (Mulder et al., 2020).Methylation levels outside of the lower quartile minus 3 9 interquartile range or upper quartile plus 3 9 interquartile range were identified as outliers and winsorized.
Structural MRI.In Generation R, MRI was acquired at age of 9-11 years on a 3-Tesla MRI system (MR750w, General Electric, Milwaukee, WI, USA) using an 8-channel head coil (White et al., 2018).In ALSPAC, MRI was acquired at age of 18-21 on a 3-Tesla General Electric HDx (GE Medical Systems) using an 8-channel head coil (Sharp et al., 2020).It should be noted that neuroimaging data in ALSPAC were available in male participants only, as they were generated as part of a specific project on the effect of pubertal testosterone on brain development.In both cohorts, automatic volumetric segmentations of the structural T1-weighted images were processed using the FreeSurfer package.Total LVV was standardized using Z-score transformation.

Psychotic-like experiences (PLE).
In Generation R, we used two measures of psychotic-like experiences which were self-reported by children at age of 13-14 years, including hallucinations assessed with two items from the Youth Self-Report (Ivanova et al., 2007), and delusions assessed with six items from the Kiddie Schedule for Affective Disorders and Schizophrenia (Adriaanse, van Domburgh, Zwirs, Doreleijers, & Veling, 2015;Kaufman et al., 1997).Following previous research, the sum scores of hallucinations were categorized into three groups: no symptoms, mild, and moderate-to-severe symptoms (Bolhuis et al., 2018), whereas the sum score of delusions was used as a continuous variable (Adriaanse et al., 2015).In ALSPAC, psychotic-like experiences at age of 24 were identified through the face-to-face, semi-structured psychosis-like symptom interview (PLIKSi;Zammit et al., 2013).Interviews were conducted by trained psychology graduates in assessment clinics.Total scores from the PLIKSi were recoded into a binary variable indicating none versus suspected or definite psychotic-like experiences that were not attributable to sleep or fever.
Covariates.Covariates for epigenetic association analyses included child sex, maternal age at delivery (in years), prenatal maternal smoking (binary categorization of 'no smoking/quit in early pregnancy' vs. 'smoked throughout pregnancy'), gestational age at delivery (in weeks), child age at MRI assessment (in years), total brain volume, technical covariates (i.e.batch effect in Generation R, surrogate variables in ALSPAC), and estimated cell-type proportions using a cordblood-specific reference-based method (Gervin et al., 2019) in both cohorts and the Houseman method with the Reinius reference set (Houseman et al., 2012;Reinius et al., 2012) at 10 years in Generation R.

Statistical analysis
Epigenome-wide association study.Linear regres- sion was applied to investigate the prospective association between neonatal DNAm and LVV in childhood.In our EWAS analyses, the DNAm b value at each CpG site was specified as the predictor and total LVV as the outcome of interest, adjusted for all relevant covariates as described above.Probes were annotated using the meffil R package (Min et al., 2018), based on genome build hg19.p-values were adjusted for multiple testing using false discovery rate (FDR) adjustment (Benjamini & Hochberg, 1995) and only sites with FDRcorrected q < .05were considered statistically significant.
Due to the known sex differences in LVV (Trimarchi et al., 2013) and the availability of MRI data only in males within ALSPAC, we performed a sex-stratified EWAS as a sensitivity analysis to provide a more direct comparison with the replication sample.A flow chart overview of the analytical approach in this study is presented in Figure 1.
Follow-up analyses.Follow-up analyses were performed to further characterize LVV-associated CpGs (FDR q < .05)as follows: (a) We used two independent online tools to test whether top CpG sites show concordance between peripheral blood and brain, based on live brain tissue from 12 epilepsy patients (IMAGE-CpG; https://han-lab.org/methylation/default/imageCpG) and postmortem brain tissue from 71 to 75 adult participants (https://epigenetics.essex.ac.uk/ bloodbrain/); (b) Gene expression profiles were probed across 54 human tissues and in the developing brain over lifespan, using GTEx and BrainSpan data from the FUMA portal (https://fuma.ctglab.nl/);(c) Look-up of methylation quantitative trait loci (mQTL) was performed using the largest mQTL database to date to explore potential genetic influences on DNAm levels (https://mqtldb.godmc.org.uk/).
Given that DNAm of CpGs below the epigenome-wide significance threshold may also account for phenotype variation and improve prediction accuracy, LVV-associated CpGs with a suggestive p < 1 9 10 À4 were taken forward to the following analyses.First, we assessed whether LVV-associated CpG sites are enriched for genetic liability to schizophrenia by performing colocalization analysis, a procedure that estimates the probability of a single shared variant.To this end, we first queried GoDMC (Min et al., 2021) to identify independent cis-mQTLs for the LVV-associated CpGs (p < 1 9 10 À8 , within 1 Mb from the CpG).When such mQTLs were identified, we extracted all available SNPs associations within a 1 Mb radius from these index mQTLs from GoDMC, and intersected with SNPs from a GWAS of schizophrenia (Lam et al., 2019).We excluded CpG sites with less than 10 mQTLs.On the resulting data, a Bayesian colocalization analysis was performed using the coloc.abffunction with default skeptical priors from the coloc R package (Giambartolomei et al., 2018;Wallace, 2020).A posterior probability of a single shared variant (PP H4 ) ≥.8 was considered sufficient evidence for colocalization (see Appendix S1 for further details).Second, pathway enrichment analysis was performed based on Gene Ontology (GO) and the DNA methylation at birth and lateral ventricular volume in childhood Kyoto Encyclopedia of Genes and Genomes (KEGG) using the gometh function of the missMethyl R package (Phipson, Maksimovic, & Oshlack, 2016).Independent pathways with FDR q < .05were considered significant.Last, to verify the relevance of childhood LVV to psychotic outcomes in the pediatric population, we tested whether LVV at age of 10 years was prospectively associated with hallucinations (categorical outcome using ordinal logistic regression analysis) and delusions (dimensional outcome using multivariable linear regression analyses) later in adolescence.Analyses were adjusted for sex, age of psychotic-like experiences assessment, ethnicity, and total brain volume.
Generalizability of neonatal DNAm markers associated with LVV.Temporal stability from birth to childhood in Generation R: Using the repeated mea- sures of epigenetic data in Generation R, we examined whether the DNAm sites identified at birth remained associated with childhood LVV when measured again using peripheral blood at age 10 years.To this end, we tested whether DNAm levels for top CpG sites (FDR q < .05)correlated between time points (birth and age 10), and whether they were cross-sectionally associated with LVV (i.e. both DNAm and LVV assessed concurrently at age of 10 years) based on linear regression analyses.
In addition to testing single top CpGs, we also applied a MPS approach to examine associations using a broader epigenetic profile of LVV.Specifically, we first preselected LVV-associated CpGs at p < 1 9 10 À4 from the cord blood EWAS.We then split the cord blood DNAm sample into training (80%, N = 672) and testing (20%, N = 168) datasets using the createDataPartition function in the caret R package (Kuhn, 2008), accounting for even distribution of LVV in the training and testing datasets.
The preselected CpGs were used to establish the predictive model and feature selection in the training dataset using elastic net regularization (ENR), a machine-learning approach.The advantage of ENR is selecting informative features without compromising prediction accuracy (Zou & Hastie, 2005).Optimal combinations of the mixing (alpha) and shrinkage (lambda) parameters were determined via 10-fold crossvalidations implemented in the cva.glmnet function of glmne-tUtils package (Friedman, Hastie, & Tibshirani, 2010).
CpGs with non-zero coefficients from the elastic net model with the optimal alpha and lambda values were extracted and used as external weights to construct MPS in the testing set, which enabled us to evaluate direct replication and prediction performance.Specifically, an ENR-weighted DNAm sum score for LVV (i.e.MPS LVV ) was calculated by multiplying the methylation value at a given CpG by the ENR estimated weight, and then summing these values: MPS LVV = b1*CpG1 + b2*CpG2 . . .+ bi*CpGi.To assess the incremental utility of the MPS LVV over and above covariates, we estimated the incremental R 2 by comparing the predictive performance of the full model including the MPS LVV to that of the covariate-only model.
Next, the ENR estimated weights for selected CpGs were used to construct MPS LVV_childhood using whole blood DNAm at 10 years in Generation R. We then examined the crosssectional association between MPS LVV_childhood and LVV in childhood.
Generalizability of cord blood epigenetic associations with adult LVV in ALSPAC: Finally, we explored the generalizability of our findings in an independent all-male sample (N = 114) with complete data on cord blood DNAm at birth and MRI at age of 18-21 from ALSPAC.Again, we examined both the individual top CpGs as well as the MPS (constructed using cord blood DNAm in ALSPAC).In addition to the MPS LVV derived from the discovery of birth EWAS in the overall sample (N Generation R = 840), the same ENR approach was repeated to construct a MPS LVV-males in ALSPAC, based on the all-male EWAS (N Generation R-males = 417) to maximize comparability with the characteristics of the replication sample.We estimated how much variance (incremental R 2 ) in LVV and psychotic-like experiences was explained by the ENRbased MPS LVV-males and MPS LVV , respectively.Similar covariates (e.g.different ages of outcome assessments and surrogate variables in ALSPAC) were included in regression models as in Generation R.
In addition, we performed a logistic regression model to investigate whether LVV at age of 20 is associated with psychotic-like experiences at age of 24 in ALSPAC, after adjusting for total brain volume and age of outcome assessment.

Results
Sample characteristics are described in Table 1.We compared children included in our selected EWAS sample (N = 840) with children who had complete data on LVV and covariates but not on neonatal DNAm (N = 1,928).This showed that children in the analytical sample on average were born later, had mothers who were older, and had a larger total brain volume.No differences were found for child sex, maternal education, smoking during pregnancy, and LVV.

Epigenome-wide association study
At birth, four CpGs prospectively associated with childhood LVV after FDR correction (q < .05;Table 2 and Figure 2A), including (a) cg23923495 located in the gene body of KCTD3 (potassium channel tetramerization domain containing 3); (b) cg20995689, located in the gene body of SDCCAG8 (SHH signaling and ciliogenesis regulator); (c) cg08945340, located at an intergenic site that is not annotated to any genes; and (d) cg10949007, annotated to a promoter region of GLRX gene (glutaredoxin).Suggestive associations (p < 1 9 10 À5 ) were observed at 26 additional CpG sites (Table 2).There was no evidence of genomic inflation in the EWAS (k = 1.16, see the quantile-quantile plot Figure 2B).
One of these four probes, cg10949007, also remained significant after FDR correction in the allmale EWAS (N = 417).Two additional CpGs, cg22874802 and cg23737062 mapped to the EPHA2 and FBXL22 genes were identified after FDR correction in the male sample.No CpGs reached genomewide significance in females, despite using a comparable sample size (N = 423).Detailed results of the sex-stratified EWASs are shown in Table S1 and Figure S2.

Follow-up analyses
Blood-brain concordance.IMAGE-CpG using live brain tissue showed that DNAm in blood at cg08945340 was highly and positively correlated with DNAm in brain tissue (r = .62,Table S2), whereas the other two sites showed negative correlations (cg10949007, r = À.30; cg20995689, r = À.40), suggesting moderate inverse methylation patterns between blood and brain.A weaker correlation pattern for all four CpGs was observed when using the Blood Brain DNA Methylation Comparison Tool (r s = À.16-.23;Table S2), with cg23923495 showing the highest correlation between blood and prefrontal cortex (r = .23).Gene expression analysis across tissues and across brain developmental stages.Next, we assessed gene expression levels across 54 tissues including blood and several brain regions from public GTEx data (Figure S3a).Generally, KCTD3, SDCCAG8, and GLRX were expressed across multiple tissues, including all the brain regions.In particular, both KCTD3 and GLRX were highly expressed in the cerebellum of the brain, while SDCCAG8 showed similar levels of expression in all the brain regions.Data from BrainSpan further indicated that KCTD3 and SDCCAG8 were moderately to highly expressed during the prenatal period, decreasing in expression after birth.In contrast, GLRX showed higher expression in adulthood compared with other developmental stages (Figure S3b).
Methylation quantitative trait loci.Of the four significant CpGs identified at birth, cg10949007 in GLRX and cg08945340 were associated with 78 known mQTLs (p < 1 9 10 À8 , Table S3).All the identified associations were in cis.
Colocalization analysis.Using a total of 140 LVVassociated CpGs at p < 1 9 10 À4 , we tested for the presence of shared genetic variants with schizophrenia GWAS loci.We identified 98 independent cis-mQTLs at p < 1 9 10 À8 from GoDMC.After intersecting with schizophrenia genetic variants within a 1 Mb radius, 57 CpG sites remained with a sufficient number of SNPs (≥10) for colocalization analysis (Table S4).One CpG site (cg08170519) annotated to IGSF9B showed strong evidence of colocalization (PP H4 = .98),suggesting a single shared variant affecting the two traits (i.e.LVV-associated DNAm and schizophrenia).

Pathway enrichment analysis. After removing all
probes containing an SNP and cross-reactive probes, a total of 114 CpGs at p < 1 9 10 À4 were annotated to 86 genes.GO and KEGG analyses yielded no significantly enriched common biological processes, cellular components, or molecular functions after FDR correction (q < .05).The top GO terms and KEGG pathways are included in Table S5a-b.

Generalizability of neonatal DNAm markers associated with LVV
We performed ENR to select informative CpG features for constructing the MPS of LVV.First, we preselected the 140 CpGs significant at p < 1 9 10 À4 from the discovery EWAS based on the overall sample at birth.Cross-validation identified a panel of 125 out of 140 preselected CpGs with an optimal combination of alpha (0.027) and lambda (0.170) in the training set.A MPS LVV based on the ENR estimated weights was strongly predictive of LVV in the testing set (incremental R 2 = 0.22).Detailed results of all MPS analyses are shown in Table 3.
To construct the male-specific MPS LVV_males , a total of 136 CpGs were preselected at p < 1 9 10 À4 from the all-male EWAS performed at birth.An optimal combination of alpha (0) and lambda (.276) included all 136 preselected CpGs in the training set (N = 336).The MPS LVV_males showed an excellent prediction of LVV in the testing set (incremental R 2 = .33).
These ENR-estimated nonzero coefficients were then used as external weights to construct: (a) an MPS LVV_childhood (using whole blood DNAm at 10 years) in Generation R to test cross-sectional associations with LVV, and (b) an MPS LVV and an MPS LVV_males (using cord blood DNAm at birth) in an independent sample of older males in ALSPAC, to test prospective associations with LVV in young adulthood.
Temporal stability from birth to childhood in generation R. The correlations between DNAm levels at the four top sites across time points (i.e. at birth vs. childhood) were low, ranging from À0.04 to 0.18 (Table S6a), which suggests high variability over time.No nominally significant cross-sectional associations at 10 years were observed for these individual CpG sites (Table S6b).In contrast to the single probe analysis, the MPS LVV_childhood was significantly associated with LVV (b = .19,SE = .06,p = .001),explained 2.6% of the variance of LVV in the childhood DNAm sample.
Generalizability of cord blood epigenetic associations with adult LVV in ALSPAC.As a final step, we tested the generalizability of our findings in ALSPAC using LVV measured at age of 20 years.None of the four top CpG sites at birth reached statistical significance in the probe-level analysis, with three sites showing a consistent direction of associations (Table S7a,b).However, the malespecific MPS LVV_males significantly associated with LVV in this all-male adult sample, explaining 3.9% of the variance in LVV on top of all covariates (b = .34,SE = .15,p = .02).The model with MPS LVV (based on a discovery EWAS using both males and females) explained 3% of the variance in this phenotype (b = À.30,SE = .14,p = .04)but in a negative direction.
No significant association between the MPS LVV and psychotic-like experiences was observed in ALSPAC (b = À13.17,SE = 7.63, p = .08).We also did not find a prospective association between LVV and

Discussion
To our knowledge, this is the first study to investigate prospective associations between epigenomewide DNAm at birth and LVV in childhood, a key brain feature implicated in risk for schizophrenia.We identified four CpGs at birth that were prospectively associated with variation in LVV at age of 10 years, based on data from over 800 participants.These sites map to genes that have been implicated in brain development and psychiatric disorders including schizophrenia, pointing to interesting candidates for future investigation.None of these  four CpGs showed temporally stable associations in childhood (using repeatedly assessed DNAm in Generation R) or were found to prospectively associate with LVV in an independent sample of young male adults (ALSPAC), likely due to the small effect sizes of these single CpGs in combination with the use of smaller sample sizes for these analyses.Notably, however, the use of a broader MPS of LVV including a much larger selection of CpGs led to improved predictive power, showing significant cross-sectional associations with LVV in childhood in Generation R (i.e.temporal stability) and significant prospective associations with LVV in young adulthood in ALSPAC (reproducibility).We also found evidence for the colocalization of genetic variants associated with LVV-related CpGs and schizophrenia.In the future, the use of larger multi-cohort analyses and replication efforts (Rijlaarsdam et al., 2016) with neuroimaging data collected at more comparable time points will be important in order to further elucidate the role of these DNAm patterns in LVV development and related risk for schizophrenia, and what factors explain the identified DNAm profile of LVV at birth.

DNA methylation patterns at birth prospectively associate with LVV in childhood
Our EWAS at birth identified four CpGs after genome-wide correction, three of which were annotated to genes.The top site (cg23923495) is located in the gene body of KCTD3, a gene encoding the KCTD (potassium channel tetramerization domain) family protein that is a major regulator of cAMP signaling, which is essential for neuronal function (Muntean et al., 2022).KCTD3 is widely expressed in the brain and has been involved in several neurocognitive and neurodevelopmental disorders, including cerebellar hypoplasia and autism (Marshall et al., 2008;Teng et al., 2019).SDCCAG8 (cg20995689), a centrosome-associated protein-encoding gene, is involved in neuronal migration in the developing cortex (Insolera, Shao, Airik, Hildebrandt, & Shi, 2014) and normal cilia formation and function (Flynn, Whitton, Donohoe, Morrison, & Morris, 2019).Studies in animals have shown that SDCCAG8 expression associates with enlarged lateral ventricle in mice (Insolera et al., 2014).SDCCAG8 is highly expressed in the human fetal brain during development (Figure S3b), and alterations in SDCCAG8 expression have been suggested as a candidate pathway underlying risk for schizophrenia and related cognitive deficits (Flynn et al., 2019).Furthermore, existing GWASs have linked genetic variation in SDCCAG8 with several brain-related phenotypes at genome-wide significant levels, including schizophrenia (Lam et al., 2019), educational attainment (Okbay et al., 2022), cognitive performance (Lam et al., 2017), and risk-taking behavior (Karlsson Linn er et al., 2019).A recent EWAS has also linked DNAm of SDCCAG8 with prenatal maternal stressful life events associated with risk of schizophrenia (Kotsakis Ruehlmann et al., 2023).
GLRX (cg10949007) is a gene encoding, a member of the glutaredoxin family.This protein plays a protective role in nerve cell function in the presence of oxidative stress, by catalyzing the reversible reduction of glutathione (GSH), a major antioxidant.Previous studies found that lower GSH levels in the blood predict cognitive deficits and brain volume loss in children and adolescents with first-episode psychosis at 2 years of follow-up (Fraguas et al., 2012;Mart ınez-Cengotitabengoa et al., 2014).
GLRX has been associated with autism, cognitive aging in healthy people and amyloid-b neuropathology in Alzheimer's disease (Bowers et al., 2011;Harris et al., 2007;Wang et al., 2020).Altogether, these results suggest that cord blood DNAm sites, which were prospectively associated with childhood LVV, are annotated to genes involved in brain function, neural development, and psychiatric risk, including schizophrenia.
Colocalization analyses were performed to test whether CpGs in cord blood associated with LVV potentially share genetic variants with schizophrenia.One CpG site showed significant colocalization: cg08170519 is located in IGSF9B, a gene involved in GABA neurotransmission/inhibitory synapse development and Vitamin D receptor pathway processes, which have been repeatedly implicated in schizophrenia (Cui, McGrath, Burne, & Eyles, 2021).Our findings on colocalization and mQTLs show that genetic factors might implicate in the identified DNAm patterns.In future, the integration of genetic influences with prenatal environmental factors may provide a deeper insight linking early life epigenetics, LVV, and genetic liability to schizophrenia.

Cord blood epigenetic profile of LVV persists into childhood
Interestingly, none of the top hits identified at birth (i.e.prospective analysis) continued to associate with LVV when DNAm was measured concurrently to MRI scans at 10 years (i.e.cross-sectional analysis).However, a broader MPS LVV_childhood containing 125 CpG sites selected based on ENR was found to crosssectionally associate with LVV in childhood (2.6% of variance explained on top of covariates), supporting a degree of temporal stability in DNAm patterns associated with LVV.
Previous longitudinal epigenetic studies have found that DNAm at birth is more strongly predictive of certain neurodevelopmental outcomes, such as ADHD (Neumann et al., 2020), compared with DNAm patterns examined later in childhood.Our findings demonstrated that MPS capturing broader DNAm patterns might be more stable and reproducible across different time points than the analysis of single CpG sites.

A MPS at birth prospectively associates with LVV in young adulthood in an independent cohort
We further showed that a male-specific MPS LVV_males derived from a smaller but more comparable discovery EWAS (i.e.including only the 417 males in Generation R, not the females) prospectively associated with LVV measured later in life (at age of 20 years instead of age 10 years in the discovery sample), within in an independent sample of young adults from ALSPAC, explaining 3.9% of variance in this phenotype over and above covariates.Interestingly, we found that our prediction performed better when using this male-specific MPS LVV-males , compared to an MPS LVV using results from a larger discovery EWAS (i.e. from the full sample of 840 individuals comprising both males and females), pointing to the importance of maximizing comparability between the characteristics of the discovery EWAS and the replication sample.Of note, we were not able to replicate individual top CpGs in ALSPAC, likely due to both the small effect sizes observed and the small sample size of the replication sample (with corresponding larger standard errors).Overall, these findings further support the use of MPSs for capturing a broader epigenetic profile associated with LVV with improved predictive power and reproducibility.
Of note, while we found that LVV in childhood prospectively associated with self-reported delusions later in adolescence in Generation R, this prospective association was not identified at an older age in ALSPAC, which could reflect the potential functional significance of LVV in early development or differences in measurement of psychotic-like experiences (self-report vs. semi-structure interview).It should be noted that ALSPAC is the only other cohort (except for Generation R) to our knowledge with data on DNAm at birth and MRI collected in late childhood or adolescence.This prospective cohort allowed us to link cord blood-specific DNAm markers with LVV and related psychotic outcomes in early adulthood, another critical period for brain development and the onset of schizophrenia.Given that both DNAm and brain features vary substantially over time, it is crucial to evaluate the generalizability of epigenetic markers of LVV in a developmental context.More research using repeated measures of both DNAm and neuroimaging is needed to better understand their dynamic associations across development, as currently such datasets are extremely scarce.

Strengths and limitations
The Generation R Study is a large birth cohort with prospectively collected DNAm and neuroimaging data on a population level, which enabled us to conduct prospective associations linking early epigenetic variation to later LVV.Other strengths of this study include the use of an epigenome-wide approach with a brain structure highly relevant for schizophrenia (i.e.LVV as an intermediate phenotype), the use of an independent prospective birth cohort to test the generalizability of our findings, and the use of MPSs based on a machine learning approach (i.e.ENR) to capture a broader epigenetic profile of LVV and help improve predictive power.In addition, the availability of DNAm at multiple time points enabled us to explore developmental aspects Our findings must also be interpreted in the context of several limitations.First, the sample sizes and measurement of psychotic-like experience varied between the two cohorts examined, which limited our ability to test the generalizability of findings.Second, the psychotic-like experiences were selfreported by adolescents in Generation R, which may be subject to reporting bias because of a misinterpretation of the questions.Nevertheless, there is evidence from longitudinal population studies that self-report assessments of psychotic-like experiences during early adolescence are predictive for both psychotic-like experiences and clinical diagnosis later in adulthood (Healy et al., 2019;Isaksson, Angenfelt, Frick, Olofsdotter, & Vadlin, 2022).Third, the 450 k array only covers 1.7% of the total number of CpGs in the human genome, compared with the EPIC array targeting approximately 850,000 sites (3% of total CpGs).Furthermore, selected CpG sites are substantially biased toward promoter and gene body regions, which limits the capacity to detect potential associations between DNAm with LVV at other unmeasured loci.Fourth, the study focused exclusively on DNAm.Other epigenetic processes (e.g.histone modifications or circulating microRNAs) are likely to also be important for brain phenotypes.Last, despite the fact that we identified prospective associations between DNAm and LVV, it is not possible to establish causality.

Conclusion
We conducted the first EWAS of DNAm at birth with LVV in childhooda brain feature robustly linked to schizophrenia risk by previous research and found to prospectively associate with adolescent selfreported delusions in our population-based cohort.We identified several genome-wide significant CpG sites that map to genes implicated in brain development and the brain-related traits, including schizophrenia.A MPS based on the EWAS results at birth was found to cross-sectionally associate with LVV when measured in childhood (showing evidence of temporal stability), and to prospectively associate with LVV in young adulthood within an independent cohort of older males (showing evidence of reproducibility).Prediction in this independent sample was stronger and more robust when the MPS was based on male-specific EWAS results compared with the sex-combined EWAS results, highlighting the importance of maximizing comparability between the characteristics of the discovery and replication samples.Future studies with larger samples, repeated measures of DNAm, and MRI, more comparable measurement time-points between studies and better epigenomic coverage are needed to confirm our findings, and integrated genetic and prenatal environmental factors to elucidate which factors explain variability in LVV-associated epigenetic patterns at birth and downstream risk for schizophrenia.

Ó
2023 The Authors.Journal of Child Psychology and Psychiatry published by John Wiley & Sons Ltd on behalf of Association for Child and Adolescent Mental Health.

Ó
2023 The Authors.Journal of Child Psychology and Psychiatry published by John Wiley & Sons Ltd on behalf of Association for Child and Adolescent Mental Health.doi:10.1111/jcpp.13866

Figure 1
Figure 1 Summary flow chart of the analytical approach

Ó
2023 The Authors.Journal of Child Psychology and Psychiatry published by John Wiley & Sons Ltd on behalf of Association for Child and Adolescent Mental Health.psychotic-likeexperiences in the all-male sample (b = .26,SE = .27,p = .33).

Figure 2
Figure 2 Manhattan (A) and quantile-quantile (B) plots showing genome-wide associations between DNAm at birth and LVV at age of 10 years in Generation R (N = 840), k = 1.16

Ó
2023 The Authors.Journal of Child Psychology and Psychiatry published by John Wiley & Sons Ltd on behalf of Association for Child and Adolescent Mental Health.doi:10.1111/jcpp.13866DNA methylation at birth and lateral ventricular volume in childhood of the association between DNAm and LVV, although we cannot separate true temporal signals from other potential contributing factors such as cell-type composition differences in blood at these different time points.

Table 1
Study population characteristics a The mean age of fathers at intake (30.38 years) was only available in the overall sample in ALSPAC.b N delusions = 1899.c N hallucinations = 2,360 in Generation R. Ó 2023 The Authors.Journal of Child Psychology and Psychiatry published by John Wiley & Sons Ltd on behalf of Association for Child and Adolescent Mental Health.doi:10.1111/jcpp.13866DNA methylation at birth and lateral ventricular volume in childhood

Table 2
CpG sites in cord blood suggestively associated (p < 1 9 10 À5 ) with LVV at age of 10 years in Generation R (N = 840) NA, not available.The full model is adjusted for batch effects, estimated cell-type proportions, child sex, gestational age, maternal age in take/delivery, maternal smoking during pregnancy, child age at MRI assessment and total brain volume.

Table 3
Generalizability with methylation profile scores for LVV Regression results of ENR-based MPS on LVV at 10 years in Generation R, LVV at 20 years and PLE at 24 years in ALSPAC.Regression models are corrected for relevant covariates as in EWAS analysis.ENR-based MPS: using elastic net regularization, the MPS LVV was constructed based on ENR estimated weights using preselected CpGs from EWAS in the overall sample, MPS LVV_males was constructed based on those from the all-males EWAS.PLE, psychotic-like experiences.R 2 : Adjusted R-squared.R 2 change was calculated by building a basic model with covariates only (without MPS) and then a MPS model that included MPS as a predictor.The R 2 of the basic model was subtracted from those of the MPS model.Ó 2023 The Authors.Journal of Child Psychology and Psychiatry published by John Wiley & Sons Ltd on behalf of Association for Child and Adolescent Mental Health.