Longitudinal studies of child mental disorders in the general population: A systematic review of study characteristics

Abstract Introduction Longitudinal studies of child mental disorders in the general population (herein study) investigate trends in prevalence, incidence, risk/protective factors, and sequelae for disorders. They are time and resource intensive but offer life‐course perspectives and examination of causal mechanisms. Comprehensive syntheses of the methods of existing studies will provide an understanding of studies conducted to date, inventory studies, and inform the planning of new longitudinal studies. Methods A systematic review of the research literature in MEDLINE, EMBASE, and PsycINFO was conducted in December 2022 for longitudinal studies of child mental disorders in the general population. Records were grouped by study and assessed for eligibility. Data were extracted from one of four sources: a record reporting study methodology, a record documenting child mental disorder prevalence, study websites, or user guides. Narrative and tabular syntheses of the scope and design features of studies were generated. Results There were 18,133 unique records for 487 studies—159 of these were eligible for inclusion. Studies occurred from 1934 to 2019 worldwide, with data collection across 1 to 68 time points, with 70% of studies ongoing. Baseline sample sizes ranged from n = 151 to 64,136. Studies were most frequently conducted in the United States and at the city/town level. Internalizing disorders and disruptive, impulse control, and conduct disorders were the most frequently assessed mental disorders. Of studies reporting methods of disorder assessment, almost all used measurement scales. Individual, familial and environmental risk and protective factors and sequelae were examined. Conclusions These results summarize characteristics of existing longitudinal studies of child mental disorders in the general population, provide an understanding of studies conducted to date, encourage comprehensive and consistent reporting of study methodology to facilitate meta‐analytic syntheses of longitudinal evidence, and offer recommendations and suggestions for the design of future studies. Registration DOI: 10.17605/OSF.IO/73HSW.


INTRODUCTION
The pooled global prevalence of mental disorders among children and adolescents (herein children) has been estimated at 13.4% (Polanczyk et al., 2015), representing individual and social burden (Lim et al., 2008;Waddell et al., 2018), and posing a considerable public health concern (Kvalsvig et al., 2014).Longitudinal (Douglas, 1975), parental separation, adoption (Ely et al., 1999;Lipman et al., 1992), disrupted familial relationships (Almqvist et al., 1999), maternal anxiety (McClure et al., 2001;Spence et al., 2002), poor cognitive functioning, parental psychopathology and early aggression (Schonfeld et al., 1988).Longitudinal research has also identified factors that mediate associations between low socio-economic status and symptoms of disorder (Bor et al., 1997).Knowledge of risk and protective factors and sequelae related to child mental disorders expand our understanding of the development and impact of disorders and serve as targets for prevention, treatment, and intervention (Trentacosta et al., 2008).
This complexity makes it difficult to categorize and compare risk and protective factors across studies.
Studies that recruit samples of children from the general population are the focus of our review.Our definition of general population studies follows Polanczyk et al. (2015), as studies deliberately recruiting representative samples to produce findings that are generalizable to the general population.This is achieved through probability sampling from a sampling frame that is inclusive of the target population, ensuring that everyone in the target population has a known chance of being selected.General population studies are the focus for the following reasons.First, they provide a holistic representation of mental disorders by including children across the disorder spectrum as opposed to including only the most severe, and often more complex, cases (Jongerden et al., 2015).As comorbidity is higher in clinical samples, knowledge gained from them may not fully represent the disorder in the general population (McConaughy & Achenbach, 1994).Second, general population studies include symptomatic children whose parents/teachers or who themselves do not seek care, important for comparability when the reasons for help-seeking and access to care differ across countries and populations (Georgiades et (Doernberg & Hollander, 2016;Sleep et al., 2021).In this review, we define mental disorders broadly as "a clinically recognizable set of symptoms or behaviours associated in most cases with distress and with interference with personal functions" (WHO p.11, 1992)

Temporal characteristics
These include the year of study initiation, year of most recent follow-up, number of follow-ups, and whether studies are ongoing which provide context on the objectives and study progression.A greater number of follow-ups may improve precision when assessing disorder patterns over time (Willett et al., 1998) but may also increase attrition due to increased burden on respondents.Study duration is important as longer duration allows for inclusion of more distal risk factors including inter-generational relationships (Trentacosta et al., 2008).
Age of the sample places the study into a developmental context as disorder trajectories can be age-dependent (Holmbeck et al., 2006;Oerlemans et al., 2020).

Sampling and sample attributes
These include target population, sampling frame, and sampling approach.
For children, appropriate sampling frames include birth records and registries, child tax benefit records, and household censuses.Schools may also be acceptable sampling frames in areas where practically all children attend (Polanczyk et al., 2015).Achieving high response rates further increases the probability that a sample is generalizable to the target population, in this case the general population (Weitzman et al., 2003).Researchers may also check the representativeness of their sample by comparing demographic factors such as sex between their sample and target population or by using sampling weights (Schulz et al., 2021).Attrition is a potential source of bias in longitudinal research and a major concern, with evidence indicating special populations (children with more severe symptoms, ethnic/racial minorities, and males) are more likely than their counterparts to be lost to follow-up (Gustavson et al., 2012).The use of incentives or gifts to participants has been shown to reduce attrition (Booker et al., 2011).
Researchers seeking to study rare disorders may oversample high risk individuals and special groups to increase the statistical power of

Information sources and content
We identified data sources, content and informants.Categories of

Objectives
We conducted a systematic review of the study characteristics (operational factors, temporality, sampling and sample, mental disorder measurement, and information sources and content) of longitudinal studies of child mental disorders in the general population to describe the range, scope, and nature of studies conducted to date.A search of JBI Evidence Synthesis, the Cochrane Database of Systematic Reviews, PubMed, Epistemonikos figshare, OSF, and PROS-PERO found no reviews of all available general population longitudinal studies that measure mental disorders in children and map risk and protective factors across a broad age span.With the large number of longitudinal studies conducted, this information is needed to: (1) understand which studies have been conducted to date; (2) provide a study inventory for researchers to identify evidence for comparison, harmonization, replication, meta-analysis, or application of machine learning approaches; and (3) inform the deliberate and strategic planning of new longitudinal studies to maximize the value of research investments.Bringing existing longitudinal evidence to bear on new research questions can consolidate cross-study findings and identify sources of stability or change in trends and associations (Ioannidis & Lau, 1999).

Study identification and eligibility
The target of the review is the study-the operational process of designing and conducting a primary research project, collecting, and reporting data-rather than an individual record-a paper, manuscript, or publication.Although study is commonly used to refer to a published manuscript, in this review it refers to the larger scale research name and records that did not.When a study name was found, all records referencing that study name were searched for and a group of records for that study was created in EndNote.A methodology paper was then searched for.If unavailable, a Google search of the study name was conducted to identify a study website or data user guide.If unavailable, the record group for a particular study was searched to find a paper documenting the prevalence of mental disorders.The study was screened for eligibility using the first available source and criteria listed in Appendix 2. Records without a study name were screened individually according to our eligibility criteria and, where possible, an information source was identified.In cases when eligibility was unclear, articles were reviewed and discussed with a senior author (LD).Eligible studies went through full text review for inclusion, in duplicate, by four trained reviewers (TB, WX, HM, BV), using one of the four specified sources, following inclusion/exclusion criteria.Discrepancies in the identification of eligible studies were resolved through discussion.When the study was potentially eligible for inclusion, but a study name was unavailable, the authors were contacted to obtain the name and an information source.

Data extraction
Data was extracted and coded independently by four reviewers in duplicate (TB, WX, HT, HM) using a Microsoft Excel data extraction tool and codebook (Appendix 3).Reviewers were trained and a pilot extraction study (n = 18) was conducted on randomly selected studies by two reviewers (TB, WX).Reviewers identified, discussed, and resolved disagreements to achieve consensus and the data extraction tool was revised to add/remove fields based on rates of available data.Inter-rater agreement on coded data was calculated as a percentage.

Analytic approach
Data for included studies were entered into a database.Text was extracted on unique study features, target population and mental disorder measurement tools-grouped based on the tool name, taking into consideration different spellings, and abbreviations.
Respondent and age-group version information for tools was not extracted.Tabular descriptive statistics were generated in Excel and narrative syntheses were generated for extracted text.

Study selection
Our search strategy resulted in 18,133 unique records of which 4852 met our inclusion criteria.Of these, 3735 represented groups of records for 487 named studies and 1117 related to nameless studies.From these, 11 records reporting methods or prevalence were identified, and nine authors were contacted for study name information and a data extraction source.In total, 487 studies were identified.Figure 1 presents the PRISMA diagram for study identification.Following eligibility screening, 159 studies were included.Appendix 4 lists the eligible studies and link or citation for the information source.Initial agreement on eligibility between screeners was 88.7%, with 100% consensus achieved after discussing disagreements.Initial agreement for the extracted, coded, study characteristics was 90.3%, and 100% after disagreements were resolved.

Operational factors
Of the 159 included studies, 94% had an identifiable study name (Table 1).Over half (54%) had a peer-reviewed methodology paper available, which was used to assess eligibility and for data extraction.
A quarter (25%) had a study website, 17% had a published mental health-related prevalence paper, and a study user guide was used for the remaining studies (5%).Only 39% of studies reported their data accessibility.Among those that did, most required external researchers to apply for data access and/or pay a modest fee.A third had restricted data access, with access limited to collaborators; only 2% reported having some portion of study data available through open access.
One study did not specify a geographic region.Studies were most often conducted at the town/city level (34%), followed by the national level (26%) and a jurisdiction between city/town and state/ province (19%).Studies were less frequently conducted at the province/state (16%), international level (4%) or at a jurisdiction between national and provincial (1%).
Forty-three studies reported a unique design feature.The majority (n = 20) described a connection or linkage to another study that emerged out of the original study or allowed access to data on other family members, laboratory studies, or similar ('consortium') studies in other countries.Eleven described a subsample or some aspect of special sample targeting conducted as part of the study.Six studies noted something unique about their approach to measurement (e.g., sipping/tasting vs. drinking alcohol, fathers as respondents, sexual behaviours questions adapted for Muslim population, mental disorder case definitions).Special foci on system influences, large size, unique characteristics of study location, and the ability to use non-participants as controls were also described.

Temporal characteristics (Table 1)
Year of study initiation ranged from 1934 to 2019, with the median reported study duration and follow-up number being 10 years and four follow-ups.Seventy percent of the studies were ongoing.
Attrition, perhaps the primary challenge to generalizability in longitudinal studies, was not well reported.Using baseline sample size and sample size at last reported follow-up, we were able to calculate attrition for 73% of studies.For 5% of these, baseline sample CHARACTERISTICS OF LONGITUDINAL STUDIES OF CHILD MENTAL DISORDERS size was smaller than follow-up due to sample augmentation or misreporting.Median attrition rates were 30% and post hoc analysis found a median rate of 22% in studies that reported using incentives.

Sampling & sample attributes (Table 1)
The majority (87%) of studies had a stated target population.A quarter were pregnancy cohorts.Baseline sample sizes ranged from n = 151 to 64,136 children (median n = 2792).Of the 74% of studies reporting baseline response rates, participation ranged from 17% to 100% (median = 84%).The median age for study participants was 8.5 years.Half (50%) did not report study participants' sex at baseline.For those that did, the median proportion of males was 51%.
Of the 127 studies that reported sampling approaches (80%), cluster sampling was reported most (43%), followed by total population and stratified sampling (26%), often in combination with other sampling approaches.Sampling frames were reported by 70% of studies.Over half of studies used school lists and a fifth used maternal and/or child health records.Studies also reported using birth records, dwelling registries, samples from other studies or multiple frames.Less commonly used sampling frames were child benefit records and phone registries.A quarter of the studies reported oversampling high risk or special populations of interests for their studies and providing incentives to their participants.

Mental disorder measurement
Studies assessed children for mental disorder symptoms using various instruments and informants (Table 2).Internalizing disorder symptoms were assessed most, followed by disruptive, impulse control and conduct, neurodevelopmental and neurocognitive, and eating, elimination, and sleep.Some studies assessed children for substance use and other addictive disorder symptoms, somatic, psychotic and dissociative, obsessive compulsive, and trauma/ stressor-related disorder symptoms.Few studies reported measuring personality disorder, and gender dysphoria symptoms.
Measurement scales were used in almost all studies-far more frequently than diagnostic interviews; a quarter of studies used both assessment types.Researcher-developed instruments were also

Information sources and content
Information sources and types of information collected were coded as present if reported and assumed to be absent if not reported (Table 3).All studies reported at least one information type and 5 did not report a source.In addition to reports from multiple informants about child mental disorders, studies collected information from other sources.These included data from linkages to school or medical records, neurocognitive testing, or biological samples most commonly.Some studies included an observational assessment of the child, EMA assessment data, such as written or electronic diaries, app-based data, or physiological sensors, or interviewer assessments of the child, their family, or environment.
Risk and protective factors and/or sequelae of mental disorders were grouped according to whether their content was about the  The database search was limited to English language records for feasibility and resource reasons (costs of translators/translations), excluding otherwise eligible studies that did not publish records in English and leading to potential bias in our understanding (Amano et al., 2016).According to our review, the countries with the most longitudinal studies of child mental disorder are the United States, England, Canada and Australia.To understand the impact of omitting studies without English records, the excluded non-English search results are reported in Appendix 5.In our search, 886 out of 18,133 records were identified as non-English.The English titles and abstracts were screened and 94 identified as potentially eligible, corresponding to 57 studies.Of these, 23 also had English records and had already been captured by our review.Appendix 5 lists the 34 potentially eligible non-English studies.They include records in Chinese, German, Spanish, French, Japanese, Polish, Portuguese, Italian, and Russian, reporting studies from China (8), Germany (3), Spain (3), Japan (3), France (2), Brazil (2), Italy (1), Canada (1), Chile (1), and Taiwan (1) and studies with no country reported (9).If these studies proved eligible for inclusion following full text review, this wouldn't change the finding that most studies were conducted in the United States (n = 33) but the representation of studies from China would increase, with China and England both having 14 studies, followed by Canada with 13.English records reporting 6 Chinese studies were included in our review.Apart from the underrepresentation of studies from China, we don't think that including non-English language records would have led to significantly different results.
Our Appendix Table A1 reports where possible, the start year, end year, location, baseline sample size and age at baseline so the reader can judge for themselves.
With or without potentially eligible non-English studies, longitudinal studies were conducted in all continents, with European and North American being the most common and South American and African countries being the least common.This may have implications for meta-analytic research about longitudinal relationships between mental disorders and other variables at a global level, as we know that there are cultural differences in mental disorder perceptions, mental health literacy, and help-seeking behaviours in children (Ivanova et al., 2015(Ivanova et al., , 2022;;Pescosolido, 2007).Some studies noted connections to similar studies in other countries or mentioned being part of cross-national consortium studies.Studies were most often conducted at the city/town level, followed by the national and provincial/state level.The tendency towards a smaller geographic scope is important for meta-analytic research given almost all studies conducted at the city/town level took place in large, urban cities.There are known differences between urban and rural settings in prevalence and determinants of child mental disorders and while cities are most likely to be considered urban, towns could be urban or rural depending on size and density (Buttazzoni et al., 2022).
Most studies sought to measure a broad range of psychopathology.Fewer studies assessed disorders seen more rarely in children.This is likely due to the fact our search criteria included terms for these more common disorders, but also because general population samples are less appropriate for assessing rare disorders because large sample sizes are needed to achieve adequate statistical power.Measurement scales were used more frequently than diagnostic interviews to assess disorder symptoms in children-some studies used both.Validated measurement scales are less burdensome on respondents and interviewers and produce valid and reliable assessments increasing feasibility (Boyle et al., 2017).Most studies used multiple informants, most commonly child and/or parent/caregiver.Informant discrepancies in reports of mental disorder symptoms are known to occur, so including multiple respondents provides a comprehensive assessment of the occurrence of mental disorders (Brown et al., 2006).How to combine multiinformant data is not well understood but we can use multiinformant data simultaneously in latent variables in structural equation models (Martel et al., 2017).
The use of different information sources has implications for study procedures.The frequent use of data linkage to other records and the collection of biological samples represent a challenge to data sharing as these data often need more secure storage and use than traditional survey data (Harron et al., 2017;Rychnovská, 2021).Data linkage can minimize respondent burden by providing an alternate method for assessing certain survey content.
Biological sample collection varies in respondent burden, is sometimes invasive and has data storage implications (International Society for Biological and Environmental Repositories, 2008).More importantly, there is some evidence from the UK that including a nurse visit for biological data collection in longitudinal studies has a negative effect on cooperation in the wave directly after the visit, although the effect is mostly short-term and these visits did not have a longer-term impact on subsequent wave participation 12 of 39 -BOGDAN ET AL. (Pashazadeh, et al., 2021).Some cognitive tests, observational and interviewer assessments require the presence of a human interviewer or assessor requiring resources beyond those needed for an online or computer-administered survey.Ecological Momentary Assessments are well-suited to the increasing availability, affordability and use of electronic devices, technology and software programs.They can be used to administer neurocognitive tests, however these assessments can be burdensome for participants and require substantial data monitoring from researchers to minimize the amount of missing and incorrect data.The technology associated with this form of data collection may also be expensive and require resourcing to monitor and maintain to avoid software malfunction and/or data loss (Heron et al., 2017).
Our review was limited by the reporting quality of eligible studies.While some studies had detailed records of their methodology, others did not, meaning we were unable to determine some study features.For studies where a mental disorder prevalence paper was used for data extraction, mental disorder risk and protective factors, and/or sequelae may have been measured but not reported because of their limited relevance to the focus of the paper.Using methodology papers to extract the characteristics of ongoing studies may have led us to report older study features because of the typically large time intervals between methodological publications on different study waves.
To standardize data extraction and ensure that data extraction sources contain sufficient information of interest, we selected study methodology papers, websites, data user guides or mental disorder prevalence papers as information sources.While this increased the amount and standardization of information extracted, requiring that a study have one of these sources available for it to be included in the review led to the exclusion of 53 records about nameless or otherwise unidentifiable studies which could not otherwise be excluded based on our eligibility criteria.Appendix 6 lists these records.
Studies not well described, easily identifiable or without consistent use of a name in published research findings are less likely to be represented in our review.Similarly, studies that address but have not published results about child mental disorders in the scientific literature are not represented.

Exclusion Criteria
Records that were not subject to peer review, including conference abstracts, presentations, and dissertations or that did not involve primary data collection or data linkage, such as reviews and book chapters, were not included in our review.We also exclude studies that: (1) use adult samples, (2) recruit convenience samples, (3) assess mental disorders using self-reports of diagnosis or administrative medical records, (4) assessed mental disorders in the context of evaluating a policy or treatment, or clinical practice, or (5) that have a sample size below 50.Further, (6) animal studies, (7) A1.
studies of child mental disorders in the general population have advanced our understanding of trends in prevalence and incidence of disorders; risk and protective factors; and sequelae.Early studies such as the 1946 UK National Birth Cohort, the 1981 Finnish Birth Cohort, the 1983 Ontario Child Health Study, the 1981 Queensland Study of Pregnancy, the 1975 Dunedin Multidisciplinary Health and Development Study, and the 1958 US National Collaborative Perinatal Project (or Child Growth and Development Study) identified risk factors for mental disorders in children including long or repeated hospitalizations from birth to age five al., 2019; Costello et al., 2014; Hintzpeter et al., 2015; ten Have et al., 2009; Fekih-Romdhane et al., 2022).Children exhibiting externalizing behaviours may be more commonly referred when help-seeking is largely driven by parents and/or teachers who can observe externalizing behaviours more easily or perceive them as more of a nuisance (De Los Reyes et al., 2015; Splett et al., 2019).In addition, there may be social, financial and/or geographical barriers to helpseeking, such as living in places where beliefs that mental disorders do not need treatment are common, or where mental illnesses are not considered as urgent or dangerous as physical illnesses, and so not a resourcing priority (O'Brien et al., 2016; Radez et al., 2021).Third, general population studies produce the most widely generalizable results-making them useful for informing public funding and policy decisions at the population level.The Diagnostic and Statistical Manual of Mental Disorders (5th ed.; DSM-5; American Psychiatric Association, 2013) and the International Statistical Classification of Diseases and Related Health Problems (11th ed.; ICD-11; World Health Organization [WHO], 2019) are common classification systems of mental disorders but have limitations to accommodate variability in conceptualizations.We focus on assessment of symptoms to accommodate the multiple measurement tools and methods developed in accordance with different classification frameworks and do not require conjoint assessment of impairment.Disorder selection is broad and includes Key Points � Longitudinal studies of child mental disorders investigate risk factors and sequelae of disorder.� Comprehensive synthesis of the characteristics of existing longitudinal studies are needed to provide an understanding of studies conducted to date and to inform planning of new studies.� This review synthesizes characteristics of 159 longitudinal studies of child mental disorders in the general population ranging in size, scope, location and duration.� Our findings have implications regarding: (a) the usefulness of published methodology and the need for standardized reporting requirements; (b) meta-analytic syntheses of longitudinal evidence relating to child mental disorder; and (c) planning and methodological considerations for new studies.Operational factors These include study objectives, funding sources, and geographical location, as they have the potential to influence study methods.Data accessibility is a characteristic of interest based on the continued open data and sharing efforts in the scientific community to increase research efficiency, transparency, and replicability of published results (Mello et al., 2013).Better data access maximizes study value, potential output and removes cost and access barriers to obtaining longitudinal data, but also comes with logistical and resourcing challenges.Text about unique study features was also extracted.
their study and maximize resources (de la Osa et al., 2019; McGonagle & Sastry, 2015).Similarly, funding challenges may require investigators to create subsamples of general population studies (Carter et al., 2010).Pregnancy cohorts, or studies recruiting pregnant women and initiating data collection before children are born, do not target children as their study population.Nonetheless, certain prenatal and perinatal circumstances have been associated with increased risk of mental disorders later in childhood (Allen et al., 1998; Ståhlberg et al., 2020) and are included in our review if they meet inclusion criteria.CHARACTERISTICS OF LONGITUDINAL STUDIES OF CHILD MENTAL DISORDERSMental disorder measurementThis includes the name and type of assessment tool (standardized diagnostic interview, dimensional measure, researcher developed measure), disorder coverage (neuro-developmental or cognitive; psychotic and dissociative; mood or anxiety; obsessive compulsive; trauma and stressor-related; somatic; eating, elimination, or sleep; sex-related; gender dysphoria; disruptive, impulse control, or conduct; personality; and substance use and other addictive disorders in childhood) and informant.
study content were informed by our understanding of key risk and protective factors and are organized by collection mode.Data sources include: (1) data linkage to administrative health or government records relating to service use, medication use, or education which provides access to more detailed information with better reliability, less bias, and in a way that can reduce respondent burden (Harron et al., 2017); (2) ecological momentary assessment (EMA) which involves continual assessment of study participants in their natural environments using participant-reports or sensors increasing ecological validity, reducing recall bias and random error, and in some instances increasing sensitivity to change (Moskowitz & Young, 2006); (3) observational assessments, where study participants' behaviors are observed for specified characteristics while engaged in a task assessing study participants' interactions with their environment and other people (Floyd et al., 1998); and (4) biological samples which increase our understanding of biological determinants, progression and sequelae, and the genetic basis underlying psychopathology (Insel et al., 2010).

of 39 -
endeavor such as the National Longitudinal Survey of Youth, or the Millennium Cohort Study.The search strategy was developed to identify records documenting methodology, mental disorder prevalence or epidemiological research from longitudinal studies.These records were used to identify studies, screen for eligibility, and identify an appropriate information source about the study for data extraction.These could be: (1) a record documenting the study methodology; (2) a study website; (3) a study data user guide; and (4) a record documenting disorder prevalence.The research team determined source types based on those that are typically available and used to find out about study methodology.They were prioritized according to reliability and comprehensiveness (in order shown above).Having one consistent source of information would have been preferred, but no standard study reporting method exists.Records identified by our search were uploaded into EndNote (version 20.1) and duplicates were removed.One researcher (TB) independently reviewed the titles and abstracts of retrieved records and grouped them using study name into records that included a 4 BOGDAN ET AL.

2 .
Use a standard study name across publications to assist with future identification of studies 3. Assess completeness of sampling frame for representing general population 4. Report data accessibility and process for access 5. Report response and retention rates 6.Use incentives to reduce attrition 7. Consider opportunities to connect or link to other existing studies, embed in or add to an existing cohort, or join a study consortium CHARACTERISTICS OF LONGITUDINAL STUDIES OF CHILD MENTAL DISORDERS sampling can increase efficiency in large-scale studies (Latpate et al., 2021), but the representativeness of samples produced using this method can be challenged when many clusters (i.e., schools) refuse participation.While total population sampling, the second most frequent method, improves representativeness of the sample, it is expensive, and is still subject to nonresponse.General population study response rates have declined over time (Ebert et al., 2018; Stedman et al., 2019).Response rates in this review had a wide range, suggesting that study response remains a challenge and researchers should continue to employ strategies to encourage response, particularly strategies focused on reducing barriers to participation, such as the provision of childcare, transportation, and parking services.The more important issue in longitudinal studies is retention of survey respondents over time and the bias that can result from attrition.In addition to monitoring and reporting retention rates, methods exist to reduce attrition (Abshire et al., 2017), such as use of incentives, and attrition can be accounted for in data analysis using suitable missing data methods or statistical adjustments (Enders, 2011; Nicholson et al., 2016; Schminkey et al., 2016).Retention rates were not well reported and, after piloting data extraction, were removed as a field for charting due to missing data.Sample size at baseline and final reported follow-up was reported allowing for sample retention calculations, but it is impossible to know how well this approximates actual retention.Longitudinal studies should monitor and report retention, detail any sample supplements and methodological or statistical efforts to account for sample loss.
Longitudinal studies of mental disorders in the general population of children vary substantially in their operational, temporal, sample, disorder assessment, information and content characteristics.They occur predominantly in European and North American countries, use school list sampling frames, and assess children for a range of internalizing and externalizing disorders-mostly using measurement scales.Risk and protective factors and sequelae are assessed most frequently at the individual level, followed by familial and then environmental.This review provides a comprehensive description and synthesis of longitudinal study characteristics, inventories existing studies, provides extracted data on their characteristics, raises questions about the impact of different study methods on efficiency, internal and external validity, and provides a foundation for future meta-analytic work.Our work recommends standardized reporting of study methodology, makes recommendations for new longitudinal studies, and identifies common methodologies that can be used when planning future longitudinal research involving the assessment of child mental disorders in the general population.AUTHOR CONTRIBUTIONS Theodora Bogdan: Conceptualization; data curation; formal analysis; investigation; methodology; writing-original draft; writing-review & editing.Weiyi Xie: Data curation; formal analysis; investigation; methodology; project administration; validation; writing-review & editing.Habeba Talaat: Data curation; formal analysis; writing-review & editing.Hafsa Mir: Data curation; formal analysis; writingreview & editing.Bhargavi Venkataraman: Data curation; formal analysis; writing-review & editing.Laura E. Banfield: Conceptualization; methodology; software; writing-review & editing.Katholiki Georgiades: Conceptualization; methodology; supervision; writingreview & editing.Laura Duncan: Conceptualization; funding acquisition; investigation; methodology; project administration; resources; software; supervision; validation; writing-review & editing.How to cite this article: Bogdan, T., Xie, W., Talaat, H., Mir, H., Venkataraman, B., Banfield, L. E., Georgiades, K., & Duncan, L. (2023).Longitudinal studies of child mental disorders in the general population: A systematic review of study characteristics.JCPP Advances, 3(3), e12186.https://doi.org/10.1002/jcv2.12186disorders/or anxiety disorders/or anxiety, separation/or "disruptive, impulse control, and conduct disorders"/or exp "feeding and eating disorders"/or mood disorders/or depressive disorder/or depressive disorder, major/or dysthymic disorder/or neurodevelopmental disorders/or "attention deficit and disruptive behavior disorders"/or attention deficit disorder with hyperactivity/or conduct disorder/or child behavior disorders/or exp substance-related disorders/ 2. adhd.ti,ab,kf,kw.3. addiction*.ti,ab,kf,kw.4. ((mood or anxiety or depress* or attention-deficit hyperactivity or oppositional-defiant or conduct or affective or eating or "substance use") adj3 (disorder* or condition* or symptom* or assess* or measur*)).ti,ab,kf,kw.5. ((neuropsy* or behavio?r* or emotion* or mental* or psychiatr* or internalizing or externalizing) adj2 (diagnos* or disorder* or ill or illness*)).ti,ab,kf,kw.6. or/1-5 7. adolescent/or exp child/or exp infant/ 8. (p?ediatric* or child* or adolescen* or youngster* or youth or teen* or boys or girls or neonat* or infan*).ti,ab,kf,kw.9. 7 or 8 10. exp Cohort Studies/ 11. ((longitudinal or follow-up or cohort or prospective or retrospective or panel) adj2 stud*).ti,ab,kf,kw.12. 10 or 11 13.(prevalence or survey* or epidemiolog* or methodol*).ti,ab,kf, not (animals/not (humans/and animals?.mp.)) [mp = title, abstract, original title, name of substance word, subject heading word, floating sub-heading word, keyword heading word, organism supplementary concept word, protocol supplementary concept word, rare disease supplementary concept word, unique identifier, synonyms] 19.limit 18 to English 20. limit 19 to (adaptive clinical trial or address or autobiography or bibliography or biography or case reports or clinical study or clinical trial, all or clinical trial, phase i or clinical trial, phase ii or clinical trial, phase iii or clinical trial, phase iv or clinical trial, veterinary or clinical trials, veterinary as topic or clinical trial protocol or clinical trial protocols as topic or clinical trial or comment or congress or consensus development conference or consensus development conference, nih or controlled clinical trial or dataset or dictionary or directory or editorial or equivalence trial or festschrift or government publication or guideline or historical article or interactive tutorial or interview or introductory journal article or lecture or legal case or legislation or CHARACTERISTICS OF LONGITUDINAL STUDIES OF CHILD MENTAL DISORDERS -17 of 39 letter or news or newspaper article or observational study, veterinary or patient education handout or periodical index or personal narrative or portrait or practice guideline or pragmatic clinical trial or randomized controlled trial or randomized controlled trial, veterinary or technical report or twin study or video-audio media or webcast) 21. 19 not 20 PsycINFO-OVID 1. anxiety disorders/or generalized anxiety disorder/or separation anxiety disorder/ 2. exp eating disorders/ 3. major depression/or dysthymic disorder/or endogenous depression/or reactive depression/ 4. disruptive behavior disorders/or conduct disorder/or oppositional defiant disorder/ 5. attention deficit disorder/or attention deficit disorder with hyperactivity/ 6. exp "substance use disorder"/ 7. social anxiety/ 8. adhd.ti,ab.9. addiction*.ti,ab.10. ((mood or anxiety or depress* or attention-deficit hyperactivity or oppositional-defiant or conduct or affective or eating or "substance use") adj3 (disorder* or condition* or symptom* or assess* or measur*)).ti,ab.11. ((neuropsy* or behavio?r* or emotion* or mental* or psychiatr* or internalizing or externalizing) adj2 (diagnos* or disorder* or ill or illness*)).ti,ab.

27 .
limit 26 to english language 28.limit 27 to (adaptive clinical trial or address or autobiography or bibliography or biography or case reports or clinical study or clinical trial, all or clinical trial, phase i or clinical trial, phase ii or clinical trial, phase iii or clinical trial, phase iv or clinical trial, veterinary or clinical trials, veterinary as topic or clinical trial protocol or clinical trial protocols as topic or clinical trial or comment or congress or consensus development conference or consensus development conference, nih or controlled clinical trial or dataset or dictionary or directory or editorial or equivalence trial or festschrift or government publication or guideline or historical article or interactive tutorial or interview or introductory journal article or lecture or legal case or legislation or letter or news or newspaper article or observational study, veterinary or patient education handout or periodical index or personal narrative or portrait or practice guideline or pragmatic clinical trial or randomized controlled trial or randomized controlled trial, veterinary or technical report or twin study or video-audio media or webcast) [Limit not valid in APA PsycInfo; records were retained] 29.27 not 28 Embase-OVID or clinical trial, phase iii or clinical trial, phase iv or clinical trial, veterinary or clinical trials, veterinary as topic or clinical trial protocol or clinical trial protocols as topic or clinical trial or comment or congress or consensus development conference or consensus development conference, nih or controlled clinical trial or dataset or dictionary or directory or editorial or equivalence trial or festschrift or government publication or guideline or historical article or interactive tutorial or interview or introductory journal article or lecture or legal case or legislation or letter or news or newspaper article or observational study, veterinary or patient education handout or periodical index or personal narrative or portrait or practice guideline or pragmatic clinical trial or randomized controlled trial or randomized controlled trial, veterinary or technical report or twin study or video-audio media or webcast) [Limit not valid in APA PsycInfo; criteria were as follows: (1) data were from epidemiological longitudinal studies, (2) assessing children aged 0 to 19, (3) for common childhood mental disorders (mood, depressive, anxiety (generalized, separation, social), attention, oppositional defiant, conduct, eating, addiction and substance use) or groupings of emotional, behavioral, internalizing and externalizing.Study information also needed to be available in English or published in an English language peer-reviewed journal in the form of a methods paper (e.g.protocol, cohort profile, design), prevalence paper (i.e.paper publishing disorder/symptom prevalence in the population of interest), online study website or data user guide to ensure studies included in the analysis could be adequately assessed for eligibility and data extraction.

5
of Parents and Children (ALSPAC) Golding, Pembrey, Jones, & Team, T. A. S. (2001).ALSPAC-The Avon Longitudinal Study of Parents and Children.Paediatric and Perinatal Epidemiology, 15(1), 74-87.https://doi.org/10.1046/j.1365-3016.2001.00325.xF I G U R E A 1 PRISMA flow diagram of study identification.APPENDIX Systematic review, title and abstract screening of non-English studies and list of potentially eligible records about non-English studies Non-English language studies resulting from our systematic review were extracted and duplicates removed, resulting in 886 unique records.One researcher (TB) independently reviewed the English titles and abstracts of non-English records and excluded ineligible records.Ninety four records representing 64 studies were identified from the eligible records.Duplicates and studies already included in our review using English records were removed.The record abstract with the most study information was selected for charting.In total, 34 potentially eligible non-English studies were identified.Figure A1 shows the PRISMA flow diagram of study identification.Basic study characteristics, including location of study, start and end year, baseline sample size and age at baseline were extracted and are reported in Table Study methods adhered to Cochrane (Higgins & Green, 2011) and the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA: Moher et al., 2009) standards for systematic reviews.The review was registered with Open Science Framework

6 of 39
Operational, temporal, and sample study characteristics.
-BOGDAN ET AL.T A B L E 1 CHARACTERISTICS OF LONGITUDINAL STUDIES OF CHILD MENTAL DISORDERS used.Commonly used measurement scales include the Child Behaviour Checklist (CBCL; Achenbach, 1999), Strengths and Difficulties Questionnaire (SDQ; Goodman, 1999), Mood and Feelings Questionnaire (MFQ; Angold et al., 1995), and Center for Epidemiologic Studies Depression Scale (CES-D; Radloff, 1977), Children's Depression Inventory (CDI; Saylor et al., 1984), and Youth Self Report (YSR; Ebesutani et al., 2011).The most frequently reported diagnostic interviews were the Kiddie Schedule for Affective Disorders and Schizophrenia (K-SADS; Kaufman et al., 1997) and Diagnostic Interview of Children and Adolescents (DISC; Shaffer et al., 2000).Most studies used child or caregiver reports of the child's mental disorder symptoms, while fewer studies also included teacher and service provider reports.

APPENDIX 3 Data extraction and coding Study & Source Information 1
. Study ID.2.Primary information source, coded as methods paper, user guide, OF LONGITUDINAL STUDIES OF CHILD MENTAL DISORDERS 10.Sampling approach, coded as stratified, cluster, simple random, total population, or multiple.11.Sampling frame, copied directly from source.12.Whether subpopulations were oversampled, coded yes/no.13.Whether study participants received incentives/gifts for their participation, coded yes/no.Informants of assessment: the child, their caregiver, teacher, or service provider, coded yes/no. CHARACTERISTICS