Research Review: Internalising symptoms in developmental coordination disorder: a systematic review and meta‐analysis

Background Developmental coordination disorder (DCD) affects 5%–6% of children. There is growing evidence that DCD is associated with greater levels of internalising symptoms (i.e. depression and anxiety). This is the first systematic review and meta‐analysis to explore the magnitude of this effect, the quality of the evidence and potential moderators. Methods A systematic search was conducted to identify studies reporting a comparison between individuals with DCD/probable DCD and typically developing (TD) individuals on measures of internalising symptoms. A pooled effect size (Hedges g) was calculated using random‐effects meta‐analysis. Study quality, publication bias and potential moderators of the effect were explored. Results Twenty studies, including a total of 23 subsamples, met the inclusion criteria, of which 22 subsamples were included in the meta‐analysis (DCD: n = 1123; TD: n = 7346). A significant, moderate effect of DCD on internalising symptoms was found (g = 0.61). This effect remained robust after accounting for publication bias and excluding lower quality studies. The effect was significantly larger in studies utilising a cross‐sectional design (vs. longitudinal), convenience sampling (vs. population screening) and a majority male sample. Conclusions The findings demonstrate that individuals with DCD experience greater levels of internalising symptoms than their peers. This highlights the importance of routine screening for emotional difficulties in DCD, raising awareness of the condition in mental health services and developing psychosocial interventions that extend beyond a focus on motor impairments. However, there is a need for higher quality, longitudinal studies to better understand the causal relationship between DCD and internalising symptoms.


Introduction
Developmental coordination disorder (DCD) is a neurodevelopmental disorder affecting between 5%-6% of children and is characterised by significant impairment to an individual's ability to perform everyday motor tasks (American Psychiatric Association [APA], 2013). This can include difficulties with self-care (e.g. tying shoelaces), academic tasks (e.g. handwriting) and leisure activities (e.g. catching a ball). A diagnosis of DCD is based on four criteria (APA, 2013): (a) performance in motor coordination tasks is substantially below expectation given the person's age and opportunities; (b) the motor coordination difficulties significantly interfere with activities of daily living or academic achievement; (c) difficulties began in the early developmental period; and (d) the difficulties cannot be attributed to an intellectual disability or neurological condition (e.g. cerebral palsy).
Despite its prevalence, DCD often goes unrecognised and is poorly understood among healthcare and education professionals (Gaines, Missiuna, Egan, & McLean, 2008;Wilson, Neil, Kamps, & Babcock, 2013). This is of concern given that DCD has been found to have a significant impact not just on an individual's motor abilities, but across a wide range of psychological, cognitive, physical and social domains (Zwicker, Harris, & Klassen, 2013). There is also evidence that the impact of DCD across these domains persists into adulthood (Cousins & Smyth, 2003;Hill, Brown, & Sorgardt, 2011).
One area that has received increasing attention is the impact of DCD on mental health, specifically internalising symptoms (i.e. depression and anxiety). There is growing evidence that individuals with DCD have elevated levels of internalising symptoms compared to their typically developing (TD) peers (Mancini, Rigoli, Cairney, Roberts, & Piek, 2016;Mancini, Rigoli, Roberts, & Piek, 2018). Research has also found associations between motor ability and internalising symptoms in community samples of TD children and adults (Poole et al., 2016;Rigoli, Piek, & Kane, 2012;Wilson, Piek, & Kane, 2013), an increased risk of psychiatric disorders in individuals with DCD (Rasmussen & Gillberg, 2000), and impaired motor ability in individuals with common psychiatric disorders (Damme, Simons, Sabbe, & van West, 2015).
Understanding the link between DCD and mental health has important implications for assessment and intervention with this population, including at schools, physical health services and mental health services. To date, several reviews have summarised the findings on internalising symptoms in DCD (Cac ßola, 2016; Mancini et al., 2016Mancini et al., , 2018. However, they consist of narrative summaries only. There are also inconsistent findings, with some studies finding no significant effect (Davis, Ford, Anderson, & Doyle, 2007;King-Dowling, Missiuna, Rodriguez, Greenway, & Cairney, 2015) and others a large effect (Dewey, Kaplan, Crawford, & Wilson, 2002;Pratt & Hill, 2011). A systematic search, synthesis and critical appraisal of the evidence can provide a more rigorous understanding of this relationship (Mancini et al., 2016(Mancini et al., , 2018. A meta-analysis to pool the findings would provide a more accurate understanding of whether individuals with DCD do indeed experience greater internalising symptoms than their peers and would provide insight into the magnitude of this difference. Studies also vary greatly in their design, participants, measures and methodological quality. Whereas some studies have recruited participants with a confirmed diagnosis of DCD, others have included only those identified as 'probable DCD' (i.e. based on parent-report or performance-based screening measures without comprehensive assessment of all diagnostic criteria). Studies also differ in how well they controlled for confounding variables, whether they employed a longitudinal or cross-sectional design and whether they recruited participants through population-based screening or convenience sampling. These differing methodological factors could all impact on the quality of a study and, thus, the magnitude of the effect identified (Sanderson, Tatt, & Higgins, 2007). There is some evidence to suggest the impact may be greater in adolescents compared to younger children (Piek et al., 2007;Skinner & Piek, 2001), in males (Sigurdsson et al., 2002) and in individuals with comorbid attentiondeficit hyperactivity disorder (ADHD; Piek et al., 2007), although these factors are unlikely to explain all the variance in the relationship between DCD and internalising symptoms. Research has also highlighted differences between parent-and child-reported measures of internalising symptoms, with parents often under-reporting difficulties (Cantwell, Lewinsohn, Rohde, & Seeley, 1997). A metaanalytic approach allows for an investigation into the potential sources of heterogeneity across studies, which could help to guide future research and intervention.
The aim of this paper, therefore, was to answer the questions: do individuals with DCD experience significantly greater levels of internalising symptoms than TD individuals, and what is the magnitude of this difference? The specific objectives were to conduct a systematic review and meta-analysis of studies that compared individuals with DCD to TD individuals on measures of internalising symptoms; to appraise the quality of the evidence; and to explore which factors moderate the effect. The focus was on severity levels of internalising symptoms, as opposed to rates of actual diagnosis, given that most studies have adopted severity outcome measures, and given that data on diagnostic rates may obscure important differences in actual symptoms.

Method
The review was protocol-driven and carried out in accordance with recommended guidelines for systematic reviews (Moher, Liberati, Tetzlaff, Altman, & The Prisma Group, 2009) and meta-analyses of observational studies (Stroup et al., 2000).

Eligibility criteria
In line with recommended guidelines, broad inclusion criteria were used with the aim to later explore the impact of specific design features. Articles were eligible if they: (a) included participants, of any age, with a confirmed diagnosis of DCD according to DSM-IV or DSM-5 criteria; or who were identified as having motor coordination difficulties consistent with DCD (i.e. 'probable DCD'); (b) included a comparison group of TD individuals, as defined by the absence of diagnosed or suspected developmental disorders at the time of the study; (c) measured levels of internalising symptoms (i.e. depression and/or anxiety) for each group using self-report, parent-report, teacher-report, direct observation or clinical interview; (d) reported statistics that could be transformed into a standardised mean difference; and (e) were available in full text in English. Studies involving participants with a comorbid diagnosis (e.g. ADHD) were eligible if the motor coordination difficulties were clearly described and used as the basis for group comparison. Studies were excluded if participants' motor difficulties were attributed to another developmental difficulty or medical diagnosis (e.g. cerebral palsy). For studies using the label 'dyspraxia', they were required to refer to overall motor impairment and not just oromotor difficulties and gesture. Studies were also excluded if the outcome only included rates of psychiatric diagnoses (i.e. they did not report on a measure of the severity of symptoms).
If separate studies included overlapping samples, priority was given to the study with the best control of important confounders (i.e. age and gender) or the study that allowed for the most detailed exploration of moderating factors (e.g. outcomes reported separately by gender or age group). Where the same participants were included but different subtests reported, data were combined (Borenstein, Hedges, Higgins, & Rothstein, 2009).

Search strategy
Studies were identified through a systematic search of Medline, PsychInfo, CINAHL, ERIC and Web of Science. Unpublished studies were searched using ProQuest Dissertations and Theses and Open Grey. The latest search was completed on 3rd March 2018. The search included terms related to DCD combined with terms related to internalising symptoms (see Appendix S1). Titles and abstracts were screened by one reviewer (SO), with 25% cross-checked by a second (AJ; per cent agreement = 97.5%, Cohen's kappa = .65). The full texts of all potentially relevant articles were then screened independently by two reviewers (SO & AJ), with disagreement resolved by consensus and discussion with a third (HL; per cent agreement = 94.5%, kappa = .87). The bibliographies of the included studies and relevant review articles were screened and their citations were tracked to identify additional studies. The first authors of the included articles were contacted to identify any further eligible studies or to clarify missing information.

Data extraction
Two researchers (SO & AJ) performed data extraction for all included studies and inconsistencies were discussed until consensus was reached. Interrater reliability was good for both categorical (per cent agreement = 96%-100%; kappa = .89-1.00) and continuous (intraclass correlation coefficient = 1.00) data.
The following information was extracted: author, publication year, country, design (cross-sectional; longitudinal), population, sampling procedure (population-based screening; selective/ convenience sample), criteria for DCD (confirmed DCD; probable DCD), criteria for TD, number of participants, gender (percentage male), age (mean and range), comorbid ADHD diagnosis (ADHD assessed and excluded from the sample; ADHD assessed and included; ADHD not assessed), measures of internalising symptoms, internalising construct (depression; anxiety; overall internalising), reporter (self-report; parentreport; teacher-report; clinician/researcher-report) and scores (means and standard deviations, other relevant statistics).
If multiple informants or measures were used to assess internalising symptoms, they were extracted separately so that they could be pooled. Preference was given to data adjusted for important confounders (e.g. gender, age) if not matched by design. However, where studies also adjusted for additional variables (e.g. intelligence), the unadjusted scores were preferred to ensure comparability across studies (Voils, Crandell, Chang, Leeman, & Sandelowski, 2011). Where findings were reported separately for subgroups (e.g. gender, age groups), these data were extracted separately as subsamples. Where separate groups were included for confirmed and probable DCD, only the confirmed DCD group was extracted. Where separate groups were included for comorbid DCD/ADHD and DCD-only, the DCD-only group was extracted.

Study quality
An adapted version of the Newcastle-Ottawa Scale (NOS) was used to assess study quality (Wells et al., 2011). Two reviewers (SO & AJ) conducted ratings independently, with disagreement discussed until consensus was reached (per cent agreement = 91%-100%; kappa = .82-1.00). Each study was rated on representativeness of the DCD group (i.e. population screening), selection of the control group (i.e. same population as DCD), ascertainment of DCD diagnosis (i.e. confirmed DCD); control for baseline internalising (for longitudinal studies), comparability of groups (i.e. control for confounders), measurement of internalising symptoms (i.e. validated measures), length of follow-up (for longitudinal studies) and completeness of follow-up (for longitudinal studies only).
For the DCD criteria to be rated as 'Confirmed DCD', the study must have assessed motor skills as being below the 15th percentile using performance-based measures (Criterion A; Blank, Smits-Engelsman, Polatajko, & Wilson, 2012), as having a significant impact on activities of daily living or academic achievement (e.g. questionnaires or interview; Criterion B), and ruled out intellectual disability and other neurological conditions (e.g. by interview, performance measures, medical reports; Criterion D). Alternatively, they could have cross-checked medical records for diagnosis. Given that many studies were published prior to publication of the DSM-5 and the introduction of Criterion C, it was not essential that studies established whether participants met this criterion (i.e. if difficulties began in the early developmental period). The confounding factors of age and gender were considered to be the most important for comparability of study groups (Twenge & Nolen-Hoeksema, 2002). The NOS satisfies relevant guidelines (Sanderson et al., 2007) and is recommended for systematic reviews of observational studies (Deeks et al., 2003).

Statistical analysis
The main analysis was performed using Review Manager 5.3 software (The Nordic Cochrane Centre, The Cochrane Collaboration, Copenhagen).
Summary effect. The standardised mean difference (SMD; Hedges g) and its 95% confidence interval were calculated for each study (or subsample) separately. SMD's around 0.2 can be considered as small, 0.5 as moderate and 0.8 as large. Where studies reported on multiple measures of internalising, a pooled effect size and variance were calculated (Borenstein et al., 2009). Effect sizes were weighted according to the inverse of their variance to ensure that more precise estimates influence overall effect size most heavily and to reduce the effect of the upwardly biased estimates of smaller studies (Hedges & Olkin, 1985). Random-effects meta-analysis was used to calculate a summary effect for total internalising symptoms across all studies and its 95% confidence interval.
Heterogeneity. Q-statistics were used to assess for heterogeneity and the I 2 statistic to quantify the proportion of the variance due to heterogeneity. Moderators were explored to identify potential sources of heterogeneity. The following moderators were explored: design (longitudinal vs. crosssectional), gender (>50% male vs. ≤50% male in DCD group), age (included adolescents ≥12 vs. no adolescents), comorbid ADHD (assessed and excluded vs. not assessed, or assessed but included), sampling strategy (population screen vs. selective/convenience); selection of DCD group (confirmed vs. probable DCD), controlled for confounders (age and gender controlled vs. uncontrolled), reporter (self-report vs. informantreport) and type of internalising (overall internalising vs. depression vs. anxiety). The significance of moderators was tested using Q-statistics.
Publication bias. Publication bias was assessed visually using a funnel plot. Egger's test was used to statistically check for publication bias. Duval and Tweedie's trim and fill procedure was used to compute an adjusted effect size by imputing the effect of smaller, unpublished studies (Duval & Tweedie, 2000). Finally, Rosenthal's (1979) Fail-safe N was calculated to determine the number of studies with an average effect size of 0 that would have to be included to produce a nonsignificant result. This number should exceed 5k + 10 (where k is the number of studies).

Sensitivity analysis.
A sensitivity analysis was conducted to calculate a pooled effect size excluding lower quality studies (i.e. those not meeting at least five criteria on the NOS).

Search results
The search identified 20 studies meeting the inclusion criteria, consisting of 23 eligible subsamples (two studies reported outcomes separately for males and females, one study for children and adolescents; hereafter treated as separate studies). The search process is summarised in Figure 1.
It should be noted that two articles reported outcomes at multiple time points for the same longitudinal study (Harrowell, Holl en, Lingam, & Emond, 2017;Lingam et al., 2012). The data from the latter time point were used (Harrowell et al., 2017) because separate outcomes were reported for males and females, allowing for better exploration of moderators. Two articles reported data on the same cross-sectional study (Pearsall-Jones, Piek, Rigoli, Martin, & Levy, 2011;Piek et al., 2007), so the larger sample was included (Piek et al., 2007). One eligible study reported unusually small standard deviations for the outcome, raising concerns around its accuracy and was subsequently excluded (Tseng, Howe, Chuang, & Hsieh, 2007).

Characteristics of the included studies
The characteristics of the included studies are summarised in Table 1. A total of 8,469 participants were included (1,123 DCD, 7,346 TD). The studies were published between 1994 and 2018. Most were from developed countries, with one study from Taiwan. Three prospective cohort studies were identified that screened for DCD in a cohort and assessed their internalising symptoms at a later follow up (Harrowell et al., 2017;male & female samples;Wagner, Jekauc, Worth, & Woll, 2016). The remaining 20 studies adopted a cross-sectional design.

Risk of bias
The risk of bias in the included studies is summarised in Table 2. Selection bias was variable. The use of a population screening method in 16 studies ensured the DCD sample was somewhat representative of the population studied. However, seven studies recruited the DCD sample from a selective group or convenience sample, which may be at greater risk of bias. This included children born very preterm or with extremely low birth weight (Davis et al., 2007), volunteer samples from DCD support groups (Crane et al., 2017;Hill & Brown, 2013;Pratt & Hill, 2011), monozygotic twin samples (Piek et al., 2007)   2006). In most studies, the TD group was drawn from the same community (e.g. school, geographical location) as the DCD group, except for four studies that were from a different source (Crane et al., 2017;Hill & Brown, 2013;Pratt & Hill, 2011;Wagner et al., 2012) and thus had an increased risk from selection bias. Additionally, while 10 studies confirmed diagnoses of DCD by independent assessment or clinical reports (Chen et al., 2009;Crane et al., 2017;Harrowell et al., 2017;van den Heuvel et al., 2016;Hill & Brown, 2013;Pratt & Hill, 2011;Wagner, 2017;Watson & Knott, 2006), 13 did not confirm key diagnostic criteria Davis et al., 2007;Dewey et al., 2002;Francis & Piek, 2003;King-Dowling et al., 2015;Li et al., 2018;Piek et al., 2007Piek et al., , 2008Schoemaker & Kalverboer, 1994;Skinner & Piek, 2001;Wagner et al., 2016). Caution should be taken, therefore, when attributing differences in internalising symptoms in these studies to DCD. Most studies were cross-sectional and, therefore, unable to account for internalising symptoms prior to the development of motor difficulties. Of the longitudinal studies, one controlled for baseline internalising symptoms (Wagner et al., 2016). The studies varied in the comparability of study groups and control for important confounders. Age and gender were controlled in 11 studies through either matched groups Francis & Piek, 2003;Piek et al., 2007;Schoemaker & Kalverboer, 1994;Skinner & Piek, 2001;Wagner et al., 2012;Watson & Knott, 2006) or adjusted analyses (Harrowell et al., 2017;Wagner et al., 2016). The study by Piek et al. (2007) adopted a monozygotic twin design which also controls for a wide range of genetic and shared environmental factors. The remaining studies failed to sufficiently control for age and gender. Although one such study did adjust for differences in gender in the analysis (Piek et al., 2008), it also included intellectual ability as a covariate, and so the unadjusted difference between the groups was used in the meta-analysis to ensure comparability (Voils et al., 2011).
All studies utilised outcome measures with established validity and reliability psychometrics. It should be noted that three studies only reported specific narrow-band depression or anxiety subscales, as opposed to using the broad-band internalising scales Piek et al., 2008). These might raise concerns around selective reporting.
Finally, only three studies included a longitudinal follow-up (Harrowell et al., 2017;male & female sample;Wagner et al., 2016). This was over a year in all three studies. However, there were high rates of attrition in all three.

Internalising symptoms
Since only one study based on an adult sample was identified, that study was excluded from the remaining analyses. Across the 22 studies with children and adolescents, those with DCD or probable DCD were found to have higher levels of internalising symptoms than TD controls with a medium effect size (g = 0.61; 95% CI: 0.48-0.74; see Figure 2 for forest plot). There was significant moderate heterogeneity among the studies (I 2 = 56%; v 2 = 47.84; p = .0007).

Moderator analysis
The results of the moderator analyses are summarised in Table 3. The results revealed that the effect size was significantly larger in studies that utilised a cross-sectional design (vs. longitudinal), that included a majority male sample in the DCD group (vs. majority female) and that recruited a selective or convenience sample of participants (as opposed to population-based screening). There was also a trend (p = .05) towards a greater effect size in studies that did not control for important confounders and in studies that excluded individuals with a diagnosis of ADHD. No significant effect was found for age, confirmation of DCD diagnosis, outcome respondent or type of internalising measure.

Sensitivity analysis
A sensitivity analysis was conducted to include only those studies meeting five or more criteria on the NOS. Moderator analysis identified a significant difference between the high-quality and low-quality studies (Q = 4.42; p = .04). Analysis of the higher quality studies (k = 10) found a smaller, but still moderate, effect of DCD on internalising symptoms (g = 0.46; 95% CI: 0.34-0.58). There was also no evidence of significant heterogeneity among these ten studies (Q = 5.95; p = .55; I 2 = 0%).

Publication bias
The funnel plot (see Figure S1) displayed some asymmetry, with smaller studies tending to report larger effect sizes, possibly indicative of publication bias. Eggers test was statistically significant, supporting the presence of publication bias (Egger's bias = 2.38; 95% CI: 0.20-4.56; p = .02). However, Duval and Tweedie's trim and fill procedure did not impute additional studies and, therefore, the effect size adjusted for publication bias was identical to the nonadjusted effect size. Rosenthal's Fail-safe N, suggested that the number of studies with null results that would have to be included to produce a nonsignificant combined effect size is 1,076. This is substantially larger than the minimum required when applying Rosenthal's (1979) formula (i.e. 120).

Discussion
The present systematic review and meta-analysis indicate that children and adolescents with DCD or probable DCD experience greater levels of internalising symptoms compared to their TD peers. The magnitude of this difference suggests a moderate effect size, with individuals with DCD scoring over half a standard deviation higher. This moderate effect, although reduced slightly, remained robust after excluding lower quality studies. Methodological and participant factors that may moderate the magnitude of this effect have also been identified.

DCD and internalising symptoms
The findings are in line with the emerging consensus that DCD can have a significant impact on an individual's mental health (Cac ßola, 2016;Mancini et al., 2016Mancini et al., , 2018. Notably, the magnitude of the effect identified is comparable, if not greater, than that found in metaanalyses of a wide range of chronic physical health conditions (Pinquart & Shen, 2011a, 2011b. The environmental stress hypothesis is a framework that was introduced to account for the relationship between DCD and mental health (Cairney, Rigoli, & Piek, 2013). It suggests that the motor impairments in DCD can expose an individual to a variety of secondary stressors, which over time can lead to poorer mental health. Although potential mediators were not explored in this review, they have been outlined previously (Mancini et al., 2016(Mancini et al., , 2018. They include peer victimisation , reduced leisure activities (Raz-Silbiger et al., 2015), impaired social skills (Wilson, Piek et al., 2013), poorer self-esteem (Rigoli et al., 2012), physical inactivity (Li et al., 2018), reduced social support (Rigoli et al., 2017) and lower perceived academic competence (Lingam et al., 2012). Individuals with DCD may also experience impairments to various cognitive abilities, including executive function (Wilson, Ruddock, Smits-Engelsman, Polatajko, & Blank, 2013) and social cognition (Cummins, Piek, & Dyck, 2005). This may further impact on selfregulation and mental well-being (Lantrip, Isquith, Koven, Welsh, & Roth, 2016;Letkiewicz et al., 2014). Future meta-analyses regarding the magnitude of the effects for these potential mediators will be important, providing further opportunities for intervention.

Moderating factors
The present review identified several methodological factors that might moderate the degree to which DCD is associated with internalising symptoms. As expected, the effect size was likely overestimated in cross-sectional compared to longitudinal studies, and in convenience or clinic-referred samples compared to samples recruited via population-based screening. Such methodologies have less control of confounding factors and a less representative selection of DCD participants (e.g. more severe impairments in clinical samples). There was also a trend towards larger effect sizes in studies that failed to control for age and gender. Again, the magnitude of the effect sizes in these studies was likely inflated by confounding factors (Deeks et al., 2003). No significant effect was found for the DCD criteria used. This suggests that, although establishing all DCD diagnostic criteria is important for the quality of research in this area (Zwicker et al., 2013), failing to do this might not have a substantial impact on the results. This may be because it is specifically the motor difficulties, as tested by screening measures in studies of 'probable DCD', which affect internalising symptoms, rather than issues surrounding having a diagnosis. Population-based screening, longitudinal design and control for confounders should therefore take priority in future studies. Participant factors that might moderate the effect of DCD on internalising symptoms were also identified. The effect was larger in studies with a majority male sample. This would suggest that DCD has a greater impact on the mental health of males and is in line with the findings of Sigurdsson et al. (2002). This is particularly important given the prevalence of DCD may be greater in males (Kirby & Sugden, 2007;Missiuna, 1994). It has been suggested that male children attribute greater value to physical activity and sports compared to females, which might account for the larger impact of DCD on their wellbeing (Poulsen, Ziviani, & Cuskelly, 2006;Poulsen, Ziviani, Cuskelly, & Smith, 2007). However, although significant, the difference in the magnitude of the effect sizes was minimal. Additionally, only two of the included studies reported within-study comparisons of the impact of DCD on males and females, with one suggesting no difference  and the other suggesting a greater impact for females (Harrowell et al., 2017). Regardless of which gender experiences the larger effect, there is evidence that DCD can impact on the mental health of both genders and perhaps it is the mechanism by which this occurs that differs (Li et al., 2018).
There was also a trend towards larger effect sizes in studies that specifically excluded participants with ADHD. This contradicts what might have been expected from previous research (Martin, Piek, & Hay, 2006;Rasmussen & Gillberg, 2000). One possible explanation for this finding is that children with comorbid ADHD are more likely to be diagnosed and to subsequently receive   excluded from analysis due to using both self-and observer-report. Inclusion of each type of measure from this study, independently, did not significantly change the results. support for their difficulties (Heath, Toste, & Missiuna, 2005;Rivard, Missiuna, Hanna, & Wishart, 2011). However, it should also be noted that there were only five studies included in the meta-analysis that specifically excluded participants with ADHD. All five studies were also of a lower quality. It is a more plausible explanation that methodological limitations inflated their combined effect size. Contrary to what might be expected (Missiuna, Moll, King, King, & Law, 2007;Skinner & Piek, 2001), no significant moderating effect was found for age. However, only one study included in the present review reported separate outcomes for adolescents and children. The moderator categories for the metaanalysis were based on somewhat arbitrary criteria (i.e. studies that included adolescents in their sample, as opposed to studies with a pure adolescent sample) which may have prevented the detection of differences between the age groups.
Finally, no significant effect was found for the type of outcome measure or respondent, suggesting that DCD may be associated with elevated levels of depression, anxiety and overall internalising symptoms, regardless of the person rating it. However, there are limited within-study comparisons available despite previous research highlighting variability between parent-and self-reported internalising symptoms (Cantwell et al., 1997). As such, collecting information from multiple informants will maximise reliability in future research and clinical practice.

Strengths and limitations
There are several limitations of this review. First, the quality of the included studies was variable. Most studies were based only on cross-sectional data, which make it difficult to establish causality. Although three longitudinal studies were included, only one controlled for baseline measures of internalising symptoms and all reported a high rate of attrition. Many of the studies also failed to control for important confounders (i.e. age and gender) and to establish all DCD diagnostic criteria. The review also focused on studies that dichotomised participants into DCD and TD groups, whereas motor coordination can be understood as a continuum of ability. This dichotomy can miss the variation in motor skills that exists within each group, as well as changes over time and across different measures.
The moderator analysis should also be interpreted with caution. As outlined above, there were an insufficient number of studies within some of the moderator categories to reliably explore their impact (including age and comorbid ADHD). It is of note that many studies failed to measure ADHD symptomatology, despite evidence for high rates of comorbidity (Martin et al., 2006). Most of the studies were also conducted in western, developed countries and, therefore, generalisation to other countries is limited. Additionally, only one study with adults was identified; therefore, the extent to which elevated internalising symptoms persist into adulthood is unclear.
However, this review is the first attempt to systematically synthesise the evidence on internalising symptoms in individuals with DCD and provide a pooled summary of the effect size. Publication bias is unlikely to have a substantial impact. Additionally, despite methodological limitations of the included studies, potential moderating factors have been identified. The effect size also remained substantial, and heterogeneity reduced, after excluding lower quality studies.

Implications and conclusion
The findings have important clinical and research implications. It can be concluded that individuals with DCD have an increased risk of developing elevated levels of internalising symptoms. The difference of half a standard deviation between individuals with DCD and their peers could be considered clinically important (Norman, Sloan, & Wyrwich, 2003). This would support the practice of routine screening of mental health difficulties in individuals with DCD and motor impairment. Given that DCD is poorly understood among professionals (Gaines et al., 2008;Wilson, Neil et al., 2013) and that families often report difficulties obtaining support (Stephenson & Chesson, 2008), such routine screening could be useful across a range of services (including schools, occupational therapy, physical healthcare and mental healthcare). The findings also highlight the need for professionals in mental health services to be aware of the disorder and how it impacts their patients. Additionally, the findings support the need for the development of psychosocial interventions for DCD with a focus on the secondary stressors that might mediate the link between motor difficulties and emotional well-being .
Future research should focus on high-quality longitudinal studies to better understand the causal link between DCD and internalising symptoms, including the role of important mediators. It is recommended that studies include probability sampling strategies and control for confounders and the stability of internalising symptoms over time. This review has also highlighted the need for more research investigating mental health in adults with DCD, especially given that the impact of DCD has been found to continue into adulthood (Cousins & Smyth, 2003;Hill et al., 2011;Kirby, Williams, Thomas, & Hill, 2013). Research investigating the effectiveness of routine screening for mental health difficulties in DCD and psychosocial interventions would also provide insight into the improved management of DCD. Given the major economic impact of poor mental health (Trautmann, Rehm, & Wittchen, 2016) and the increasing focus on improving psychological well-being in government policy (Department of Health, 2011), the need to identify and support those individuals most at risk of mental health difficulties is crucial.

Supporting information
Additional supporting information may be found online in the Supporting Information section at the end of the article: Appendix S1. Summary of search terms used. Figure S1. Funnel plot of effect sizes and standard error.