Exploring causal hypotheses: breaking with long-standing research traditions



This commentary is on the original article by Faebo Larsen et al. on pages 1016–1022 of this issue.

Faebo Larsen et al.[1] investigate the early life determinants for developmental coordination disorder (DCD) in 7-year-old children, using data from the Danish National Birth Cohort. They explore interesting associations between (amongst others) sex, gestational age, being small for gestational age (SGA), walking attainment, and the outcome variable, DCD, respectively, reaffirming previously established DCD-determinants and adding a new one (SGA) in the process. Their main contributions are the sex-specific associations for SGA.

As such, the authors' approach fits a long-standing research tradition where cautiousness regarding causal inferences prevails and research slowly and carefully progresses in understanding. We would like to take this opportunity to point out some possibilities for formulating and expanding their results using causal graphs, since recent developments in statistical perspectives on causal inference have proven the former credo ‘correlation is not causation’ to be overly restrictive.[2] As researching pointed out previously in DMCN, causal graphs and their theory could also be of great use in the epidemiology of developmental disabilities in children.[3]

Causal graphs and their associated theory have bridged the gap between statistical associations and causal connections mathematically, providing ways to test statistically (parts of) hypothetical causal models using observational data.[4] When done with appropriate caution and obeying model assumptions – albeit no more than should be customary when applying any statistical tool – this can avoid the risk of over-extrapolating the available data.

Respecting constraints imposed by time and logic, the graph in Figure 1 represents one of the plausible causal mechanisms underlying the variables and as such it may form an indispensable addition to the research questions posed by Faebo Larsen et al. The arrows depict hypothesized causal effects between the variables in the model.[4] The causal hypothesis as depicted by this graph could then be evaluated with appropriate statistical tools (e.g. structural equation modelling or even logistic regression). Additionally, focussing on one specific effect, the graph and the associated theory can be used to identify those variables for which conditioning in the analysis is needed in order to obtain unconfounded effect estimates.[2, 4]

Figure 1.

A causal graph depicting one of the plausible causal mechanisms underlying the variables investigated by Faebo Larsen et al.

For example, to examine the existence of a direct effect of being SGA on DCD (the dotted arrow in the graph), it can be concluded that conditioning on the variables sex, walking attainment, gestational age, and maternal background variables in the analysis would provide the correct estimate of just that effect. This would be found as the (unreported) regression coefficient of SGA on DCD in the logistic regression model of the second column of Table IV in the Faebo Larsen paper. Similar conclusions could be drawn for other (total, direct, or indirect) effects. If the total effect of being SGA (including the indirect effects on DCD of walking attainment and gestational age) would have been of interest, walking attainment and gestational age should have been left out of the analysis. Conditioning on gestational age, maternal background variables, and sex, but not on walking attainment, would give an estimate of the direct effect of being SGA on DCD and the effect of being SGA through walking attainment on DCD combined (represented by the adjusted odds ratio for being SGA in the second column in Table III). If sex on its own as a determinant would have been of interest, correction for maternal background variables would not have been necessary (assuming they do not affect the sex of the child).

Such an approach, though admittedly bold by breaking with long-standing traditions, could help interpret the observed results. In addition, it could also give rise to the formulation of new research questions to be evaluated in future research, hence contributing to the development of the research field. In our view, these new developments and the possibilities they provide for analysis and interpretation of observational data deserve to be acknowledged and explored. The important and clinically relevant findings of the current study of Faebo Larsen et al. can then be interpreted in a more concise manner.