Investigating causality between liability to ADHD and substance use, and liability to substance use and ADHD risk, using Mendelian randomization

Attention-deficit hyperactivity disorder (ADHD) has consistently been associated with substance use, but the nature of this association is not fully understood. To inform intervention development and public health messages, a vital question is whether there are causal pathways from ADHD to substance use and/or vice versa. We applied bidirectional Mendelian randomization, using summary-level data from the largest available genome-wide association studies (GWAS) on ADHD, smoking (initiation, cigarettes/day, cessation, and a compound measure of lifetime smoking), alcohol use (drinks/week, alcohol problems, and alcohol dependence), cannabis use (initiation) and coffee consumption (cups/day). Genetic variants robustly associated with the ‘exposure’ were selected as instruments and identified in the ‘outcome’ GWAS. Effect estimates from individual genetic variants were combined with inverse-variance weighted regression and five sensitivity analyses (weighted median, weighted mode, MR-Egger, generalized summary-data-based-MR, and Steiger filtering). We found evidence that liability to ADHD increases likelihood of smoking initiation and heaviness of smoking among smokers, decreases likelihood of smoking cessation, and increases likelihood of cannabis initiation. There was weak evidence that liability to ADHD increases alcohol dependence risk, but not drinks/week or alcohol problems. In the other direction, there was weak evidence that smoking initiation increases ADHD risk, but follow-up analyses suggested a high probability of horizontal pleiotropy. There was no clear evidence of causal pathways between ADHD and coffee consumption. Our findings corroborate epidemiological evidence, suggesting causal pathways from liability to ADHD to smoking, cannabis use, and, tentatively, alcohol dependence. Further work is needed to explore the exact mechanisms mediating these causal effects.


Introduction
Individuals who have been diagnosed with attention deficit hyperactivity disorder (ADHD) are more likely to be (heavy) substance users compared to those without ADHD 1 . Around 5.9-7.1% of children and adolescents and 5.0% of adults are thought to meet the diagnostic criteria for ADHD 2 , and genetic studies support the notion that a clinical diagnosis represents the extreme end of a continuum of impulsivity and/or attention problems in the general population 3,4 . Both ADHD diagnosis and higher levels of impulsivity and attention problems are associated with higher levels of cigarette smoking 5,6 , cannabis use 7,8 , alcohol use 7,9 and caffeine consumption 10,11 . The exact nature of these associations is not fully understood, which hampers the development of evidence-based interventions and public health messages.
Several explanations have been posited as to why ADHD and substance use are correlated.
First, there are risk factors that increase susceptibility to both. These could be environmental factors that have been shown to be risk factors for ADHD and substance use, such as trauma exposure or other adverse early life events 12,13 , or these could be genetic influences with pleiotropic effects on both ADHD and substance use. Family studies have shown that ADHD and substance use are moderately to highly heritable, and indicate shared genetic risk factors 14,15 . Overlap in genetic risk has also been examined in recent genome-wide association studies (GWAS) of ADHD and substance use 4,[16][17][18][19] . Substantial genetic correlations were found for ADHD with ever versus never smoking (rg=0. 48, p=4.3e-16), number of cigarettes smoked per day (rg=0.45, p=1.1e-05), alcohol dependence (rg=0.44, p=4.2e-06), and cannabis initiation (rg=0.16, p=1.5e-04), pointing to a common neurobiological aetiology. This is consistent with research indicating that cognitive deficits such as impaired response inhibition and working memory are important features of both ADHD and substance abuse 20,21 , and that both ADHD and substance abuse can be considered forms of externalizing disorders 22 .
While a partial common neurobiological aetiology to ADHD and substance use is therefore likely, environmental and genetic correlation could also (partly) reflect causal effects of one on the other. If variable X causes variable Y, it follows that any environmental or genetic risk factor causing variable X will also be associated (indirectly) with variable Y. The current literature has mostly focused on causal pathways from ADHD to substance use, with longitudinal cohort studies showing that externalizing symptoms in early adolescence predict onset of smoking and faster progression to daily smoking, and that ADHD medication reduces early onset smoking and alleviates smoking withdrawal 5 .
For alcohol and cannabis the evidence is less clear, with some studies finding that ADHD symptoms only predict their use in girls 23 , and a recent twin study reporting no relation between ADHD symptoms and alcohol or cannabis use 14 . For caffeine, a relatively small longitudinal study (n=144) suggested reciprocal effects between caffeine consumption and ADHD symptoms during adolescence 10 .
There is tentative evidence that there may be causal effects in the other direction (i.e., substance use leading to an increase in ADHD symptoms) 24,25 . In monozygotic twin pairs discordant for smoking, the smoking twin scored higher on attention problems -a difference which only appeared after smoking was initiated 24 . For cannabis use the evidence is mixed. Low to moderate cannabis use in adolescents seems to lead to a small increase in attention and academic problems, which disappears following sustained abstinence 25 . However, there is no indication that cannabis use exacerbates ADHD-related brain alterations 26 . With regards to alcohol use, binge-pattern exposure during development has been shown to cause attention deficits in mice 27 , but there is no clear evidence for such effects in humans.
It is difficult to fully unravel the nature of the association between ADHD and substance use with observational data because of bias due to (unmeasured) confounding and reverse causality (i.e., the outcome affecting the exposure). Mendelian randomization (MR) is a method to infer causality which has recently gained much popularity. MR uses genetic variants robustly associated with an exposure variable as an instrument to test causal effects on an outcome variable 28,29 . Because genes are transmitted from parents to offspring randomly, genetic variants that are inherited for a trait (e.g., ADHD) should not be associated with confounders such as social-economic status. By using genetic variants as instrumental variables it is therefore possible to obtain less biased results.
So far, one MR study found evidence for a causal effect of alcohol use on attention problems and aggression in adolescents (but not on delinquency, anxiety, or depression) 30 . Two other studies provided evidence that genetic liability to ADHD, as well as higher extraversion, has a causal effect on smoking initiation 31,32 . A recent MR study found that liability to ADHD leads to a higher risk of cannabis initiation, but these analyses were based on summary-level data of a cannabis GWAS which has recently been updated (with a much larger sample size -n=184,765 instead of n=32,330). Moreover, potential causal effects in the reverse direction were not adequately tested given that the authors included all ADHD cases instead of just those diagnosed in adulthood 33 . Overall, existing MR studies are limited in that they have primarily tested unidirectional effects only, included a narrow focus on one specific substance use behaviour, and/or had limited statistical power.
The rationale behind MR is that the random assortment of genetic variants creates subgroups in the population which roughly mimic treatment groups from a randomized controlled trial. Outcomes are compared between individuals in the 'high genetic risk group' and those in the 'low genetic risk group' for a proposed exposure variable. The method rests on three important assumptions, namely that the genetic variants used as instruments: 1) strongly predict the exposure variable -typically, the selected variants have been genome-wide significantly associated (p<5e-08) with the exposure and replicated, 2) are independent of confounding variables, and 3) do not affect the outcome through an independent pathway, other than possible causal effects via the exposure (Figure 1a). A potential threat to MR is horizontal pleiotropy, where the genetic variant used as an instrument directly affects vulnerability to multiple phenotypes. This could lead to violation of MR assumptions 2 and 3. To assess whether MR assumptions may have been violated, we conducted various sensitivity analyses described below. We applied MR using summary level data (sometimes known as 'two-sample MR'), which uses effect estimates of genetic variants (SNPs) from large GWAS that have been performed previously. In this approach, the SNP-exposure association and the SNP-outcome association estimates are taken from two separate GWAS (Figure 1b). A major strength of this design is that it takes advantage of large, well-powered GWAS, without the need to have information on both the exposure and the outcome in one single sample. An additional assumption of this method is that the SNPs identified as instruments based on their effect estimates in the exposure GWAS also predict that exposure variable in the outcome GWAS -this cannot be directly tested. To estimate the causal effect of the exposure on the outcome, the SNP-outcome association is divided by the SNP-exposure association for each SNP. The main MR result is obtained by combining these ratios into an overall estimate of causal effect using inverse-variance weighted (IVW) fixed-effect meta-analysis (Figure 1b) 29 .  (1), the instrument is not associated with (un)measured confounders (2), and the instrument does not influence the outcome other than through the exposure (3). b) Illustration of the MR design when using summary level data and the SNP-exposure association and SNP-outcome association are taken from two separate GWASs (also known as 'two-sample MR').

Mendelian randomization versus other causally informative designs
MR is inherently different from other causally informative designs such as twin or family studies. While these methods use a priori knowledge of genetic relatedness between family members to explain variation in a particular phenotype, MR exploits directly measured genotype in (usually) unrelated individuals. Whereas twin and family studies aim to correct for genetic (and shared environmental) differences in order to infer causality, MR exploits the genetic component by using it as an instrument for causal inference. A comprehensive review comparing all methods that use genetic data to strengthen causal inference is available elsewhere 34 .
Lifetime smoking is a compound variable that captures smoking initiation, duration, heaviness and cessation, across mid-to late-adulthood. As ADHD onset is expected to occur (long) before midto late-adulthood, lifetime smoking was not appropriate to use as an exposure and was only used as an outcome. In addition, cigarettes per day and smoking cessation could not be used as exposures because the GWAS these are based on were performed in (former) smokers only. To perform an MR analysis with genetic variants for cigarettes per day and smoking cessation as instruments, the outcome GWAS (in this case ADHD) would have to be stratified on smoking status, which was not possible with the summary data we used.
When testing causal effects of liability to ADHD on substance use, summary statistics from the complete ADHD GWAS containing child, adolescent, and adult data were used. When testing causal effects of substance use on ADHD, only adult data (ADHD diagnosed >18 years) were used (n=15,548) to ensure a plausible temporal sequence of a potential causal effect (i.e., substance use cannot logically have a causal effect on ADHD diagnosed in childhood). This is crucial given that ADHD is generally a child-onset disorder and the onset of substance use is typically during adolescence or early adulthood. What we aim to test here is whether substance use causes later development of or exacerbating of ADHD symptoms (resulting in a diagnosis at adult age). There was no sample overlap of the ADHD GWAS with the smoking, alcohol and coffee GWAS. Between the ADHD and the cannabis initiation GWAS there was very minimal overlap (<3%).

Main analysis
To assess causal effects of liability to ADHD on substance use, we identified independent single nucleotide polymorphisms (SNPs) that reached genome-wide significance (p<5e-08) in the ADHD GWAS to use as genetic instruments. These same SNPs were then identified in the substance use GWAS. To assess causal effects in the other direction, we identified independent genome-wide significant SNPs in the different substance use GWAS as genetic instruments, and then identified those different sets of SNPs in the ADHD GWAS. The analyses were conducted in R, using the two-sample MR package of MR-Base, a database and analytical platform 37 .

Sensitivity analyses
Besides IVW (Figure 1b), five additional MR methods were applied. Each of these five methods can provide an unbiased estimate of the true causal effect, provided that certain assumptions are met.
While it is not possible to know which of these methods' assumptions actually hold, examining the combined results of all methods allows us to assess the robustness of a causal finding. This practice of using multiple MR methods to triangulate evidence has become increasingly important now that MR has moved beyond more biological phenotypes (e.g., LDL-cholesterol) and is increasingly used in the context of complex traits such as ADHD and substance use, were detailed knowledge of the exact biological function of the associated genes is lacking 38 .
First, we used weighted median regression, which provides an unbiased estimate of the causal effect, even if <50% of the weight of the genetic instrument comes from invalid instruments 39 . Second, we used weighted mode regression, which provides unbiased results as long as the causal effect estimate that is most common among the included SNPs comes from valid instruments and is thus consistent with the true causal effect 40 . Third, we used MR-Egger regression, which provides an unbiased estimate of the causal effect provided that the strength of the genetic instrument (association between the SNP and the exposure) does not correlate with the effect that same instrument has on the outcome. This 'InSIDE assumption' (Instrument Strength Independent of Direct Effect) is a weaker assumption than the assumption of no pleiotropy 41 . However, MR-Egger does rely on the NOME (NO Measurement Error) assumption, and if this is violated its results may be biased.
Violation of the NOME assumption can be assessed by the I 2 statistic. An I 2 value below 0.9 indicates considerable risk of bias, which may still be corrected for with MR-Egger simulation extrapolation (SIMEX). An I 2 value below 0.6 means that MR-Egger results (even with SIMEX) are unreliable. We report MR-Egger results when I 2 >0.9, report MR-Egger SIMEX results when I 2 =0.6-0.9, and do not report MR-Egger results when I 2 <0.6 42 . Fourth, we used generalised summary-data-based Mendelian randomization (GSMR) 43 . This method achieves higher statistical power than other MR methods by taking into account very low levels of linkage disequilibrium (LD) between the included SNPs. GSMR includes a filtering step which identifies and removes SNPs considered outliers based on their effect size (HEIDI-filtering). MR-Egger and GSMR were applied only when the genetic instruments contained 10 or more SNPs. Fifth, we used Steiger filtering, which computes the amount of variance each SNP explains in the exposure and in the outcome variable. In case of a true causal effect of the exposure on the outcome, a SNP used as an instrument should be more predictive of the exposure than the outcome. If not (i.e., the SNP is more predictive of the outcome than the exposure) it might imply reverse causation 44 . Steiger filtering was used to exclude all SNPs that were more predictive of the outcome than the exposure, after which MR analyses were repeated. LCV (Latent Causal Variable model) is a recent method with the potential to distinguish genetic correlation from causation 45 . While we conducted LCV analyses, we report these in the supplemental material only because: 1) we aim to explicitly test bidirectional causality, which LCV does not allow, and 2) for cigarettes smoked per day, smoking cessation, and lifetime smoking LCV is not appropriate because we intended to only use them as outcome variables, and with LCV it is not possible to indicate which trait is the exposure or outcome.
For an additional indication of the robustness of our findings we inspected the Cochran's Q statistic, which provides an estimate of heterogeneity between the effects of the individual genetic variants 46 , and performed leave-one-out analyses, repeating the IVW analysis after removing each of the SNPs one at a time 37 .

Defining strength of evidence
We did not explicitly correct for multiple testing to avoid judging the evidence based simply on an arbitrary threshold. Instead, we interpret the evidence by looking at both the effect size and statistical evidence for the main IVW result, combined with how consistent the results of the sensitivity analyses are across multiple MR methods. Due to their stricter assumptions, the sensitivity analyses have lower statistical power to identify a true causal effect. Thus, when the effect sizes of the sensitivity analyses are of similar magnitude and direction, this supports a causal interpretation, even if the statistical evidence for an individual analytical approach is weaker than in the IVW analysis.

Results
We found evidence for causal effects of liability to ADHD on smoking initiation (IVW beta=0.07, 95% CI=0.03 to 0.11, p=1.7e-05), cigarettes smoked per day (IVW beta=0.04, 95% CI=0.02 to 0.06, p=0.006), smoking cessation (IVW beta=-0.03, 95% CI=-0.05 to -0.01, p=0.005) and lifetime smoking (IVW beta=0.07, 95% CI=0.06 to 0.14, p=1.4e-07). The weighted median and weighted mode sensitivity analyses confirmed these findings, albeit with slightly weaker statistical evidence for the latter ( Table  1). For smoking initiation, MR-Egger did not show clear evidence for a causal effect, but this may have been due to a lack of statistical power 41  CI=0.00 to 0.01, p=0.089, respectively). GSMR could not be performed because there were too few SNPs (<10). Steiger filtering showed that -with the exception of one SNP in the ADHD risk to smoking initiation analysis -all SNPs were more predictive of the exposure than of the outcome. Cochran's Q statistic indicated heterogeneity of the effects of the included variants for the ADHD liability to smoking initiation and ADHD liability to lifetime smoking analyses (Supplementary Table 3; Q=34.44, p=7.5e-05 and Q=47.73, p=2.9e-07, respectively), while leave-one-out analyses gave no indication that the overall causal effect was driven by a particular SNP (Supplementary Figure 1).
There was also considerable evidence that liability to ADHD causally increases risk of cannabis use initiation (IVW OR=1.13, 95% CI=1.02 to 1.25, p=0.010). Weighted median, weighted mode and GSMR confirmed this finding, but with (slightly) weaker statistical evidence. MR-Egger was not reported due to a low I 2 value (Supplementary Table 4). Steiger filtering did not identify any SNPs more predictive of the outcome than of the exposure. There was weak evidence for heterogeneity in SNP effects for the ADHD liability to cannabis initiation analysis (Q=15.90, p=0.069). Leave-one-out analyses did not suggest any individual SNPs were driving the overall effect.
There was no clear evidence for a causal effect of liability to ADHD on alcohol drinks per week, alcohol problems or coffee consumption. While there was some weak evidence that liability to ADHD causally influences alcohol dependence (IVW OR=1.07, 95% CI=1.01 to 1.14, p=0.030), this effect was not consistent across the sensitivity analyses. However, when we repeated these analyses using alcohol intake frequency as the outcome measure in UK Biobank only -one of the cohorts included in the much larger GWAS sample the main analyses were based on -there was evidence for a causal effect reflecting increased risk (IVW beta=0.22, 95% CI=0.04 to 0.40, p=0.013, Supplemental Table 5).
This is in line with recent findings that different alcohol use behaviours can show distinct (directions of) genetic associations 47 .

< INSERT TABLE 1 (INCLUDED AT THE END OF THIS DOCUMENT) >
In the other direction, we found strong evidence for causal effects of liability to smoking initiation on ADHD risk (IVW OR=3.72, 95% CI=3.10 to 4.44, p=2.9e-51). Weighted median, weighted mode, MR-Egger, and GSMR sensitivity analyses indicated similarly strong evidence, albeit with smaller effect sizes ( Table 2). The Egger intercept did not indicate horizontal pleiotropy (intercept=0.01, 95% CI=-0.01 to 0.03, p=0.37). However, for this relationship the I 2 value was low - Table 4) -indicating that MR-Egger was not reliable. Furthermore, Steiger filtering revealed that only 265 of the 346 smoking initiation SNPs (77%) were more predictive of the exposure, smoking, than of the outcome, ADHD. When repeating the IVW and sensitivity analyses with these SNPs only, the evidence for a causal effect was still strong, but effect sizes were attenuated (Supplementary Table 6). Cochran's Q statistic provided no clear evidence for heterogeneity for the liability to smoking initiation to ADHD risk analysis (Q=373.84, p=0.14) and leave-one-out analyses did not indicate that the overall effects were driven by a single SNP. As an additional sensitivity test, we repeated the smoking initiation -ADHD analyses using ADHD symptoms in childhood only, with one of the replication samples of the original GWAS paper (<13 years; n=17,666 48 ). The degree to which smoking initiation SNPs predict ADHD childhood symptoms in such an MR analysis could reflect horizontal pleiotropy, since most individuals in this age group will not have begun to smoke yet. We found strong evidence for a causal effect (IVW beta=0.28, 95% CI=0.17 to 0.39; Supplementary Table   7) -although these effect estimates and the statistical evidence were weaker than in the original analyses that restricted to adults. Together with the results of Steiger filtering, this indicates that the increasing effect of smoking initiation on ADHD risk is, at least in part, due to horizontal pleiotropy.

(Supplementary
There was no clear evidence for a causal effect of liability to cannabis use initiation, alcohol use, or coffee consumption on ADHD risk. The results of LCV analyses indicated that smoking initiation and alcohol dependence are genetically causal for ADHD, while for all other relationships there was no clear evidence of causal effects (Supplementary Table 8).

Discussion
We find evidence, using Mendelian randomization analyses of summary-level data, for causal effects of liability to ADHD on substance use risk, such that it increases the odds of initiating smoking, smoking more cigarettes per day among smokers, and finding it more difficult to quit, as well increasing the odds of initiating cannabis use. There was some indication that liability to ADHD increases alcohol dependence risk, but evidence for that was weak. In the other direction there was weak evidence that liability to smoking initiation increases (adult) ADHD risk. There was no clear evidence of causal effects between liability to ADHD and coffee consumption.
Our findings complement and confirm a large body of observational literature suggesting that individuals diagnosed with ADHD are at a higher risk of initiating smoking, transitioning into regular smoking, and being less able to quit 5 . We also provide evidence for a causal effect of liability to ADHD on risk of cannabis use, for which the literature has so far been inconclusive 14,23 . While previous observational studies may have been biased by (unmeasured) confounding, our approach of using genetic variants as instrumental variables is more robust to confounding and reverse causality. We were not able to identify the exact mechanism of causation, but it seems plausible that higher levels of impulsivity may lead individuals with ADHD liability to try out cigarettes or cannabis without considering their possible negative consequences 5, 49 . Another potential mechanism is 'selfmedication', whereby a substance is used because of its (real or perceived) positive effects on ADHD symptomatology -even though such effects might not actually exist 50 .
Interestingly, there was also evidence for causal effects of liability to smoking on ADHD risk. This is in line with previous literature indicating that smoking can have detrimental, long-term effects on attention 24 . It has been hypothesized that nicotine inhaled through cigarette smoke can affect the developing prefrontal cortex -involved in attention and impulse control -during adolescence 51 . It is important to note, however, that the evidence we found for causal effects of smoking on ADHD risk was much less robust than it was in the other direction. First of all, we were not able to test causal effects of smoking heaviness or smoking cessation on ADHD, which would have provided more compelling evidence. Second, a considerable portion (23%) of the SNPs used as an instrument for smoking initiation were in fact more predictive of the outcome, ADHD, implying reverse causation.
There is extensive research showing that genetic influences on smoking initiation are mediated via impulsivity-related traits 5 . This was confirmed by our findings that the genetic instrument for smoking initiation also showed strong evidence for a causal effect on ADHD symptoms in children <13 years (who would not yet have started smoking). Another important point is that for the analyses of substance use to ADHD, we used adult diagnosed ADHD as the outcome. This strengthened our approach by ensuring the appropriate temporal sequence for a causal effect in this direction.
However, it might be that individuals with adult diagnosed ADHD differ from those who were diagnosed during childhood. A recent study assessed the neurodevelopmental profile of individuals diagnosed with ADHD in adulthood, and found that they did not have a typical profile of neurodevelopmental impairment 52 . Our results should therefore be replicated using other, continuous measures of ADHD symptoms in adulthood. Preferably these would be more 'proximal' measures of attention problems and impulsivity, obtained through cognitive performance tasks or (functional) brain imaging.
We found some weak evidence for a causal effect of liability to ADHD on alcohol dependence risk (based on a DSM-diagnosis), but no clear evidence for causal effects of liability to ADHD on drinks per week or alcohol problems (based on the AUDIT self-report survey). These findings are of particular interest because current evidence on the mechanisms underlying associations between ADHD and alcohol use is inconclusive 14,23 . Given the very large and powerful genetic datasets that our analyses are based on, one would expect that a strong causal effect of ADHD on alcohol use would be convincingly shown, which was not the case given the weak evidence. The fact that there was some indication of causality from ADHD liability to alcohol dependence risk, but not for the other two alcohol measures, weakens the evidence further. However, it might be that ADHD liability only affects serious manifestations of alcohol abuse -such that it is clinically diagnosed -but not self-reported consumption. Of all the included GWAS data sets included in our study, alcohol dependence was based on the smallest sample size (n=46,568), and so it would be good to attempt replication of this finding when bigger samples become available. There was no clear evidence for causal effects between liability to ADHD and coffee consumption, which would indicate that observational correlations are the result of shared risk factors rather than causality. Important strengths of this study include the very large and recent samples that the analyses are based on, the variety of different substance use phenotypes that were included, and the use of multiple sensitivity analyses that each rely on distinctly different assumptions. However, there are also limitations to consider. First, the genetic instruments used in MR may vary in their strength (i.e., the amount of variance in the exposure variable that they explain). Stronger instruments are more likely to identify a causal effect, which in theory could explain why there was reasonable evidence for causality for some relationships (e.g., smoking to ADHD), but not for others (e.g., alcohol to ADHD).
When looking at the predictive power of the instruments, the differences were modest -for ADHD all SNPs included in the instrumental variable combined explained 0.5-0.7% of the variance, for smoking initiation 2.4%, for cannabis initiation 0.2%, for alcohol drinks per week 1.1%, for alcohol problems 0.3%, for alcohol dependence 0.2%, and for coffee consumption 0.6% (the formula to compute these numbers is described elsewhere 16 ). However, the power of these instruments to pick up effects on the outcomes also depends on the sample size of the outcome samples. Second, we were not able to apply all sensitivity analyses to all the tested relationships, due to an insufficient number of robustly predictive SNPs for some of the exposures. When even larger GWAS will become available, identifying more SNPs, we will be able to examine these relationships better. Third, and a more general limitation of MR, is that we cannot correct for unmeasured familial confounding, such as 'dynastic effects' which occur when parental genotypes have a direct effect on offspring phenotypes. This could potentially be dealt with using within-family MR studies when large enough data sets become available 53 . Fourth, the nature of our study design did not allow us to assess the role of ADHD medication status, which has previously been shown to affect substance use 5 . Fifth and final, the multiple testing burden should be considered when interpreting our findings, although this would not change our conclusions substantially, given the strong statistical evidence for the main findings.
Overall, our findings add to the current literature by allowing more robust conclusions on the causal nature of associations between ADHD and substance use. We confirm previous evidence from epidemiological studies that liability to ADHD increases the odds of initiating smoking, smoking more heavily, and finding it more difficult to quit 5 . For cannabis and alcohol use, where epidemiological studies were inconsistent, we show that liability to ADHD may increase the odds of initiating cannabis use and, tentatively, of developing alcohol dependence. This suggests that addressing ADHD symptoms early on in life may not only decrease smoking initiation and progression, but also cannabis initiation and the development of alcohol dependence. To further inform preventive efforts, future work should focus on the exact mechanisms through which causal effects of liability to ADHD are mediated. One possibility would be to perform an MR analysis for the different dimensions of ADHD (attention problems vs. impulsivity-hyperactivity) separately, if and when large enough GWAS for those phenotypes become available. Another area of interest is cognitive training. Efforts have been made to test whether training cognitive functions such as inhibitory control, which is impaired in ADHD, can decrease substance use. While for several health behaviours there is evidence that stimulus-specific inhibitory control training can be effective 54 , the literature of its efficacy on smoking is still very scarce. Our finding that smoking might causally increase ADHD risk should first be replicated and followed-up with different research methods and a wider range of measures of ADHD symptoms.
Such triangulation 55 will be essential to provide conclusive evidence on this, potentially highly impactful, finding. For the relationships where there was no indication of any causal effects -liability to ADHD and alcohol consumption and coffee use -it seems that we can, tentatively, say that the best approach for prevention would be to identify shared risk factors that are modifiable, so as to decrease risk of ADHD as well as alcohol and coffee consumption.

Authors contribution
J.L.T. carried out the analyses and drafted the manuscript. K.J.H.V and T.G.R. assisted with carrying out the analyses. All of the authors assisted with interpretation of the findings, thoroughly reviewed the content of the manuscript and approved the final version. Table 1. Results of the Mendelian randomization analyses using summary level data from liability to ADHD to substance use risk including IVW estimates and four sensitivity analyses: weighted median, weighted mode, MR-Egger, and GSMR (generalized summary-data-based Mendelian randomization). Table 2. Results of the Mendelian randomization analyses using summary level data from liability to substance use to adult ADHD risk (diagnosis received after age 18) including IVW estimates and four sensitivity analyses: weighted median, weighted mode, MR-Egger, and GSMR (generalized summary-leveldata based Mendelian randomization).