Does integrated academic and health education prevent substance use? Systematic review and meta‐analyses

BACKGROUND
Prevention of substance (alcohol, tobacco, illegal/legal drug) use in adolescents is a public health priority. As the scope for school-based health education is constrained in school timetables, interventions integrating academic and health education have gained traction in the UK and elsewhere, though evidence for their effectiveness remains unclear. We sought to synthesize the effectiveness of interventions integrating academic and health education for the prevention of substance use.


METHODS
We searched 19 databases between November and December 2015, among other methods. We included randomized trials of interventions integrating academic and health education targeting school students aged 4-18 and reporting substance use outcomes. We excluded interventions for specific health-related subpopulations (e.g., children with behavioural difficulties). Data were extracted independently in duplicate. Outcomes were synthesized by school key stage (KS) using multilevel meta-analyses, for substance use, overall and by type.


RESULTS
We identified 7 trials reporting substance use. Interventions reduced substance use generally in years 7-9 (KS3) based on 5 evaluations (d = -0.09, 95% CI [-0.17, -0.01], I2  = 35%), as well as in years 10-11 (KS4) based on 3 evaluations (-0.06, [-0.09, -0.02]; I2  = 0%). Interventions were broadly effective for reducing specific alcohol, tobacco, and drug use in both KS groups.


CONCLUSIONS
Evidence quality was highly variable. Findings for years 3-6 and 12-13 could not be meta-analysed, and we could not assess publication bias. Interventions appear to have a small but significant effect reducing substance use. Specific methods of integrating academic and health education remain poorly understood.

Existing systematic reviews suggest that school curriculum-based health interventions can reduce alcohol consumption (Foxcroft & Tsertsvadze, 2012), tobacco smoking (Thomas & Perera, 2006), and drug use (Faggiano et al., 2005). Given an increasing focus in state-provided education on "accountability" and performance metrics for core academic subjects, health education, including that relating to the prevention of substance use, is increasingly difficult to deliver within busy school timetables (PSHE Association, 2013). In this context, many schools deliver health education in other subjects, integrating it with academic learning (Formby et al., 2011). This may be an appropriate modality of delivery even in schools with dedicated time to health education. That is, even without the marginalization of health education, this approach may be more effective because it could allow for larger doses (Formby et al., 2011;Pearson et al., 2015); it may be less prone to student resistance to health messages (Kupersmidt, Scull, & Benson, 2012); and it may enable synergy and reinforcements between lessons provided in different subjects (Bier, Zwarun, & Warren, 2011). On the other hand, integration may lead to delivery by school staff less qualified, confident, or willing to address specific health issues, who may not be qualified teachers, or who offer only a cursory treatment of health topics. Existing UK interventions aiming to integrate health and academic education (The British Heart Foundation, 2014;Wright & Ainsworth, 2008) have not been informed by existing theory or evidence, which may limit their effectiveness. For example, the British Heart Foundation's "Money to Burn" and the Ariel Trust's "Plastered" interventions incorporate education about the risks, respectively, of smoking and drinking into mathematics lessons. Money to Burn includes activities such as calculating how much a smoker would typically spend on cigarettes, whereas Plastered includes activities calculating units of alcohol consumed and presenting statistics on attitudes to alcohol, as a basis for whole-class discussions about the harmful consequences of alcohol consumption. An additional question arising in school-based interventions is at which developmental period these interventions might be most effective. In England, key stages (KS) describe each age-related phase of schooling. KS2 includes school years 3-6 (age 7-11 years), KS3 includes years 7 to 9 (age 11-14 years), KS4 includes years 10 and 11 (age 14-16 years), and KS5 includes years 12 and 13 (age 16-18 years).
No systematic review has examined the effectiveness of interventions integrating health and academic education. Existing reviews relating to school-based interventions (Faggiano et al., 2005;Farrington & Ttofi, 2010;Foxcroft & Tsertsvadze, 2012;Hahn et al., 2007; R. E. Thomas & Perera, 2006;Vreeman & Carroll, 2007), some of which are now quite old, largely focus on interventions in specific health education lessons or their international equivalents. Because of such lessons increasingly being squeezed from school timetables, there is an urgent need for a systematic review to synthesize evidence for interventions that integrate academic and health education by developmental stage, and our review meets this need with a focus on substance use.

| METHODS
A protocol for this systematic review and meta-analyses was registered on PROPSERO (https://www.crd.york.ac.uk/prospero/), as CRD42015026464. The work reported in this paper was part of a larger evidence synthesis project looking at various aspects of interventions integrating academic and health education.
We included randomized controlled trials of school-based health curriculum interventions integrating health and academic education, where a majority of participants were school students aged 4-18 years.
Academic education was defined as education in specific academic subjects, literacy, numeracy, or study skills. Operationalizing "integration" was challenging. We judged that to be included, study reports must have been explicit that the intervention aimed to integrate health and academic education, either in the form of health education being woven into one or more existing mainstream school subjects (such as the example presented in Section 1 where education on smoking and alcohol were woven into existing timetabled mathematics lessons) or of distinctive health education lessons also including academic content (for example, a social and emotional skills curriculum aiming to prevent violence which also included education aiming to improve literacy or study skills). That is, interventions could either incorporate health education into academic lessons or featured health education lessons that also included academic education. In the element of the review

Key messages
• No systematic review has examined the effectiveness of interventions integrating health and academic education, despite growing policy interest.
• Interventions were broadly effective for reducing substance use in key stages 3 and 4, with more and stronger evidence in key stage 3.
• Effect sizes were small, but of public health significance.
• Although this intervention model is promising, it is unlikely to be enough to reduce inequalities in child and adolescent substance use.
reported here, we included evaluations measuring outcomes relating to substance use: tobacco smoking, alcohol use, or legal or illegal drug use. We did not restrict our focus to interventions seeking to alter substance use as their only or main objective because our preliminary scoping work indicated that many interventions integrating academic and health education may take a "whole-child" approach, where substance use is one outcome of interest out of several. We did not restrict our studies by date of research or publication, or language, nor did we restrict our searches.
Because our interest was in mainstream education, we excluded interventions that targeted specific subpopulations defined in terms of health outcomes such as children with autism, children with learning disabilities, or children with known behavioural problems. We also excluded interventions which were delivered in mainstream subject lessons without any attempt at integrating health and academic education; trained teachers in classroom management without student curriculum components; or were delivered exclusively outside of classrooms. We did not include outcomes relating to attitudes or knowledge, preferring instead estimates of actual use.
We undertook searches for this review between November 18 and December 22, 2015. We searched 19 electronic bibliographic databases and 32 websites and contacted subject experts. A full description of electronic searches undertaken can be found in Online File 1.
An exclusion criteria worksheet, informed by our inclusion criteria and with guidance notes, was prepared and piloted by four reviewers (GJMT, TT, AF, CB), who screened 50 references in pairs on title and abstracts until achieving a 90% agreement rate, after which single reviewers each screened discrete subsets of the full set of references.
After pairs of reviewers screened each set of abstracts in the pilot screening phase, they discussed decisions. This process was invaluable because, despite being guided by clear inclusion criteria, decisions about whether an intervention did or did not aim to integrate health and academic education were not always easy. The definition of integration, in terms of full integration of health content into existing academic lessons or partial integration whereby new health lessons also included some academic content, was also helpful in making clear decisions about which studies to include or exclude. Full reports were obtained for those references not excludable based on title and abstract. Again, reviewers piloted the procedure for screening full reports working in pairs screening 50 reports and discussed any differences until a 90% agreement rate was reached. CB reviewed all studies identified as potentially included in the review as a final check to determine inclusion.
We extracted data using a modified version of an existing tool (Peersman, Oliver, & Oakley, 1997) including items on study location, intervention description, description of integration, intervention development, timing of intervention and evaluation, target population description, provider and organization characteristics, research questions or hypotheses, sampling methods and sample size at baseline and follow-up, sociodemographic characteristics of participants at baseline and any follow-ups, and data collection and analysis. Trials were assessed for risk of bias using a modified version of questions suggested in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins & Green, 2011). Data from all studies were extracted and appraised by two reviewers in parallel and independently, before meeting to agree details.
Effect sizes from included study reports concerning substance use (tobacco smoking, alcohol, or drugs) as defined in the protocol were extracted into a Microsoft Excel spreadsheet and converted into standardized mean differences (Cohen's d) using all available information as presented for each study. Because all evaluations were cluster randomized trials, some baseline imbalance on individual participant characteristics was likely. Thus, we used effect estimates adjusted for covariates when these were presented alongside unadjusted estimates. Negative effect sizes indicate a reduction in substance use.
Most studies reported several substance use outcomes at several measurement time-points. As indicated in the protocol, we used multilevel meta-analysis as set out by Cheung (2014) and van den Noortgate, López-López, Marín-Martínez, and Sánchez-Meca (2015) with random effects at both the outcome and study level. Multilevel meta-analysis accounts for dependencies between outcomes from the same study by partitioning the variance (tau-squared) between outcomes into a within-study level and a between-study level. The final effect size estimate includes all information that the multiple effect size estimates contribute while correcting for the nonindependence of multiple effect size estimates from each study and presented an estimate of I 2 between studies for heterogeneity. Because outcomes were measured at different points in students' developmental trajectory, we created a "matrix" of KS against type of outcome. We then meta-analysed findings within each cell of the matrix where appropriate, for example, substance use for students in KS3. We considered omnibus outcomes (e.g., count of substances used, any substance use), alcohol outcomes, tobacco smoking outcomes, and drug use outcomes separately and then as a combined model to examine the global impact of these interventions on substance use.
Were we to have had 10 or more studies in any comparison, we would have used funnel plots to investigate the presence of small-study bias.
The funder of the study had no role in study design, data collection, data analysis, data interpretation, or reporting. The corresponding author had full access to all the data in the study and had final responsibility for the decision to submit for publication.

| RESULTS
A total of 78,451 references were identified from the searches. Of these, 1,472 (2%) were identified as duplicates. The remaining 76,979 references were screened on title and abstract and, of these, 76,277 (99%) were excluded using the criteria listed in the protocol.
Of the 702 remaining reports, 12 were unobtainable and 690 full-text reports were screened, of which 628 were excluded. Of the remaining 62 reports of 16 different evaluations included in the overall project, seven outcome evaluations reported across 12 papers specifically addressed substance use and are reviewed here; the remaining papers reported other outcomes relevant to the larger review. See Figure 1 for the review flowchart. In total, the included studies represent data from 89 schools and approximately 7,013 students. Trial-level average cluster sizes ranged from 36 students to 103 students, with a median of 83 students across trials.

| Included studies
Included studies are described in Table 1. All included studies were randomized controlled trials. In each trial, the comparator was treatment as usual. All trials randomized at the school level. One evaluation was conducted in the UK (Kids, Adults Together, KAT [Segrott et al., 2015]) and one in Australia (Gatehouse ), whereas all others were conducted in the United States.
Trials varied in length of follow-up time. One intervention, KAT (Segrott et al., 2015), only evaluated outcomes in the same period as the intervention was administered, whereas two evaluations followed up participants for 3 years (Gatehouse ; Infused Life Skills Training, I-LST [Smith, Swisher, Vicary, & Bechtel, 2004]). The remaining four evaluations included follow-ups between 5 and 8 years from baseline. Study populations in most evaluations drew from different year cohorts at baseline. Students in the trials of KAT and Linking the Interests of Families and Teachers (LIFT) were largely drawn from years 5 and 6. The Hawaii evaluation of Positive Action (Beets et al., 2009) included students in years 2 or 3 at baseline, as did the evaluation of Raising Healthy Children (RHC; Catalano et al., 2003). Unusually, the evaluation of Gatehouse Patton et al., 2006)

| Quality of included studies
Quality was variable between studies. For many of the items, studies did not report sufficient information to enable judgment. Ratings by item are reported in Table 2. Three evaluations, KAT (Segrott et al., 2015), LIFT (DeGarmo, Eddy, Reid, & Fetrow, 2009;Eddy, Reid, & Fetrow, 2000;Reid, Eddy, Fetrow, & Stoolmiller, 1999), and Positive Action Chicago Lewis, 2012;Lewis et al., 2012;Lewis et al., 2013;Li et al., 2011) all presented enough information to appraise these evaluations as having a low risk of bias in randomization sequence generation. Positive Action Chicago used random number generators, whereas LIFT drew allocations from a hat (Reid et al., 1999). KAT used optimal allocation to determine the randomization sequence (Segrott et al., 2015). The remaining four evaluations had unclear risk of bias. None of the included evaluations stated if or how allocation was concealed, thus preventing a determination as to risk of bias in this domain. Although blinding is often difficult in trials of school interventions, blinding of outcome assessors is often possible and may reduce ascertainment bias. Only one evaluation, LIFT (DeGarmo et al., 2009;Reid et al., 1999), provided enough information to judge whether any blinding occurred. In this evaluation, outcome assessors were blind to allocation. All evaluations were judged as having relatively low or balanced attrition where this was relevant to outcome assessment (complete outcome data domain). Trial protocols were rarely available, thus precluding determinations as to risk of bias on selection reporting. All evaluations except for one (RHC) reported in text methods to account for clustering, and all evaluations took steps to reduce other forms of bias. In two evaluations (LIFT and KAT), it was unclear if a suitable control group had been recruited.

| Substance use in KS2
Of the three evaluations presenting substance use outcomes for KS2, one evaluation, Positive Action Chicago (Li et al., 2011), presented an omnibus outcome comparing counts of substances used. Both KAT (Segrott et al., 2015) and Positive Action Hawaii (Beets et al., 2009) presented outcomes relating to alcohol use. In addition, Positive Action Hawaii (Beets et al., 2009) presented outcomes relating to tobacco smoking and drug use. We did not undertake a meta-analysis for substance use outcomes in KS2.
At the end of the third intervention year (corresponding to year 6), intervention students in the Positive Action Chicago trial (Li et al., 2011) had a lower count of types of substance use compared to control students (IRR = 0.69, 95% CI [0.50, 0.97]). In the Positive Action Hawaii trial (Beets et al., 2009)       intervention year and included students in years 5 and 6, were inconsistent and had wide confidence intervals, though this is not unexpected as this was a relatively small pilot trial (Segrott et al., 2015).
Intervention students were more likely, but not significantly so, to have been drunk in the last 30 days (OR = 1.5, 95% CI [0.4, 5.8]) and to have ever been drunk (1.7, [0.5, 6.8]) and significantly more likely to have ever had an alcoholic drink (5.3, [1.2, 23.9]). However, intervention students were less likely, but not significantly so, to have had a drink in the last 30 days (0.7, [0.2, 2.5]).

| Substance use in KS3
Five evaluations reported outcomes relating to substance use in KS3: Gatehouse Patton et al., 2006), LIFT, I-LST (Smith et al., 2004), Positive Action Chicago , and RHC (Brown, Catalano, Fleming, Haggerty, & Abbott, 2005). Of these, Gatehouse (Patton et al., 2006) and Positive Action Chicago  reported omnibus substance use outcomes. Alcohol, tobacco smoking and drug use outcomes were reported by the same five evaluations: Gatehouse Patton et al., 2006), LIFT, I-LST (Smith et al., 2004), Positive Action Chicago  and RHC (Brown et al., 2005). We undertook separate meta-analyses for alcohol use, tobacco smoking, and drug use (including marijuana separately), and for all substance use outcomes. Findings from meta-analyses are summarized in Table 3 Unlike the meta-analysis of alcohol outcomes, meta-analyses for tobacco smoking and drug use all had negligible between-studies heterogeneity (I 2 = 0%). We did not undertake meta-analysis for omnibus substance use outcomes alone given that only two evaluations would have been included. The two evaluations measuring omnibus substance use outcomes in KS3 related to different interventions and fol-  (Patton et al., 2006). However, in Positive Action Chicago , intervention participants in year 9 used fewer

| Substance use in KS4
Three evaluations reported outcomes relating to substance use in KS4: Gatehouse , I-LST (Vicary et al., 2006), and RHC (Brown et al., 2005). All outcomes reported related to alcohol use, tobacco smoking, and drug use, that is, no included evaluations reported omnibus substance use outcomes in KS4. We undertook separate meta-analyses for alcohol use, smoking, and drug use (namely, marijuana use). We also undertook an overall meta-analysis of substance use outcomes in KS4. Findings from these meta-analyses are summarized in Table 3

| DISCUSSION
In this systematic review and meta-analysis, quality of evidence was highly variable. There was some evidence from meta-analyses that interventions were broadly effective for reducing substance use in KS3 and KS4, with more and stronger evidence in KS3. Effect sizes were small but of potential public health significance. For both estimates of global effects of these interventions, findings can be translated to represent that about 53% of people receiving the intervention will have lower levels of substance use than people receiving the control group. Although this is a small individual-level effect, at the population level, the cumulative burden of ill health avoided may be large. Our findings regarding the specific developmental periods in which interventions are effective are especially interesting given trajectories of harmful substance use, where early prevention of substance use could decrease the risk of harmful patterns and sequelae of drug use in later adolescence and adulthood.
Our findings reflect the positive findings from previous systematic reviews on substance use prevention in schools and extend this to interventions integrating health and academic education, suggesting that these have the potential to achieve significant population-level health benefits if delivered at scale. A landmark review on alcohol use prevention in schools did not pool study estimates but authors concluded that interventions had the potential to be effective (Foxcroft & Tsertsvadze, 2012). Our meta-analyses found specific evidence of effectiveness in KS3 but not KS4. Similarly, a systematic review of school-based tobacco prevention programmes found evidence for interventions preventing initiation of smoking (Thomas & Perera, 2006). Although our analysis combined all smoking outcomes, we were able to find evidence for effectiveness in reducing smoking in KS4. Again, the magnitude of effects was small but of potential public health significance at a population level. A recent systematic review on peer-led interventions to reduce alcohol, tobacco, and drug use in young people, while focused on ages 11 to 21, similarly found significant reductions in these outcomes, adding to the plausibility of our findings (MacArthur, Harrison, Caldwell, Hickman, & Campbell, 2016). Finally, a major systematic review on illicit drug use prevention in schools found "small but consistent" effects (Faggiano et al., 2005); like in our meta-analyses, review authors predominantly included marijuana use outcomes. Our findings reflect theirs in sum and substance, but we were able to locate effects more specifically as occurring in KS3 and KS4, with weaker evidence supporting effectiveness in KS2.
Our interest was in interventions that integrate academic and health education. Most interventions reviewed undertook integration as a practical way to introduce health materials into a school day, with very few acknowledging integration as central to intervention theory of change. This made searching for this evidence especially challenging. It is likely that we missed some potentially relevant studies, as integration may not have been apparent in titles or abstracts. Determining whether studies found in our searches should be included or not was also sometimes difficult. All study reports of included studies explicitly described some form of integration between health and academic content, but the extent and clarity of this description varied enormously. It is possible that we excluded studies of interventions where in practice health education was sometimes or always delivered in academic lessons but where this was not mentioned as part of the design of the intervention in study reports, or where interventions introduced new health lessons some of which included academic learning objectives, but where this level of detail was omitted from study reports. Despite this risk of excluding potentially relevant studies and thereby reducing the sensitivity of our screening, we judged that restricting inclusion to studies of interventions where integration was explicitly described as part of the design of the intervention was the only way to ensure the review had a specificity of focus and avoided an unworkable situation where screening decisions required detailed examination of intervention materials or lengthy discussion with authors. We were also unable to assess publication bias. Anecdotally, reviewers also noted that there were a considerable number of quasi-experimental studies that may have provided useful insights but were excluded due to our inclusion only of RCTs. Inclusion of nonrandomized trials may have also been beneficial here or would warrant further study in the future. Our "matrix"-based meta-analysis method had both strengths and limitations. One strength is that it could examine effects in specific developmental phases of relevance to educators and intervention implementers. However, one limitation is that it involved combining estimates across a diversity of follow-up times. The few studies included in each meta-analysis precluded examination of heterogeneity by follow-up time.
In terms of policy, this intervention type presents great promise as a means of addressing health in school systems that are overwhelmingly focused on academic attainment and school timetables and in which health education is increasingly being squeezed out and to enhance existing provision where dedicated time for health education still exists. It is possible that these interventions may work in part by improving health-related knowledge and skills and also by improving school engagement through the enhancement of relationships between students and teachers and by breaking down boundaries between different subjects and between classrooms and the wider school environment. Interventions such as Positive Action that aimed to improve such relationships yielded a consistent pattern in achieving reductions in substance use.
However, our findings also suggest that integrative interventions, although attractive as a means to deliver health, social and emotional learning in the context of school systems, overwhelmingly focused on educational attainment are not necessarily a panacea. Study quality varied and was often unclear, and more evaluations are needed across the range of key stages to better understand the effectiveness of these interventions in preventing substance use. In addition, evaluators should make an effort to understand subgroup effects more specifically. Although some evaluations (namely, the evaluations of Positive Action) considered subgroup effects, more careful attention to moderation by school-level characteristics could yield useful knowledge about context-intervention fit.

FUNDING AND CONFLICT OF INTERESTS
Authors have no conflicts of interest or financial interests to disclose.