Criteria for considering studies for this review
Types of studies
The ecological nature of interventions and continuous PM monitoring at sites makes non-randomised studies, such as interrupted time series, very common in the field of ambient air pollution research and policy making. Due to the complexity and range of ambient air pollution interventions, and the importance of non-randomised evidence within the field, we will consider both randomised and non-randomised studies for this review. As there is considerable debate within the methodological literature about the relevance (or not) of including non-randomised evidence in systematic reviews of complex interventions, this review will also explore the added value of including RCTs only versus RCTs plus Cochrane Effective Practice and Organisation of Care (EPOC) Group-recognised study designs versus RCTs plus EPOC-recognised study designs plus other study designs. The following study designs will therefore be eligible for inclusion:
Individually randomised trials
Controlled before-and-after studies adhering to EPOC standards (CBA-EPOC) – with at least two intervention sites and two control sites (EPOC 2013)
Interrupted time series studies adhering to EPOC standards (ITS-EPOC) – with at least three data points before and after a clearly defined intervention (in terms of content and timing) (EPOC 2013)
Controlled before-and-after studies not adhering to EPOC standards (CBA) – with fewer than two intervention and/or control sites
Uncontrolled before-and-after studies (UBA)
Interrupted time series studies not adhering to EPOC standards (ITS) – with fewer than three data points before and after a clearly defined intervention (in terms of content and timing)
Repeated cross-sectional studies (CSS) – with a clearly defined intervention (in terms of content and timing) and data collected at least once before and once after the intervention
As effects on ambient air and health may be observed after very short (days) as well as long (months or years) time periods, studies of both short and long duration will be included.
As we expect inconsistencies in naming among studies, we will be very careful not to exclude studies solely based on study design labels. For example, a cohort study, which is linked to a clearly described intervention and where effect data are collected both pre- and post-intervention, is an uncontrolled before-and-after study according to our definition and would be included.
Types of participants
Interventions to reduce ambient PM air pollution are usually intended for the general population and are of global relevance. As discussed above, concentrations at which ambient PM air pollution has been shown to affect human health are experienced by both children and adults in urban and rural settings in both developed and developing countries (Dadvand 2013; Lim 2012; WHO 2006). For this reason, we will make no exclusions with regard to age group or other participant or setting characteristics.
Types of interventions
We will categorise interventions with regard to the target PM source:
Vehicular sources: those interventions aimed at reducing ambient PM originating from vehicular sources such as automobiles or public transportation, or interventions aimed at reducing traffic and congestion but resulting in changes in ambient PM concentrations
Industrial sources: those interventions aimed at reducing ambient PM stemming from industrial and power-generating sources
Residential sources: those interventions aimed at reducing ambient PM stemming from residential heating and cooking, or those aimed at reducing indoor PM but resulting in changes in ambient PM concentrations
Multiple sources: those interventions aimed at reducing ambient PM originating from multiple sources, which could include any of the above-listed sources
Each of these interventions may comprise multiple components, including technological or infrastructural, educational, policy and regulatory, and execution may be intervention- and component-dependent (see Figure 3).
The comparison will be no intervention.
Types of outcome measures
Effects of interventions can be assessed at different stages as illustrated in Figure 2. For this review, studies which measured at least one of the following ambient air quality or human health primary or secondary outcomes will be eligible for inclusion.
Ambient air quality
There are many components of ambient air pollution, such as PM, carbon monoxide (CO), sulphur dioxide (SO2), nitrogen oxides (NOx) and ozone (O3), for which decreases in ambient concentration should theoretically be associated with improved human health (NRC 2002; WHO 2006). PM is the indicator pollutant used most broadly for monitoring, guidelines and standards, and has been shown to be associated with numerous health outcomes, and will therefore be the primary population-level outcome used to assess ambient air quality for this review.
For the above PM outcomes, studies will be eligible for inclusion if PM is measured on a per mass basis with particles smaller than PM10. Since the exact cut-off point will be different across studies, we will consider studies that measured PM10, coarse PM and PM2.5. In addition, since it is thought that combustion-related PM is more harmful to health than PM generated from other sources, we will also consider studies that focused on combustion-related indicators of PM or soot, including black carbon, black smoke, elemental carbon and absorption of PM. Soot has been recommended as an additional indicator to evaluate the health risks of PM and traffic abatement measures (Janssen 2011).
A sampling duration of less than 24 ± 2 hours will be excluded, as this cannot be considered representative of daily concentrations. As the focus of this review is on the effectiveness of interventions to reduce ambient PM concentrations, those studies measuring only indoor air pollution will not be included. While biomarker studies as proxies of dose are becoming more common, uncertainties still remain with respect to their reliability, and we are not aware of any intervention studies that have used these.
Human health response
Health responses associated with ambient air pollution, and PM in particular, include cardiovascular, respiratory and all-cause mortality, as well as acute cardiovascular and respiratory events. As ambient PM is responsible for 3.2 million annual premature deaths globally (Lim 2012), the primary human health outcomes in this review will be the following:
Also, as mortality will likely be routinely collected at the population level, assessments of mortality are less prone to bias than assessments of morbidity-related outcomes.
In addition, this review will also assess the following secondary outcomes, where available:
Ambient air quality and personal exposure
For all ambient air quality (population level) and personal exposure (individual level) secondary outcomes, as for air quality primary outcomes, 24-hour concentrations and multiples of 24-hour measurements will be included.
Human health response
In addition to their relevance to health and quality of life, we chose these endpoints since they are often studied in relation to ambient PM pollution (Rückerl 2011). We will assess lung function using volume measures including forced expiratory volume in one second (FEV1) or forced vital capacity (FVC), and flow measures including peak expiratory flow (PEF) and maximal (mid-)expiratory flow (MMEF). Respiratory events will be defined as serious respiratory symptoms, asthma attacks, wheezing or lower respiratory tract infections (LRI). Cardiovascular events will primarily be concerned with heart attack and stroke. These morbidity measures will likely be available as routinely collected data, although some, e.g. lung function, cardiovascular and respiratory events, may be measured at the individual level.
As PM interventions may also generate unintended effects of relevance to policy makers, we will attempt to document these where reported in primary studies. Some examples include:
Search methods for identification of studies
We will perform searches within the following electronic databases.
Grey literature/unpublished research/in press
HMIC (1979 to date)
WHO ICTRP (inception to date)
ClinicalTrials.gov (inception to date)
IDEAS (inception to date)
JOLIS (inception to date)
3ie impact database (inception to date)
PubMed (all-topic search for e-publications ahead of print in title and abstract)
The MEDLINE search strategy is shown in Appendix 1 and will be adapted to the above listed databases. To ensure that the appropriate studies are identified, this search strategy is designed to capture studies relevant with regard to 1) the problem (ambient PM air pollution), 2) ambient air quality and health outcomes (ambient pollutant concentrations, mortality, cardiovascular and respiratory events), 3) intervention (those interventions expected to reduce ambient PM concentrations from vehicular, industrial or residential sources) and 4) study design (this search filter returns those study designs used in epidemiological research, i.e. no toxicological, pharmaceutical or animal studies).
In addition to the above listed database searches, we will handsearch all references of included studies, and the tables of contents of Environmental Health Perspectives andAtmospheric Environment for the 12 months preceding the last search date. To further ensure that relevant published data not captured through the search and unpublished data will be identified, we will contact the review advisory group (RAG), described in detail below, to suggest relevant published and unpublished literature.
Searches will be conducted in English but we will endeavour not to exclude any studies on the basis of language, with the team being able to assess papers published in English, Dutch, German, French, Italian and Afrikaans. For papers not published in English, we will explore options for translation and assessment for inclusion. All search results will be stored in EndNote.
Data collection and analysis
Selection of studies
Following removal of duplicate studies, we will perform a multi-stage screening process. In the first stage, Jacob Burns (JB) will screen all titles, removing those which are clearly not relevant with regard to population, intervention, outcomes or study design. In the second stage, JB, Eva Rehfuess (ER) and Lisa Pfadenhauer (LP) will independently screen 100 randomly selected abstracts and discuss any disagreements to ensure a standardised screening process. One review author, JB, ER or LP, will then assess all abstracts, excluding only those which are clearly not relevant with regard to population, intervention, outcomes or study design. During these initial stages, we will take an inclusive approach to screening, with all titles and abstracts where relevance is questionable kept for the next stage of independent screening by two review authors. In the third stage, two review authors, JB, ER or LP, will independently screen all remaining abstracts. Certain details regarding study design and features are often not as well reported in non-randomised studies when compared with RCTs (Higgins 2012). If certain key criteria for inclusion cannot be ascertained from the abstract, the study will be kept for full-text screening. Disagreements between review authors will be resolved through discussion, and a third review author will be consulted where necessary.
Subsequently, in the final stage, two review authors, JB, ER, LP or Anke Rohwer (AR), will examine the full text of all potentially relevant studies, assessing each against a checklist of inclusion criteria, evaluating whether the study matches the target study design, population, intervention, comparison and outcome of the review. Sections one to three of the review data extraction form, a standardised form adapted from the Cochrane Public Health Group's Data Extraction and Assessment Template (see Appendix 2) comprise the checklist for inclusion. Disagreements between the two review authors will be resolved through discussion, and a third review author will be consulted where necessary. Review authors will document the reasons for exclusion at each stage of screening. We will perform screening using Endnote.
Data extraction and management
JB and one of the other review authors will extract data for sections four to seven of the data extraction form independently. Inconsistencies or disagreements between the two review authors will be resolved through discussion, and ER will be consulted where necessary. For sections eight and nine of the data extraction form, which consists of the detailed documentation of intervention and context, a single review author (JB, LP, AR or ER) will extract data.
The final agreed data extraction will be entered into RevMan 5.2 (RevMan 2012) by JB, and checked by a second review author (one of ER, LP or AR).
As considerable differences in intervention type are expected, we will focus on extracting all relevant data to describe the intervention thoroughly. We will document data regarding intervention duration, intensity, goal and level of implementation (e.g. local, regional, national, international), as well as other intervention characteristics, including economic and process measures. We will document information and effect estimates for all primary and secondary outcomes reported by the study. We will attempt to capture the complexity of the intervention by assessing the following domains developed as part of the Methodological Investigation of Cochrane reviews of Complex Interventions (MICCI) project (Simon Lewin, personal communication):
Number of discrete, active components included in the intervention compared with the control
Number of behaviours or actions of intervention recipients or participants to which the intervention is directed
Number of organisational levels targeted by the intervention
The degree of flexibility or tailoring permitted across sites or individuals in intervention implementation/application
The level of skill required by those delivering the intervention
The level of skill required for the targeted behaviour when entering the study by those receiving the intervention (consumers, professionals, planners) in order to meet the intervention's objectives
We will extract relevant contextual data, based on a context and implementation framework developed as part of the EU-funded INTEGRATE-HTA project, where available (Lisa Pfadenhauer, personal communication). This framework, shown in section nine of Appendix 2, places eight contextual domains within several interrelated levels which include the setting, community, national and international levels:
Aspects highlighted by the PROGRESS framework possibly leading to important inequality issues (e.g. place, race, occupation, gender, religion, education, socioeconomic) are also addressed through this framework.
Assessment of risk of bias in included studies
After piloting and exercises to calibrate the assessment by JB, ER and Hanna Boogaard (HB), two authors (JB, HB, ER, LP or AR) will independently assess the risk of bias of all included studies. Disagreements between two review authors will be resolved through discussion, and a third review author (ER or AvE) will be consulted where necessary.
To do so, we will employ two methods in parallel, the Cochrane 'Risk of bias' tool as used by the EPOC group (EPOC 2013) and the modified version of the Graphic Appraisal Tool for Epidemiological studies (GATE), as employed by the Public Health Excellence Centre at the UK National Institute for Health and Care Excellence (NICE 2012). Particular attention will be paid to the appropriate consideration of confounders in analysis, e.g. background mortality trends, climatic conditions. We will apply the EPOC-modified Cochrane 'Risk of bias' tool to RCTs, cluster-RCTs, ITS-EPOC and CBA-EPOC; we will apply the modified GATE tool to all study designs, thus yielding a double assessment of the four lower risk of bias study designs. For these four designs, we will compare any differences in assessment of study quality depending on the choice of tool and conduct sensitivity analyses, as necessary.
Cochrane 'Risk of bias' tool (EPOC)
The Cochrane 'Risk of bias' tool, as modified by the Cochrane EPOC group, is widely used and validated and allows for comparison across Cochrane reviews. It assesses risk of bias separately for controlled studies (RCTs, controlled clinical trials (CCTs) and CBA-EPOC) and for interrupted time series (EPOC 2013).
For controlled studies, the assessment is based on the following areas:
For ITS, the assessment is based on the following areas:
Intervention independent of other changes
Shape of intervention pre-specified
Intervention affects outcome data
Incomplete outcome data
Selective outcome reporting
Other sources of bias
For each of these areas, one of the following summary assessments is given:
Low risk of bias: plausible bias unlikely to alter the results
Unclear risk of bias: plausible bias that raises some doubt about the results
High risk of bias: plausible bias that seriously weakens confidence in the results
Modified GATE tool
We will assess randomised studies and controlled before-and-after studies with the GATE tool for quantitative intervention studies. This version of the tool, modified for assessment of public health interventions, is suitable for all intervention study designs, assessing these on a level playing field, and is therefore practical in a review such as this (Jackson 2006; Voss 2013). The GATE appraisal checklist is divided into five sections, allowing for a systematic assessment of aspects related to a study's external validity (section one), as well as internal validity (sections two to four). Section five then allows the review author to give each study an overall quality rating for both external and internal validity. Specifically, sections one to five deal with validity concerns related to the following:
We will rate risk of bias for different aspects within sections one to four as one of the following (NICE 2012):
++ Indicates that for that particular aspect of study design, the study has been designed or conducted in such a way as to minimise the risk of bias
+ Indicates that either the answer to the checklist question is not clear from the way the study is reported, or that the study may not have addressed all potential sources of bias for that particular aspect of study design
– Should be reserved for those aspects of study design in which significant sources of bias may persist
Not reported (NR): Should be reserved for those aspects in which the study under review fails to report how they have (or might have) been considered
Not applicable (NA): Should be reserved for those study design aspects that are not applicable given the study design under review (for example, allocation concealment would not be applicable for case control studies)
In section five, we will rate overall external and internal validity of a study using one of the following; in considering the internal validity of a study, we will also consider whether exposure measurements were reliable and valid.
++ All or most of the checklist criteria have been fulfilled; where they have not been fulfilled the conclusions are very unlikely to alter
+ Some of the checklist criteria have been fulfilled; where they have not been fulfilled, or are not adequately described, the conclusions are unlikely to alter
– Few or no checklist criteria have been fulfilled and the conclusions are likely or very likely to alter
We will assess interrupted time series studies, uncontrolled before-and-after studies and repeated cross-sectional studies with the GATE tool for quantitative studies reporting correlations and associations. This version of the tool is analogous to the version for intervention studies, but emphasises the selection of the exposure group and statistical control for confounding rather than intervention allocation and blinding.
Measures of treatment effect
For studies assessing a continuous outcome, including ambient air concentrations of PM, CO, SO2, NOx or O3, as well as for hospital admissions and lung function measures, we will use the mean difference to assess intervention effect. For dichotomous outcomes, including cardiovascular, respiratory or all-cause mortality and cardiovascular or respiratory events, we will use the risk ratio (RR). We will also include 95% confidence intervals (CI) for all mean difference and RR intervention effects.
Unit of analysis issues
Where cluster trials are considered without adjustments for clustering, we will perform re-analysis, where possible, taking into account the correlated nature of within-cluster data.
Dealing with missing data
In the case that missing information on study features (e.g. number of time points, randomisation details), intervention characteristics (e.g. additional components, information on intensity) or outcome data (e.g. missing values, variance measure, suspected selective outcome reporting) prevent or limit use of a study, we will contact the investigators.
Assessment of heterogeneity
We will assess issues of clinical and methodological heterogeneity through tables, with the documentation of the following relevant study-specific characteristics:
Methods: study design, group assignment, exposure assessment, outcome assessment, adjustment for confounders
Population: setting, age
Intervention: components, duration, intensity/dose, goal
Context: geographical susceptibility, baseline mortality and morbidity, baseline PM, political issues (e.g. policies), legal issues (e.g. regulations, guidelines), ethical issues
Delivery: delivery agent, organisation and structure
We will assess statistical heterogeneity graphically with a forest plot and statistically with an I2 statistic calculation. We will consider an I2 value greater than 50% to indicate substantial heterogeneity, and will consider it statistically significant if the P value for the Chi2 test is < 0.1. We will create forest plots and I2 calculations using RevMan 5.2 (RevMan 2012).
Assessment of reporting biases
Where feasible, we will use a funnel plot to investigate the risk of publication bias by intervention type and outcome measure. We will visually examine the funnel plot for asymmetry. As we are likely to find fewer than 10 studies in each category, we will not be able to conduct statistical tests of asymmetry, such as Begg's and Egger's tests.
For each intervention category (vehicular sources, industrial sources, residential sources and multiple sources), where two or more studies report on the same primary outcome, and for which sufficient methodological and clinical homogeneity exists, we will perform a separate meta-analysis. For studies with multiple comparison groups, we will only analyse those comparisons assessing an intervention/intervention components compared with no intervention/intervention components. We will pool each study design category (i.e. RCTs and cluster-RCTs, ITS-EPOC, CBA-EPOC, ITS, CBA, UBA, CSS) in separate meta-analyses, where pooling is possible. We will examine the following primary outcomes for pooling:
We will also explore ways to convert PM10 and coarse particles into PM2.5 estimates, with the use of previously published conversion factors (Ballester 2008). In addition, we will explore ways to convert different combustion-related PM indicators into, for example, elemental carbon estimates (Cyrys 2003; Janssen 2011). The use of a common PM2.5 indicator would allow for a greater number of PM-related outcomes to be included in a single meta-analysis. We will synthesise secondary outcomes analogously to primary outcomes and perform meta-analysis where possible and appropriate.
Due to expected differences in intervention components and complexity, setting and study population, we will implement random-effects models for all meta-analyses. We will carry out inverse-variance random-effects meta-analyses using RevMan 5.2 (RevMan 2012). Effects will be considered statistically significant with a P value less than 0.05.
As it is likely that much of the evidence will prove too heterogeneous for statistical pooling purposes, we will conduct another form of evidence synthesis alongside meta-analysis. The harvest plot has been shown to be an effective, clear and transparent way to synthesise evidence for complex interventions (Ogilvie 2008; Turley 2013) and we will implement it in this review. Harvest plots will allow us to synthesise evidence graphically based on all study designs for the effects of the vehicular sources, industrial sources, residential sources and multiple sources intervention categories across all primary and secondary outcomes. We will develop four separate harvest plots, one for each intervention category. We will arrange studies, represented by bars, in rows with regard to outcomes, and identify them by the first three letters of the author's last name. We will illustrate the direction of effect – increasing effect, no effect, decreasing effect – in three columns. Also illustrated, by height of bar, will be appropriateness of study design. The ratings derived from and symbols used within the GATE tool (++, +, -) will represent the risk of bias for each study. We will create harvest plots in Microsoft Power Point.
We will also plot intervention effects on PM10 and PM2.5 reduction against WHO air quality guidelines and/or interim targets (Table 1) to explore to what extent specific interventions are effective in helping reach these targets. We will create plots of intervention effects against WHO air quality guidelines in R (R 2011).
Subgroup analysis and investigation of heterogeneity
In order to assess possible sources of heterogeneity due to, for example, intervention design or inclusion of certain more susceptible populations, we may perform subgroup analyses regarding the following issues, although we expect that lack of data will prevent us from conducting most of these:
Population characteristics: developing versus developed country, urban versus rural setting, children versus adults
Intervention characteristics: number of components, duration of intervention, goal of intervention, intensity/dose of intervention, temporary versus permanent goals, level of implementation, complexity
Delivery: delivery agent, organisation and structure, regulatory versus non-regulatory
Inequality characteristics based on the PROGRESS framework: place, race, occupation, gender, religion, education, socio-economic status
As part of this review of a complex public health intervention, we will assess the value of going beyond randomised evidence by including both EPOC-recognised designs (CBA-EPOC, ITS-EPOC) and non-EPOC study designs (CBA, UBA, ITS and CSS) based on meta-analyses and harvest plots. We will also examine how the choice of quality appraisal tool – i.e. EPOC-modified Cochrane 'Risk of bias' tool versus modified GATE tool – affects any conclusions about the quality of individual studies.
In the main synthesis described above, we will conduct all possible meta-analyses separately for each study design. We will narratively discuss any differences in conclusions based on (i) only randomised studies versus (ii) randomised studies and EPOC-recognised non-randomised designs versus (iii) a very broad inclusion of randomised, EPOC and non-EPOC non-randomised designs. For the second group, i.e. randomised studies and EPOC-recognised designs, we will conduct two additional sensitivity analyses. The first will only consider those studies with a low risk of bias rating based on the EPOC-modified Cochrane 'Risk of bias' tool; the second will only consider those studies with a high internal validity rating based on the modified Gate tool.
We will also create two additional sets of harvest plots including only evidence from individually and cluster-randomised trials, and including randomised studies plus CBA-EPOC and ITS-EPOC; we will compare these to the harvest plots of the main synthesis described above.
Of methodological significance will be how the conclusions to be drawn from the review change based on the inclusion of higher risk of bias evidence. Moreover, it will be of interest how the additional information gained from inclusion of these broader study designs compares to the increased time, workload and expertise needed at the screening, data extraction, quality appraisal and evidence synthesis stages.
Quality of evidence
In order to assess the quality of the body of evidence used in the above-listed data syntheses, we will use the Grading of Recommendations Assessment, Development and Evaluation (GRADE) system for grading evidence for each of the seven primary outcomes described above (Guyatt 2008). GRADE will allow us to systematically and transparently grade the quality of the body of evidence for each outcome based on the following factors:
Based on these criteria, we will grade each outcome grouping as one of the following:
High quality – further research is very unlikely to change our confidence in the estimate of effect
Moderate quality – further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate
Low quality – further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate
Very low quality – any estimate of effect is uncertain
We will create a 'Summary of findings' table to summarise this assessment.
Review Advisory Group
The protocol draft was sent to the individuals listed below in Table 3, who formed the membership of our Review Advisory Group (RAG). The RAG is made up of ambient particulate matter pollution and intervention experts as well as potential end-users of the review, who all provided feedback to ensure the review will meet its intended goal of assessing the effectiveness of ambient PM interventions in a systematic and comprehensive way and that the review will appropriately inform policy.
| Advisor|| Organisation|
| Ambient PM policy advisers|
|Martin Lutz||Senate Department for Urban Development and the Environment, Berlin, Germany|
|Leendert van Bree||PBL Netherlands Environmental Assessment Agency|
|Carlos Dora||WHO Department of Public Health and Environment, Geneva|
|Marie Neira||WHO Department of Public Health and Environment, Geneva|
|Marie-Eve Heroux||WHO European Center for Environment and Health, Bonn|
|Bryan Hubbell||US Environmental Protection Agency|
| Ambient PM experts|
|Bert Brunekreef||Institute for Risk Assessment, Utrecht University, Netherlands|
|C Arden Pope III||Brigham Young University, US|
|Nino Kuenzli||Swiss Tropical and Public Health Institute|
|Wei Huang||Peking University Health Science Center, Beijing, China|
Table 3: Membership of Review Advisory Group