PROTOCOL: Police programs that seek to increase community connectedness for reducing violent extremism behaviour, attitudes and beliefs

Abstract Community engagement and connectedness are identified as potential mitigating factors for those at risk of engaging in violent extremism. Police have a critical role in promoting social inclusion and social connectedness and thereby preventing violent extremism. Thus, it is essential to understand the effectiveness of policing programs aimed at promoting community connectedness and their impact on reducing violent extremism. To date, there has been no systematic synthesis of the evaluation evidence for these policing approaches and their impact on violent extremism. This is the protocol for a review that will include any policing intervention that aims to promote community connectedness. The present proposed review is necessary to ascertain whether policing interventions that seek to promote community connectedness are effective for reducing violent extremism behaviour, attitudes and beliefs.

Increasingly, police are working with a range of different agencies, together actively engaging the community to reduce social isolation, improve economic opportunity and aim to create social and cultural norms that prevent violent extremism (Schanzer et al., 2018).
Yet it is unclear whether or not the range of police initiatives that foster community connectedness are able to reduce violent extremism. Thus, it is essential to understand the effectiveness of policing programs aimed at promoting community connectedness and their impact on reducing violent extremism.

| The intervention
This review will include any policing intervention that aims to promote community connectedness, which is defined by the presence of two components. First, the intervention must have a policing focus, defined as some kind of a strategy, technique, approach, activity, campaign, training, program, directive or funding/organisational change that involves police in some way (other agencies or organisations can be involved; Higginson, Eggins, Mazerolle, & Stanko, 2015). Police involvement is broadly defined as the following.
• Police initiation, development or leadership.
• Police are recipients of the intervention or the intervention is related, focused or targeted to police practices.
• Delivery or implementation of the intervention by police.
Second, the policing intervention must aim to promote community connectedness. For the purposes of this review, we define the promotion of community connectedness to mean an intention to increase prosocial linkages or prosocial ties between either community members themselves, community members and police, or community members and people in businesses, houses of worship, schools or any other community-based organisation. Other terminology that may be used to represent connectedness in the literature includes the following (Thomas, 2019).
• Promotion of common values, norms and/or reciprocity.
• Promotion of social networks, collective efficacy, social cohesion or social capital.
• Promotion of shared problem-solving or citizen engagement.
We anticipate that policing interventions aiming to promote community connectedness will likely overlap with initiatives labelled community policing (see Gill et al., 2014) or other policing approaches that often aim to enhance community connectedness (e.g., neighbourhood policing or legitimacy approaches). However, we specifically define a policing intervention that aims to promote community connectedness as being characterised by community consultation, partnership or collaboration with citizens and/or organisational entities. Specific strategies may include: • community meetings or forums; • developing partnerships with specific organisations (Fox, 2012); • police liaison programs involving community members ; • police work with community leaders to enhance personal skills (e.g., self-identity, self-awareness and resilience), employment skills (e.g., teamwork and self-awareness) or leadership skills (Thomas, 2019); • routine police work (such as beat policing, foot patrols and community intelligence initiatives) that explicitly seek to promote community connectedness; • specific initiatives-such as neighbourhood policing teams-that seek to promote community connectedness; or • police legitimacy enhancing programs that seek to promote a sense of belonging and inclusion within local communities.

| How the intervention might work
Police programs that seek to reduce violent extremist behaviours and beliefs through improving community connectedness aim to generate an impact by promoting an increased sense of prosocial belonging and inclusion amongst at-risk groups. This causal pathway is underpinned by key perspectives in the literature that argue how people are treated by institutional authorities, such as police, has an impact on their sense of identity and belonging by making them feel accepted by broader society .
Social identity theory and the group value model (see Tyler & Lind 1992) demonstrates that the ways that police engage with citizens will differentially affect the way that people perceive the police and thereby their willingness to comply with directives (see also Bradford, Murphy, & Jackson, 2014;Huo, Smith, Tyler, & Lind, 1996). Some research shows that procedural fairness is more important for those on the margins (De Cremer & Sedikides 2005;Murphy 2013). Other research finds that the procedural justice, social identity and legitimacy pathway is found amongst both those with high and low group identifications (Bradford et al. 2014). Outcomes of police initiatives that build a sense of belonging and inclusion are, therefore, assumed to act as protective factors against radicalisation by ensuring individuals are not influenced by the messaging and grievance narratives that violent extremists use to attract support. We acknowledge, however, Nagin and Telep's (2017) review of the evidence challenging the causal relationship between perceptions of procedurally just treatment of citizens by agents of the criminal justice system and perceptions of police legitimacy.
the Muslim community-has been shown to influence the degree to which people are willing to partner with police to tackle terrorism (Cherney & Murphy 2017). Police are seen as representatives of the state and when they work with community groups, this helps builds police legitimacy and has a spill over effect on people's sense of belonging and inclusion.

| Why it is important to do the review
Community engagement approaches have become a key component of police counterterrorism efforts (Cherney & Hartley, 2017). These strategies have emphasised community engagement and outreach to identify potential violent extremism threats. This has involved police programs that aim to promote collaborative problem solving between police and community members to tackle radicalisation, such as through identifying youth at risk of radicalising to violent extremism (Cherney, 2018). For example, following 9/11, police units in Australia, the United Kingdom, the United States, and Canada were established to undertake outreach with particular community groups (e.g., Muslim communities), with the aim of tackling violence extremism by enhancing relations and connectedness between police and these communities, and also between community members (Cherney, 2018;Ramirez, Quinlan, Malloy, & Shutt, 2013). However, to date, there has been no systematic synthesis of the evaluation evidence for these policing approaches and their impact on violent extremism. 1 Therefore, the present proposed review is necessary to ascertain whether policing interventions that seek to promote community connectedness are effective for reducing violent extremism behaviour, attitudes and beliefs. In addition, the results from this review will inform future decision making relating to both the design and evaluation of police programs by identifying potential gaps in the evidence-base and level of investment needed in evaluation of primary studies.

| OBJECTIVES
The primary objective of this review is to answer the question: how effective are police programs that seek to increase community connectedness for reducing violent extremism attitudes, beliefs and behaviours? If there are sufficient data, the review will also examine whether the effectiveness of these interventions vary by the following factors: geographical location, target population and type of policing strategy used to promote connectedness.
3 | METHODOLOGY 3.1 | Criteria for including and excluding studies

| Types of study designs
This review will include quantitative impact evaluations that utilise a randomised experimental (e.g., RCTs) or a quasi-experimental design with a comparison group that does not receive the intervention. We will include studies where the comparison group receives "businessas-usual" policing, no intervention or an alternative intervention (treatment-treatment designs).
Although not as robust as RCTs, "strong" quasi-experiments can be used to provide causal inference when there are elements of the design that aim to minimise threats to internal validity (see Farrington, 2003;Shadish, Cook, & Campbell, 2002). Minimising threats to internal validity can include controlling case assignment to treatment and control groups (regression discontinuity), matching characteristics of the treatment and control groups (matched control), statistically accounting for differences between the treatment and control groups (designs using multiple regression analysis) or providing a difference-in-difference analysis (parallel cohorts with pretest and posttest measures). Therefore, we will include the following "strong" quasi-experimental designs in this review.
• Matched control group designs with or without preintervention baseline measures (propensity or statistically matched).
• Unmatched control group designs without preintervention measures where the control group has face validity.
• Unmatched control group designs with pre-postintervention measures which allow for difference-in-difference analysis.
• Long interrupted time-series designs with or without a control group (≥25 pre-and postintervention observations; Glass, 1997).
Weaker quasi-experimental designs can be used to demonstrate the magnitude of the relationship between an intervention and an outcome. However, we will exclude the following weaker quasiexperimental designs due to their limitations in establishing causality.
• Raw unadjusted correlational designs where the variation in the level of the intervention is compared with the variation in the level of the outcome.
• Single group designs with pre-and postintervention measures. 1 We conducted a search of the literature using the following terms to identify existing reviews: terroris* OR extremis* OR radicali*. Searches of the Campbell Collaboration library, Cochrane Collaboration library, PROSPERO register and Google Scholar did not identify any existing systematic reviews (completed or ongoing) on the specific topic proposed in this protocol. 2 We will include all short interrupted time-series designs with control group, as long as the design includes a minimum of one preintervention observation for each of the treatment and comparison groups. This approach is consistent with the inclusion of unmatched control group designs with pre-post intervention measures. For studies with extremely short preintervention time series (less than four pre-and four postintervention observations), the data will be collapsed and treated as pre-post averages, rather than as true time series data. MAZEROLLE ET AL. | 3 of 20 3.1.2 | Types of participants This review will consider the impact of community connectedness policing interventions on the following population subjects.
• Individuals of any age, gender or ethnicity.
We will place no limits on the geographical region reported in the study. Specifically, we will include studies conducted in high-, low and middle-income countries in the review.

| Types of interventions
This review will include any policing intervention that aims to promote community connectedness. Specifically, each study must meet the following two intervention criteria.
1. Report on a policing intervention, defined as some kind of a strategy, technique, approach, activity, campaign, training, program, directive or funding/organisational change that involves police in some way (other agencies or organisations can be involved; Higginson et al. 2015). Police involvement is broadly defined as: • police initiation, development or leadership; • Police are recipients of the intervention or the intervention is related, focused or targeted to police practices; or • delivery or implementation of the intervention by police.
2. Report on a policing intervention that aims to promote community connectedness. For the purposes of this review, we define the promotion of community connectedness to mean an intention to increase linkages or ties between either the community members themselves or community members and police. Other terminology that may be used to represent connectedness in the literature includes (Thomas, 2019): • promotion of common values, norms and/or reciprocity; • promotion of social networks, collective efficacy, social cohesion and/or social capital; or • promotion of shared problem-solving or citizen engagement.
We anticipate that policing interventions aiming to promote community connectedness will include, more generally, community consultation, partnership, or collaboration with citizens and/or organisational entities. Specific strategies may include (but are not limited to): • community meetings or forums; • developing partnerships with specific organisations (Fox, 2012); • police liaison programs involving community members ; or • police work with community leaders to enhance personal skills (e.g., self-identity, self-awareness and resilience), employment skills (e.g., teamwork and self-awareness) or leadership skills (Thomas, 2019).

| Types of outcome measures
Terrorism is one outcome of violent extremism, which constitutes both a cognitive and behavioural component. In the literature, a distinction is made between radicalisation as constituting beliefs, while violent extremism is the behavioural outcome of those beliefs.
In this review, will include programs focused on individuals and groups identified as at risk of radicalisation due to beliefs and or associations, as well as those who have acted on those beliefs. This is to ensure we capture police programs tackling different levels of violent extremism.
Hence, this review will include studies where the measured outcome is violent extremist attitudes, beliefs and behaviour. For the purposes of this review, violent extremism is defined as "advocating, engaging in, preparing or otherwise supporting ideologically motivated or justified violence to further social, economic and political objectives" (Barker, 2015;Horgan, 2009;Khalil and Zeuthen, 2016; United States Agency for International Development, 2016).
It is important to note that violent extremism is defined and captured differently across countries (e.g., Barker, 2015 The review will also include studies where the outcome is disengagement and/or deradicalisation, which are often encompassed within conceptualisations of violent extremism (Klausen, Campion, Needle, Nguyen, & Libretti, 2016). Disengagement generally captures the behavioural aspect of extremism and refers to reducing or ceasing physical involvement in violent or radical activities (Horgan, 2009). In contrast, deradicalisation is defined as the psychological shift in attitudes or beliefs (Horgan & Braddock, 2010). This can encompass a variety of ideologies, including: Islamist (or jihadist), far-right (right-wing), far-left We will include outcome data that are measured through selfreport instruments, interviews, observations and/or official data (e.g., arrests or convictions). Some examples of how violent extremism attitudes, beliefs and behaviour can be measured include: • official data taken from the Profiles of Radicalised Individuals in the United States (PIRUS), 3 which includes: active participation in operational plots intending to cause causalities (e.g., gathering weapons, choosing targets), recruiting individuals to an official or unofficial extremist group and providing material/financial support to extremist organisations (START, 2018); • level of disillusionment with, disappointment with or renouncement of extremist group members, extremist leaders and/or radical ideology (Barrelle, 2015;Berger, 2016;San, 2018); • willingness to engage in violence (San, 2018); • modification of group and personal social identity (Barrelle, 2015); • level of acceptance and/or engagement with cultural and religious differences or pluralistic views (Barrelle, 2015); • number and/or strength of ties with extremist social networks, extremist recruiters (Perliger & Pedahzur, 2011); and/or • amount of extremist activity (San, 2018).

| Duration of follow-up
Studies will be included regardless of the length of follow-up after the intervention. If the length of follow-up varies across studies, we will group and synthesise studies with similar follow-up durations.

| Types of settings
We will include studies reporting on an impact evaluation of an eligible intervention using eligible participants, outcome(s) and an eligible research design in any setting. Where there are multiple conceptually distinct settings, we will synthesise the studies within the settings separately.
We will include studies written in any language that are identified in the search. We will use Google Translate for the title and abstract screening stage to identify whether a non-English language study is potentially eligible for review, and will call upon our international network of colleagues for assistance with full-text screening and coding where needed.
We will include studies published between 2002 and 2018 in the review.

| Search strategy
The search for this review will be led by the Global Policing Database (GPD) research team at the University of Queensland (Elizabeth Eggins and Lorraine Mazerolle) and Queensland University of Technology (Angela Higginson). The University of Queensland is home to the GPD (see http://www.gpd.uq.edu.au), which will serve as the main search location for this review. The GPD is a web-based and searchable database designed to capture all published and unpublished experimental and quasi-experimental evaluations of policing interventions conducted since 1950. There are no restrictions on the type of policing technique, type of outcome measure or language of the research (Higginson et al., 2015). The GPD is compiled using systematic search and screening techniques, which are reported in Higginson et al. (2015) and summarised in Appendices A and B. Broadly, the GPD search protocol includes an extensive range of search locations to ensure that both published and unpublished research is captured across criminology and allied disciplines.
Because the GPD includes experimental and quasi-experimental studies that evaluate interventions relating to police or policing, with no limits on outcome measures, we will use a broad search to capture studies for the review. Specifically, we will search the title and abstracts of the corpus of GPD full-text documents that have been classified as reporting on a quantitative impact evaluation of a policing intervention between 2002 and 2018 using the following search terms: *terror* OR extrem* OR *radical* (Table 1).
We will also employ additional strategies to extend the GPD search. This includes the following.
• Conducting reference harvesting on both the corpus of eligible documents and existing reviews.
• Forward citation searching for all eligible documents.
• Liaising with the Five Country Research and Development Network (5RD), and the Department of Homeland Security Advisory Board network for the Campbell Collaboration grants, to enquire about eligible studies that may not be publicly available.
• Personally contacting prominent scholars in the field and authors of eligible studies to enquire about eligible studies not yet disseminated or published.
• Hand-searching the four most recent issues of the following journals to identify eligible documents yet to be indexed in academic

| Description of methods used in primary research
Existing literature highlights the burgeoning range of policing approaches that focuses on increasing community connectedness in the context of terrorism, radicalisation and/or single issue violent extremism (e.g., see Cherney & Murphy, 2017;Downing, 2009;Schanzer et al., 2016;Silk, 2012). Based on this literature, we anticipate that the vast majority of studies captured by our review will utilise quasi-experimental research designs with some conceptual differences in the measurement of outcome variables.

| Criteria for determination of independent findings
Issues of dependence can arise where (a) multiple documents report on a single empirical study, (b) multiple conceptually-similar outcomes are reported in the one document and/or (c) studies have clustering in their research design. For meta-analyses, each eligible study will be included only once for each conceptually distinct outcome category.
The software that will be used for this review allows the nesting of multiple dependent documents relating to a single study, to allow for the identification of multiple reports of the same study. In cases where there are dependent documents reporting on the one study, all documents will be coded and the most complete report of the study will be used for data extraction. If necessary and where appropriate, data may be extracted from multiple documents to enable the calculation of effect sizes.
If documents report on multiple conceptually similar outcomes, these effect sizes will be averaged and only the averaged effect size will be included in the meta-analysis (Borenstein, Cooper, Hedges, & Valentine, 2009). Where studies utilise a research design with clustering (e.g., study sites assigned to conditions), the method proposed by Fu et al. (2013)

| Title and abstract screening
The first stage of assessing study eligibility will begin with title and abstract screening of all unique records identified in the systematic search. After removing duplicates and ineligible document types (e.g., book reviews, blog posts) from the results of the systematic search, all records will be imported into the review management software, SysReview (Higginson & Neville, 2014). Each title and abstract (record) will then be assessed according to the following exclusion criteria. Although all efforts will be made to remove ineligible document types and duplicates prior to screening, automated and manual cleaning can be less than perfect. As such, the first two exclusion criteria will be used to remove ineligible document types and duplicates prior to screening each record on substantive content relevance.
Most records indexed in the GPD have a pre-existing full-text document. However, records from the additional searches that are deemed as potentially eligible at the title and abstract screening stage will progress to literature retrieval, where the full-text document will be located. Where full-text documents cannot be retrieved via existing university resources, they will be ordered through the university libraries of the review authors or by contacting study authors. All potentially eligible records will then progress to full-text eligibility screening.

| Full-text eligibility screening
The full-text of each document will be screened for final eligibility according to the following exclusion criteria.
2. Document is not unique.
3. Document does not evaluate a policing intervention that aims to promote community connectedness.
4. The evaluation does not report violent extremism attitudes, beliefs or behaviour as an outcome.
All efforts will be made to remove ineligible document types and duplicate documents in earlier stages. However, sometimes these types of records can progress into later stages of screening (e.g., where duplicate records are not adjacent during screening or where screeners cannot unambiguously determine whether a record is ineligible based on the title and abstract). Therefore, the first two exclusion criteria will be used to remove ineligible document types and duplicates.

| Full-text coding and risk of bias assessment
Eligible documents progressing from the full-text screening stage will be coded within SysReview, using the coding companion provided in Appendix C. Broadly, studies will be coded according to the following domains.

Risk of bias.
Risk of bias will be evaluated using either the Cochrane randomised or nonrandomised risk of bias tools. Using these tools, studies will be rated across domains as having high, low or unclear risk of bias. Study authors will be approached to obtain missing data where a domain is rated as "unclear". Results of the risk of bias assessment will be presented in summary tables and in a risk of bias summary

| Statistical procedures and conventions
Meta-analyses will be performed for all outcomes where there are at least two independent effect sizes, and we will conduct separate meta-analyses for each set of conceptually similar outcomes. We will also conduct separate meta-analyses for studies where participants are individuals and studies where participants are places, even if both groups of studies report on the same outcome. Random effects inverse variance meta-analyses (Lipsey & Wilson, 2001) will be conducted in R using rmeta (Lumley, 2015; available at https://CRAN.Rproject.org/package=rmeta). Mean effect sizes and their corresponding confidence intervals will be reported in-text and in graphical forest plots.
Where the participants are individuals, we anticipate that evaluations will typically report outcomes as continuous measures (e.g., willingness to engage in violence, self-reported on a Likert scale), and in these instances, Hedges' g (standardised mean differences [SMDs]) will be computed. Where evaluations report binary outcomes (e.g., disengagement from radicalised peers: yes/no), effect sizes will be computed as odds ratios and then transformed into Hedges' g for meta-analyses (see Borenstein, Hedges, Higgins, & Rothstein, 2009).
Throughout the review, we will follow Campbell Collaboration guidelines and aim to transform the smallest number of effect sizes to a common effect size (Polanin & Snilstveit, 2016), therefore the final effect size metric will be that which is calculated most commonly for each outcome.
Where the participants are micro-or macroplaces, we anticipate that evaluations may report outcomes as counts or rates in the intervention and comparison areas, before and after the intervention (e.g., number of radicalised individuals). In these instances, we will calculate the effect size as the relative incident rate ratio (RIRR), which can be interpreted as the relative proportion change in the outcome in the treatment area after the intervention, compared with the comparison area (Farrington, Gill, Waples, & Argomaniz, 2007;Higginson & Mazerolle, 2014). RIRR is calculated as

= / ad bc, RIRR
where a denotes the count or rate in the intervention area before the intervention, b denotes the count or rate in the intervention area after the intervention, c denotes the count or rate in comparison area before the intervention, and d denotes the count or rate in the comparison area after the intervention. The RIRR will be converted to a log relative incident rate ratio (LRIRR) for synthesis, but converted back to RIRR for more intuitive reporting. The standard error of the LRIRR is initially calculated as however, the odds ratio formula of the RIRR assumes a Poisson distribution, to which crime data typically does not conform. where s is the standard deviation and ̅ y is the mean of the observations. If ∅ > 1, we will adjust the standard error by multiplying the standard error by the quasi-Poisson overdispersion parameter ∅.
We will conduct separate meta-analyses for each of the reported follow-up time points, where data permit. Where studies report multiple points of follow-up, effect sizes will be calculated for each time-point, but synthesised separately with studies that have similar outcome time-points. If studies report both baseline and postintervention outcome data, SMDs will be calculated using baseline adjusted mean differences (i.e., mean change scores) and the change score standard deviations will be standardised using the raw standard deviation within groups. If the standard deviation for mean change scores is not available, we will follow Lipsey and Wilson's (2001) formula to calculate the standard deviation. If studies report follow-up outcome data, post-only outcome data will be used to estimate SMDs, and follow-up outcomes will be analysed separately from postintervention outcomes.
For each meta-analysis, we will assess heterogeneity in effect sizes using the I 2 statistic, χ 2 test, and τ 2 (Higgins & Thompson, 2002).
We will conduct moderator analyses to explore potential sources of heterogeneity, using the variables listed under "Objectives" section.
The analogue to analysis of variance (ANOVA) will be used for categorical moderators and regression-based approaches will be used for continuous moderators. We may also conduct additional exploratory subgroup analyses, but will clearly distinguish between a priori and post hoc analyses in our reporting.
Finally, we will visually and statistically assess the data for evidence of publication bias. We will inspect funnel plots for asymmetry, and if asymmetry is detected, we will conduct subgroup analyses to assess if the effect sizes from the published and unpublished documents are significantly different.

| Treatment of qualitative research
This review will not include qualitative research.
• Statistical analysis: Higginson, Eggins. Cherney has published research that is closely linked with the review topic. To minimise potential bias, Cherney will not be involved in the screening or coding of any studies for this review.

PRELIMINARY TIMEFRAME
The final review is due for submission on December 20, 2019.

PLANS FOR UPDATING THE REVIEW
Lorraine Mazerolle and Adrian Cherney will be responsible for updates of this review, which are anticipated to occur every 3 to 5 years. (25, pp. 115-191

Search terms
To ensure optimum sensitivity and specificity, the GPD search strategy utilises a combination of free-text and controlled vocabulary search terms. Because controlled vocabularies and search capabilities vary across databases, the exact combination of search terms and field codes are adapted to each database. Final search syntax for each location will be reported in the final review.
The free-text search terms for the GPD are provided in Table A1 and are grouped by substantive (i.e., some form of policing) and evaluation terminology. Although the search strategy may vary slightly across search locations, it follows a number of general rules, which are as follows.
• Search terms are combined into search strings using Boolean operators "AND" and "OR". Specifically, terms within each category are combined with "OR", and categories will be combined with "AND". For example: (police OR policing OR "law#enforcement") AND (analy* OR ANCOVA OR ANOVA OR …).
• Compound terms (e.g., law enforcement) are considered single terms in search strings by using quotation marks (i.e., "law*enforcement") to ensure that the database searches for the entire term rather than separate words.
• Wild cards and truncation codes are used for search terms with multiple iterations from a stem word (e.g., evaluation, evaluate) or spelling variations (e.g., evaluat* or randomi#e).
• If a database has a controlled vocabulary term that is equivalent to "POLICE", the term is combined in a search string that includes both the policing and evaluation free-text search terms. This approach ensures that the search retrieves documents that do not use policing terms in the title/abstract but have been indexed as being related to policing in the database. An example of this approach is the following search string: (((SU: "POLICE") OR
• For search locations with limited search functionality, a broad search that uses only the policing free-text terms is implemented.
• Multidisciplinary database searches are limited to relevant disciplines (e.g., include social sciences but exclude physical sciences).
• Search results are refined to exclude specific types of documents that are not suitable for systematic reviews (e.g., newspapers, front/back matter, book reviews).

Search locations
To reduce publication and discipline bias, the GPD search strategy adopts an international scope and involves searching for literature across a number of disciplines (e.g., criminology, law, political science, public health, sociology, social science and social work). The search captures a comprehensive range of published (i.e., journal articles, book chapters and books) and unpublished literature (e.g., working papers, governmental reports, technical reports, conference proceedings and dissertations) by implementing a search strategy across bibliographic/academic, grey literature and dissertation databases or repositories.
It is noted that there is substantial overlap of the content coverage between many of the databases. Therefore, the Optimal Searching of Indexing Databases (OSID) computer program (Higginson & Neville, 2014) Table A2.

Types of interventions
Each document must contain an impact evaluation of a policing intervention. Policing interventions are defined as some kind of a strategy, program, technique, approach, activity, campaign, training, directive or funding/organisational change that involves police in some way (other agencies or organisations can be involved). Police involvement is broadly defined as following.
• Police initiation, development or leadership.
• Police are recipients of the intervention or the intervention is related, focused or targeted to police practices.
• Delivery or implementation of the intervention by police.

Types of study designs
The GPD includes quantitative impact evaluations of policing interventions that utilise randomised experimental (e.g., RCTs) or quasiexperimental evaluation designs with a valid comparison group that does not receive the intervention. The GPD includes designs where the comparison group receives "business-as-usual" policing, no intervention or an alternative intervention (treatment-treatment designs).
The specific list of research designs included in the GPD are as follows.
• Systematic reviews with or without meta-analyses.
• Matched control group designs with or without preintervention baseline measures (propensity or statistically matched).
• Unmatched control group designs with pre-post intervention measures which allow for difference-in-difference analysis.
• Unmatched control group designs without preintervention measures where the control group has face validity.
• Long interrupted time-series designs with or without a control group (≥25 pre-and postintervention observations).
• Raw unadjusted correlational designs where the variation in the level of the intervention is compared to the variation in the level of the outcome.
The GPD excludes single-group designs with pre-and postintervention measures as these designs are highly subject to bias and threats to internal validity.

Systematic screening
To establish eligibility, records captured by the GPD search are progress through a series of systematic stages which are summarised in Figure B1, with additional detail provided in the following subsections.

Title and abstract screening
After removing duplicates, the title and abstract of records captured by the GPD systematic search is screened by trained research staff to identify potentially eligible research that satisfies the following criteria.
• Document is dated between 1950 and present.
• Document is about police or policing.
• Document is an eligible document type (e.g., not a book review).
Records are excluded if the answer to any one of the criteria is unambiguously "No", and will be classified as potentially eligible otherwise. Records classified as eligible at the title and abstract screening stage progress to full-text document retrieval and screening stages.

Full-text eligibility screening
Wherever possible, a full-text electronic version of an eligible record is imported into SysReview (review management software ;Higginson & Neville, 2014). For records without an electronic version, a hardcopy of the record is located to enable full-text eligibility screening.
The full-text of each document is screened to identify studies that satisfy the following criteria.
• Document is dated between 1950 and present.
• Document is unique.
• Document reports a quantitative statistical comparison.
• Document reports on policing evaluation.
• Document reports in a quantitative impact evaluation of a policing intervention.
• Evaluation uses an eligible research design. 6. Enter the appropriate data in the relevant "Data for effect size calculations" tabs (see below). The data entered will depend on what is reported in the document. If none of the circumstances in the tabs reflect the data in the document, follow the link to David Wilson's online effect size calculator to calculate an effect size. You can enter the data in the "Data for effect size calculations 2" tab in the "Other information" textbox. [textboxes]