Guidance for producing a Campbell evidence and gap map

Abstract Evidence and Gap Maps (EGMs) are a systematic evidence synthesis product which display the available evidence relevant to a specific research question. EGMs are produced following the same principles as a systematic reviews, that is: specify a PICOS, a comprehensive search, screening against explicit inclusion and exclusion criteria, and systematic coding, analysis and reporting. This paper provides guidance on producing EGMs for publication in Campbell Systematic Reviews.


Ten tips
1. Get the framework right. Determining the map framework is the most important stage of map development and best done through a consultative process with key stakeholders, especially the commissioner of the map. It is important to pilot the proposed framework. Piloting is an iterative process which may take several rounds, which can be characterised as "revise, refine and define".
2. Fully comprehensive and mutually exclusive categories. The set of interventions and of outcomes should fully cover the desired scope of the map (fully comprehensive) but not overlap with one another (mutually exclusive). A single intervention or outcome should clearly fit into a single subcategory without ambiguity as to its proper home. The implication is that many single studies will be coded for just one intervention. Avoid over-coding: that is putting a single intervention into multiple intervention categories. Multicomponent interventions and systematic reviews are likely to be coded for several interventions. For filters it is useful to add "not reported/not clear" as a category for items such as race or sex.
3. Produce a dictionary of terms. A dictionary of all terms used in mapthat is intervention and outcome labels as well as filters-is useful for both coders and users. Good practice is for the dictionary to give an example of each label from a study in the map. Locating such studies also contributes to the data cleaning process. 4. A few big holes rather than many small holes. The target for the framework is 4-6 row and column heads each with 4-6 subcategories. If subcategories are not used the target is 12-15 row and column heads. Having more headings that this makes the map difficult to navigate for users. Moreover, there is more likely to be ambiguity or missing categories with a large number of very specific headings rather than a smaller number of broader headings. During piloting if items are found which do not fit into the existing framework, consider broadening definitions (widening the holes) rather than adding new subcategories.
5. The importance of stakeholder consultation. Stakeholder consultation is important in determining the scope of the map, developing the framework, and interpreting the findings. But there are drawbacks. For example, stakeholder consultation will often create pressure for more categories as they want to see "their interventions" named. But this pressure needs to be weighed against the disadvantages of a cumbersome framework.
6. Be realistic on the timeline. Although the coding, analysis and report writing are easier for a map than a review, the broader scope means there are many more studies to screen and code. So, the overall time required may not be much different to that for a full review.
7. Produce the map in stages. It can be a good idea to produce the map in stages. A first stage can map studies using a "low hanging fruit approach", for example, relevant Campbell reviews and the eligible included studies in those reviews, or eligible studies in a single database such as the 3ie evidence hub. Have a staged approach allows an intermediate output, and for some user testing with an actual map rather than the proposed framework which seems a bit abstract to intended users. 8. Consider different visual representations of the map. Most maps have primary dimensions, which are interventions and outcomes for effectiveness maps. However, it can be useful to present the map with other row and column headings, for example, interventions in rows and regions as headings, as a way of seeing the geographical distribution of evidence. 9. Add a bit of colour to the report. The report of the map findings is a descriptive analysis of the distribution of studies. The report can be both more interesting and useful if some depth and colour is added, which can be as simple as a box listing study titles for certain categories. If it is a field in which certain study designs are rare, then a box or the text may elaborate on a study using that design to illustrate its use. If a common problem is identified in the critical appraisal of included studies some examples can be provided of this problem and studies which overcome it.
10. Innovative reporting. The report can also be made more interesting by more innovative analysis. There are approaches used by bibliometricians which could be more widely used in reporting maps, for example, network analysis.

| SUMMARY
EGMs are a systematic evidence synthesis product which display the available evidence relevant to a specific research question.
The scope of a map is generally broader than that of a systematic review. We distinguish between EGMs which contain primary studies and reviews, mega-maps which include reviews and other maps, and maps of maps which contain only other maps.
EGMs are used to identify gaps requiring filling with new evidence, collections of studies for review, and increase the discoverability and use of studies by decision-makers, research commissioners and researchers. They also highlight reviews which can be used to generate higher-level evidence products such as guidelines.
Maps are typically shown as a matrix. The most common map is a map of effectiveness studies (or an "effectiveness map", although it does not show effectiveness), although maps may be made of any body of literature. An effectiveness map is most commonly drawn with interventions as row headings and outcomes as column headings, both may be subdivided into subcategories. The cells contain the studies relevant to that intervention/outcome combination. The map is interactive so users can click on the cell to access the studies.
There are also filters, such as study design, country or region, and subpopulations, so users can see a subset of studies meeting certain criteria.
The row and heading titles and the filters are referred to as the map framework. Determining the map framework is the most important stage of map development and best done through a consultative process with key stakeholders. Stakeholders may also be involved in other stages of the process, especially use.
The search, screening and coding follow the usual systematic review approach. Critical appraisal can also be included and shown in the map. The map is accompanied by a descriptive report providing an overview of the studies, for example, frequency plots.
Producing a map usually requires of team of one or two principal investigators (PIs; content expertise, and knowledge of evidence synthesis) and a team of two junior researchers or research assistants. A map may be expected to take 6-12 months to complete.

| INTRODUCTION AND OVERVIEW
An EGM is a systematic presentation of all relevant evidence of a specified kind for a particular sector, subsector or geography.
Examples of maps include "Evidence and gap map of studies assessing the effectiveness of interventions for people with disabilities in lowand middle-income countries"  and "The impacts of agroforestry on agricultural productivity, ecosystem services, and human well-being in low-and middle-income countries: An evidence and gap map" (Miller et al., 2020).
Produced using the same systematic approach as a systematic review, EGMs usually show what evidence is there, not what the evidence says. Following systematic processes takes time, so while a map might be produced in 3 months, 6-12 months is generally a more realistic timeline.
The scope of an EGM is typically much broader than that of a systematic review. A map typically contains systematic reviews and primary studies, but may include only one of these, and may sometimes also include other maps.
Campbell EGMs include a visual presentation of the evidence as a matrix. Usually the matrix shows intervention categories as rows and outcomes as columns. 1 Figure 1 shows an example of a section of a map on homelessness. The section shown illustrates an intervention category (health and social care) and its first three subcategories (health services, addiction support and end-of-life care), as well as three of the outcome domains (health, housing stability, and public attitudes and engagement) and their corresponding outcome subdomains. The bubbles in each cell show the studies included in the map. The separate bubbles show type of study (primary study and systematic review) and confidence in study findings based on critical appraisal of each study. For example, in Figure 1, the red bubble corresponds to systematic reviews for which we have low confidence in study findings. The larger the bubble the more studies there are.
While the intervention-outcome matrix is the most common representation of an EGM, different representations can also be useful-for example, showing availability of studies for interventions against global region.
The map may have additional dimensions capturing study or intervention characteristics, such as study design, location and population subgroup which can be applied as filters so as to only show evidence for that filter (e.g., randomized controlled trials [RCTs] or Asia). The map is interactive so that users may click on entries to see a list of studies, and on study names to access the study or a database record for the study.
Just like a systematic review, an EGM can be produced to address any research question. The type of evidence to be included in the map is determined according to the research question. Maps may be produced, for example, of studies of effects, prevalence, implementation or barriers or facilitators. The title of the map should make clear the type of evidence being included. Some research questions may require more scoping to develop the map framework and inclusion criteria.
There are many approaches to evidence mapping (Saran & White, 2018). This guidance presents advice on preparing EGMs to Campbell standards (Campbell Collaboration, 2019). Research teams constructing a Campbell map should also consult the Campbell Collaboration Checklists for the Production and Report of Reviews (White et al., 2018a(White et al., , 2018b.

| WHY PRODUCE AN EGM?
Mapping is standard part of the systematic review process, and most reviews will conduct some sort of mapping even if they do not publish a map. Many organisations have produced maps of different types. A review of these may be found in Saran and White (Saran and White, 2018).
Maps can be used for various purposes, and the main purpose may vary by type of map. The main uses of EGMs: • Guide users to available relevant evidence to inform intervention and design and implementation. Very often users (both researchers and decision-makers) are unaware of the extent of the evidence base, so maps are a way of increasing the discoverability, and hence use, of that evidence. Country evaluation maps, such as that for Uganda, make available over 500 evaluations many of which were little used as they were not know and not readily discoverable.
• Identify existing high-quality reviews as a basis for evidence summaries for policy purposes or to populate evidence portals (e.g., the UK Centre for Homelessness Impact's Intervention tool).
• Tell implementing agencies where there is no relevant evidence for their interventions. Agencies committed to evidence-based interventions should be aware when they are operating in an "evidence free" area (or one with only weak evidence) so they can act accordingly (i.e., shift to a different, evidence-based intervention, or collect evidence for the intervention they are supporting). The effective altruism charity, Giving Evidence, uses maps for this purpose with its clients.
• Identify research gaps for new primary research and new synthesis. This enables a strategic, policy-oriented approach for research commissioners to decide what research to commission and researchers what research to do. A coordinated research programme 1 The term outcome is used here to mean any effect from the intervention, which is common usage in maps and reviews. It is not the same usage as in the monitoring and evaluation literature which distinguishes outputs, outcomes, impact so that outcomes refer to high-level indicators.
would begin with evidence mapping, identify gaps for new primary research, and conduct synthesis on the basis of the primary studies included in the map. The UK Youth Endowment Fund has commissioned a map of interventions to prevent youth crime for this purpose. UNICEF commissioned the map of violence against children based on the results of the child welfare mega-map, which they also supported.
EGMs most commonly have a primary purpose to allow a more strategic approach to the commissioning research by identifying areas for which there are no or few primary studies, or many studies but no systematic reviews. They may also identify areas in which there are many reviews and so a review of reviews may be appropriate. Research commissioners use this information, along with their own knowledge priorities, to decide what studies to commission.
While informing research is a primary purpose of EGMs, they can also be used for strategy and programme development by identifying relevant evidence. EGMs save strategy and programme developers the time and effort of screening studies from search engines such as Google Scholar, and more effectively and efficiently guide them to relevant evidence for the interventions and outcomes of interest than their own efforts to find these studies.
Where maps are being used as a basis for identifying reviews to be undertaken, maps can improve the efficiency of review production in two ways. First, the included studies for a review are the relevant studies identified in a map. 2 Where the coding for maps is available for the review team then the workload for coding is reduced.
It is particularly useful to identify areas in which there is no evidence. Agencies cannot claim to be doing evidence-based programmes in such areas, and should be using any new or existing programmes to collect evidence of effectiveness.

| TYPOLOGY AND TERMINOLOGY
The studies included in a map may vary according to its scope (i.e., the breadth of sectoral and geographical coverage). The following three names are used by Campbell: • Evidence and gap map: A map of a specific sector or subsector which typically includes both systematic reviews and primary studies. For example, "The impacts of agroforestry on agricultural productivity, ecosystem services, and human well-being in low-and middle-income countries: An evidence and gap map" (Miller et al. 2020) and "Interventions for adults exposed to war and armed conflict: An evidence and gap map" (Farina and Maynard, 2019).
• Mega-map: A map with broader scope covering a large sector or several sectors that includes only systematic reviews and other maps.
For example, "Mega-map of systematic reviews and evidence and gap maps on the effectiveness of interventions to improve child welfare in low-and middle-income countries" (Saran et al., 2019).
• Map of maps: A map with a very broad scope covering many sectors that only includes other maps. For example, "A map of evidence maps relating to low-and middle-income countries" (Phillips et al., 2017).
The name may also be varied to give a clearer indication of the evidence contained in the map. For example, "The country evaluation map of development interventions in Uganda" or "Uganda evaluation map" is a map of all sorts of evaluation (not just effectiveness studies).
F I G U R E 1 Screenshot of a section of homelessness evidence and gap map 2 For this to be the cases, the map needs to have been sufficiently systematic. Where it has not been based on all components of a systematic search, for example, reference snowballing, then the authors of a map-based review need carry out these additional components of the search strategy.
Or "The prevalence of violence against children: an evidence map of prevalence studies".
An EGM is one type of synthesis product. There are other types.
There is a trade-off between the depth and breadth of these different types of evidence synthesis product. As shown in Figure 2, the broader the scope (horizontal axis) then the less deep is the analysis (content) of the included studies.
A key distinction is the step between reviews of reviews and EGMs, as the former summarise what the evidence says, whereas the latter show what evidence there is. While reviews may vary in the extent to which they conduct synthesis-some systematic reviews simply summarise each study, so are more akin to an annotated bibliography than a synthesis-they do attempt to summarise what the body of evidence says, which Campbell maps do not.
In addition to the naming conventions presented above, the following terminology is used for Campbell maps: • The framework: The framework for an EGM defines the dimensions of the map: row and column headings and filters. • Labels: The label is the name given to a category or subcategory in the map (e.g., "End of life care" in Figure 1).
• Filters: Filters are study characteristics (e.g., study design, country or region, or subpopulation) which may be applied to the map to show only evidence relevant to those filters.
• Aggregate map: A map showing the data by category level for rows and columns, rather than subcategory level.

| DEVELOPING THE FRAMEWORK
Developing the framework, in particular the primary dimensions, that is, row and column headings, is the most important, and often the most difficult, part of developing an evidence map. The framework needs to be codable and resonate with users. So, category headings should be one which are readily understandable to those coding the map, and enable users to find studies of specific interventions, or for particular outcomes, where they expect to find them. There is a trade-off between avoiding jargon but using language familiar to sector stakeholders although it may not be readily understandable by lay users, noting also that that this terminology may vary by country. For example, "contingency management" was the dominant approach to tackling homelessness in the USA for some years. It is not apparent to lay readers what this label means, but will be so to any user working in the sector.
It is also important to avoid overlong labels for the simple reason that they are unlikely to be readable in the visual representation of the map.
In general, the framework should be developed with these principles: • Where there is an existing, widely accepted international typology for either interventions, outcomes or another primary dimension then this typology should be adopted. For example, the disability EGM (Saran et al., 2020) adopted WHO's Community-Based Rehabilitation Matrix for both interventions and outcomes.
• However, the research team should assess the suitability of the typology, and revise it as necessary. For example, the EGM of interventions to reduce violence against children (Pundir et al., 2020) drew upon the widely adopted INSPIRE framework, 4 but modified it to create suitable coding categories.
F I G U R E 2 Types of evidence synthesis.Source: Saran and White (2018) • Where there is no such international typology, the research team should consult the strategy documents of the main international actors, usually multilateral agencies and global programmes, but also the strategy documents of the agency or agencies funding the map, if any.
• If strategy documents are not available, then it may be necessary to consult project documents of intervention within the scope of the map. A sample should be taken of major funders in the area, including those funding the map.
• Alternatively, a framework can be adopted and adapted from other research studies (maps or reviews). The initial framework used for the map of interventions to improve the welfare of those experiencing or at risk of homelessness was taken from the review by Munthe-Kaas et al. (2018), though subsequently substantively revised on the basis of expert review and stakeholder consultation.
• If no suitable framework is available then the research team can develop their own by drawing on the range of resources listed above.
In all cases, the framework will be subject to modification through stakeholder consultation, the piloting process and the requirements of the commissioner. Annex 1 provides guidance on how to structure stakeholder consultation.
Stakeholder consultation is an important stage in developing the map, in particular consultation with the funder(s). The advisory group for the map should include a broad range of stakeholders (policy, practitioners and researchers). Advising on the framework is one of their key contributions. Stakeholders will expect to recognise the categories used, and in particular to see "their intervention". If the numbers are sufficiently small to fit on one screen without scrolling that is an advantage, but generally only be possible for the aggregate map. The map may also be printed, but the print will appear small unless there are few rows and columns.
The general principle is "a few large holes" rather than "lots of small holes". Too detailed a disaggregation of either of the primary dimension makes the map difficult to navigate and coding more challenging ( Figure 3), with greater knowledge and judgement required on the part of the coders.

| STAKEHOLDER CONSULTATION
Stakeholder consultation is an important part of the map, and its role appears at several places in this guidance. This consultation includes defining scope and framework development, engagement in piloting, identify sources for the search such as organisational websites, and discussing and promoting use of the map.

| EGM TITLE AND SCOPE
The title of the EGM should define the scope of the map. It is reported in each of the title registration form (TRF), the protocol and the report.
EGMs have a broader scope than most systematic reviews.
Hence the title for an EGM will be broader than that for a review.
Mega-maps and maps of maps have a still broader scope.
The long title provides further information such as the type of studies being included. For example: • An EGM of studies of the effectiveness of interventions to reduce abuse and neglect of children at risk and vulnerable children (aged 0-11) in high-income countries.
• An EGM of studies of the prevalence of violence against women and girls in South Asia.
• An EGM of studies of interventions promoting safe water, sanitation, and hygiene for households, communities, schools and health facilities in low-and middle-income countries (L&MICs; Waddington et al., 2018).
• Understanding pathways between agriculture and nutrition: an EGM of tools, metrics and methods developed and applied in the last 10 years (Sparling et al., 2019).
The short title may be centred on population, outcomes or intervention. For example: • Population-oriented title: Disability EGM A search should be made for existing and ongoing maps to avoid duplication. It is acceptable that there may be some overlap between maps.
F I G U R E 3 The framework categories should use clear labels and should not be too disaggregated 9 | EGM PICOS For EGMs of effectiveness studies, the scope is fully defined by the PICOS, which is reported in the TRF, Protocol and Report. Variations on PICOS may be used for maps assessing the evidence base for other research questions. For example, in a map of process evaluations examining implementation issues, then the columns may be barriers and facilitators rather than outcomes. A map of risk factors could have population groups in rows and risk factors in rows, so neither interventions or outcomes appear.

| P: Population
Specify the types of populations to be included and excluded. The description of the included population can be broad, for example, "low and middle-income countries" or "children under 12". As may also be the case for reviews, the population may be implied by the intervention, for example, "interventions to increase school attendance and enrolment".
Population subgroups (e.g., women, children, ethnic minorities, conflict-affected populations and people with disabilities) are usually an important filter. User consultation is helpful in identifying relevant subgroups.
Population may also sometimes identify exclusion criteria if appropriate. For example, an evidence map for homelessness may exclude those made homeless by natural disasters. Maps of education interventions may exclude children with learning disabilities. 5

| I: Interventions
Interventions are usually one of the two primary dimensions of the evidence map, being used as the row headings. An evidence map should preferably have four to five categories each with four to five subcategories, giving around 20 rows. Annex 2 gives the example of a framework for a map on interventions to increase the welfare of those experiencing or at risk of homelessness (White at al. 2020). 6 As discussed above, the intervention categories are identified by Interventions should be clearly defined. They can include policies, programmes and practice. The categories should be fully inclusive of the scope and mutually exclusive-any one intervention should ideally not fall in multiple categories, though of course may do if it has multiple components which is common.
Developing a typology of interventions is possibly one of the important contributions of the map. To assist coders and users it is helpful to develop a dictionary defining the intervention labels. As the map is produced it is also useful to include examples from the map of a study coded under each label. This exercise can help sharpen the definitions, and also be a stage in data cleaning.

| C: Comparison
In the case of an effectiveness map then comparison group (e.g., active versus passive) may be used as a filter. Studies which include both active and passive comparison groups (which systematic reviews often will) should be coded under all categories that apply. In the case of an active comparison (i.e., the comparison group also gets an intervention) then the comparison intervention is also coded for that study in the map.

| O: Outcomes
Outcomes are one of the two primary dimensions of an evidence map of effectiveness, being used as the column headings. Other examples of column headings are barriers and facilitators in a map of process evaluations and implementation studies (White, Wood et al, 2018) and tools, methods and metrics in a "methods map".
As discussed above, the outcome categories are identified by (a) existing frameworks; (b) examination of strategy and project documents, especially those of the commissioner where appropriate; (c) consultation with the commissioner and other stakeholders; and (d) piloting of the initial framework.

| S: Study designs
EGMs may be presented for any type of evidence, for example, effects, prevalence, and implementation studies. A key question related to the scope of the EGM is "what sort of evidence will be presented?".
The long version of the map title should give an indication of the sort of evidence being included in the map.
The TRF should give a broad indication of the type of studies to be included, for example, "systematic reviews and primary studies of effectiveness". The protocol should spell out in detail eligible and ineligible study designs that are the basis for the inclusion and exclusion criteria to be used in screening. Screening tools may be included as an annex.

| SEARCHING AND SCREENING
The search should be conducted and reported to Campbell standards, as laid out in the Campbell Collaboration Checklist for EGMs (White 5 Both of these examples are illustrative, not recommendations. 6 The framework shown in Annex 2 also includes a definition of each label, and an example of a study coded with that label from the map. These other features are discussed below. Study teams should consult the Campbell Methods Guide "Searching for studies: a guide to information retrieval for Campbell systematic reviews" (Kugley et al., 2017) for further information.
Reference snowballing and citation tracking are an important part of the systematic search strategy. However, since maps usually have hundreds of included studies, there are limits to the feasibility of these approaches. Nonetheless, ensuring that all eligible included studies in the included reviews are in the map (a process referred to as unzipping the reviews) is an important part of the search and screening for map, and often one of the richest sources of primary studies in a well-reviewed area. It is also recommended to snowball references for the twenty or so most recent studies in the map.
Citation tracking is recommended for 10-20 highly cited studies in the map.
Since a primary purpose of maps is as a basis for a consultation to identify research priorities it is important to include on-going studies. This is done by searching libraries of reviews (e.g., Campbell, Cochrane, EPPI-centre, Centre for Environmental Evidence, 3ie systematic review repository for reviews of studies conducted in L&MICs) and registries (e.g., Prospero for systematic reviews, American Economic Association RCT Registry, OpenTrials, the Registry for International Development Impact Evaluations for primary studies in International Development, WHO ICTRP).
Just as for a review, the search strategy should be piloted. An iterative process is usually needed to develop the search strategy.
Campbell maintains a list of useful resources organised by sector including databases which are recommended for searching. See: www.campbellcollaboration.org/links Since the scope of EGMs is usually quite broad, then it is likely that there will be more studies to screen than for a review. A review may typically screen some 2,000-5,000 studies, whereas screening for a map may be 15,000-30,000 studies or sometime more. When there are so many studies to be screened consideration may be given to a machinelearning assisted search. Even then screening can take 4-6 weeks for full time work, and so in practice likely longer actual time.

| CODING
EGMs require less coding per study than systematic reviews. But a map usually includes more studies than a review, so the workload for coding can be similar.
The coding form should include: • Basic study characteristics, that is, bibliographic details.
• Primary EGM dimensions (e.g., for an effectiveness map, intervention categories and subcategories, and outcome domains and subdomains). A single included study may span more than one category or domain, and this is to be expected in the case of included reviews and maps.
• Secondary EGM dimensions to be included as filters.
• Data required for critical appraisal if included (see below).
It is generally not recommended to code direction of effect or effect size since Campbell maps do not portray this information. However, this may possibly be done in the case of a mega-map which includes only systematic reviews. In addition, the evidence that may be prepared for each included study will give information on study contents.
The coding form should be piloted with a small number (20-30) of included studies. The piloting exercise should include the PIs and those who will actually do the coding, if different. Piloting of the framework allows a process of "revise, refine and define". Each round of piloting may lead to revisions in the framework, refinements of categories, and clarifying definitions of each label. Refinement is usually best a broadening of categories to accommodate interventions not previously identified. It is not good practice to simply add new labels for previously undiscussed interventions as this can rapidly lead to category proliferation to make the map unwieldy.
The final version of the coding form should be included in the protocol.

| Data cleaning
As with any data set, the coding data will need to be cleaned once coding is completed. This can be done using usual data coding techniques such as: • Univariate tabulations of all codes to identify uncoded data, out of range codes and investigate unexpected patterns in the data.
• Examine a random sample of coded records, systematically checking in all records any code which appears to incorrectly used by the coders.
It is also useful to ask trusted users to explore the map prior to launch and report any issues they find with the coding.

| Screening and coding included systematic reviews and EGMs
Systematic reviews and EGMs may be coded in one of two ways: • According to their eligibility criteria (PICOS) • According to their included studies The main advantage of the former approach is ease as screening and coding can be based of the PICOS of the review or map being 7 PRISMA stands for the Preferred Reporting Items for Systematic Reviews and Meta-Analysis. See http://www.prisma-statement.org/. screened or coded. However, the disadvantage is that the study may be included in the map but not include any studies meeting the inclusion criteria of the map. A common example is an EGM of evidence related to evidence from L&MICs. Such a map would include any relevant systematic review with a global scope and so in principle contains studies from developing countries but yet not in fact include existing primary studies from a L&MIC since none were found.
Experience shows that map users are surprised, disappointed or annoyed when finding a review in the map which does not contain any primary studies that are relevant (i.e., which meet the map's inclusion criteria).
Hence the second approach-that is coding against included studies-may be better. This means that the included studies in the systematic review need to retrieved and coded to determine the coding of the systematic review. Of course, these primary studies need to be coded anyway as they will also be in the map, unless they are excluded on grounds of study design or, date of publication, or other grounds. 8 This approach will mean that reviews eligible on their PICOS but not their included studies are not included in the map, which will always be the case for empty reviews (i.e., a systematic review which finds no eligible studies). Another possibility is to show reviews which are eligible on their PICOS but include no studies eligible for the map on the map, but coded with a different colour to reviews which have eligible included studies.

| CRITICAL APPRAISAL OF INCLUDED STUDIES
Critical appraisal of included studies is recommended for Campbell EGMs but not mandatory. It is also possible to conduct critical appraisal of some studies, for example, systematic reviews, while not conducting this for other, for example, primary, studies.
The options are: • Critical appraisal of all included studies. This is recommended for Campbell EGMs but not mandatory.
• Conducting critical appraisal of systematic reviews but not of primary studies. The main rationale for this approach is workload related. There are many more primary studies so critically appraising them can add considerably to the work burden. If critical appraisal of primary studies is not done, then it recommended that primary study study design is coded and included as a filter.
• Not conducting any critical appraisal, which is not recommended since it is relevant to know not only how much evidence there is and what it covers but also the confidence we can have in findings from that evidence.
Where critical appraisal of any included studies in the map is done then the results can be shown as either in the colour coding scheme for the map or as a filter.
There are many critical appraisal checklists available for both primary studies and reviews. See for example the tools of the Critical Appraisal Skills Programme, 9 Lewin et al. (2009) for reviews of effects, Lewin et al. (2015) for qualitative evidence syntheses, Shea et al. (2016), and the list provided by the Cardiff University Specialist Unit for Review Evidence. 10 AMSTAR 2 (Shea et al., 2016) is a commonly used tool for critical appraisal of systematic reviews.
Since an EGM typically includes both reviews and primary studies, and may include multiple study designs (e.g., experimental and nonexperimental) then more than one check list will need to be employed.
The critical appraisal tools should be specified in the protocol.
Tool developed or adapted specifically for the map should be included in the annex.

| EGM PROTOCOL
The protocol for an EGM should include: • A clear indication of the scope of the map by reporting the PICOS, adapted as needed be to the type of research question.
• The framework: in the case of a map of effectiveness studies, this is the intervention and outcomes, as well as any filters. It is preferable to have an annex which includes definitions for all labels.
Annex 2 provides a sample framework with definitions, in which examples are also given of a study in the map coded under each label.
• The search strategy.
• The coding form, including any tool used for critical appraisal.
• An analysis and reporting plan.
It is strongly recommended to pilot the search strategy, the screening tool and the coding form before submission of the protocol. The piloting of the first two provides studies to include to illustrate the items (labels) in the coding tool. The exercise of identifying studies for each label is a useful part of the "revise, refine and define" iterative process in finalising the coding tool.
Authors should consult the Campbell EGM conduct standards checklist for a complete list of what to include in the protocol (White et al., 2018a).
The online interactive map is accompanied by a descriptive report to summarise the evidence for stakeholders such as researchers, research commissioners, policy makers and practitioners. Relevant guidelines on conduct and reporting standard for Campbell EGM can be found here (White et al., 2018a(White et al., , 2018b. Analysis of studies included in the EGM is descriptive and report should capture following information.
• Description of the EGM methodology.
• Main findings in terms of spread and concentration of evidence across intervention and outcome categories highlighting important evidence gaps and trends identified in the research literature.
• Additional findings from filters such as study design, geographical location (ideally both regions and countries), population, confidence in study findings (assessed through standardised checklists), funding and implementing agency for the included studies.
• Implications for policy and future research and key recommendations.
• A Plain Language Summary highlighting key findings in plain language, without use of any jargon.
The total number of studies in a map is often the main findings, possibly broken down as primary studies and systematic reviews.
This information may be included in the title of the online map (see Figure 1). It is useful to do that since a study may appear in multiple cells so simply counting bubbles give an inaccurate impression of the amount of evidence available.
The authors need to decide whether to distinguish studies and papers, that is, there may be several papers from a single study. The Chez Soi study of a Housing First intervention in five Canadian cities accounts for around 20 primary studies in the map of homelessness.
In systematic reviews, it is necessary to identify papers from the same study, especially if meta-analysis is being performed. This is not strictly necessary in the same way for a map, and is more difficult for a map than a review on account of the large number of studies typically included in a map and the fact that the coding is not as detailed. However, differentiating between papers and studies can be useful for users and is recommended especially if it is planned to use the map to generate reviews.
The EGM report should discuss the distribution of evidence and identify the main gaps in the evidence base. Tabulations and cross tabulations should be presented of the primary and secondary dimensions. The time trend of publication is usually presented.
Care should be taken in how the map is presented. Where the map only shows reviews, not primary studies, then the map shows "gaps in evidence synthesis" not "gaps in evidence". If a map has both reviews and primary studies, then where a cell contains a review but no primary studies then it is correct to say there is no evidence since these reviews are empty reviews with respect to that intervention/outcome combination (which we can see as there are no primary studies).
Authors are encouraged to explore engaging and innovative ways of presenting the data. In addition to histograms of the distribution of studies presentations may include, for example, a global map showing the distribution of studies, possibly using a heat map approach, and a network diagram of authors or their institutions. Campbell reviews currently use a limited range of bibliometric methods. Evidence maps present an opportunity to borrow more from this closely related field.

| FORMATS
The formats for the TRF are available on the Campbell Systematic Reviews website. 11 The protocol and report formats are embedded in Archie (see next item).

| SOFTWARE
The Archie review management system for the protocol and final report is required to produce Campbell evidence synthesis products.
The TRF is submitted as a word file.
A range of options are available to generate maps. EPPI-reviewer can be used to generate the online map from a JSON file, which can be exported from EPPI-reviewer. Teams may also apply to 3ie to use their mapping software, 12 or write their own code using packages such as R or Stata.
EPPI reviewer can be used for search, screening and coding, and the data thus used to generate the online evidence map and to produce the tables and figures for the descriptive report.
There are also increased opportunities for using machine learning to produce and update maps. For example, EPPI Reviewer has machine learning functionality for both searching (using the dataset of Microsoft Academic) and screening. Since machine learning requires training data to start it off the application to updates is clear, and opens up the path for living maps.
There are also automated data extraction tools used for qualitative data analysis such as Atlas-ti, but the ability of these to successfully carry out the thematic coding needed for, for example, intervention and outcome categories is still under development.
Currently it is recommended to use any machine learning application in conjunction with human search, screening and coding.

| THE ONLINE EVIDENCE MAP
Campbell partnered with EPPI reviewer to produce a user interface, based on EPPI-reviewer database for mapping. In the example below on interactive map, outcomes were used columns and intervention as rows. The categorisation system for rows and columns (called branches in the EPPI Reviewer coding system) used in EPPI Reviewer as codes can only be one or two level deep and need to be uniform with all subcategories same or with similar layout. The columns and rows are defined in EPPI-reviewer as "codes" and the circle represent quality and quantity of included studies in each cell.
Coded data is then exported via "coding report" in JSON format from EPPI-reviewer. The EPPI Centre team responsible for EPPI Reviewer are available to discuss issues in designing the coding structure to achieve the desired visualisation with authors of Campbell EGMs.
The exported data is then imported into EPPI-Mapper (mapping utility) to create an interactive map (see sample map in Figure 4).
Map can be customised and published online. By clicking on any cell, a pop-up window opens displaying the list of studies included in that cell. Maps can be filtered depending on the data behind it and is displayed in four different styles.
Reader function in the online map allows users to search for specific types of studies within a data set, which then displays studies with bibliographic information. Users can also locate studies by clicking on any cell.
It is strongly recommended to generate a test map once the pilot studies are coded to ensure that the desired presentations can be generated. Providing this test map to stakeholders for consultation and feedback may identify issues related to coding interventions, outcomes or filters. Viewing the map can have implications for the content of the coding form and how it is structured in the coding software.

| DATABASE RECORDS
All included studies for studies coded in EPPI Reviewer have a record in the underlying Global Evidence Database. 13 This record is automatically generated by EPPI Reviewer from the coding once the map is published.
The record contains the information that has been coded.
Database entry writers may be different to the team performing screening and coding, as entry writing it is a more specialised skill set.
In addition to that basic information it is highly recommended that the record also contain: • Description of the intervention

• Context
• Study design

• Main findings
It may often prove practical to complete the basic coding to complete and publish the map, and then produce the additional content for the database records.
Planning time and budget for EGMs should allow for this phase.

| UPDATING AND LIVING MAPS
Since maps typically have a broad scope they will date quickly as new studies appear. Annual updating is recommended, compared to every three years for a systematic review. Using machine learning to support searching and screening, as is possible in EPPI Reviewer, facilitates annual updates and the maintenance of living maps. After the introductory orientation, participants are handed blank cards and asked to write possible interventions and outcomes for the map. Stress one intervention, or one outcome, per card, as participants often list several on one card. Allow 5 min for this. As a variation, after the individual exercise, have participants work in small groups of 3-4 for five more minutes to collectively generate more cards.

REFERENCES
All the intervention cards (red) are now piled together. The participants are asked to group these in piles of similar interventions.
They can discard duplicates as they go. This works best with 6-10 people working around a large table on which the cards can be laid out. After grouping they can see if any of the cards in a group can be combined, with a new label covering all the combined cards. Each group needs a name, which is category label, with the individual cards as subcategories. Once completed, put the cards up on the wall as the map row headings.
Repeat the exercise for outcomes as the column headings.
If time allows try to fit some studies into the map as in the validation exercise described below.

A2 | Validating the framework
The rationale for the approach described here is that actively engaging participants in the map makes them more likely to critically reflect on it.
After the introductory orientation show some sample studies and screen them against the eligibility criteria. This is easiest done on title and abstract (T&A) only. First present one or two studies, showing the T&A, explicitly identifying the PICOS element which determine elig-ibility. Participants are then asked to screen 3-5 studies. It is best to have provided the T&As and a screening tool in the handout. Include at least one ineligible study. The studies you include may be selected to prompt discussion of specific issues, for example, including old studies or, most usually, not including before versus after study designs.
The same is now done for coding. That is, present a coding of a study (using the abstract, while emphasising in practice do not code on abstract) for intervention and outcomes. Then have the participants code the studies they screened as eligible. Again, the coding form should be in the handout. The feedback from this coding will prompt comments on what goes where.
Finally ask each participant to think of an intervention they work with or are familiar with and to fit that in the coding framework.
Small group work is generally more productive than plenary discussions. So, prior to plenary discussion ask groups of four to six people to discuss the framework and suggest possible modifications.
Finally, an open discussion on the framework should now take place so that suggestions from specific groups can be aired more generally.  (2006). "A total of 198 clients in two urban sites who had co-occurring disorders and were homeless or unstably housed were randomly assigned to one of two treatment conditions and were followed for three years" Millburn (2012). "a family-based program can also result in significant reductions in risky behavior… We aimed to reengage youth with their families" Landlord-tenant mediation Mediation between landlords and tenants to encourage landlords to accept tenants with history of homelessness, substance abuse, and so forth, and to address conflicts. This may include, but is not limited to mediation around arrears, noise and substance abuse, damage to property, eviction, and so forth

| 17 of 27
In-kind support Provision of clothing, hygiene products, household items, and so forth O'Toole (2015). "Clinic Orientation Arm… additional resources available at the clinic (clothes, hygiene kits, food, and benefits representatives, available to all homeless Veterans regardless of primary care enrollment)…" Day centres Centres open only during the day to provide food and services for people experiencing homelessness. This code is used if the day centre itself is being evaluated in the study rather than being the setting for the intervention Bond (1990). "Outcomes were examined after 1 year for 82 clients, averaging over 17 lifetime psychiatric hospitalizations, randomly assigned either to ACT or to a drop-in (DI) center" Outreach Outreach refers to work with people sleeping rough or in temporary or unstable accommodation. Outreach workers go out, including late at night and in the early hours of the morning, to locate people who are rough sleeping or work with day centres, shelters, and so forth.  (2019). "Phase 1 (months 1-3) will address objectives 1 and 2: 1. Develop an intervention using coproduction methods for use in community outreach/ hostel settings" Temporary accommodation Temporary accommodation includes a range of housing options, which may include hostels and shelters. The key characteristic is that TA is NOT considered an emergency solution and is embedded in the legal framework Rashid (2004). "The goals of this study were to (a) assess the outcomes of former foster care youth using transitional living programs and (b) compare outcomes achieved by former foster care youth who participated in an employment training program with similar youth who did not"

Consultation process for this document
The initial draft was produced the Campbell EGM Working Group in January to March 2018. This working group (who are the named authors) were also sent methodological queries when the guidance was being further developed in summer 2019.
A revised version was subject to the following consultation process: Stage 0-internal (Campbell secretariat staff) (February 2020) Stage 1-the working group (March 2020) Stage 2-knowledgeable friends, that is, producers and commissioners of maps (April to May 2020) Stage 3-CGs and public (June to July 2020) Submission of guidance for publication August 2020 WHITE ET AL.