The impacts of agroforestry interventions on agricultural productivity, ecosystem services, and human well-being in low- and middle-income countries: A systematic review.

Background
Agroforestry, the intentional integration of trees or other woody perennials with crops or livestock in production systems, is being widely promoted as a conservation and development tool to help meet the 2030 UN Sustainable Development Goals. Donors, governments, and nongovernmental organizations have invested significant time and resources into developing and promoting agroforestry policies and programs in low- and middle-income countries (LMICs) worldwide. While a large body of literature on the impacts of agroforestry practices in LMICs is available, the social-ecological impacts of agroforestry interventions is less well-studied. This knowledge gap on the effectiveness of agroforestry interventions constrains possibilities for evidence-based policy and investment decisions to advance sustainable development objectives.


Objectives
The primary objective of this Campbell systematic review was to synthesize the available evidence on the impacts of agroforestry interventions in LMICs on agricultural productivity, ecosystem services, and human well-being. The secondary objectives were to identify key pathways through which agroforestry interventions lead to various outcomes and how the interventions affect different sub-groups of the population.


Search Methods
This review is based on a previously created evidence and gap map (EGM) of studies evaluating the impacts of agroforestry practices and interventions on agricultural productivity, ecosystem services, and human well-being. We included published and unpublished literature in the English language covering the period between 2000 and October 20, 2017. We searched six academic databases and 19 organization websites to identify potentially relevant studies. The search was conducted for our EGM in mid-2017, and we did not conduct an additional search for this systematic review.


Selection Criteria
We included randomized control trials (RCTs) and quasi-experimental studies assessing the effect of an agroforestry intervention on at least one outcome measure of agricultural productivity, ecosystem services, or human well-being for farmers and their farmland in LMICs. Agroforestry interventions include any program or policy designed to promote and support the adoption or maintenance of agroforestry practices, which include trees on farms, silvopasture, shade-grown crops, and homegardens with trees, among others. Moreover, the studies needed to include a nonagroforestry comparator, such as conventional agriculture or forestry systems or a before-after comparison.


Data Collection and Analysis
We used a standardized data extraction spreadsheet to extract details about each included study. We also used a standardized form to assess risk of bias for each of the included studies in this SR. Meta-analysis techniques were used to combine and synthesize effect size estimates for the outcomes measures that had sufficient data. We used a random effects models for the meta-analyses and use Hedge's g (difference in means divided by the pooled standard deviation) to report effect size estimates. The outcomes without enough evidence for meta-analysis were discussed narratively.


Main Results
We identified 11 studies across nine countries, all of which used quasi-experimental methods. Overall, the quality of the evidence base was assessed as being low. Studies were rated as having high or critical risk of bias if they failed to convincingly address more than one of the main potential sources of bias, namely selection bias, group equivalence, and spillover effects. Given the low number of studies and the high risk of bias of the evidence base, the results of this SR are limited and should be considered a baseline for future work. The results of the meta-analysis for impacts on yields indicated that agroforestry interventions overall may lead to a large, positive impact on yield (Hedge's g = 1.16 [-0.35, 2.67] (p = .13)), though there was high heterogeneity in the results (I 2 = 98.99%, τ 2  = 2.94, Q(df = 4)  = 370.7). There were positive yield impacts for soil fertility replenishment practices, including incorporating trees in agricultural fields and improved fallow practices in fields where there are severe soil fertility issues. In other cases, incorporating trees into the production system reduced productivity and took land out of production for conservation benefits. These systems generally used an incentive provision scheme to economically offset the reductions in yields. The result of the meta-analysis on income suggests that agroforestry interventions overall may lead to a small, positive impact on income (Hedge's g = 0.12 [-0.06, 0.30] (p = .20)), with moderately high heterogeneity in the results (I 2 = 75.29%, τ 2  = 0.04, Q(df = 6) = 19.16). In cases where improvement yields were reported, there were generally attendant improvements in income. In the cases where payments were provided to offset the potential loss in yields, incomes also generally improved, though there were mixed results for the certification programs and the tenure security permitting scheme. One program, which study authors suggested may have been poorly targeted, had negative yield impacts. There was not enough comparable evidence to quantitatively synthesize the impacts of agroforestry interventions on nutrition and food security outcomes, though the results indicted positive or neutral impacts on dietary diversity and food intake were likely. Surprisingly, there was little evidence on the impacts of agroforestry interventions on environmental outcomes, and there was no consistency of environmental indicator variables used. However, what has been studied indicates that the environmental benefits are being achieved to at least some extent, consistent with the broader literature on agroforestry practices. The evidence base was insufficient to evaluate the interaction between environmental and social impacts. Several studies explicitly considered variable impacts across different population sub-groups, including differential impacts on small-holders versus large-holders, on woman-headed households versus male-headed households, and on richer groups versus poorer groups. Small-holder farmers typically experienced the most positive effect sizes due to the agroforestry interventions. Women and poorer groups had mixed outcomes relative to men and richer households, highlighting the importance of considering these groups in intervention design.


Authors' Conclusions
There is limited evidence of the impacts of agroforestry interventions, restricting our ability to draw conclusions on the effect sizes of different intervention types. The existing evidence forms a baseline for future research and highlights the importance of considering equity and socio-economic factors in determining suitable intervention design. Some key implications for practice and policy include investing in programs that include pilot programs, funding for project evaluation, and that address key equity issues, such as targeting to smallholders, women, poor, and marginalized groups. Funding should also be given to implementing RCTs and more rigorous quasi-experimental impact evaluations of agroforestry interventions over longer time-periods to collect robust evidence of the effectiveness of various schemes promoting agroforestry practices.

intake were likely. Surprisingly, there was little evidence on the impacts of agroforestry interventions on environmental outcomes, and there was no consistency of environmental indicator variables used. However, what has been studied indicates that the environmental benefits are being achieved to at least some extent, consistent with the broader literature on agroforestry practices. The evidence base was insufficient to evaluate the interaction between environmental and social impacts.
Several studies explicitly considered variable impacts across different population sub-groups, including differential impacts on small-holders versus large-holders, on woman-headed households versus male-headed households, and on richer groups versus poorer groups. Small-holder farmers typically experienced the most positive effect sizes due to the agroforestry interventions. Women and poorer groups had mixed outcomes relative to men and richer households, highlighting the importance of considering these groups in intervention design.
Authors' Conclusions: There is limited evidence of the impacts of agroforestry interventions, restricting our ability to draw conclusions on the effect sizes of different intervention types. The existing evidence forms a baseline for future research and highlights the importance of considering equity and socio-economic factors in determining suitable intervention design. Some key implications for practice and policy include investing in programs that include pilot programs, funding for project evaluation, and that address key equity issues, such as targeting to smallholders, women, poor, and marginalized groups. Funding should also be given to implementing RCTs and more rigorous quasi-experimental impact evaluations of agroforestry interventions over longer time-periods to collect robust evidence of the effectiveness of various schemes promoting agroforestry practices.
1 | PLAIN LANGUAGE SUMMARY 1.1 | Limited evidence on agroforestry interventions shows positive impacts on agricultural yield and income Agroforestry interventions may lead to a large, positive impact on yield, though there is high variations in findings. Agroforestry interventions may also lead to a small, positive impact on income.
There is insufficient evidence on nutrition, food security and environmental outcomes. Equity concerns of agroforestry interventions appeared in many of the studies, with mixed results, indicating that additional consideration of equity in agroforestry interventions is needed.

| What is this review about?
Agroforestry is defined as the integration of trees and woody shrubs in crop and livestock production systems. It is widely promoted as a conservation and development tool to sequester carbon, improve soil fertility, and conserve biodiversity on agricultural lands while generating economic benefits for farmers. Agroforestry is promoted through a range of interventions, including farmer capacity development, provision of tree germplasm, and financial or tenure security provision.
This review examines the evidence on the impacts of any type of agroforestry intervention in low-and middle-income countries (LMICs) on three broad outcomes: agricultural productivity, ecosys-

| What are the main findings of this review?
There is a large, positive overall effect of agroforestry interventions on agricultural yields, although there is large variation in the results.
The largest positive impacts of agroforestry on yields are associated with less fertile lands, and negative yield impacts are associated with highly productive lands.
There is a very small, positive overall effect of agroforestry interventions on income. Increased or neutral income effects are associated with either increased yields providing additional income, or incentive payments offsetting the costs associated with decreased yields.
Few impact evaluations considered the impacts of agroforestry interventions on nutrition and food security. Qualitative assessment suggests that agroforestry interventions may lead to positive or neutral nutrition and dietary diversity outcomes and may lead to positive food security outcomes.
Few studies considered the impacts of agroforestry interventions on ecosystem services. However, the effects of agroforestry practices on ecosystem services are well-documented in the broader agroforestry literature.
In areas with limited soil fertility, agroforestry interventions provided technical support through extension and training programmes, and in some cases provided access to tree germplasm, to support farmers to adopt agroforestry practices intended to increase yields and incomes. In higher productivity areas, agroforestry interventions provided incentives-such as PES, certification schemes, and tenure security-to adopt agroforestry practices intended for conservation that may reduce overall yields.
1.5 | What do the findings of the review mean?
The existing evidence suggests that there may be positive impacts on agricultural yields and incomes as well as food security and ecosystem services, but appropriate intervention design is dependent on local biophysical and socio-economic characteristics. commitments to ensure its agricultural investments are "climate smart" by 2020 (World Bank, 2016). High-level policy documents in many LMICs now explicitly call for the integration of trees into farming systems (e.g., national policies of Government of India (2014), Republic of Kenya (2014), and Government of Malawi (2011)) and there is growing interest in promoting agroforestry as part of sustainable intensification initiatives that reconcile agricultural production with the provision of other important ecosystem services (FAO, 2013;Pretty, 2018).
A large body of literature on the adoption Mercer, 2004;Pattanayak et al., 2003) and impacts  of agroforestry practices in LMICs is now available. However, systematic understanding of the socialecological impacts of agroforestry interventions remains missing.
A critical gap exists in knowledge of the on-the-ground effectiveness of interventions promoting the adoption of agroforestry practices in advancing sustainable development priorities. The lack of such knowledge, in turn, hampers the ability of decision-makers to effectively allocate resources relating to agroforestry research, policy, and practice. This systematic review (SR) addresses this need for evidence synthesis, focusing on impact evaluations that assess the effects of agroforestry interventions on agricultural productivity, ecosystem services, and human well-being.

| The intervention
Agroforestry is promoted and supported in a variety of ways.
Previous work  has identified six main agroforestry intervention categories: • Farmer capacity development through training, extension, the provision of other advisory services and technical information, demonstration sites, participatory trials, and other modes of action learning.
• Incentive provision through direct payments to farmers for planting and caring for trees on their farms and the receipt of premiums for particular agricultural commodities, e.g., for shade grown coffee.
• Enhancing access to tree germplasm through the direct provision of tree seedlings/seeds and linking farmers to and/or strengthening the capacity of tree germplasm suppliers.
• Community-level campaigning and advocacy encouraging large numbers of community members to plant trees on their farms and/ or pursue specific agroforestry practices.
• Market linkage facilitation for a greater and/or more favorable integration of smallholders into tree-product value chains.
• Policy and institutional change for a more enabling environment that promotes the uptake of agroforestry and/or enables its potential benefits to be better realized.
Agroforestry interventions typically encourage farmers to take up several complementary practices (e.g., planting of longer-term tree species together with short-term shrubs along field contours) to meet multiple social-ecological objectives (Waldron et al., 2017). The establishment of trees incorporated into crop fields or pasture, trees integrated with plantation crops, and improved or rotational fallow are other common examples of promoted practices, which may include the provision of training and material support in setting up of tree nurseries and grafting stock. Strengthening the integration of smallholders into tree-product value chains through, for example, addressing production constraints or promoting more favorable contractual arrangements with buyers, is also increasingly popular .

| How the intervention might work
A simplified and generic theory of change that may underlie an agroforestry intervention (either explicitly or implicitly) is presented in Figure 1. The first required step is successful mobilization and engagement of farmers or landholdersthose that would potentially adopt new or expanded agroforestry practices on their land. The second step represents a given interventions, such as farmer capacity development or facilitating access to appropriate tree germplasm.
At least the first and, in many cases, both are required for significant and appropriate adoption of the promoted agroforestry practices and/or tree germplasm. Following such adoption, several intermediary outcomes are then expected. For example, farmers may see improved soil health and other ecosystem services, such as water infiltration, that then increase crop productivity or reduce production costs and, therefore, increase returns. Some participants in the intervention may find that increased use and availability of tree/shrub fodder leads to increases in milk and other livestock production and returns. Selling other agroforestry products such as timber, firewood, and fruit, is also expected to increase and diversify income and food sources (Mbow et al., 2014b;Sharma et al., 2016;Waldron et al., 2017). These changes may have differential effects depending on gender. Together, these intermediate outcomes are expected to interact together to bolster household resilience to shocks, as well as overall household income food and nutritional security. These positive benefits-and the broader context in which this stylized theory of change is embedded-will then affect further household investment in agroforestry.

| Potential tradeoffs
Our theory of change diagram presents positive pathways linking agroforestry interventions, adoption, and beneficial impacts.
However, agroforestry may also include potentially negative tradeoffs, such as a reduction in area of crop production and negative tree-crop interactions. Though the evidence is mixed (e.g., Blaser et al., 2018), a reduction in productivity may accompany agroforestry practices. Therefore, some interventions promoting new or expanded agroforestry practices may require a mode of compensation for yield losses. Such compensation may come in the form of PES or certification programs that yield higher prices for the crops CASTLE ET AL. the farmers produce. These types of interventions may help balance the tradeoff between environmental benefits, like biodiversity conservation, soil and water quality, and carbon sequestration, with agricultural yield and economic ones. Furthermore, while there may be short-term tradeoffs with reductions in yields, this may not stay true in the long term as productive tree crops reach maturity or as soil fertility increases (Garrity et al., 2010;Nair, 1993;Pandey, 2007). The role of climate change further affects potential tradeoffs in agroforestry systems. Such systems may provide, for example, climate change resilience, which may result in productivity advantages during difficult years with extreme weather events.

| Why it is important to do this review
Agroforestry systems and practices are found across LMICs and are viewed as increasingly important for boosting food security, addressing environmental degradation, and contributing to a range of other development policy objectives (Garrity et al., 2010;Waldron et al., 2017). However, financing and implementation of agroforestry and other nonmainstream agricultural approaches remains limited in many contexts (DeLonge et al., 2016;Horlings & Marsden, 2011;IPES-Food, 2016). Instead, high-input, mechanized approaches to agriculture predominate. Over the past half century, these approaches have become conventional, leading to major increases in yields and helping to feed much of the world's population (IAASTD, 2009;Pretty & Bharucha, 2014; The Government Office for Science, 2011). However, these benefits have brought with them sometimes steep social and environmental costs, including biodiversity loss, climate change, land degradation, water pollution, and negative effects on human health (Brawn, 2017;Horrigan et al., 2002;IAASTD, 2009;Intergovernmental Panel on Climate Change, 2015;Maxwell et al., 2016;Pretty & Bharucha, 2014).
Farmers, consumers, and policymakers increasingly recognize these environmental and health costs and seek viable alternatives that can simultaneously address food security concerns while delivering other social and environmental benefits. Agroforestry represents one such alternative, but there is an important need to systematically identify what kinds of interventions and practices have worked to deliver these benefits and understand potential trade-offs involved. Evidence on the effectiveness of agroforestry interventions is therefore needed to inform broader debates and investment decisions relating to sustainable agricultural intensification. We expect that this SR will present a vital resource to inform such discussions, including on expanded measures that account for the multiple values of agroforestry and other agricultural systems (Sukhdev, 2018;Waldron et al., 2017).
This SR uses evidence compiled in a recently published EGM on the impacts of agroforestry practices and interventions on agricultural productivity, ecosystem services, and human well-being . In this mapping exercise, we identified 395 studies on the impacts of agroforestry practices and interventions, including eight impact evaluation of agroforestry interventions and 11 SRs. An extended search identified an additional three impact evaluations included in this SR.
F I G U R E 1 Illustrative theory of change for an AF intervention. Figure presented in Miller et al. (2020) and used with permission here. AF, agroforestry All the SRs we found on agroforestry study the impact of agroforestry practices, without considering interventions promoting and supporting the adoption of agroforestry leading to social-environmental outcomes. These SRs include Reed et al. (2017) on the impact of trees on food production and livelihoods in the tropics; meta-analyses of agricultural yields with and without trees in West Africa (Bayala et al., 2012;Sileshi et al., 2008); a global meta-analysis of agroforestry impacts on pasture yields (Rivest et al., 2013); global meta-analyses of the carbon sequestration potential of agroforestry (Kim et al., 2016) and on soil carbon storage (Corbeels et al., 2018); biodiversity functions of agroforestry in the tropics (Jezeer et al., 2017;Norgrove & Beck, 2016) and globally (De Beenhouwer et al., 2013); a meta-analysis on the use of trees in agriculture on infiltration capacity (Ilstedt et al., 2007); and a global metaanalysis of the impacts of agroforestry on pest, disease, and weed control (Pumariño et al., 2015).
As detailed below, the current SR includes all LMICs, not just tropical ones, and both direct and indirect effects of agroforestry interventions on a range of outcomes, including multi-dimensional human well-being. We are aware of no SR that summarizes empirical studies on the causal effects of agroforestry interventions in LMICs, particularly outside the context of tightly controlled, research station-based experimental trials.
There are two primary audiences for this SR. First, we expect that researchers on agroforestry and broader sustainability issues will use the results to inform further investigations on these topics, including new empirical research. Results should be of wide interest to researchers in a range of institutions, from CGIAR centers to universities. The second main anticipated audience is decision-makers for whom agroforestry is already or potentially of interest. This includes relevant ministries and programs in governments and donor agencies, as well as NGO and other advocacy and implementing organization staff.

| OBJECTIVES
The overall aim of this SR is to identify and synthesize existing evidence on the effects of interventions that promote the adoption and use of agroforestry practices on agricultural productivity, ecosystem services and human well-being in LMICs.
In this SR, we address the following three research questions: 1) What effects do agroforestry interventions have on agricultural productivity, ecosystem services, and human well-being outcomes?
2) What are the effects of agroforestry interventions on different study population sub-groups?
3) What are the pathways through which agroforestry interventions generate impacts? 4 | METHODS

| Criteria for considering studies for this review
The included studies in this SR were identified based on results from our evidence gap map (EGM) of the impacts of agroforestry on agricultural productivity, ecosystem services, and human well-being in LMICs . We based our EGM, and thus this review, on a previously published protocol (Miller, Ordonez, et al., 2017). Here, we summarize the methods used in that EGM and then present methods used specifically to carry out this SR. We note that the EGM included studies that evaluated the impacts of agroforestry practices, but our SR only considers those studies that evaluated the impacts of agroforestry interventions. The selection criteria for studies in this review are summarized in Table 1 and discussed in detail below.

| Types of studies
This SR includes quantitative impact evaluations using experimental or quasi-experimental designs. Experimental designs use random assignment to treatment and control groups, such as a randomized controlled trial (RCT). Quasi-experimental designs use rigorous statistical methods to adjust for nonrandom assignment between treatment and control groups to make causal inferences. We include the following types of quantitative impact evaluation studies (Snilstveit et al., 2019): • Studies where participants are randomly assigned to treatment and comparison group (experimental study designs); • Studies where assignment to treatment and comparison groups is based on other known allocation rules, including a threshold on a continuous variable (regression discontinuity designs) or exogenous geographical variation in the treatment allocation (natural experiments); • Studies with nonrandom assignment to treatment and comparison group that include pre-and post-test measures of the outcome variables of interest to ensure equity between groups on the baseline measure, and that use appropriate methods to control for selection bias and confounding. Such methods include statistical matching (for example, propensity score matching (PSM), or covariate matching), regression adjustment (for example, difference-in-differences, fixed effects regression, single difference regression analysis, instrumental variables (IVs), endogenous switching regression, and "Heckman" selection models); • Studies with nonrandom assignment to treatment and comparison group that include post-test measures of the outcome variables of interest only and use appropriate methods to control for selection bias and confounding, as above.
Ideally, studies would have included both baseline and postintervention data. However, given the small number of studies meeting this criterion, we include studies with only post-intervention outcome data as long as they use some method to control for selection bias and potential confounding factors.
We excluded theoretical or modeling studies (unless they include a relevant empirical example with design that meets inclusion criteria), editorials and commentaries, and field trials that were not part of a specific intervention.

| Types of participants
The population of interest was farms and those that live and farm on them in LMICs using a system that falls within the definition of agroforestry.

| Types of interventions
From a policy perspective, it is especially useful to know what kinds of interventions might most effectively promote agroforestry practices to yield desired social-ecological outcomes. This SR focuses on the types of interventions summarized in Table 2.
The promotion of agroforestry includes a wide range of specific practices that fall under what is generally considered as agroforestry.
Here, we consider "agroforestry" to be defined as "a collective name for land-use systems and technologies where woody perennials (trees, shrubs, palms, bamboos, etc.) are deliberately used on the same land-management units as agricultural crops and/or animals, in some form of spatial arrangement or temporal sequence" (Nair, 1993). To capture the wide diversity of practices that might fall under this definition and present them in a coherent way, we subdivided agroforestry into the practice types listed in Table 3. This set of practice types is based on the classification system proposed by Nair (1985Nair ( , 1993 and updated by Sinclair (1999), Torquebiau (2000), and Atangana et al. (2014).
To identify the effect of an intervention or practice, a study needs to include both adopters or those exposed to an agroforestry intervention and comparators. A comparator is defined as a farm or household that does not adopt a given practice identified in Table 3, or is not exposed to a specific agroforestry intervention. Specifically, eligible comparisons included a land or household where agroforestry was not practiced but another land use was in place (e.g., agriculture, primary forest, or secondary forest/forest plantation).
For observational studies, a farm or household before adopting a given agroforestry practice was also an eligible comparator.

| Types of outcome measures
This SR focuses on three broad outcome categories: (1) agricultural productivity, (2) ecosystem services, and (3) human well-being. Studies that focused exclusively on the adoption of a particular agroforestry technique or species without reference to impact were excluded. We did not specify a minimum or maximum duration of follow-up for study inclusion. All types of agricultural production settings in LMICs were considered relevant.
Importantly, we excluded studies that evaluated the impact of agroforestry interventions on adoption only without estimating the impacts on any measure of our three broad outcome categories. We believe these outcomes are most of interest to policymakers considering agroforestry interventions. Additionally, while the adoption of agroforestry due to an intervention may be an important indicator for the longer-term effects of the intervention given the larger body of evidence for the impacts of agroforestry practices, the realized outcomes are highly uncertain. However, agroforestry practices are highly diverse, there is high variability in the long-term outcomes, and the impacts are context-specific . One practice may lead to very different outcomes in different contexts (Friggens et al., 2020;Martin et al., 2020). Furthermore, adoption alone, especially when only considering early adoption, says very little about the effectiveness of an intervention. Agroforestry impacts depend on tree survival, farmer commitment to maintaining the practice, and the biophysical features of the land.
Therefore, this review focuses on the studies measuring outcomes beyond adoption. Future research may examine adoption-only studies.

| Search methods for identification of studies
The search and screening process was conducted for the agroforestry EGM by Miller et al. (2020), based on a previously published protocol for the EGM (Miller, Ordonez, et al., 2017)

| Data collection and analysis
The online literature review and reference management software, EPPI-Reviewer 4, was used to upload relevant titles and abstracts for candidate studies identified through the search strategy for our EGM. The EGM specifically marked whether a study considered an agroforestry intervention (versus only a practice) and if it used experimental or quasi-experimental methods. These impact evaluations of interventions comprise the evidence base for this SR. The information for each study included in this SR was extracted into a data extraction matrix in Excel. Our data extraction matrix was adapted from the data extraction matrix used in Snilstveit et al. (2019). The data we extracted included bibliographic information, study design, context, intervention information, process and implementation, cost, external validity, outcome information, and outcome data to be used in the meta-analyses. The data extraction matrix is presented in Supporting Information Appendix 2.

| Selection of studies
For our EGM, we imported the records from academic databases into our data management software (EPPI-Reviewer 4), and we used the built-in tool to aid in removing duplicates. The grey literature was T A B L E 2 Classification of interventions to promote agroforestry

Intervention type Description and examples
Farmer capacity development Efforts focus on enhancing farmer knowledge and/or skills relevant to agroforestry practice, for example, setting up and managing tree nurseries; tree planting and management techniques; and seed collection and propagation. Such interventions can involve the provision of training, extension and other advisory services, and specific technical information, as well as the setting up of demonstration sites, running of participatory trials and other modes of participatory action learning Enhancing access to tree germplasm Efforts to facilitate farmer access to quality and desired tree/shrub seedlings/seeds required to pursue prioritized agroforestry practices. Such interventions often entail the direct provision of seedlings/ seeds to farmers but can also involve linking farmers to relevant suppliers and/or enhancing the ability of existing or new suppliers to supply participating farmers with quality and desired tree germplasm Community-level campaigning and advocacy Interventions of this type can also involve the provision of information about the benefits of trees and agroforestry and/or the provision tree seedlings/seeds but is distinct from the first two types. The main objective is to motivate, including through social pressure, community members to plant trees on their farms and/or pursue specific agroforestry practices. Campaigning and advocacy may be done through radio and/or community meetings, speeches, and drama and may involve a mass community effort to plant trees, for example, on a specific day of the year Incentive provision Interventions of this type seek to motivate farmers to plant trees and practice agroforestry through the provision of incentives. Examples include paying farmers for planting and caring for trees on their farms in exchange for desired ecosystem services (e.g., carbon sequestration) and buyers offering premiums to farmers for agricultural commodities produced under certain conditions (e.g., via certification schemes for products such as shade grown organic coffee) Market linkage facilitation Interventions of this type focus on efforts to enhance potential returns from agroforestry to encourage adoption. This could be through linking producers to and/or brokering new and/or improving existing contractual arrangements with buyers. Other examples include the collective marketing of agroforestry products and/or interventions to stimulate demand for a given agroforestry product, for example, Baobab fruit Institutional and policy change Interventions of this type involve reforming and/or putting in place new policies, laws, regulations, and institutions more broadly to facilitate greater uptake of and benefits from agroforestry. Such efforts are designed to address existing policy and institutional constraints such as, for example, prevailing forestry regulations-designed for forest management areas-that may frustrate smallholder efforts to grow particular high-return tree species or insecure land tenure that may similarly deter long-term investments in tree planting T A B L E 3 Classification of agroforestry systems and specific practices

Agroforestry system
Specific Practices

Definition
Agrisilvicultural (crops and trees) Improved or rotational fallow Land resting system using trees and shrubs to replenish soil fertility, sometimes in rotation with crops as in traditional shifting cultivation Multipurpose trees on parklands or lots (mixed trees and crops) Scattered trees in parklands (landscapes derived from agricultural activities) or other land area or in systematic patterns on bunds, terraces or plot/field boundaries Mixture of plantation crops Combination of plantation crops in an intercropping system in alternate arrangement, including use of shade trees for cash crops Tree gardens Cultivation of a mixture of several fruit and other useful trees, sometimes with the inclusion of annual crops. This arrangement is sometimes referred to as homegardens Alley cropping Planting rows of trees with a companion crop grown in the alleyways between the rows Shelterbelts Extended windbreak of living trees and shrubs established and maintained to protect farmlands (beyond a single farm) Silvopastoral (pasture/animals and trees) Multipurpose fodder trees or shrubs around farmlands (protein bank) Production of protein-rich tree fodder on farm/rangelands Living fences and shelterbelts Trees as fences around plots and/or an extended windbreak of living trees and shrubs established and maintained to protect farmlands and provide fodder Integrated production of animal/ dairy and wood products Production of animal/dairy and wood products within the same land area Trees/shrubs on pasture Trees scattered irregularly or arranged according to some systematic pattern Agrosilvopastoral (crops, pasture/animals and trees) Integrated production of animals (meat and dairy), crops, and wood/fuelwood Production of crops, animal/dairy and wood products within the same land area, including around homesteads Woody hedgerows for browse, green manure, soil conservation Multipurpose woody hedgerows for browse, mulch, green manure, soil conservation, and so forth Wooded pasture products Land covered with grasses and other herbaceous species, and with woody species Agroforestry including insects/fish Entomoforestry The combination of trees and insects (e.g., bees for honey and trees) Aqua-silvo-fishery Trees lining fish ponds, tree leaves being used as "forage" for fish imported into and managed in Microsoft Excel due to reference format incompatibility with EPPI-Reviewer 4. We screened the records at the title and abstract level, excluding studies which did not meet our criteria for study country, publication year, study type, and relevant agroforestry practice or intervention.
The review process consisted of 14 reviewers. All the reviewers were trained by the project leads (Miller and Baylis) and research coordinators (Ordoñez and Castle). We first reviewed search results at the level of title and abstract to determine inclusion or exclusion.
The title and abstract stage of the review process included 11 reviewers. To ensure inter-rater reliability, each reviewer was given two samples of 30 studies for classification (60 studies in total).
Results from one of the lead researchers was used as the standard for classification and a kappa statistic was used as a measure of agreement between reviewers (Cohen, 1960

| Data extraction and management
We conducted data extraction that expanded on the related EGM for the included studies in this SR. We used a standardized data extraction form, including a codebook describing the scope of each question on which we sought data, to compile descriptive data from all studies meeting our inclusion criteria (Supporting Information Appendix 2). We extracted the following types of information: • Bibliographic information.
• Study design and basic information, including information on the intervention type, funder, implementing agency, intervention objectives, location, details on the program, target groups, number of participants, duration, follow up, practices promoted, sample size of treatment and control group, comparators, and equity focus groups.
• Process and implementation, including information on program uptake and adherence, implementation fidelity and service delivery quality, and other process factors.
• External validity measures, including length of study, conditions of trial, independence of evaluation, conflicts of interest, and use of theory.
• Outcome measures, including types of outcomes evaluated, indicator variables, equity groups examined, sample size, effect size data, and standard deviation.
• Mechanisms, including any stated mechanisms linking the intervention to the outcome.
One person (Castle) undertook this descriptive data extraction and a senior reviewer checked it for agreement. Two reviewers (Castle and Ordoñez) undertook the effect size data extraction to check for consistency. We assessed the risk of bias by coding "Yes," "No," and "Unclear" for each of the following criteria:

| Assessment of risk of bias in included studies
1. Mechanism of assignment: was the allocation or identification mechanism able to control for selection bias? Again following Snilstveit et al. (2019), we used the results of the risk of bias assessments to produce an overall rating for each study as low, medium, high or critical risk of bias. We used the following decision rules to come to this decision: • If all questions are answered "yes," studies are assigned a low risk of bias rating.
• If studies score "yes" for selection, group equivalence and spillovers, but "no" or "unclear" for other domains studies are assigned a medium risk of bias rating. If they score "yes" for two out of three of the categories selection, group equivalence and spillovers, and unclear for another, we assign a medium risk of bias rating.
• If studies score "no" for any one of the following: selection, group equivalence or spillovers they are assigned a high risk of bias rating.
For studies unclear on two or more of the three key categories (selection, group equivalence or spillovers) but that attempted matching/matching w. regression, we give a high risk of bias rating.
• If studies score "no" for more than one of the selections, group equivalence or spillover questions the study is assigned a critical risk of bias rating.
• Otherwise, we take an unclear rating as "no."

| Measures of treatment effect
The statistical evidence in the studies was extracted with the intention of comparing the estimated effects of interventions on outcomes. Two reviewers independently extracted the data from a random sample of studies to ensure consistency and resolve any inconsistencies.
We extracted outcome data from each study and calculated standardized effect sizes, standard errors, and confidence intervals.
For continuous outcomes, we calculated standardized mean difference effect sizes using Hedges' g (sample size corrected) standardized mean difference (SMD). To adjust for the small positive bias resulting from the Hedges' g calculation, we use the following equation to obtain an unbiased version of Hedges' g (Borenstein et al., 2009b;Ellis, 2010;Hedges & Olkin, 1985): where J is the correction factor To calculate the effect size, d, we use one of the following formulas, based on the outcome reporting provided by each study (Borenstein et al., 2009b).
For studies reporting mean difference and standard deviations, we calculate the effect size according to the following formula: where the pooled standard deviation is For studies reporting a regression coefficient, β, between continuous variables, the regression coefficient to standardized mean difference, g using the following equation: where the standard deviation, SD, is where SE is the standard error, calculated as the regression coefficient divided by the T value for the coefficient, and N is the total sample size in the regression model. And the standard error is

The variance is
For studies reporting a correlation coefficient between two continuous variables, the correlation coefficient, r, serves as the effect size index. We can convert r to standardized mean difference, g using the following equation: And the variance is Hedge's g is typically interpreted using the following approximations, though the meaning of small, medium, and large depends on the context (Cohen, 1977): • Small effect (cannot be discerned by the naked eye) = 0.2 • Medium effect = 0.5 • Large effect (can be seen by the naked eye) = 0.8

| Unit of analysis issues
Several of our included studies reported dependent effect sizes, but only effect sizes that are statistically independent should be used in meta-analysis. Specifically, three studies reported effect sizes for multiple treatment groups within the same study Thorlakson & Neufeldt, 2012).
Additionally, one study used panel data and presented results for multiple outcomes at multiple time points (Sills & Caviglia-Harris, 2015). To deal with effect size dependence due to multiple comparisons, multiple outcomes, or multiple time points within a study, we followed the methods presented in Borenstein et al. (2009aBorenstein et al. ( , 2009c. We compute a summary effect for the different treatment groups that use the same control group reported within a study to use in our meta-analyses. Since each study will be represented by one summary effect size in the meta-analysis, we avoid more weight being given to studies reporting more treatment groups. The variance of the summary effect size incorporates the correlation among the treatment groups.
The summary effect size is computed as the mean of the effect sizes for each treatment group.
where n is the number of treatment groups, and Y i is the effect size for treatment group ∈ i n.
The summary effect size variance is: where V i is the variance for each treatment group effect, V k is the variance for every other treatment group effect size, and r ik is the correlation between each treatment group effect sizes ( = r 0.5 ik when considering multiple treatments using the same control group).
When a study reported on multiple outcomes measures for a single outcome type for our analysis, we selected a single outcome measure that was most similar to the measures used in the other studies included in our meta-analysis, as we describe in Section 4.3.9.

| Dealing with missing data
One of the included studies did not provide sufficient data to calculate effect sizes. We contacted study corresponding author when there was missing or incomplete data for calculating effect sizes, who provided the needed information.

| Assessment of heterogeneity
We used a random-effects model and report the I 2 statistic to assess the percentage of variability in the estimates due to heterogeneity, the τ 2 variation in the observed effects, and Q statistic difference between the observed effects and fixed-effects model estimates (Higgins et al., 2020). The forest plots also provide a graphical visual of the heterogeneity.

| Assessment of reporting biases
We did not explicitly assess reporting biases due to the small number of studies included in our review. If we had included at least 10 studies in a meta-analysis, we would have reported funnel plots with tests for funnel plot asymmetry (Higgins et al., 2020).

| Data synthesis
We synthesized the study results using conventional meta-analysis (Borenstein et al., 2009b) where data permitted, and narrative synthesis for all studies meeting our inclusion criteria. For synthesis purposes, we looked at impacts of agroforestry on agricultural productivity (yield), income, human nutrition and well-being, and environmental outcomes. We attempted to reduce publication bias by searching for and including unpublished studies and grey literature in the review.
We used the "metafor" package in R software to conduct metaanalysis and create forest plots with effect sizes from each included study. We used the random-effects model in the metafor package for the meta-analyses since we expect that the true effect will vary from study to study included in our meta-analyses (Borenstein et al., 2009b;Viechtbauer, 2010 presented both PSM and IV results separately, we used the PSM results in the meta-analysis. Some studies also used more than one method as part of their analysis, such as IV with endogenous switching regression. We assessed heterogeneity in the effect size results graphically by presenting the effect size distribution on forest plots for the meta-analyses.
We decided to perform meta-analysis when there were three or more studies presenting comparable indicators for a given outcome.
For the outcomes that do not have enough data to perform metaanalysis or where outcome measures were not comparable, we provide a narrative discussion of the trends in size and direction of effect sizes from the studies as well as a discussion of the mechanisms suggested to link the intervention with the outcome. The narrative discussion also highlights difficulties in measuring the outcomes and methods used as well as how comparable the evidence is across the included studies.
The risk of bias assessment was used to determine the overall quality of the evidence base. A "low" or "medium" risk of bias assessment would be considered higher quality evidence in terms of controlling sources of bias than a study with "high" or "critical" risk of bias. We did not use the risk of bias in the meta-analyses and did not restrict our analyses based on risk of bias rating since all of the studies we identified were rated as a high or critical risk of bias. We discuss throughout this report the importance of considering the low overall quality of the evidence base when interpreting the results, and we highlight these points in our discussion. Our primary conclusions include the overall limited evidence base on agroforestry interventions, which is in part drawn from our risk of bias assessments. The results are presented in a table and discussed the assessment narratively.

| Subgroup analysis and investigation of heterogeneity
We planned to conduct qualitative investigation of subgroup heterogeneity across multiple dimensions of equity, specifically gender, race/ethnicity, socio-economic status, and literacy/education level. In our subgroup analysis, we only reported on gender and socioeconomic status due to a lack of studies reporting on race/ethnicity or literacy/education level.
In our data extraction, we noted any reference to equity in the included studies. Equity focus is defined as the extent to which an intervention or analysis focuses on specific disadvantaged populations. We aimed to identify how and to what extent the included studied considered equity in their approach. We used the PRO-GRESS framework (O'Neill et al., 2014) to consider potentially disadvantaged groups in the included studies. Key dimensions of equity that we considered were gender, race/ethnicity, socioeconomic level, and literacy/educational level. We assessed the extent to which each study addresses equity, by describing any intervention focus on specific social groups, examining equity as an outcome, or reporting on differential impacts across sub-populations.

| Sensitivity analysis
Due to the limited studies in our review and overall high risk of bias present in the included studies, we were unable to conduct any sensitivity analyses to assess the effects of risk of bias on the treatment effect estimates.

| Treatment of qualitative research
We did not include qualitative research; however, we would have included qualitative studies in our review had they met our inclusion criteria (Miller, Ordonez, et al., 2017). We did not identify any qualitative experimental or quasi-experimental impact evaluations of any agroforestry interventions. included studies used quasi-experimental methods, with none using an experimental design. Each article reports on one study with multiple components (treatment groups or outcome variables). Table 4 presents descriptive information on the 11 studies and the full data extraction matrix is provided in Supporting Information Appendix 2.

| Included studies
Included studies examined all of the six intervention types described above ( Table 2) Nearly all the agroforestry practices promoted in the intervention studies were agrisilvicultural (n = 7, 64%) or agrosilvopastoral (n = 3, 27%), with one study evaluating a silvopasture intervention (9%). Table 4 shows the specific practices promoted, with trees integrated in crop fields (n = 5, 45%) followed by integrated production of animals, crops, and wood (n = 3, 27%) the two most frequently promoted.
Ecosystem services was the least frequent outcome category (n = 4, 36%), and the most frequent one was human well-being outcomes (n = 10, 90%). Agricultural productivity was evaluated for over half of the studies (n = 6, 55%). We can see that for specific outcomes, income and household expenditure was the most common outcome (n = 8, 73%), followed by agricultural productivity (n = 6, 55%). When looking at the combination of interventions and outcomes, the most studied linkages were studies focused on incentive provision and farmer capacity development with human well-being and agricultural productivity outcomes.
These impact evaluation studies were published from 2005 to 2017, with 2016 and 2017 the only years with more than one study appearing. In some years, no impact evaluation studies on this topic were published. All included intervention studies were published in peer-reviewed journals (n = 9) except for two , which were organization reports.
The intervention studies included in this EGM are distributed across several tropical LMICs ( Figure 3), with Sub-Saharan Africa having the most countries with a study (n = 6, 55%  1 We note that the total percentage here, and at different points throughout this report, can be more than 100% as a given study could include more than one intervention and outcome. Percentage = # of studies meeting criterion/total number of studies (n = 11). CASTLE ET AL. "ESI Generation" is the before/after change in "environmental services index" (ESI), which aggregates indices of the biodiversity conservation and carbon sequestration services associated with specific silvopastoral practices. Description: This study evaluated the Hutan Kamasyarakatan social forestry (HKm) permitting program in Indonesia. The HKm program provides groups of farmers with secure-tenure permits to continue farming on state Protection Forest land in Sumatra in exchange for protecting remaining natural forestland, planting multi-strata agroforests, and using recommended soil and water conservation (SWC) measures on their coffee plantations. They measured impact on land value, investment in soil and water conservation practices and soil fertility management, and profitability 9 Place et al. Description: This study evaluates the impacts of an ICRAF program promoting agroforestry techniques to help subsistence farmers reduce their vulnerability to climate change in Kenya. They use propensity score matching to compare farmers engaged in the agroforestry project with a control group of neighboring farmers to estimate the effects on farm productivity, off-farm incomes, wealth and the environmental conditions of their farms a When more than one intervention type is listed, this indicates that there were multiple components to the intervention recorded.

| Excluded studies
From the agroforestry EGM, 388 studies were excluded for this SR because the study design did not meet the requirements for inclusion in our SR, namely they did not consider an intervention, as opposed to a practice only, and did not use experimental or quasiexperimental evaluation methods. From the additional search conducted after the EGM, 157 studies were excluded at the full-text level for the same reasons, that is, no intervention or not using experimental or quasi-experimental methods.

| Risk of bias in included studies
The results of the risk of bias assessment are summarized in Figure 4.
The overall risk of bias assessment results are shown in Figure 5. While few studies explicitly discussed any form of bias, we found little risk of performance bias and outcome measurement bias.
Analysis in all included studies was based on farm or household survey data that were consistently applied across treatment and control groups and there was little concern about risk of performance bias between the groups. Only two studies were coded as unclear for risk of performance bias. We found two studies that were unclear if they were at risk for outcome measurement bias; otherwise, we found no evidence of outcome measurement bias since there were no incentives for the groups or the enumerators to exaggerate their responses and the surveys were consistently implemented and timed.
Most of the studies did not explicitly discuss potential spillover effects. All of the studies selected treatment and control groups from the same population, not geographically separated.
Studies with low risk of spillover bias clearly defined the cutoff between treatment and control groups, for example, minimum number of years farmers had to have adopted improved fallow for effects to be observable as cutoff for treatment and control groups. Other studies had a high risk of spillover bias since they examined the impacts of directly receiving extension advice, which could easily be transferred to friends and neighbors. For certifications schemes and PES programs, there could be some The overall body of evidence has a high risk of bias due primarily to self-selection into studied interventions. All of the studies had a high risk of bias due to issues with self-selection into the program and analysis based only on cross-sectional data. We highlight that the evidence base did not include any RCTs, an important method for addressing this potential source of bias.

Effects of agroforestry interventions on yields
We identified six studies that measured the effects of agroforestry on agricultural productivity in terms of yields. We performed a metaanalysis across five of these studies for the effect size of agroforestry interventions impacts in terms of percentage increase or decrease in productivity for the treatment group against the control group. We report the summary effect size as described in Section 4.3.5 for the  study, which reports results for five different treatment group compared to the same control group, as well as for the Thorlakson and Neufeldt (2012), which reports results for two subregions. Results are presented as a forest plot in Figure 6.
The average effect of these interventions on crop yield outcomes is 1.16 standardized mean difference (Hedges' g) with a 95% confidence interval (CI) of [−0.35, 2.67] (p = .13), which we calculated using a random effects model. The forest plot in Figure 6 shows a  .
The overall effect size is not statistically significant, but there was high variability in the contexts and types of practices. Several studies found large, positive effect sizes, all of which evaluated the impacts of soil fertility replenishment practices and climate change resilience practices in Sub-Saharan Africa Thorlakson & Neufeldt, 2012). On the other hand, the studies that employed incentives to promote adoption of agroforestry practices for biodiversity conservation and carbon sequestration resulted in reduced yields . Given the low number of included studies, we cannot identify specific contextual factors that would generalize as key factors driving yield effect direction or size due to agroforestry adoption within this study.
Included studies varied widely in the methods used, agroforestry intervention and system studied, and outcomes reported. Table 4 describes some of these important differences, which we further discuss below.
One of the studies, , evaluated five different coffee certification schemes in Nicaragua designed to promote environmentally-friendly, shade-grown coffee production practices.
Each of these schemes was evaluated against a common control group, so we used the summary effect size as described in were promoted through coffee traders and targeted medium-to large-scale enterprises. They also found that tree diversity and carbon stock tradeoffs with productivity were mediated by level of investment in labor and inputs, where farmers with higher tree diversity invested less, had lower productivity, but received a premium price. The certification schemes that required lower investment in labor and inputs tended to have lower productivity and those with higher investment in labor and inputs tended to have higher productivity. However, the net revenue did not necessarily differ due to the lower investment costs and higher price premiums.
The requirements on environmental and social criteria vary substantially between the different certification schemes, which also results in variability in the distribution and diversity of shade trees on the coffee farms. Overall, the authors concluded that the certification schemes were delivering enough compensation to offset the lower returns on investment.
Hegde and Bull (2011)  They used an IV approach along with endogenous switching regression to show the impact of the intervention on adoption and the impact of adoption on food security outcomes. The instruments they used were participation in agroforestry training, which they argued affected the decision to adopt but did not affect the dependent variable except through adoption. They tested this instrument and found that it was positively correlated with the decision to adopt agroforestry but was not significantly related to food productivity.
However, it is likely that there are systematic differences between farmers who decide to participate in agroforestry training programs and those who do not. They find a significant, large, and positive impact on crop value (35% increase). They also examined the heterogeneity in their results, and they found that households with land holdings less than two acres experience the greatest benefits from adopting fertilizer trees. Farmers with land ownership of less than one acre averaged an 82% increase in the food crops value from adopting fertilizer trees, and farmers with between one and two acres averaged a 66% increase in food crop value with fertilizer tree adoption. They also measured maize yields, but they did not use quasi-experimental methods to analyze the results. (2013)  | 21 of 52 they found highly heterogenous effects on outcome measures. The average treatment effects on the untreated (ATU) are significantly higher than the average treatment effect on the treated (ATT) outcomes, which implies that the farmers who did not adopt are the ones that would have received the higher benefits from adopting the improved fallow technology. Importantly, they found that increases in maize yields were lower in their study with a quasi-experimental design compared to increases in maize yields observed in randomized field experiments conducted by ICRAF elsewhere in Zambia (Mafongoya et al., 2006). The authors attribute these differences to farmers' skills in managing improved fallows and maize crops, highlighting the need for continuous farmer training in new agricultural technologies such as improved fallows, and demonstrating potential differences between controlled field experiments and larger-scale intervention outcomes. Thorlakson and Neufeldt (2012) evaluated the impacts of an agroforestry intervention that provided farmer training, seedling provision, and small amounts of food for participating as well as training, tools, and seedlings for tree nursery management. They measured farm productivity by converting current seasonal crop production to economic units using average 2010 crop prices in the study region. Through quantitative and qualitative methods, they found that farmers tended to have increased yields with agroforestry practices from fruit production, though they noted that the full benefits may not have been realized within the 4 years since the start of the program. The increase in productivity in the Lower Nyando region was approximately a 35% increase, and in the Middle Nyando region was approximately a 20% increase, but the effect sizes in both regions was not statistically significant.

Kuntashula and Mungatana
Finally, we note that  included results of improved fallow on yields in Kenya. Due to insufficient information reported and since this was not part of their econometric analysis, this study was not included in the meta-analysis. In their qualitative analysis, the farmers reported perceptible increases in yields that they directly attributed to the improved fallow technology. Based on farmer recall data, the study reports a median increase in maize yield of 167% and a mean increase of 128% compared to an unfertilized maize-only control, but they did not report standard errors or p-values and they reported that 12.5% of the plots had a decrease in maize yield. In the two studies  where we observe declines in yields, interventions were designed to counteract negative yield effects with higher prices or PES. Among the different coffee certification programs, some resulted in higher yields than their matched noncertified farms along with receiving higher prices for their products through the certification program.

Results
The intention of the coffee certification programs was to promote the maintenance of sustainable practices, such as shade-grown coffee agroforestry systems, to provide habitat for biodiversity conservation. The interventions were designed to incentivize farmers to maintain these practices by increasing crop market value due to the certification to counteract any losses in yields or increased difficulty in managing the system. Results were inconclusive on the yield side, as described above. The PES program in Mozambique promoted the single-purpose planting of trees for carbon sequestration, instead of multi-purpose trees that could be productive in their own right.
A distinct, drastic loss of yields was indeed observed, with the program intended to deliver direct payments to farmers to counteract these losses (see results on income below).
On the other hand, in the four cases where we observe increases in yields Thorlakson & Neufeldt, 2012), the interventions were meant to increase farmer capacity through the provision of extension and training services to help farmers adopt practices that would ultimately improve productivity. These practices included integrated production systems, improved fallows, and fertilizer trees.
We note that all four of these programs were implemented in Africa, in locations with severe soil fertility limitations. Such low baseline conditions combined with interventions tailored to boost agricultural productivity appear to have been propitious for increasing yields.

Effects of agroforestry interventions on income
We identified eight studies that measured the effects of agroforestry on income. We conducted a meta-analysis based on seven of these included studies that measured the impacts of agroforestry on income. Again, we report the summary effect size as described in We included the gross income effects in the meta-analysis to be consistent with the other included studies. Land tenure security was determined to be major factor in the negative effects of this program since farmers in the region were reluctant to invest the time and resources in the management the agroforestry systems when they did not have tenure security. There were also issues with crop yield losses due to tree-crop competition for irrigation water, light, and fertilizer.
The authors suggested that the program was poorly targeted and implemented, with difficulties and limitations in obtaining the subsidies, insufficient subsidy amounts, and low tree survival rates.
Another study, Sills and Caviglia-Harris (2015), also evaluated the impacts of a program promoting "green" agriculture in the Brazilian Amazon in the short-and long-terms. We did not include the results from this study in this meta-analysis since they did not observe the impacts of the agroforestry component directly, but rather the impacts of the program on agroforestry adoption and the overall impacts of the program, which included a broad range of green agricultural practices. The program increased agroforestry adoption in participants. In the short-term, they found positive, significant impacts of the program on household income, but mixed insignificant impacts in the medium-and longer-terms. We tested the impact of including these results, and there was no substantive change in our meta-analysis results with included this result.
We found that in the first three cases in Africa, where improved fallow, intercropping, or fertilizer trees were the promoted practice, there was evidence that adopting agroforestry improved or diversified yields and led to increased incomes for farmers. While the effects were positive, the results were not consistently statistically significant.
These interventions fall under the category of providing support and technical assistance to farmers to adopt profitable agroforestry practices. The last study discussed, Dai et al. (2017), was expected to generate improved and diversified income, but demonstrates that contextual factors such as land tenure, soil and water limitations, and opportunity costs as well as the program implementation strategy can hinder the effectiveness of an agroforestry program.
In the other three cases, we saw examples of PES, certification schemes, and land tenure security programs, which all fall under the category for incentive provision interventions. In these cases, there was variability in the productivity and profitability of the practice, but the provided incentives were meant to lead to increased incomes. For the coffee certification schemes, there was high variability between the schemes. Incomes were particularly affected by whether the farmers received higher prices for the certified products or not. In the PES case, there was a drastic reduction in yields, as we saw in the previous section, but the farmers had a significant increase in income and expenditure per capita. However, this did not carry over to womenheaded households or poor households, which saw lower increases in income and expenditure. The PES payouts were mostly distributed in the first 6 years, with only 10% of the total payment to be delivered at year 25, the term of the agreement. The study was conducted only 4 years after the initiation of the project, so the long-term impacts on income are questionable. In some areas, farmers planted multi-purpose trees that would reach productive maturity around the end of the bulk of the payments. In cases where trees are not productive and they significantly reduce yields, the permanence of these systems and project attrition rates after year 6 are not known.

Effects of agroforestry interventions on nutrition and food security
We found four studies that measured some dimension of nutrition and food security outcomes. The measures they used were not consistent, so we did not perform a meta-analysis for this outcome. On the other hand,  evaluated changes in food intake and nutritional status due to improved fallow agroforestry, but they find no significant results on nutritional measurements. Instead, the only variable that they found that significantly increased energy, protein, and iron intake was if the household was female headed. In their analysis, female-headed households had significantly positive (or less negative) change in each of these three indicators.
In addition to these two studies that consider nutrition as outcomes, two other studies evaluate the impact of agroforestry on food security. In , food security is measured in terms of food crop value, and they find a strong, positive increase in their study of fertilizer trees in Malawi. They concluded that adopters of fertilizer trees were more food secure due to significantly higher maize yields and food crop values than nonadopters. For the full sample, they find that fertilizer trees increased food crop value by 35%, or 12,447 Malawian Across the four studies that looked at nutrition and food security as an outcome measure, there was a positive or neutral impact of agroforestry on nutrition and food security. Although the evidence base is thin, it currently favors the hypothesis that agroforestry can improve nutritional and food security outcomes. That there were so few rigorous studies on this topic is surprising given that agroforestry is often promoted for diversifying food production and thereby improving nutrition and food security outcomes. Sills and Caviglia-Harris (2015) measured the difference in percent of the farmer's lot that was deforested between the "green" agriculture program participants and nonparticipants. They found that the program decreased forest cover loss, but the results were not statistically significant and may have been due to positive selection bias. Thorlakson and Neufeldt (2012)

Intersection of agroforestry for environmental and social outcomes
In much of the evidence on agroforestry interventions, agroforestry is assumed to have underlying environmental benefits, but researchers rarely evaluate both outcomes within the same study. The baseline assumptions in many of the studies are that agroforestry provides one or more of the following ecosystem services: protect crops from wind and soil erosion, help farmers in climate change mitigation and adaptation, improve soil fertility, sustain biodiversity, and sequester carbon. Most of the impact evaluations, however, did not evaluate the impacts of the programs on these outcomes, instead focusing on food security and economic outcomes. Only four studies included in our review examined both social and environmental outcomes Sills & Caviglia-Harris, 2015;Thorlakson & Neufeldt, 2012), with four others looking at both agricultural yield and income outcomes , Thorlakson & Neufeldt, 2012. While we had anticipated exploring the interaction between environmental and social outcomes, the evidence base was too minimal to evaluate the intersection of these outcomes. Under this pathway, three key elements, participant engagement, program exposure, and indirect financial support, lead to agroforestry adoption, which is then intended to lead to positive productivity, profitability, and human well-being outcomes through diversified production and income streams, improved soil fertility, and other changes. The promoted practices are simultaneously supposed to lead to improved ecosystem services outcomes, such as soil and water management, soil fertility replenishment, habitat provision, and carbon sequestration. The second type of intervention pathway we identified is incentive provision to enhance value and offset economic tradeoffs.

The role of interventions in the adoption and success of agroforestry
Interventions following this pathway include a range of incentive provision approaches such as certification programs and PES. Certification programs are intended to increase the prices for sustainably produced products. PES comprise direct payments to farmers for maintaining a specified set of sustainable practices. These interventions rely on the provision of incentives to offset the decreases in productivity or profitability that occur when sustainable practices are adopted. These interventions are useful when the practice involves taking land out of productive use or incorporating elements that decrease the overall productivity of the system. By providing financial incentives, farmers are motivated to adopt the desired practices since the sustainable alternatives maintain the same or higher profitability as the conventional options.
Like the first pathway, these interventions also require participant engagement and program exposure through promotion of the program to achieve adoption. Adoption of the promoted agroforestry practice is specifically intended to achieve environmental benefits, such as soil and water quality, biodiversity conservation, and carbon sequestration. In this case, however, the planted trees take land out of production without any direct increase in productivity or productivity due to the practice. Direct financial incentives are therefore required to offset the loss in productivity. The intervention may be in the form of a payments for environmental services program that directly pays farmers for implementing and maintaining sustainable management practices or in the form of a certification program that brings in higher prices for products produced under sustainable management practices. When these incentives provide enough additional income to offset or exceed the loss of profits due to decreased productivity, farmers are incentivized to implement the promoted practices. However, even with these direct offsets, an increased labor requirement, increased difficulty in farm management, or lack of market access may still deter farmers from adopting the promoted agroforestry practice(s). , Bull (2011), Pagiola et al. (2016), and  analyzed interventions within this second pathway.
Studied programs incorporated incentives to promote the adoption of agroforestry practices to offset the costs of taking land out of production or reducing the productivity of the system. Two of the projects assessed were pilot projects, one was an early-stage assessment, and one was a later stage assessment of the program. None of the studies explicitly evaluated the rate of participation in the different programs.

Equity focus of agroforestry interventions
We found three studies that explicitly considered variable impacts across different sub-groups of the study population, but most included studies made some suggestions about group differences. The studies considered differential impacts on small-holders versus largeholders, on woman-headed households versus male-headed households, and on richer groups versus poorer groups.  found that households with the smallest landholdings were most positively impacted by the intervention. This program targeted smallholders and the poorest populations, and they found that this was key to the success of the program-they found a high treatment effect on the treated, but a small, expected treatment effect on the untreated. On the other hand,  predicted that the average treatment effect on the untreated was much higher than the treatment effect on the treated group. Through discussion with farmers, they found that there were several barriers to entry that prevented some households who may have received the highest benefits through the program from entering the program. The main barrier cited by farmers was the long waiting time for the accrual of benefits of planting trees, which require several years to grow before providing benefits.
Female-headed household were often disproportionately affected or overlooked by agroforestry interventions, with some of the studies showing that woman-headed households have less positive or more negative impacts on their households than male-headed households.
Even when a program did not benefit men more than women, there were baseline differences between the two groups such that women-headed households in both the treatment and control groups had lower food crop value and income.  explicitly considered the differences between male-headed and female-headed households, and they found significantly higher benefits for male-headed households.
They use decomposition analysis to find that 54% of the differences were due to different endowments, and 46% was due to discrimination.  noted that it was difficult for women to participate in agroforestry training programs because of restrictions from their husbands and household chores, and they could not receive the benefits of the intervention due to insufficient landholdings. Thorlakson and Neufeldt (2012) noted the disproportional benefit to women of having access to firewood, since women are often responsible for collection and have to walk miles to collect firewood if they are without trees on their farm.
There was also considerable variability in how agroforestry interventions impacted poor or marginalized households. Some projects specifically targeted the poorest households, while others found that poor households had less positive or more negative impacts on their households than richer households.  explicitly considered the differences between male-headed and female-headed households, and they found significantly higher benefits for maleheaded households. They use decomposition analysis to find that 36% of the differences were due to different endowments, and 64% was due to discrimination. Several of the studies suggested that poor households are expected to receive the highest benefits from agroforestry interventions.

| Summary of main results
We identified 11 impact evaluations of agroforestry interventions across nine different countries. We were able to quantitatively synthesize results for the impacts of agroforestry interventions on crop yield and income through meta-analysis techniques. We additionally used narrative synthesis for impacts of agroforestry interventions on nutrition and human-wellbeing and environmental outcomes.
Our meta-analysis showed the average effect of these interventions on crop yield outcomes was a large size and positive (1.16 standardized mean difference (Hedges' g) with a 95% CI of [−0.35, 2.67]) but was not statistically significant (p = .13). There was substantial variability in the results for crop yield. The average effect of interventions in our meta-analysis of household income outcomes was very small but positive (0.12 standardized mean difference (Hedges' g) with a 95% CI of [−0.06, 0.30]), and it was not statistically significant (p = .20). Results were less variable than for yields, though there was still moderately high heterogeneity.
There was a positive or neutral impact of agroforestry on nutrition and food security from the studies that considered one or both as an outcome measure. However, there was not enough evidence to perform meta-analysis. This lack of evidence is a major gap in agroforestry research, as agroforestry is often promoted for diversifying food production. Although the evidence base is thin, it currently favors the hypothesis that agroforestry interventions can improve nutritional and food security outcomes.
There was also a notable lack of evidence on environmental outcomes. Most of the studies assumed that the environmental benefits came with the implementation of agroforestry practices based on previous work studying the practices alone. While the environmental impacts of agroforestry practices are relatively CASTLE ET AL.

| 27 of 52
well-studied , there is a lack of evidence that these benefits translate in the context of a specific intervention.

| Overall completeness and applicability of evidence
There was a considerable overall gap in evidence available on the impacts of agroforestry interventions. We only identified 11 impact evaluations of agroforestry interventions, and the 11 lacked consistency in the measurement of indicator variables and analytical techniques. The available impact evaluations studied a wide range of practices and types of interventions, making it difficult to compare across different contexts and situations. However, the results do provide a baseline to inform future research, and the results revealed trends in agroforestry intervention pathways.
Agroforestry is often promoted for its promise to deliver on both environmental and social outcomes, but evidence supporting this claim in the context of specific interventions remains lacking. Few studies examined cobenefits and tradeoffs of agroforestry in the context of interventions. Agrawal and Chhatre (2011) argued for the importance of considering multiple social and ecological outcomes at the same time in coupled natural and human systems. Our results suggest that their call has been heeded more rarely than required to advance integrated understanding of the multiple outcomes of agroforestry interventions. There remains an urgent need to simultaneously examine multiple outcomes of agroforestry interventions to inform efforts to achieve the 2030 UN Sustainable Development Goals.

| Quality of the evidence
The overall quality of the evidence base was low, with all of the studies rating as "critical" (55%) or "high" (45%) for overall risk of bias. The biggest issue was that all of the studies were based on interventions with self-selection into the program and only used cross-sectional data at the end line in the analysis. All of the studies used methods to try to address this selection bias through PSM, endogenous switching regression, Heckman two-stage regression, difference-in-difference, and IV techniques. The studies that used matching mostly matched on a limited number of variables and did not account for several important factors that could affect the selection process and outcomes. This approach also led to issues with group equivalence. We also identified several issues with potential spillover effects that were not controlled for in the analysis. While the overall risk of bias was high, these studies represent an important advance as the field of agroforestry seeks to better understand program effectiveness.

| Potential biases in the review process
We tried to limit the potential bias within the review process by double screening of studies for inclusion and double extracting data where possible. The risk of bias assessment was also performed by two separate reviewers and discussed until reviewers agreed on a rating to assign. The effect size extraction and calculation were done by one lead reviewer and checked by another lead reviewer. When a paper did not provide enough information to calculate effect size, the study authors were contacted, and the additional information was retrieved.
The review only includes studies through October 2017 and those that were published in English, which is a limitation of our review. A number of relevant impact evaluations have been published since we conducted our search. We intend to update our SR with new literature published since 2017 within 5 years of our previous search through the year of our updated publication.
Due to resource constraints, we only included studies in English, which can limit the scope of our results by not including relevant studies in other languages. For example, our study has likely missed important evidence described in French, Mandarin Chinese, Portuguese, and Spanish, among other languages. Similarly, given that the practice of agroforestry has different names in different places, is possible that we missed a relevant term in our search strategy, even though the terms we used were developed in consultation with a search specialist and our advisory team, which included several experts in this field. There may also be geographical biases since we included only English language studies.
As previously discussed, we decided to exclude adoption-only impact evaluations, which could offer a useful counterpart to this review. While the literature on agroforestry practices is extensive, it also indicates mixed results of implementing agroforestry practices.
It would therefore be difficult to estimate even the direction of the social and environmental impacts of agroforestry adoption.
The targeted scope of this SR and the search limitations may contribute to small size of the evidence base; however, we believe that we captured much, if not all, of the literature evaluating the impacts of agroforestry interventions within our timeframe.
The scope of our EGM was very broad, and presentations to relevant stakeholders and our advisory committee revealed no additional known studies relevant for inclusion.

| Agreements and disagreements with other studies or reviews
While there are no similar reviews of agroforestry interventions, previous reviews have synthesized evidence on the effectiveness of certification schemes and PES programs, which were policy instruments used for several agroforestry interventions included in our review (Oya et al., 2017;Snilstveit et al., 2019). Oya et al. (2017) reviewed 43 quantitative and 136 qualitative studies and found that certification schemes tend to increase prices and income from produce but had no consistent effect on wages or household income. Like the evidence for agroforestry certification schemes, they found mixed effects on yields but generally positive impacts to income from production (though the effects on total household income were unclear). As with our review, they also concluded that context-specific intervention design is key for successful implementation of certification schemes. Snilstveit et al. (2019) examined the impacts of PES on environmental and socioeconomic outcomes and identified 44 quantitative impact evaluations and 60 qualitative studies.
They two PES studies we included were also included by Snilstveit et al. (2019). They found the PES programs may increase household income and improve environmental outcomes across all PES programs included, which held true when we considered only agroforestry PES programs.
We note that available evidence for the above reviews was much more extensive than was the case for our review: they each included at least five times the number of relevant quantitative studies as the present review.
Several SRs synthesized the evidence on the impacts of agricultural interventions on nutrition outcomes and found that nutrition education and homegardens had the greatest likelihood of inducing positive nutritional outcomes, especially consumption of nutrient-rich crops and dietary diversity, though the results were mixed (Berti et al., 2004;Bird et al., 2019;Masset et al., 2012;Pandey et al., 2016). These SRs also highlighted the importance of women's participation in agricultural interventions to achieve nutritional outcomes. With the agroforestry interventions we identified, the intervention that focused on education similarly led to positive nutritional outcomes , while agroforestry technologies that may provide access to increased food security did not necessarily translate to more diversified nutrition  or did not explicitly measure nutritional status as part of the study .
While we only identified one tenure reform intervention associated with agroforestry practices, which is insufficient for drawing broader conclusions, Lawry et al. (2014) reviewed the impacts of tenure interventions on investment and agricultural productivity generally, which provides more general understanding of tenure interventions. The evidence base for land property rights interventions they examined was limited (n = 20), but their findings suggest that land tenure interventions are plausible pathways to improve welfare. Loevinsohn et al. (2013) suggested in their theory of change similar pathways of change for agricultural technology adoption generally as we found specifically for agroforestry. They also found surprisingly few high-quality studies (n = 5) that evaluated the circumstances and conditions where agricultural technology adoption led to changes in agricultural productivity. Finally, Ingram et al. (2016) also highlighted the importance of gender equity in forests, trees, and agroforestry intervention, and they found differences in the nature of the products and activities that men and women participate in as well as evidence of gender differentiated incomes. They also conclude that interventions need to be explicitly gender sensitive, support collective action, and consider parallel actions to reduce gender disparity in forest, trees, and agroforestry interventions.

| AUTHORS' CONCLUSIONS
Agroforestry has been widely practiced, promoted, and studied across the LMICs of Africa, Asia, and Latin America. Given its prevalence and promise, agroforestry is promoted for its potential to provide a vital contribution to advancing several of the 2030 UN SDGs (Van Noordwijk et al., 2018;Waldron et al., 2017 Tierney et al., 2011).
In this study we have presented the findings of a SR that used systematic methods to identify, collect, and synthesize available evidence on the effects of agroforestry interventions in LMICs on three important outcomes: agricultural productivity, ecosystem services, and human well-being. Based on the available evidence, we reviewed the impacts of specific agroforestry interventions on crop yields, income, nutrition and human well-being, and environmental outcomes. The main finding of our review is that there is a critical lack of rigorous evidence on the impacts of agroforestry interventions. However, the existing evidence points to the positive or neutral impacts of agroforestry interventions on multiple social and ecological outcomes.

| Implications for practice
Our review offers a baseline of the impacts of agroforestry interventions to guide future practice and policy. Given that the major finding is that there is a critical gap in evidence on the effects of the agroforestry interventions, it may seem there are few implications for policy and practice. In a sense this is true-we lack systematic understanding of the relative effectiveness of different interventions to inform new policies and programs. However, the overall findings of this report do suggest some important paths forward.
First, the review highlights the need for additional funding for impact evaluations of agroforestry programs and policies. There is a need for donors to explicitly call for rigorous impact evaluations as part of the implementation of the interventions. While in many cases the evidence suggests there may be positive impacts of agroforestry interventions, the evidence base is extremely limited. We suggest that careful piloting and baseline assessment occur as a prerequisite of new program implementation in new contexts. Future studies should carefully consider using RCT designs to test the effectiveness of different intervention approaches. Research in agriculture (e.g., Carter et al., 2013;Jack & Cardona Santos, 2017;Weiser et al., 2015) and forestry (e.g., Jayachandran et al., 2017) as well as a raft of agroforestry field trials  suggests the feasibility of RCTs in this domain.
Second, the setting and the type of agroforestry practice promoted are crucial considerations for intervention targeting and effectiveness.
For example, in areas with low productivity and soil fertility and high poverty and food insecurity, interventions may prioritize income and yield outcomes by promoting fertilizer trees or improved fallow rather than shade trees or trees for carbon sequestration. These interventions may simultaneously target poverty reduction and food security, with less emphasis on biodiversity conservation and carbon sequestration. Evidence reviewed here suggests that such interventions may be suitable for smallholder farmers in highly degraded areas, though additional studies are necessary to test this hypothesis given the small number of studies. In areas with high biodiversity or high potential to sequester carbon, CASTLE ET AL.
| 29 of 52 the focus may turn to environmental outcomes. Practices associated with environmental outcomes, such as shade trees in coffee plantations or pastures or intercropped tress for carbon sequestration, potentially reduce yields, which may require income offsets such as higher prices through certification programs or PES. Ideally, under either scenario, incomes will be increased to promote the adoption of the practices, benefit the landholders, and help improve human well-being. In many cases, agroforestry systems can incorporate more diverse food sources, such as fruit and nut trees, to improve nutrition outcomes as well.
Finally, our results show that the impacts of agroforestry interventions were experienced differently by different population subgroups. New programs should therefore consider who and how they are targeting groups in program implementation, with special attention to gender and social class. We found that female-headed household were often disproportionately affected or overlooked in the studied agroforestry interventions, with some results showing that woman-headed households experienced less positive or more negative impacts than male-headed households. There was also considerable variability in how agroforestry interventions impacted poor or marginalized households.
Some studied projects specifically targeted the poorest households.
Results were mixed, with some finding that poor households received the most benefit, while others found that such households had less positive or more negative impacts on their households than wealthier households. Several studies reported variable differences between the average treatment effects on the treated versus the average treatment effects on the untreated. These variations highlight the need to carefully consider the targets of an intervention and understand the incentives and barriers to entry.

| Implications for research
Our study reveals that rigorous evidence on the effects of agroforestry interventions remains extremely limited. Impact evaluations of agroforestry interventions remain challenging due to the long timescale between implementation and impacts. Trees take a long time to grow, and the resulting effects on environmental health and human livelihoods may take decades. The scope of many development projects usually only lasts a few years, so long-term monitoring and evaluation must be built into project proposals and designs. Many studies we found only examined whether farmers adopted agroforestry as the results of an intervention, without measuring the subsequent impacts on social-ecological outcomes, so they were not included in our review. One approach to addressing the need for long-term evaluation is establishing on-farm experimental trials with treatment and control farmers, for which there may be better justification for long-term monitoring proposals. Finally, RCTs are rarely conducted in agroforestry research based on our findings, but RCTs can offer valuable insights into how agroforestry interventions impact farmer livelihoods and the environment.
The complexity that comes with integration of agricultural, forest, and pastoral, and other systems, as done in agroforestry, poses significant challenges to evaluating the effectiveness of specific agroforestry interventions. However, given the potential of agroforestry to contribute to a number of major sustainable development goals simultaneously, there is an urgent need for such impact evaluation.
Nevertheless, there are examples demonstrating such evaluation is possible. Expanding the number of impact evaluations of agroforestry interventions, especially using RCTs, therefore represents a major opportunity for expanding and improving the existing evidence base.
There is also a need for long-term trials with baseline data collected to conduct impact evaluations using panel data to understand how the impacts of the agroforestry interventions change over time.
A better understanding of the win-win scenarios and tradeoffs associated with agroforestry is urgently needed, particularly given the potential of agroforestry to help achieve the SDGs. More robust evidence on the different environment and development objectives agroforestry can advance, including climate change mitigation and adaptation, poverty reduction, and health and nutrition, is needed in its own right, but also to enable analysis of synergies and tradeoffs.

DECLARATIONS OF INTEREST
The authors declare that there are no conflicts of interests. None of the authors are researchers on any of the studies examined in this review. All authors have published work within the field of agroforestry, which may be cited in the background section of this review.
Karl Hughes is an employee of ICRAF, an organization that funded and/or implemented many of these interventions.

PLANS FOR UPDATING THIS REVIEW
We plan perform a follow-up review to update this review within 5 years. In the follow-up review, we plan to include studies that evaluate the impact of agroforestry interventions on agroforestry adoption as well as social-ecological outcomes.

DIFFERENCES BETWEEN PROTOCOL AND REVIEW
Our EGM deviated from the protocol in several ways as the review team identified challenges in screening. First, we did not include researchermanaged agroforestry field trials due to time and resource constraints and due to conceptual differences between researcher-managed fields under controlled experimental conditions and fields managed by farmers.
Secondly, we did not include studies that used other types of agroforestry practices as a comparator to an agroforestry practice, as opposed to using a nonagroforestry land use as a comparator. After reviewing began, we realized that these studies aimed to optimize agroforestry configurations, rather than demonstrate the impact of agroforestry practices.
Thirdly, we limited our time scope to start in the year 2000, rather than 1990, due to time and resource constraints. For our parallel systematic map for high-income countries (Brown et al., 2018), we considered studies back to the year 1990, and noticed drastically diminished returns prior to 2000 due to limited evidence meeting our inclusion criteria (e.g., many studies did not include a relevant comparator or relevant outcome, rather the focus was more on tree species selection and breeding).
These decisions did not impact our systematic review.

CHARACTERISTICS OF EXCLUDED STUDIES
For the full list of excluded studies, contact the lead author, Sarah E. Castle.

Forest plots
Effects of agroforestry interventions on yields

Effects of agroforestry interventions on incomes
Funnel plots (too few studies to conduct funnel plot analyses)

Effects of agroforestry interventions on yields
Effects of agroforestry interventions on incomes

INTERNAL SOURCES
No sources of support provided

EXTERNAL SOURCES
Funding from 3ie and the USDA National Institute of Food and Agriculture, Hatch project #1009327 is gratefully acknowledged.

SUPPORTING INFORMATION
Additional supporting information may be found online in the Supporting Information section. The databases that we searched for publications were: • SCOPUS • EBSCO: Econlit The search terms used in each database are shown in Table A1 (constructed using the terms from CAB thesaurus and using the research group's specific knowledge). Each search string included each of the agroforestry practices from Table 3. These terms and search strings were modified through a scoping exercise in Web of Science, SCOPUS, and EBSCO, where the search terms were used, and the results were evaluated by analyzing the relevance of the first 50 studies. We note that the intervention types are more generic, including topics well beyond agroforestry, so our search focused on practices. We included an LMIC filter to identify only studies in relevant countries, shown in the last row of

Searching other resources
Additionally, in order to identify the existing grey literature, the websites of various organizations that are likely to produce published and unpublished research were searched. The list of relevant research organizations (Table A3) was constructed from cross-validation of websites listed in the systematic mapping protocols of agroforestry related studies (e.g., Bottrill et al., 2014;Leisher et al. 2016;Nguyen et al. 2015). This list was validated with the external EGM advisory group. To optimize the scope of the search while ensuring transparency in our methods, we followed the approach developed by Haddaway et al. (2017), which allowed us to search multiple websites simultaneously and to extract the relevant information from each website into a single database. Finally, we also contacted key informants within 3ie, ICRAF, and other relevant organizations for identification of additional relevant literature for screening and inclusion. For the organizational websites, we used a simplified search string with our primary keyword "agroforestry." T A B L E A1 Search terms by intervention and outcomes