Effects of Payment for Environmental Services (PES) on Deforestation and Poverty in Low and Middle Income Countries: A Systematic Review

....................................................................................................................... 7 Executive summary ...................................................................................................... 8 Background ............................................................................................................... 8 Objectives ................................................................................................................. 9 Selection criteria ....................................................................................................... 9 Search strategy ......................................................................................................... 9 Data collection and analysis ................................................................................... 10 Results .................................................................................................................... 10 Effects on Deforestation Outcomes .................................................................... 10 Effects on Human Welfare Outcomes ................................................................. 11 The Role of Institutional and Social Conditions .................................................. 11 Authors’ conclusions .............................................................................................. 12 1 Background ......................................................................................................... 13 1.1 Description of the problem .......................................................................... 13 1.2 Description of the intervention ................................................................... 14 1.2.1 Description of PES ................................................................................... 14 1.3 How the intervention might work ............................................................... 16 1.3.1 Main hypotheses ...................................................................................... 16 1.3.2 Unintended consequences and moderating factors for PES ................ 18 1.3.3 Moderator hypotheses.......................................................................... 19 1.4 Why it was important to do this review ....................................................... 20 2 Objectives ............................................................................................................ 22 3 Selection Criteria ................................................................................................ 24 3.1 Participants .................................................................................................. 24 3.2 Interventions ............................................................................................... 24 3.3 Outcomes ..................................................................................................... 25 3.4 Study Types ................................................................................................. 25 4 Search Strategy ................................................................................................... 27 4.1 Electronic searches ...................................................................................... 27


Abstract
We conducted a systematic review of studies on the impact of payments for environmental services (PES) that set natural forest conservation as the goal on deforestation and poverty in developing countries. The review is motivated by debates over whether the pursuits of conservation and poverty reduction in developing countries tend to conflict or whether they might be complementary. A search for rigorous impact evaluation studies identified eleven quantitative and nine associated qualitative evaluation studies assessing the effects of PES. The methodological rigor of these studies varied widely, meaning that the evidence base for the impact of PES policies is limited in both quantity and quality. Given the evidence available, we find little reason for optimism about the potential for current PES approaches to achieve both conservation and poverty reduction benefits jointly. We call for the production of high quality impact evaluations, using randomisation when possible, to assess whether the apparent incompatibility of conservation and poverty reduction might be overcome through programming innovations.

Executive summary BA CK G RO UN D
Natural forest preservation in the tropics, and thus in developing countries, must be an element of any effective effort to manage climate change. Forests serve as natural carbon sinks, which help to mitigate the effect of other carbon emissions. However, forest cover is being reduced and it is estimated that deforestation is responsible for 10-17 per cent of global carbon emissions. Since 2007, governments have coordinated conservation efforts under the Reducing Emissions through Deforestation and Forest Degradation (REDD+) initiative, which has led to the implementation of various programs designed to reduce the amount of forested land converted to other purposes.
Payment for environmental services (PES) programs is one type of intervention commonly implemented under the REDD+ umbrella. PES programs allow for direct exchange between those demanding 'environmental services' such as protection or rehabilitation of natural forests and those in a position to provide them locally. While the primary goal of reducing deforestation is clear, the policy and academic literature debates the extent to which PES programs in developing countries should incorporate goals of poverty reduction. Some argue that the targeting of poverty goals will undermine conservation effectiveness (e.g., because behavioural change among poorer households does not have as much potential to promote conservation as that of wealthier households or commercial entities). Others argue that targeting benefits toward the poor would contribute to conservation effectiveness by either promoting sustainable livelihoods or helping to legitimize conservation programming.
In this review, we assess the effects of PES programs on deforestation and welfare outcomes in low and middle income countries (LMICs), and whether the twin goals of improving both environmental and human welfare outcomes are at odds with each other. We also examine how inequality, institutional capacity, corruption, and democratic accountability may moderate the effects of PES programs. Conducting this review is important for moving the debate around PES beyond theoretical discussions and into a better-informed, evidence-based discussion.

OB JE C T I V ES
The first objective of this review is to assess the evidence on the effects of PES interventions on conservation and poverty outcomes in LMICs. A second objective is to assess the extent to which effects on poverty in turn affect whether conservation benefits are realized. The third objective is to evaluate how institutional and social conditions (namely, inequality, institutional capacity, corruption, and democratic accountability) moderate the effects of PES programs.

SEL E CT ION C RI TE RI A
The review includes studies of PES programs that assess effects on either (i) deforestation outcomes in forest areas in developing countries or (ii) poverty conditions of populations residing in communities that are proximate to natural growth forest areas in developing countries. We included studies using a range of measures for both deforestation (on-the-ground point samples, samples created from satellite imagery) and welfare (consumption, income, or income potential).
We required that PES programs have a clear start date when either payments or rewards are themselves offered to individual or corporate property holders to maintain or rehabilitate (for example, via planting endemic species) natural forests, or institutions are established to facilitate such offers.
For quantitative synthesis we included (a) randomized studies and (b) quasiexperimental studies that employ strategies for causal identification with clearly delineate treated and control areas and use some method for removing biases due to non-random assignment of the intervention. Qualitative data are used in the synthesis to provide descriptions and context for interventions that are included in the quantitative synthesis. Such data were drawn from the quantitative studies themselves as well as qualitative studies that cover the same programs or settings as the quantitative studies.

SEA R C H S T RAT E GY
To find the articles included in this review, we searched a variety of databases using key words related to PES programs. The set of databases and list of keywords were assembled based on consultation with a Campbell Collaboration information retrieval specialist. We also carried out hand searches of key journals in relevant fields, using publisher search engines and references cited in papers accepted for review as well as in review papers or thematically relevant papers identified during the search.

DA TA CO LL E CT ION AN D AN A LY SI S
For studies eligible for inclusion in the review we systematically collected data on study characteristics, findings, and moderators. Risk of bias was assessed based on the guidance of the IDCG Risk of Bias Tool (version March 2012). We extracted qualitative information from both the included quantitative studies as well as qualitative studies that covered the same types of programs and contexts as our quantitative studies. We use such qualitative data to establish that conditions recorded in quantitative data are being interpreted correctly and to provide descriptions and context for interventions that are included in the quantitative synthesis.
For effects on forest cover, whenever possible we standardized them to annual forest cover change rates. For effects on material welfare and poverty outcomes, we used percentage change over estimated average counterfactual outcome (e.g. for income effects, per cent change in income relative to the average income of the control group). For each hypothesis, we synthesised estimates using meta-analysis when the following conditions were met: (i) more than two studies meeting the quantitative inclusion criteria; (ii) effect sizes for common outcome constructs; and (iii) effects measured against similar comparators.

RE SU LT S
Our database search returned 1382 articles on PES programs. After eliminating articles that were not relevant to our hypotheses or conducted with appropriate methodological rigor, we were left with 20 articles on PES programs. Of these 11 PES articles conducted quantitative impact evaluation of these programs. The 11 PES articles cover six programs in four countries (Costa Rica, China, Mexico, and Mozambique).
The resulting evidence base is weak both in terms of the number of eligible studies and the methodological weaknesses of the included studies. None of the studies are based on randomized experiments, and so the potential for hidden selection or confounding biases is the most concerning issue. Few of the studies create comparison groups that allow them to address spill-over and leakage of effects from program areas to non-program areas. None of the studies investigated forest conservation and welfare effects jointly, which made it difficult to assess how these two goals are related.

Effects on Deforestation Outcomes
The PES studies that assessed programs' effects on forest cover included nine studies of four programs in Costa Rica and Mexico. The studies focused on two types of measures: impact on deforestation rate (where the best-case scenario is a deforestation rate of 0) and impact on forest cover (which allow for a positive outcome in the expansion of forested land). Keeping in mind the weakness of the For PES to contribute to poverty reduction, poorer household must be able to participate at high rates. But participation in PES programs is typically more difficult for poor households than wealthier households (a fact documented by in a number of the studies included in the review). The study from Mozambique includes estimates for poor households and finds that the welfare effects were substantially less in absolute terms, and not statistically significant for these households.

The Role of Institutional and Social Conditions
We aimed to address a number of hypotheses regarding the influence of institutional and social conditions (inequality, institutional capacity, corruption, and democratic accountability) on the effects of PES programs. However, due to limitations of the evidence base we were unable to these hypotheses. We did however extract qualitative data from included studies and associated qualitative studies that provide some insights into the role of institutional and social conditions in the context of PES programs.
A study on the Mexican PSAH PES program found that forest conservation effects were worse in poorer areas. Qualitative information from Costa Rica was consistent with this account. Several of the studies also addressed the issue of institutional capacity, describing situations where PES programs did not have the ability to carry out their mandates. Corruption and possible misappropriation of project resources were also factors raised in a qualitative study of PES in Mexico. The study found that program resources were applied to address inadequacies in other government programs.

AU TH O RS ' C ON CL U SI ON S
Limitations in the evidence base preclude definitive hypothesis tests, however the evidence we find suggests that PES does reduce deforestation rates. The effect is modest however and seems to come with high levels of inefficiency. In terms of PES effects on poverty, we cannot say that the evidence indicates beneficial effects. Available evidence shows that PES programs are less effective in poor areas and are less likely to attract participation of poor households than wealthier ones. These are troubling findings but they are based on only a handful of cases and therefore deserve much more empirical attention.
Our review aimed to assess the extent to which environmental and poverty reduction goals conflict with one another, how different conservation strategies fare in terms of such trade-offs, and the scope for 'win-win' strategies that generate both significant environmental and poverty reduction benefits. Based on the evidence available, we do not find that a case can be made for conservation and povertyreduction goals being complementary in PES programming.
Our final conclusion re-emphasizes the poor state of the evidence base for PES programming. Much advanced scientific effort and extensive investment has gone into measuring forest conditions around the world. Relative to that, efforts to assess the effects of PES programs on deforestation and poverty are limited and methodologically weak. Researchers should consider the recent work in development economics for guidance on executing field experiments that might provide more credible evidence (Banerjee and Duflo, 2011;Casey et al., 2012;Karlan and Appel, 2012). This has led to the implementation of various programs designed to reduce the amount of forested land converted to other purposes. Payment for environmental services (PES) programs is one type of intervention commonly implemented under the REDD+ umbrella and are widely implemented around the world as part of government strategies to manage forest loss and climate change. PES programs allow for direct exchange between those demanding 'environmental services' such as protection or rehabilitation of natural forests and those in a position to provide them locally (Forest Trends, Katoomba Group and UNEP, 2008;Wunder, 2005). Governments have applied PES strategies domestically for decades to manage forests and prevent irredeemable loss of valuable endemic forest resources. PES exist alongside 'decentralized forest management', 'community-based forest management' and 'protected areas' (that is, parks and reserves) as core components of government and privately led forest management efforts around the world (Angelsen, 2009).
Fundamental issues in policy debates over conservation strategies in developing countries include the extent to which conservation and poverty reduction goals conflict, how different conservation strategies fare in terms of such trade-offs, and the scope for 'win-win' strategies that generate both significant conservation and poverty reduction benefits (Muradian et al., 2013;Sunderlin et al., 2005;Wunder, 2001Wunder, , 2013. This review is organized around these issues. Two core questions arise. First, how might these potential benefits from natural forest conservation in the tropics be realized? Second, how do different approaches to natural forest conservation relate to the pursuit of poverty reduction goals? While tropical forests are appealing as targets for conservation because of their high carbon storage density and lower (in absolute terms) opportunity costs of conservation, they are located primarily in areas of low-and middle-income countries where poverty is a central concern (Deveny et al., 2009;Kremen et al., 2000;Sunderlin et al., 2005;Van Kooten and Sohngen, 2007). It is therefore crucial to understand whether conservation strategies require trading off on poverty reduction goals or whether there are strategies that allow for synergy in the pursuit of conservation and poverty reduction goals jointly. In this review, we address these questions with respect to PES programs.

Description of PES
At the most basic level, PES (also called 'payment for ecological services' or 'reward for environmental/ecological services') refers to voluntary accession to a contract to provide a well-defined environmental service (for example, maintenance of natural forest density in a designated area) in exchange for payment or other reward from a buyer entity. Our conception of PES encompasses what is sometimes referred to as 'rewards' for environmental services or 'compensation' for environmental services. Whether PES should be defined to include additional provisions is subject to some debate. Wunder (2005), in a sophisticated treatment of this definitional issue, defines an 'ideal' PES program as one that involves (i) such a voluntary exchange as well as (ii) the payment being issued conditional on delivery of the environmental service (as opposed, say, to being issued prior to and in expectation of delivery of the service) and (iii) the buyer entity being the immediate users of the environmental services.
Such an ideal form of PES is appealing in theory, as it would seem to define an 'incentive-compatible,' and therefore sustainable, approach to environmental protection. But we feel that this ideal form is ill suited to a review of PES programs as they have been implemented around the world thus far. As Wunder (2005; and Wunder, Engel, and Pagiola (2008) note, such ideal type programming is extremely rare in developing countries 1 . Taking this into account and in a manner consistent with the review by Wunder, Engel, and Pagiola (2008), we define PES as 'voluntary accession to a contract to supply a well-defined environmental service in exchange for payment from a buyer entity,' where payments need not be monetary but may come in the form of other material benefits and the 'environmental service' must involve the maintenance or rehabilitation of natural forests. 'PES programs' are actions undertaken by corporate or government entities to facilitate PES by establishing necessary legal frameworks (for example, by demarcating property rights) or connecting potential 'buyers' to potential providers of environmental services. While PES refers to a type of exchange that may be realized at any time, PES programs have clear start dates that allow in principle for an evaluation of their impacts.
By this definition, our review of the literature has found PES programs underway in many countries with large forest areas, including Bolivia, Brazil, Cameroon, China, Costa Rica, Honduras, Indonesia, Madagascar, Malawi, Mexico, Mozambique, Nicaragua, and Vietnam. PES program 'inputs' include funds to be used for the payments (e.g., from taxes, donor grants, or purchases of carbon offsets), staff that 1 Conventional economic theory provides possible explanations for this. To the extent that any local demand for environmental services exists (and it is not clear that it always will), the benefits from such services have the quality of public goods (Samuelson, 1954). Therefore, any single beneficiary would have an incentive to free ride on others' purchase of environmental services, introducing the potential for market failure in the absence of government or other third party coordination (Salzman, 2010). Furthermore, if demand originates predominately among non-local or foreign entities, transaction costs may make direct contracting with local services providers impractical, again undermining the potential for market formation. Thus, government or NGO intervention is likely to be required to overcome market failure risks and organize the purchase of environmental services in developing country contexts. Governments or NGOs may find conditional payment to be sub-optimal in satisfying their manifold goals. For example, an NGO may have a goal of building trust with forest communities, in which case the NGO may find more appealing a "gift exchange" approach (Akerlof, 1982), where at least partial payment or administration of benefits is issued prior to service delivery, with the expectation that recipients will reciprocate by providing environmental services. recruit service providers and manage contracts, and the potential service providers themselves and their land. The 'outputs' include contracts with service providers, hectares of land covered by such contracts, and payments issued in return for services. These inputs and outputs are intended to generate beneficial environmental and, potentially, welfare outcomes.  Figure 1 is a schematic representation of the theory of change that we evaluate with this review. We embed the causal relationships between PES and poverty/deforestation in Ostrom (2007)'s generic analytic framework for conservation dynamics. The framework defines the context in terms of the resource system, governance system, resource units, and resource users. In this review, the governance systems and resource users are the key areas of contextual variation that may moderate PES effects. Resource system (forest systems in developing countries) and resource units (forested land) are assumed to be of secondary concern once we condition on governance systems and resource users, with the latter understood as being potential agents of deforestation 2 . The causal arrows in the diagram do not characterize all conceivable causal relationships, just the ones that we seek to test. We have drawn a causal arrow that flows from poverty to deforestation, and not the other way around. This does not mean that we assume no effects of deforestation on poverty. It is meant to clarify the particular mediating relationship that interests us in this review.
A crucial question for conservation programs in developing countries is whether there might be synergies between the pursuit of conservation goals and poverty reduction goals or not. Pagiola et al. (2005) argues that coupling poverty goals with environmental protection goals in conservation programming may be inefficient for reaching either type of goal, and that in many instances the two objectives are orthogonal to each other, if not in conflict. While poorer members of forest edge communities stand to gain the most from poverty alleviation programming, they may not constitute the greatest deforestation threat. Such individuals may have relatively little means or incentive to engage in deforestation relative to large-scale farmers or logging interests. If so, making poverty alleviation in forest edge communities a priority may imply inefficient targeting of resources if the goal is the biggest conservation payoff (Wunder, 2005: 12-14).
There is a moral reason to couple poverty relief with conservation, although it does not presume the possibility of synergy: such a coupling would be imperative if conservation disrupts livelihoods of forest community members by limiting their ability to exploit resources for productive purposes, whether by themselves or as hired labor (Agrawal and Benson, 2011;Angelsen and Wunder, 2003;Chomitz, 2007: Ch. 3;Edwards et al., 2011;Porras, 2010;Wunder, 2005).
Arguments for synergy include those based on a sustainable livelihoods and political logic. With respect to sustainable livelihoods, the classic study by Vandermeer and Perfecto (1995) detailed tropical deforestation threats arising from forest edge communities' abandonment of sustainable forest use practices in the face of various pressures from commercial agriculture. Poverty relief for such communities is proposed as a way to arrest such dynamics. Politically, conservation strategies may be made more viable and effective if coupled with poverty alleviation. If PES programs target only the interests of large-scale commercial enterprises, the result may be to exacerbate local inequality.  Moreover, attaching poverty alleviation goals to conservation programs may help to minimize risks of hostilities, local level subversion, and corruption (Mapedza, 2006).
Based on this theoretical discussion, the two most basic hypotheses that we wish to test are as follows:  H1: PES reduces deforestation rates  H2: PES has non-negative impacts on local poverty levels.
The focus on non-negative, as opposed to "positive" impacts per se, reflects a primary concern to ensure that policy interventions do no incidental harm in association with ultimate goals, which in this case are taken to be reductions in deforestation.
Beyond these basic effects, we are interested in the possible mediating role of poverty conditions for deforestation outcomes. The sceptical take is that the two dimensions are orthogonal or even conflicting. The 'synergy' position is that poverty consequences of conservation policies mediate effects on deforestation. For PES, the nature of the dilemma that pits attending to distributional concerns against targeting major agents of deforestation likely depends on two factors. First are levels of local inequality in terms of holdings and vulnerability due to cessation of deforesting activities (part of the 'resource users' context). Second is the political position of those who stand to lose from cessation of deforesting activities relative to those who stand to gain from PES. Thus, we have the following hypotheses:  H3: The more a PES program functions to relieve poverty, the stronger its impact will be on reducing deforestation.  H4: PES deforestation reduction impact is negatively moderated by prevailing levels of local inequality in holdings of forested property and vulnerability due to cessation of deforesting activities.

Unintended consequences and moderating factors for PES
The four hypotheses stated so far reflect the main policy interests motivating this review, but they do they reflect consensus opinion on how PES may work. Various moderating factors and potentially unintended outcomes need to be considered. The justification for PES, ostensibly, is that without intervention, benefits of forest protection are external to those who would contribute to deforestation. PES programs thus harness and redirect the value of such externalities in the form of payments to those who would otherwise contribute to deforestation (Wunder, 2005;Angelsen 2010).
In principle, a PES arrangement operates as a standard performance-based contract, whereby upon performance, in terms of forest protection or rehabilitation, payments are issued (Ferraro, 2011). A number of conditions are necessary for such an arrangement to work. Payments must be targeted toward those whose activities significantly affect deforestation rates, the payments must be sufficient to overcome opportunity costs of conservation, and the 'sellers' must be induced to carry out the conservation service rather than pocketing the payment and continuing with deforestation (Ferraro, 2011).
These conditions may fail if institutional conditions or even cultural conditions (e.g., 'a payment culture, ' Muradian et al. 2013;Wunder, 2013) are not right. Those designing the program may have inadequate knowledge or capacity to target properly or to set appropriate payment levels 3 . Constraints on PES buyers' ability to monitor and sanction may allow would-be sellers of conservation services to get away with pocketing benefits without actually reducing deforestation. These institutional conditions are captured by the 'governance system.' A PES buyer's ability to monitor and sanction will depend on local public administration, law enforcement capacity, as well as levels of corruption.
Poverty alleviation effects of PES will depend on the targeting of the program and whether local institutions represent the interest of the poor and therefore provide accountability mechanisms to ensure that benefits accrue to the poor (Corbera et al., 2009). Average incomes may rise, for example, but these gains may be concentrated among the non-poor, in which case poverty levels may be unchanged.
It is also possible for PES to have perverse or unintended negative effects. For instance, cessation of deforestation may reduce demand for labor from poor households or otherwise infringe on livelihoods of the poor, leading to welfare losses. PES may increase the value of land and result in more powerful groups displacing poorer households so as to gain control of the land that the poor occupy, again resulting in welfare losses among the poor (Landell-Mills and Porras 2002;Langholz et al., 2000). Finally, the 'commodification' of forest could erode people's sense of the intrinsic value of forests; this could make custodians of forested land more receptive to bids proposing commercial conversion of forests that are likely to be more lucrative than conservation contracts (Muradian et al., 2013, though see also Wunder, 2013).

Moderator hypotheses
We can state these points from the two sections above in terms of moderator hypotheses. The hypotheses are as follows:  H5: PES deforestation reduction impacts are positively moderated by the level of local administrative and enforcement capacity.
3 Indeed, as Boerner and Wunder (2008) and Grieg-Gran (2008) demonstrate, ex ante valuation of opportunity costs of forest conservation is not a trivial undertaking. For example, in valuing the opportunity costs in two sites in Brazil, Boerner and Wunder combine production and price data with forest loss projections to derive valuations that could be used to scale payments. Along similar lines, Gregersen et al. (2010) further problematize opportunity cost analyses by pointing out that some opportunity costs may be illegal or based on informal markets (e.g., illegal timber trade or slash and burn farming), making it especially difficult to establish clear opportunity cost benchmarks.
 H6: PES deforestation reduction impacts are negatively moderated by levels of corruption in government.
In testing these hypotheses, we control for variations in the design features of PES programs. PES programs vary by the size and terms of the payments offered. Differences across programs will reflect policy-makers' adaptation to contextual factors. In our analysis, we study how contextual variables, and in particular the moderating factors discussed above, affect the PES design.

W HY IT W AS I MP O RT ANT T O D O TH IS R E VIE W
While the environmental science is clear in characterizing the potential gains from forest conservation (Santilli et al., 2005;Gullison et al., 2007), it remains for social scientists to provide insights into how institutions and incentives may be arranged to realize such potential (Gibson et al., 2000). Realized impacts may depart substantially from hypothetical projections, in which case the latter on their own are not a reliable guide for policy. Evidence from case studies of PES programs is inconclusive about the effectiveness of such programs for forest protection; this may reflect how the implicit theories used to design PES programs have failed to account for local structural and institutional context (Angelsen, 2010;Gibson et al., 2000;Tacconi, 2007;Wunder, 2005).
Typically evaluation approaches in this field estimate the worthiness of conservation programs on the basis of elicited valuations of environmental services combined with hypothetical projections of the services that a program is supposed to deliver. As Ferraro et al. (2011) and Ferraro (2011) argue quite convincingly, there is a need to move toward credible estimation of the effects of conservation programs. Efforts to apply counterfactual analysis to assess the effects of environmental programs have been rather limited to date, but studies using quasi-experimental approaches do exist. These are currently scattered in the academic and grey literature, with no comprehensive synthesis available to date.
This review complements a number of other systematic reviews assessing the evidence on interventions considered under the REDD+ initiative and other efforts to reduce deforestation. Bowler et al. (2010)  There is a range of recent, related review studies that have had similar goals as this review, and so it is important to clarify our added contribution. Pattanayak et al. (2010) review theoretical motivations for forest-oriented PES and findings from eight quasi-experimental studies and 18 case studies, but do not apply the replicable search and synthesis methods of a systematic review. Wunder et al. (2008) review evidence on distributive effects of forest-oriented PES programs from case studies, but do not provide quantitative synthesis. The volume edited by Angelsen (2009) contains chapters that describe varieties of forest conservation policies, including forest-oriented PES programs, but these reviews do not adopt the replicable search and synthesis methods of a systematic review.

Objectives
The overall objective of this review is to assess the evidence on the conservation and poverty impact of PES programs and to assess the extent to which the poverty impact of such programs in turn affects the extent to which conservation benefits are realized. Doing so is important for moving the debate outlined in section 1.1 beyond theoretical discussions and into better-informed, evidence-based discussion (assuming relevant evidence can be found). More specifically, we seek to test the hypotheses set forth above, with hypotheses H1 and H2 being of primary interest. Hypotheses H4 through H6 are of secondary interest and in testing them we seek to evaluate how institutional and social conditions (namely, inequality, institutional capacity, corruption, and democratic accountability) moderate the impact of PES programs. Our strategy for selecting studies will be targeted toward testing the four primary hypotheses as rigorously as possible. Table 1 relates each hypothesis to the types of evidence we will need. Such an assessment of impacts does not necessarily provide the basis for a full cost-benefit analysis of PES programs. We acknowledge this limitation and propose that follow-up work should focus on filling in the cost side of the equation as a complement to the analysis that we provide in this report.  H2: PES have non-negative impact on local poverty levels.
Quantitative data on forest conservation and host community poverty outcomes for sites with PES and sites that constitute a plausible counterfactual.
Qualitative accounts of whether the interventions operated as planned.
Mediator Hypothesis H3: The more a PES program functions to relieve poverty, the stronger will be its impact on reducing deforestation.
Quantitative estimates of both poverty and deforestation impacts from PES for at least a subset of cases to assess co-variation between the two types of impact. Qualitative accounts of whether poverty benefits (disruption) contributed to compliance (noncompliance) and effective (ineffective) functioning of PES programs.

Moderator Hypotheses
Quantitative measures of local inequality, local capacity, corruption, local democratic accountability, and opportunity costs of  H4: PES deforestation reduction impact is negatively moderated by the level of local inequality.  H5: PES deforestation reduction impacts are positively moderated by the level of local administrative and enforcement capacity.  H6: PES deforestation reduction impacts are negatively moderated by the level of corruption in government.
conservation borne by forest communities for each study to assess covariation between these measures on the one hand and deforestation and poverty on the other.
Qualitative accounts of how issues related to inequality, local capacity, corruption, or local democratic accountability affected the functioning and effectiveness of given PES programs.

Selection Criteria
Our selection criteria are summarized in Table 2. Details are given in the following subsections.

PA RT I CI P AN T S
This review includes only studies that focus on either (i) deforestation outcomes in forest areas in developing countries or (ii) poverty conditions of forest-dwellers and populations residing in communities that are proximate to natural growth forest areas in developing countries. 'Forest' is defined as per the United Nations Food and Agricultural Organization Global Forest Resources Assessment: Land spanning more than 0.5 hectares with trees higher than 5 meters and a canopy cover of more than 10 percent, or trees able to reach these thresholds in situ. It does not include land that is predominantly under agricultural or urban land use. (Food and Agricultural Organization, 2010: 6) 'Developing countries' are those classified as lower income, lower middle income, or upper middle income by the World Bank in the year of the initiation of the program under study.

IN T ER V EN T I O N S
The review includes studies of PES programs. The requirements for a program to be considered a PES program are that there is a clear start date when either payments or rewards are themselves offered to individual or corporate property holders to maintain or rehabilitate (for example, via planting endemic species) natural forests, or institutions are established to facilitate such offers. We allow for those offering the rewards (the 'buyer entity') to be either public or private actors, and we allow for payments to be made in a manner that is either conditional or in a manner that is in advance (and therefore not necessarily conditional) on the fulfilment of the prescribed maintenance or rehabilitation. These differences are noted in our characterization of the design of each PES program below.

OU T CO ME S
Outcomes of interest are (i) deforestation or (ii) poverty conditions of forestdwelling communities. Similar to what Bowler et al. (2010) discovered, in our selected studies researchers varied in the precise metric that they used for deforestation impacts, including differences in operational definitions for deforestation or degradation and different types of data sources---for example, onthe-ground point samples or remote sensing samples from satellite or fly-over imagery (West, 2009;Achard and Hansen et al., 2013). We accepted whatever measure was used for the outcomes of interest as presented by the authors.
We sought to assess poverty impacts in terms of impacts on consumption, income, or income potential for members of forest communities residing below or just above the consumption-based, two-dollar per day purchasing power parity absolute poverty line (Ravallion et al., 1991). Such outcomes are typically assessed using household economic surveys or administrative data on consumption, food security, employment, or access to productive assets (Deaton, 1997). In the absence of such fine-grained data, we sought to look at studies that measure differential consumption or income impacts for 'poor' versus 'non-poor' households or communities. Again, we accepted whatever measure was used for the outcomes of interest as presented by the authors.
We also sought to pay attention to the potential impact of in-or out-migration on poverty outcomes. If a program causes outmigration among the most poor, then the resulting poverty level in the area may be less than was the case before the program. However, it would be inappropriate to take this to mean that the program helped to alleviate poverty.
Finally, we were particularly interested in identifying unintended effects of forest conservation programs on local poverty conditions. We also took note of whether studies accounted for spill-over effects such as deforestation 'leakage' or 'slippage' (Wu et al., 2001). Failure to account for such spill-over may result in a biased interpretation of the impact of a program. Table 1 above sketched out the types of quantitative data and qualitative evidence we included in this review. We prioritized identifying rigorous studies that address hypotheses H1 and H2. For quantitative synthesis, we sought well-designed experimental or quasi-experimental studies that use robust methods to construct approximations to the counterfactual for the areas or individuals subject to a PES program. We then made comparisons between outcomes in the 'treatment' group and outcomes in the approximation to the counterfactual for the treatment group.

ST UD Y TY PE S
We accepted for quantitative synthesis only (a) randomized studies or (b) quasiexperimental studies that employ strategies for causal identification with clearly delineated treated and comparison areas and use some method for removing biases due to non-random assignment of treatment. Such methods include: regression adjustment, difference-in-differences estimation, instrumental variables regression, fixed effects regression, regression discontinuity, matching, or inverse-propensityweighted estimation. While application of such a method is sufficient for inclusion in our study, we appreciate that not all studies apply methods for causal identification with equal rigor and therefore we assessed the quality of all included studies (below we discuss the tools we used to assess study quality).
Quantitative studies that were excluded were those that failed to establish a credible approximation to the treatment group counterfactual. This included studies that relied exclusively on uncontrolled before-after comparisons or failed to adopt any of the above-mentioned methods of analysis to correct for selection bias and confounding. Qualitative data are used in the synthesis to provide descriptions and context for interventions that are included in the quantitative synthesis. Such data were drawn from the quantitative studies themselves as well as qualitative studies that cover the same programs or settings as the quantitative studies.

Participants
Forest areas or forest communities in developing countries.

Outcomes
Deforestation or poverty among forest communities.

Study types
Quantitative studies providing a robust counterfactual via randomized experiment or quasiexperiment or qualitative study with clear research objectives, original analysis, explanation of methods, and seeking to contribute to the academic conservation or social science literature.

ELE C TR ON I C SE AR CH ES
Our search strategy was developed after initial scoping exercises with a Campbell Collaboration information retrieval specialist. We searched the set of databases, specialist websites, and search engines that Bowler et al. (2010: 55-56) searched as well as others identified to possibly contain relevant content 4 . Our list of sources searched is given in appendix 11.1 below.
Our search strings included the following key words: ("pay*" OR "reward*" OR "incentiv*" OR "compensat*") AND ("forest" OR "deforest*" OR "ecol*" OR "ecos*" OR "environment*" OR "conservation") To these keywords we also applied a lower-or middle-income filter based on the Cochrane EPOC filters (http://epocoslo.cochrane.org/lmic-filters). The search strategy was adapted for individual databases. An example of a full search strategy is included in appendix 11.1.4 below.
Some of the databases considered (for example, IDEAS, RUPES, and JSTOR) included search results for non-English language studies even when using English search terms and keywords. The relevance of such search results was reviewed by native language speakers (the authors were able to cover French, Spanish, German, and Bahasa Indonesia). Ultimately, only English language studies met our inclusion criteria. We did not impose any date restrictions. The searches were conducted in the period February-August 2013.

OTH E R SE AR C HE S
We carried out hand searches of (i) key journals in relevant fields as listed in in the appendix, using publisher search engines and (ii) references cited in papers accepted for review as well as in review papers or thematically relevant papers identified during the search. We had members of our advisory group and the specialist agencies listed in the appendix below review our search results to ensure that important studies were not missing from our search results.

Data Collection and Analysis
5.1 DA TA CO LL E CT ION AN D ANA LY SI S

Selection of studies
The review team applied the PICOS inclusion criteria listed in Table 2 in three stages: first to titles to remove spurious citations, then to abstracts, and finally to full texts. For all stages, we maintained an account of the number of studies excluded, and the reasons for exclusion, by tracking references in an Endnote database. In the full text stage, excluded studies were tagged in terms of the PICOS criteria that were violated. All screening was done by two independent reviewers from the research team, with disagreements resolved by a third reviewer from the team. To ensure consistency in selection procedures, multiple reviewers reviewed a sample (of 50, for example) of citations and consistency was assessed. If agreement rates were below 90 per cent, we addressed any inconsistencies in interpretation of the criteria to assure at least 90 per cent rates of agreement.

Data extraction and management
For studies eligible for inclusion, we collected data on the study characteristics, findings, and moderators using a coding form (see appendix section 9.6). The data were double entered into Microsoft Excel by the review team. While it would be ideal to have data on moderator variables measured at the level of the regions in which the programs under study are applied, such data were not typically available. Therefore, we obtained data on the moderator variables using the relevant countrylevel indicators from the World Bank Governance Indicators. In the end, because of the low number of countries represented, there was little that we could do with these moderator variables.

Assessment of risk of bias in included quantitative studies
Risk of bias was assessed based on the guidance of the IDCG Risk of Bias Tool (version March 2012) 5 . We appraised studies according to the following criteria:  Avoiding selection bias due to non-random assignment, non-exogenous source of quasi-experimental variation in assignment, no adjustment for differences in baseline measurements: We assessed this on the basis of whether or not the study worked with a source of exogenous treatment assignment.
 Avoiding confounding bias due to lack of control for key confounders: Based on an initial reading of the studies, we concluded that key confounders included variables related to land quality, socio-economic conditions (namely, livelihoods, living standards, and access and size of markets for agricultural producers), and accessibility of treated land areas. We assessed whether studies accounted for all three types of confounders.
 Avoiding motivation bias from measurement strategies that may be tainted by subjects' interest in presenting themselves in a positive light or telling researchers 'what they want to hear': This was assessed as being satisfied if study conclusions were drawn from effects estimated on non-self-reported data or data based using other measurement strategies that reduce motivation biases.
 Accounting for potential bias due to spill-overs: We assessed whether studies either evaluated units that were insulated from spill-over or, in case where spill over was a likely concern, tried to estimate the extent to which spill-over may bias naïve comparisons.
 Free of selective outcome reporting and analysis fishing: We assessed whether studies clearly omitted results that might undermine the conclusions of the study or drew conclusions on the basis of methods that showed high potential for specification search.
 Appropriate statistical inference due to proper calculation of standard errors and confidence intervals.
We coded each study on the basis of whether they clearly satisfied each of these conditions (coded as 'yes'), clearly failed to do so ('no'), or whether it was impossible to judge ('unclear').

Measures of treatment effect
For effects on forest cover, whenever possible we standardize effect estimates to annual forest cover change rates following the proposals of Puyravaud (2003). For effects on material welfare and poverty effects, we used percentage change over estimated average counterfactual outcome (e.g., for income effects, percentage change in income relative to the average income of the control group). Section 9.3 of the appendix provides the precise calculations for these standardized measures and associated standard error approximations.
When multiple estimates were presented in a given study, we first tried to select the estimate that posed the lowest risk of bias. For studies relying on 'conditional independence assumptions' and using multiple regression or matching, this would be the estimate that either controlled-for or achieved the best balance on the largest set of pre-treatment covariates 6 . When there was no clearly defensible way to identify the single estimate in a study with the least risk of bias, we extracted all estimates and then performed our synthesis with the mean of the different estimates as well as the mean of the standard error estimates. This approach does not account for the dependence of the different effect estimates, although it avoids pitfalls in the use of standard approaches that assume independence 7 .
Some of the studies that we included examine the same program, however the estimates that they present cover different time periods, cover different regions, and use independent data sources. As such we treat these as distinct (and statistically independent) estimates.

Unit of analysis issues
When the unit of analysis was at a lower level of aggregation than assignment units, standard error calculations should account for the attendant 'clustering'. We checked to be sure that this was done. In cases where it was not, we noted it in our risk of bias assessment and while we sought to correct them using standard formula in cases where the relevant problems arose the information was not available to do so.

Dealing with missing data and incomplete data
When studies did not report on endpoint or intermediate outcomes, the study authors were contacted to determine whether such outcome data did in fact exist and whether estimates could be produced. However, we did not receive data from any authors that would allow for the construction of effect estimates that went beyond what appeared in the original studies.
7 Initially we had use an inverse variance weighted averaging approach for synthesizing the different effect estimates. But, as a reviewer astutely pointed out, such an approach ignores the dependence between measures and results in synthesized standard errors that become artificially small as one increases the number of estimates. Our approach to using the mean of the effect estimates and standard errors was proposed as the least misleading way to synthesize effect estimates when there is no clear way to select one minimally biased estimate.

Quantitative Synthesis
Our plan for a quantitative synthesis was guided by the hypotheses listed in Table 1. The 'main hypotheses' (H1 and H2) require a synthesis of basic effect estimates on deforestation and welfare or poverty. For each hypothesis, the following conditions had to apply for a statistical meta-analysis to be justified (adapted from Wilson et al., 2011): i) more than two studies meeting the quantitative inclusion criteria with effect sizes for common outcome constructs AND ii) effects measured against similar comparators.
Only for the effects of PES on deforestation were these conditions met. We thus computed an average overall effect as a weighted average that accounts for the imprecision of each effect estimate. The estimate was constructed using a random effects model fit via empirical Bayes in the metaphor package for R (Viechtbauer, 2010). In our forest plot for PES deforestation effects we display the synthetic random effects means along with their 95 percent confidence intervals (displayed as a black diamond).
The limits of the evidence base prevented further meta-analyses. In our protocol, we proposed a meta-regression approach for testing the moderator and mediator hypotheses. We could not implement this approach for lack of studies. Rather, we were forced to rely on qualitative information relevant for the included studies to comment on, rather than test, the moderator and mediator hypotheses. For similar reasons, we could not implement quantitative analyses of publication biases.
For the most part, our quantitative synthesis is limited to tables of effect estimates and narrative discussions of trends in the size and direction of the effects reported by the studies. The narrative discussion highlights issues related to modes of measurement, nature of comparators, as well as moderator conditions that should be taken into account when comparing the different effect estimates. We also provide a critical assessment of methods that have been employed and provide concrete recommendations for how rigorous and comparable evidence might be generated in future research.

Use of qualitative data
We extracted qualitative information from both the included quantitative studies as well as qualitative studies that covered the same types of programs and contexts (defined by our moderator variables) as our quantitative studies. We use such qualitative data to establish that conditions recorded in quantitative data are correctly interpreted and that hypothesized, but difficult to measure, chains of events do in fact occur in linking explanatory factors to outcomes (Collier, 2011;Vajja and White, 2008). Our strategy was to search on hypothesis-specific keyword word stems in the articles for the mediator and moderator hypotheses outlined above. We used these search results to localize content that may be relevant to our hypotheses. We extracted qualitative accounts or conclusions that were relevant to each of the hypotheses, and used these to provide insights in our narrative discussion. The keyword word stems that we used included the following:

Results
Our search for qualifying studies followed the process presented in Figure 2. This search process identified 1382 articles on PES using the search terms described in Section 4.1. Screening of abstracts had us narrow this to 149 PES studies to be screened at full text. Screening of full text papers reduced this first to a set of 11 quantitative studies and one qualitative PES study that met our inclusion criteria. We then conducted a second targeted search for other relevant and methodologically adequate qualitative studies that our initial search did not recover. We did this second targeted search by identifying any qualitative studies referenced in the bibliographies of the accepted quantitative studies, checking the websites of the quantitative study authors to see if they had produced complementary qualitative research, and then searching in the same databases as in the initial search, using as search terms the names of the programs that were being evaluated in the quantitative studies. This yielded eight new qualitative studies of PES programs assessed in the quantitative studies, or in studies from the same contexts as those studies. Therefore our final set was 11 quantitative and nine qualitative PES studies. Appendix section 9.7 provides information on studies that were excluded at the full text review stage. Tables 3 through 6 provide characteristics of the included studies, grouped by program.

CH A RA C TE RI STI C S OF I N CL U DE D ST U DI ES
The evidence base for the effects of PES on deforestation and poverty is extremely thin and these studies have methodological shortcomings. We identified a handful of high quality studies, which cover a small number of programs and contexts. Few of these studies provide insights on the intersection of forest conservation and poverty, and the moderating effect of the social and institutional context. Table 7 below display various design features for the programs evaluated in the quantitative synthesis. Section 9.2 in the appendix provides more detail on each of the programs. 9.2 for descriptions). We did not identify any studies that provide evidence on both deforestation and poverty effects for a common program, which prevents us from carrying out some of the quantitative analyses that we hoped to do on how poverty effects might in turn mediate deforestation effects.

Risk of bias in included studies
In addition to the small number of studies, the evidence base suffers from methodological shortcomings. Figure 3 shows the results of our risk of bias assessment, summarised for all included studies (study by study risk of bias assessment is available in appendix 11.8). We did not identify any experimental studies, and only one study made use of a source of a plausibly exogenous variation: Alix-Garcia et al. (2012) sampled 'matched control' parcels from properties that were idiosyncratically excluded from the first PSAH cohort but admitted to a subsequent cohort. This was the only study that ensured that the 'control' parcels included in the analysis could verifiably be assumed to have some chance of having been treated by the program based on the expressed interest of the parcel owners, thereby reducing concerns about self-selection bias. All other studies required that this assumption be taken on faith.
While all studies performed some kind of confounder control, many failed to include the full combination of forest land quality, socio-economic conditions, and conditions determining accessibility to markets that are often associated with both PES take-up and PES impact (the importance of all three factors for both take-up and forest cover trends were demonstrated across the studies themselves). It is reasonable to assume that PES programs tend to be applied systematically to parcels that landowners have no intention to deforest. If research designs fail to adequately control for such selection effects, estimates obtained on forest conservation effects will be biased upward.
A majority also failed to give explicit attention to the issue of spill-over ('leakage' or 'slippage' when speaking of deforestation). Again, this will have the tendency of biasing upward estimates of program effects: deforestation displaced onto nonprogram parcels will be mistakenly interpreted as an approximation of what would have happened with no program. Therefore if opportunities seized by participants reduce opportunities available for non-participants, then their welfare may be worse than would be the case with no program, in which case the non-participants do not provide for a valid approximation to the no program counterfactual 10 .
Methods used by authors for statistical inference (standard errors, confidence intervals) were also problematic in some cases. For example, the study by  failed to provide any inferential statement on the effect estimates 11 . In general, then, the methodological shortcomings of the evidence base are quite severe. For observational studies, the Alix-Garcia et al. (2012) provides a model that others ought to emulate. A move toward experimental studies would be helpful, and this is a point that we discuss in more detail in our conclusions below. Figure 3: Risk of bias assessment

CH A RA C TE RI STI C S OF I N CL U DE D ST U DI ES
Keeping in mind the weakness of the evidence base, Table 5 and Figure 4 display the estimated effects on forest cover outcomes for PES programs from the quantitative studies that qualified for inclusion. We converted estimates of forest cover effects to effects on the annual forest cover change rate (following the methods described in Appendix 9.3.2). These are presented in Table 5 under the ra -rc heading, with the forest cover change rate in the treated area provided (ra) as a benchmark. In cases where multiple estimates of the same effect were reported (e.g., Arriagada et al. 2012 provide four different estimates of the effects of PSA on forest cover over 1997-11 The apparent rationale may be provided that the study used complete population data and therefore there was no uncertainty, sampling or otherwise. Such arguments misunderstand the nature of causal effect estimation: what is 'missing' in a causal analysis is data on counterfactual outcomes. Proxies from 'control' areas serve to fill in this missingness for the purposes of an analysis that estimates the 'treatment on the treated,' but the extent to which such a proxy departs from the true counterfactual is a source of uncertainty that needs to be accounted for (Abadie et al. 2014

PES: Risk of bias assessment
Yes No Unclear 2005), we use the mean of the estimates and mean of the associated standard errors (see fn. 8). Section 11.9 of the appendix lists all the effects that we used to compute these mean effects. Figure 4 provides forest plots of these standardized effects on forest cover change rates. The top panel in Figure 4 shows effects on annual forest cover change rates as measured on forest cover change attributable solely to deforestation (such effects do not account for non-forested areas becoming forested). The bottom panel shows effects on annual forest cover change rates as measured on forest cover change attributable to either deforestation or forest growth on previously non-forested parcels. A random effects mean estimate for the top panel suggests that PES programs have, on average, caused the annual forest cover change rate to be about 0.21 percentage points higher (s.e.=0.09, 95% CI: [0.03, 0.39]) 12 . In other words, PES has tended to reduce the annual rate of deforestation by 0.21 percentage points.
The statistics shown at the bottom left of the forest plot are measures of effect heterogeneity generated from the random effects fit. The low (essentially 0) estimate for the between variance (τ2) and percentage of variability due to between study heterogeneity (I2) suggest that the effects are highly similar across these studies.
Looking at the bottom forest plot, we see that effect sizes tend to be larger when we look at forest cover change attributable to either deforestation or forest expansion (bottom panel of Figure 4). Estimated effects on annual forest cover change rates ranged from 0.
study reports an effect of 10 percentage points, an outlier in its magnitude (evident in Figure 3). The estimates from Scullion et al. (2011) were not accompanied by standard error estimates, and so we can only report point estimates 13 . Based on our concerns noted above about potential selection biases and spill-over problems, we believe that these estimates likely overstate the true effects of PES on forest cover change.
12 The random effects estimates were computed in R using the 'metafor' package (Viechtbauer, 2010). The fit was produced using the empirical Bayes procedure, the confidence interval for the random effects mean applies the small sample adjustment of Knapp and Hartung (2003), and the predictive interval uses a t-distribution with degrees of freedom equal to the number of studies minus 1, following Higgins et al. (2009).
13 In section 9.3.1 of the appendix, we indicate that for other studies, we impute a standard error corresponding to a p-value of 0.5 in cases where no standard errors are reported. For this case, given the extreme magnitude of the estimates effects, the resulting confidence intervals would span almost the entire visible range in the graphs, making them completely uninformative. Given the outlier nature of the effect estimates from Scullion et al. and our concerns about its methodological quality, we choose not to proceed with imputing a standard error and otherwise producing synthetic estimates based on the results of this study.
On the one hand, these larger effects make perfect sense mathematically: the deforestation outcome metric is a truncated measure (at best you can only have zero deforestation) while the forest cover change metric can vary more freely (at best you could have large amounts of forest gain, in principle). At the same time, there is substantive importance of this difference in effect magnitudes. It suggests that evaluations of PES effects ought to take into account not just protection of existing forest but the possibility that PES could contribute to growth or regrowth of forest. This is consistent with the proposition by Daniels et al. (2010) that for Costa Rica's PSA at least, PES was more likely to tip farmers into allowing regrowth on nonforested parcels as opposed to inducing farmers to desist from clearing parcels.
The last effect displayed in the bottom forest plot in Figure 4 shows the estimate for the effects of MBCF on the forest disturbance in Mexico, from Honey-Roses et al.
. These authors measure forest disturbance in terms of whether a parcel is covered by forest with at least 70 per cent canopy cover, while deforestation is measured in terms of whether a parcel is covered by forest with any detectable canopy cover. By construction, forest cover disturbance occurs at a higher rate than forest cover change per se, and the point estimate for the effect on the annual change rate is much larger (1.6 percentage points rather than 0.3, see Table 5), although as Figure 4 makes clear, the estimate is quite imprecise. In section 9.4 of the appendix, we present graphs that show the implications of these effects on rates of change for forest cover trajectories.
Many of the study authors (e.g., Robalino and Pfaff [2013], Robalino et al. [2008], and Alix-Garcia et al. [2012]) raise the issue of the inefficiency of PES programs when examined in terms of their forest conservation impacts. There are two parts to the inefficiency equation. First are the high fixed costs of setting up and managing such programs, given the need to demarcate and measure ex ante forest cover in parcels and then for reliable surveillance methods to monitor compliance. This is an issue that we discuss below in our discussion of the importance of local administrative capacity (see also Honey-Roses et al. [2009] for a detailed discussion). The second problem arises from the fact that by the evidence in the studies we reviewed, the 'additionality' achieved by PES programs has been in the neighborhood of 0.2 percentage point reductions in the annual deforestation rate. This means that after 10 years of programming, we would expect about 98 per cent of forested lands retained to have been retained anyway even were there no PES program in place. As such, the vast majority of parcels on which payments are issued are made 'for nothing' from an ex post perspective. In their qualitative research on reasons for participation in Costa Rica's PSA, Arriagada et al. (2009) obtained the impression that farmers tended mostly to enrol forest areas that they had no intention to deforest. Arriagada et al. (2012) attempted to characterize what the effects of a more targeted program would be. They focused on Costa Rica's PSA programming in the Sarapiqui region, which was noted for high rates of deforestation. Also, the implementation of PSA in Sarapiqui was facilitated by an NGO that sought to improve targeting. Even there, the estimated effect on the forest cover change rate was a boost of about 0.7 to 1.7 percentage points with a mean effect of 1.2 percentage points (s.e.=0.40, 95% CI: [0.42, 1.98]), implying circa 12 percentage points more land covered in forest after ten years than would have been the case otherwise. This means that after 10 years payments on about 88 per cent of land covered by the PES program would have been for naught when strictly considering forest conservation impact.
As Daniels et al. (2010) point out, it is difficult to know whether the results from the Arriagada et al. (2012) study really represent a 'best case' scenario, given that estimation of the impact of PSA in Costa Rica is confounded generally by the fact that the 1996 law establishing the program also declared blanket regulation on the clearing of natural forests. As Rojas and Aylward (2003) also point out: In the Costa Rican case, existing legislation complicates the issue [of studying the impact of the PSA program on forest cover change] as the new [1996] Forestry Law effectively expropriated land use rights on private land by forbidding any change in land use on lands with forest cover. As a result the PES are frequently regarded as a compensatory payment for this expropriation rather than an incentive or compensation per se. (Rojas and Aylward 2003: 94) The question of the feasibility of a more efficient PES programs remains a question that is open and needs further research. Work on variable pricing and PES auction schemes are one approach, although it is not yet clear that such complex contracting arrangements are practical in developing country contexts (Cason and Gangadharan, 2004;Jack et al., 2008). Munoz-Piña et al. (2008) propose that such mechanisms were explicitly rejected for their complexity in the design of Mexico's PSAH.

Figure 4: Estimates of the effect of PES on forest cover change rates due to deforestation (top) and due to either deforestation or forest expansion (bottom).
The small black squares show the point estimates and the horizontal lines running through the squares show 95% confidence intervals. Effects are measured in terms of changes to annualized forest cover change rates (see appendix section 9.3.2 for details). Effect estimates from Scullion et al. (2011, bottom) were not accompanied by standard error estimates in the original study. The text beside each estimate shows the program, timeframe of the program, and the study. The Scullion estimates are for pine oak forest (top) and cloud forest (bottom). The black diamond on the top forest plot displays the random effects synthetic mean estimate of the effect sizes (see fn. 12). Even though studies in the top plot overlap temporally and by country, they are estimates of program effects from different programs in different regions, and so no adjustment for dependent effect sizes was deemed necessary in producing this synthetic mean estimate. No such synthetic mean was produced for the bottom plot because the Scullion study does not provide uncertainty estimates and the Honey-Roses et al. study does not measure outcomes on a scale that is directly comparable to the other studies in the plot. Random effects mean I 2 = 37.4%, t 2 = 0 (0), t = 0

EFFE C TS O F PE S O N PO V ER TY
The evidence base on the effects of PES on welfare, and in particular poverty, is extremely limited. We identified only two quantitative studies reporting effects on human welfare outcomes, covering China's Sloping Land Conversion Program over three years from 2001 to 2004 ) and Mozambique's NCCL program over three years from 2003 to 2006 . These studies are described in Table 6 below, which also lists the studies' estimates of effects on income. Liu et al. (2010) Hegde and Bull (2011) also indicate that payments amounted to about 10 per cent of average household income, in which case the fact that the effects on household incomes was only 4 per cent is indicative of opportunities forgone, although they do not provide further clarity on what those may have been.
However, these average effects do not tell us how these programs affect poor households. First, we need to consider the extent to which poor households are really able to participate. For PES to contribute to poverty reduction, poorer households must be able to participate at high rates. But participation in PES programs is typically more difficult for poor households than wealthier households. , for example, discusses how Costa Rica's PSA program was structurally biased away from benefitting the poor by nature of the program requirements. For example, the need for land title, proof of tax compliance, absence of debts or fines, production of a notarized land map, and other high transaction costs may have precluded many poor from participating in the first place.
In qualitative and survey interviews with Costa Rican farmers, Arriagada et al. (2009) found that among those who were qualified to participate in PSA, based on land holdings, the primary reason for non-participation was because they did not understand the program. Zbinden and Lee (2005: 269) provide the following interpretation: 'Establishing a PSA contract requires considerable knowledge and the ability to manage administrative tasks. Less education (and presumably often poorer) farmers appear, on average, to be less likely to possess the skills needed to take equal advantage of the forest incentive program made available by the government'. As such, Pagiola (2008: 721) concludes that 'the bulk of program benefits tend to go to larger and relatively better off farmers', a reflection of how Costa Rica's PSA was designed, ostensibly, to prioritize conservation impact.
Alix- Garcia et al. (2012: 619) propose that similar constraints apply in Mexico's PSAH. Focusing on cost-based decisions to participate, their theoretical analysis shows that if PES 'just barely compensates for the opportunity cost of forest production, it will lead to an increase in the amount of land in agriculture [rather than PES] for credit-constrained households'. Access to credit would be required to sacrifice immediate conversion of forested land in anticipation of future benefits. Given that the poor are typically credit-constrained, they are therefore less likely to take up PES contracts of the type that PSAH offered, which were designed to hew closely to opportunity cost projections (Munoz-Piña et al. 2008). For PSAH, about 31 per cent of PSAH payments beneficiaries were poor households (Munoz-Piña et al. 2008). But this figure needs to account for the fact that many such households were benefiting as part of collective contracting through common-property forest user groups known as ejidos (about 70% of the natural forest in Mexico that was eligible for PSAH was on common property [Munoz-Piña et al. 2008]).
Second, one needs to account for the possibility that the effects for poor households may differ substantially from other types of households. Poor households may have less land to commit to such programs, reducing the potential for benefit in absolute terms 14 . Or, it could be that poor households are less able to translate whatever income or freed-up labor the program offers to other income-generating activities that do not require the use of land 15 . When weighed against opportunities forgone, this could mean that net benefits for poor households are small.

Hegde and Bull (2011) provide analysis of this issue for the NCCL program in
Mozambique. When they focused on only the poor households in their sample, they found no statistically significant benefits to poor households. Looking at consumption expenditure for example, they produced multiple estimates of the effect using different matching methods.  [-198805, 405317]). Hegde and Bull's (2011) results suggest that poor households did not benefit as much as other households, in absolute terms, as a result of their participation (although we should note that they did still benefit to some extent).
Hegde and Bull's (2011) analysis does not try to disentangle the precise reasons for the difference in the poor versus non-poor households, and the reasons for such variable benefits should be investigated in future research. For example it may be that consumption expenditure is a poor measure for the welfare of poorer households in this context given limited access to markets. 16 A better measure may be savings and the maintenance of 'rainy day funds' to cover adverse economic or health shocks. For PSAH in Mexico, Munoz-Piña et al. (2008: 6) find that 'few ejido members aside from those with directive or representation functions know the conditions of the [PSAH] contract, even in ejidos that distributed payments among members'. While this does not guarantee that those in prominent positions in the ejido 'captured' the system for their benefit, it does make one concerned about this possibility. Additional studies should investigate this concern.
Third, a proper analysis of the effects of PES on poverty cannot simply look at the effects of participation in PES programs, especially since participation per se is likely to be difficult for many poor households. Rather, one needs to consider how PES programs implemented in an area may also affect the welfare of poor households in the program area even if they are non-participating households. If PES reduces the demand for labor associated with logging or agriculture, then poor labor-supplying households may suffer. We did not identify any studies looking into this question with respect to PES, unfortunately, and so we propose this as a priority for further research. Relevant references for such research are studies by Robalino and Villalobos-Fiatt (2010) and Sims (2010) on the effects of conservation parks and protected areas on welfare outcomes.

IN T ER SE CT I ON OF PO V E RTY AN D DEF OR ES TA TI O N IM PA C T
Our theoretical discussion above proposed that the conservation impact of a PES program might be tied to its poverty impact. In order to address this possibility, one would need studies that evaluate both poverty and conservation outcomes jointly.
Unfortunately, no such studies were identified. Surprisingly, there was no overlap in the programs covered by the quantitative studies on conservation and poverty, respectively, and so even general comparisons across studies are impossible. That being the case, we cannot address the mediation hypothesis.
Some studies reported on how prevailing conditions of poverty might moderate the impact of PES programs. For example, Alix-Garcia et al.'s (2012) theoretical analysis of PES proposes that, conditional on a fixed payment schedule, poorer households will be all the more sensitive to opportunity costs if they are more credit-constrained and therefore less able to smooth over forgone immediate production in favour of future PES payments. For this reason, we might expect parcels put under PES contract by poor households to carry the lowest opportunity costs to conservation, or in other words, to be the least likely to be subject to deforestation pressure among parcels included in a PES program.
Consistent with this reasoning, Kerr et al. (2004) find that poorer farmers in Costa Rica are the least responsive among farmers in general to changes in the productivity returns to land, suggesting that credit constraints and other factors that limit the ability of poor farmers to switch across different types of production. In a more direct empirical assessment, Alix-Garcia et al. (2012) find that the conservation impact in poorer municipalities (measured using a municipality vulnerability index) is substantially lower. While PES impacts in the richest areas were twice that of the overall sample, in the poorest areas the impact on avoided deforestation was approximately zero. This has important implications for debates about the extent to which welfare considerations, and particular poverty impact, should be incorporated into the design of PES programs. At least with respect to Mexico's PSAH, more targeting of poor areas would only have served to reduce the program's conservation impact.

RO LE OF IN S TI TU T ION A L A N D S O C IA L C ON TE XT
The qualitative studies contain some insights on how the institutional and social context bears on the design and performance of PES programs. In some cases, we obtained specific insights relevant to our hypotheses about how local administrative capacity, corruption, democratic accountability, and inequality moderate the effectiveness of each type of program. Qualitative analyses of Mexico's PSAH by McAffee and Shapiro (2010) and Munoz-Piña et al. (2008) describe how the institutional and social context forced the design of PSAH to deviate from what one might consider the most economically optimal. These analyses touch on all three of the obstacles that Wunder et al. (2008) highlight as crucial challenges to efficient PES program design: (1) fairness and political constraints, (2) corruption and rent seeking through the program, and (3) capacity and knowledge limitations. These three factors line up very well with our proposed moderating factors. Analyses of PSA in Costa Rica were much less adamant about how PSA deviated from what was perceived at the time as an efficient PES program, perhaps indicative of the less contested nature of the PSA policy-formulation process in Costa Rica relative to contestation around PSAH in Mexico. We summarize the findings from these accounts in the sections that follow.

Inequality
Our theoretical discussion proposes that high levels of inequality in forest areas may undermine the effectiveness of PES programs. The qualitative data suggests the Mexican PSAH program was clearly sensitive to concerns of inequality. Wunder et al. (2008: 850) point out that 'efforts to spread payments 'fairly' throughout the country meant that a substantial share of funding went to area at little risk of deforestation and/or with limited or no threats to water supplies'. Inequality was such a large issue that the program focused more on keeping payments equal than on conserving the most endangered areas. McAfee and Shapiro (2010: 3) corroborate this, finding that 'involvement of federal agencies and rural activists shifted the program's emphasis toward poverty alleviation', and officials involved in the program took this to imply that the conservation impact would be lessened.

Capacity
Our theoretical discussion proposed that the level of local state capacity positively moderates the conservation impact of PES programs. We cannot evaluate this hypothesis quantitatively given the low number of studies. Also, the impact estimates do not vary enough for us to rank clearly the success of the programs and implementation periods. Nonetheless, qualitative accounts provide some useful insights. For PES, Honey-Roses et al. (2009) discuss the technical challenges of monitoring PES participants for compliance. Their review of current standards for using remote sensing and field surveying to monitor forest cover change leads them to conclude that 'current PES and PES-like schemes are underestimating the landuse changes and overpaying non-compliant participants' (Honey-Roses et al. 2009: 126). As Munoz-Piña et al. (2008) put it: The PSAH program reported no deforestation in participating areas. The claim of 100 per cent compliance is difficult to believe in the Mexican context, especially when the seriousness of cancellation of payments has not yet been experienced by any forest owner. [National Ecological Institute] researchers believe that the current low-resolution monitoring method is responsible for the over-enthusiastic results (Munoz-Piña et al. 2008: 8).
Better monitoring to boost PES impact would require even higher levels of administrative capacity than have been applied thus far.

Corruption
Our theoretical analysis suggests that corruption in government negatively moderates conservation impact of PES programs. Some of the qualitative studies provide insights, although again no direct quantitative test is possible from the available data. At its core, PES is a payment distribution mechanism. As such, the potential for corruption is high. Corrupt institutions may be more likely to siphon off payments and prevent them from being delivered to service providers. Corrupt officials may be less capable of or interested in enforcing conservation regulations. Wunder et al. (2008) described how payments in Mexico's PSAH program were used to finance things that have little to do with conservation: 'Many side objectives in Mexico's PSAH program, for example, were added after the program was created, either to placate politically powerful groups or to address other government objectives for which funds were insufficient' (Wunder et al. 2008: 849). Muñoz-Piña et al. (2008) also describe corruption in the creation of the Mexican PSAH program. During negotiations, landowner organizations used their political clout to fight for-and receive-significantly higher PES payments than what national program staff considered to be appropriate based on an opportunity cost assessment.

Democratic Accountability
The assumption that forest-edge communities place especially high values on conservation needs to be scrutinized. McAfee and Shapiro (2010) found that forestedge indigenous groups in Mexico were much less supportive of programs that enforced pure conservation than were central government officials charged with resource management policies. In the formulation of Mexico's PSAH program, such communities called for sustainable forest agriculture to be admitted as an activity that qualified for PES payments, rather than only admitting 'no-touch' conservation. Adapting the program to these demands may result in higher levels of forest disturbance. At the same time, it is necessary to appreciate that the indigenous committees were not calling for clear-cutting or full-scale conversion into plantation land. Conservation programs may need to allow for sustainable forest use to achieve forest-edge community support.
In addition to fostering a feeling of participatory accountability in forest dwelling communities, transparent democratic processes also increase the likelihood that forest councils will maintain records of their meetings, finances, and decisions. According to Agrawal (2001), this makes it easier for central government officials to monitor forest management and know when additional resources are needed and where.

VA R IA TI ON IN T HE P R O G R AM DES I GN FE AT U RE S
We also documented variation in design features of the PES programs covered by the eligible studies. The number of studies is too few to assess rigorously how such design features may affect program impact. Nonetheless, we can point out some patterns. Table 7 shows program design features for the PES programs included in the synthesis. In all cases for which we could retrieve details, some form of conditionality was in fact applied. For Costa Rica and Mexico, we have complete information on scale of payments. When compared to prevailing income levels, these programs differ markedly: while the contracts in Costa Rica pay on the order of 6-9 per cent of average national income per hectare, the Mexico contracts pay at a rate of less than 1 per cent of average national income per hectare. And yet, the estimated forest conservation effects for the Mexico programs were no smaller, and even appear to be larger, than for the Costa Rica program. This is not what one would expect to see, although our use of national income averages could obscure how the scale of payments relates to those who were actually targeted by the program (e.g. it is conceivable that the Costa Rica program was targeting land owners whose incomes and opportunity costs were an order of magnitude higher than those targeted by the programs in Mexico).

SU MM A RY O F F IN D IN G S W IT H RE S PE CT T O O U R HY PO THE SE S
Our analysis sought to test two main hypotheses and then a set of mediator and moderator hypotheses: Main hypotheses: 1. H1: PES reduce deforestation rates. 2. H2: PES have non-negative impact on local poverty levels. Mediator hypotheses 3. H3: The more a PES program functions to relieve poverty, the stronger will be it impact on reducing deforestation. 4. H4: PES deforestation reduction impact is negatively moderated by the level of local inequality. Moderator hypotheses 5. H5: PES deforestation reduction impacts are positively moderated by the level of local administrative and enforcement capacity. 6. H6: PES deforestation reduction impacts are negatively moderated by the level of corruption in government.
Limitations of the evidence base preclude definitive tests of any of these hypotheses. With respect to hypotheses 1, we do find that PES reduce deforestation rates on average. The effect estimates suggest the impact is modest and seems to come with extremely high levels of inefficiency. For hypotheses 2, we cannot say that the evidence indicates non-negative effects on poverty for PES. This is a troubling finding, but it is based on only a handful of cases and therefore deserves much more empirical attention. We were unable to assess hypotheses 3, although we find that areas of higher levels of poverty tend to be associated with poorer conservation performance. We found qualitative evidence in support of hypotheses 4, 5 and 6, suggesting that the contextual conditions of inequality, limited local administrative and enforcement capacity, and corruption may undermine the effectiveness of PES programs. However, in the absence of clear tests, these findings remain highly uncertain.

IM PL I CA TI ON S F O R P OL I CY : EL US I V E W IN -W IN
Our review sought to address the fundamental issues of the extent to which conservation and poverty reduction goals conflict, how different conservation strategies fare in terms of such trade-offs, and the scope for 'win-win' strategies that generate both significant conservation and poverty reduction benefits. We outlined two sides of the argument about the extent to which conservation and poverty reduction goals ought to be married to each other. After reviewing the evidence, we are largely in agreement with the type of 'guarded pessimism' reflected by Wunder (2001; and the notion that PES probably offers mostly 'win-settle' solutions (Wunder, 2013), at least when it comes to strategies that have been pursued to date. Pagiola (2004) asks, 'can payments for environmental services help reduce poverty?' Our review has tried to take the enquiry a step further and asked, based on the accumulated evidence, should PES programs have poverty reduction as a part of joint goal with conservation? A pragmatic logic for doing so is that poverty alleviation benefits may help to motivate better performance in providing forest conservation services. Hope for a win-win is rooted further in the idea that property rights to forest areas are 'often the only capital of the poor who have no money or political voice' (Arriagada et al., 2009: 344), and that PES programs allow for the conversion of such property to income.
The available data provide scant evidence for addressing this question, but based on what we have seen on patterns of participation in PES programs and their welfare impacts, there is no basis to claim that PES programs are 'pro-poor', in fact, the opposite may be true. Furthermore, poverty goals and conservation goals do appear to conflict in a manner that advises against setting poverty reduction as a goal for PES programs, at least for PES programs that set natural forest conservation as the primary objective.
Of course, strategies that have been evaluated to date may be limited in terms of the lessons that they offer for potential synergy in conservation and poverty-reduction programming. If so, then to the extent that these twin goals need to be pursued jointly, new ideas and program concepts must be developed. The PES programs that we included in our synthesis shared some important features that limit the generality of our conclusions: (1) they diverged from the 'ideal type' PES program insofar as buyer entities (usually, governments or NGOs) were not direct consumers of the environmental services and (2) the goal was most often conservation of natural forest rather than sustainable forest use. In this way, our conclusions cannot be taken as a summary judgment on the entire class of potential PES programs, and particularly not for programs that try either to maximize the role of 'service consumer' feedback or, and this may be more important from a poverty perspective, to promote behaviors such as sustainable forest use rather than strict conservation per se. The importance of the latter scope condition is made especially clear by McAffee and Shapiro (2010) in their analysis of the politics behind Mexico's PSAH.
As implemented, the evidence that we review indicates that government-and NGOadministered forest conservation PES programs have also been rather inefficient in producing forest conservation services. Evidence from Alix-Garcia et al. (2012) suggests that orienting Mexico's PSAH even more toward poorer communities in Mexico would have heightened this inefficiency. Hegde and Bull (2011) found that in Mozambique, the ability of households to capitalize on PES income to improve consumption and overall welfare (e.g., by using PES income to finance intensification of other productive activities) was least likely to happen among poor households. As a result, PES income barely substituted for opportunities forgone, possibly even causing a net reduction in welfare.
Without further evidence to the contrary, there is no evidence of the type of 'winwin' that would motivate combining poverty reduction with conservation goals in PES programs. That is not to say that PES programs should avoid seeking to mitigate any harm introduced. It is to highlight a profound tension in the idea that PES ought to have poverty reduction goals, rather than poverty mitigation goals, as a priority.
The lack of an apparent win-win means that the costs from inefficiencies of targeting poor areas for PES are unlikely to be offset by sizable benefits in terms of enhancing welfare for the poor. A first order issue for policy makers working with PES programs is to address the extreme inefficiency when it comes to the amount of payments issued that are unlikely to make any difference in terms of environmental impacts. That is, targeting PES programming with the first order objective of maximizing environmental return on investment is the main priority for the next generation of PES. If, for reasons having to do with ecological conditions, the areas targeted for PES happen to be areas where poor residents are concentrated, the evidence suggests that this should not be a reason to celebrate, but rather a reason to consider the need for additional, complementary resources to be provided, presumably based on the logic of credit-constraints developed by Alix-Garcia et al. (2012). Such complementary resources would seem to be necessary for a PES program operating in a poor area to have a good chance of succeeding in terms of conservation impact and welfare impact.

RI G OR O US R ES EA R CH A C R OS S C ONT E XTS
Our final conclusion re-emphasizes the poor state of the evidence base for conservation programming. Much advanced scientific effort and extensive investment has gone into measuring forest conditions around the world. Relative to that, the evidence base on the ex post performance of PES programs is limited in size and methodologically weak. Composed as it is of a few quasi-experimental studies of varying quality, the evidence base provides a very shaky foundation, likely tainted by selection biases, for informing environmental and development policy making.
As far as we know, there are no completed randomized controlled studies despite the fact that such would seem to be quite feasible. Feasibility of field experimental studies for PES is apparent from the fact that the few high quality quasiexperimental studies that we did review constructed approximations to the treatment group counterfactual using local non-PES properties. One study by Garbach et al. (2012) established a perfect opportunity for a field experimental study by randomly selecting farmers to participate in the RISEMP pilot. But they failed to follow through and track outcomes among the 'control' group that was constructed through this random selection process, choosing instead to use a convenience sample of local farmers as a control. The possibility for learning from the pilot was severely undermined as a result.
While field experiments should be the methodological priority, the quasiexperimental studies covered in this review might be replicable for other countries and programs given tools such as Google Earth Engine's high resolution forest cover mapping (Hansen et al. 2013). Thus, there would seem to be ample opportunity to expand the coverage of these sorts of quasi-experimental studies around the world as formative research that might inform more finely targeted field experimental studies.
Future experimental and quasi-experimental studies should assess both the environmental and human welfare outcomes of PES to allow and assessment of potential synergies or trade-offs between different program objectives. Quantitative studies should also collect data on context, implementation and costs.
Moreover, the existing evidence base is limited to a few countries, and excludes vast experience from other parts of the world. We were surprised to find no studies from countries with large PES programs, such as Indonesia or Brazil. Future research should focus on assessing the effects of PES across a diversity of contexts, including in particular contexts with high de-forestation rates.
Finally, the results above suggest that priority topics for further research include (i) mechanisms for more efficient contracting and (ii) strategies to boosting conservation performance in poor areas, such as allowing sustainable use (as opposed to only non-use) of forest lands to qualify for payments.

LI MIT AT IO N S AN D DE V I AT I ONS F R OM P R OT O CO L
Limitations of this study derive from the very few cases that the quantitative evidence base covers. The countries that we cover in this review exclude the major forested areas in the tropics, including the forests of the Amazon Basin, Indonesia, and the Congo Basin.
Details on the deviations from protocol are listed in section 11.5 of the appendix.
The key point that we make there is that the very limited extent of the database prevented us from being able to do the type of thorough analysis of factors that moderate the effectiveness of PES programs. Neither were we able to investigate directly how deforestation and poverty alleviation goals interact since we found studies that looked at effects on these outcomes jointly.

AC KN O W LE DG EM E N TS
We thank the authors of the studies included in this review for their excellent work and the feedback they provided to us on earlier drafts of this review. We received enormously helpful feedback from three anonymous reviewers, the Campbell Below is the set of terms used to filter searches and limit results to studies carried out in low or middle income countries (LMICs): AND ("Africa" OR "Asia" OR "Caribbean" OR "West Indies" OR "South America" OR "Latin America" OR "Central America" OR "Afghanistan" OR "Albania" OR "Algeria" OR "American Samoa" OR "Angola" OR "Argentina" OR "Armenia" OR "Azerbaijan" OR "Bangladesh" OR "Benin" OR "Belize" OR "Bhutan" OR "Bolivia" OR "Botswana" OR "Brazil" OR "Bulgaria" OR "Burkina Faso" OR "Burundi" OR "Cambodia" OR "Cameroon" OR "Cape Verde" OR "Central African Republic" OR "Chad" OR "Chile" OR "China" OR "Colombia" OR "Comoros" OR "Congo" OR "Costa Rica" OR "Cote d'Ivoire" OR "Cuba" OR "Djibouti" OR "Dominica" OR "Dominican Republic" OR "East Timor" OR "Ecuador" OR "Egypt" OR "El Salvador" OR "Eritrea" OR "Ethiopia" OR "Fiji" OR "Gabon" OR "Gambia" OR "Ghana" OR "Grenada" OR "Guatemala" OR "Guinea" OR "Guinea-Bissau" OR "Guam" OR "Guyana" OR "Haiti" OR "Honduras" OR "India" OR "Indonesia" OR "Ivory Coast" OR "Jamaica" OR "Jordan" OR "Kazakhstan" OR "Kenya" OR "Kyrgyzstan" OR "Laos" OR "Lebanon" OR "Lesotho" OR "Liberia" OR "Madagascar" OR "Malaysia" OR "Malawi" OR "Mali" OR "Malta" OR "Mauritania" OR "Mauritius" OR "Mexico" OR "Micronesia" OR "Moldova" OR "Mongolia" OR "Morocco" OR "Mozambique" OR "Myanmar" OR "Namibia" OR "Nepal" OR "Nicaragua" OR "Niger" OR "Nigeria" OR "Pakistan" OR "Panama" OR "Papua New Guinea" OR "Paraguay" OR "Peru" OR "Philippines" OR "Puerto Rico" OR "Rwanda" OR "Senegal" OR "Sierra Leone" OR "Sri Lanka" OR "Somalia" OR "Sudan" OR "Swaziland" OR "Tajikistan" OR "Tanzania" OR "Thailand" OR "Togo" OR "Tonga" OR "Tunisia" OR "Turkey" OR "Turkmenistan" OR "Uganda" OR "Uzbekistan" OR "Venezuela" OR "Vietnam" OR "Yemen" OR "Zambia" OR "Zimbabwe") o Box 3: AND enter sections of the LMIC filter shown above (the entire filter cannot be entered at once, so enter sections of the filter until all keywords have been used) with search type "anywhere in record"  The search yields 185 hits with title and abstract information. Extract information and enter in search database (using Endnote). In some cases, standard errors are not reported but rather t-statistics, p-values, or sometimes only significance levels. When t-statistics were reported for an effect Δ, we computed the standard error as Δ /t. From significance levels, we imputed the standard error from a t-statistic equal to the quantile at the posted significance level---e.g., if an effect Δ was shown to have p < .05 for a two-way test, we imputed a tstatistic corresponding to the .975 quantile of the normal distribution (t = 1.96) and then a standard error corresponding to | Δ /t|. Generally speaking the formula for imputed standard errors (se.imp) from a two-sided p value under a normal approximation is as follows:

QU AN TIT AT I VE SY N THE S IS
where Φ-1 is the inverse CDF of the normal distribution When no standard error, t-statistic, or statistical significance level was given, we imputed a p-value of 0.5 and then assigned the associated standard error, which is equivalent to assigned a standard error equal to (1/0.67)| Δ | = 1.48| Δ |. Imputing a p-value of 0.5 is not completely arbitrary, as it corresponds to the mean of the posterior p-value distribution under the null hypothesis, given a uniform prior over 0 to 1. In addition, such constant scaling will mechanically impute smaller standard errors for estimates closer to zero, in which case inverse weighted averages across numerous estimates will tend to drive the average toward zero; this again is consistent with assuming a prior of a null effect and updating it with vague information.

Standardized forest cover effect sizes
Puyravaud (2003) proposes a standardized measure of forest cover change based on the compound interest law, where C is the amount of forest cover at the time of follow-up, C0 is forest cover at baseline, r is the continuous rate of change per unit of time, and t2-t1 is the amount of time elapsed between periods t1 and t2. Taking the natural log of both sides and rearranging yields r = ln(C/C0)/(t2-t1) This measure of rate of change takes a sign that is positive for net forest cover growth and negative for net deforestation. The quantity 100r% is interpretable as the percent change in forest cover per period. For the studies considered above, we use year as the relevant period. Figure A.1 below shows how this annual rate of change translates into proportion change in forest cover for up to twenty years. Thus, a program that has the effect of sustaining a .01 increase in the annual rate of forest cover change (or, a .01 decrease in the deforestation rate) would induce on the order of a 10 percent increase in the extent of forest cover after ten years and 20 percent increase in the extent of forest cover after twenty years, as compared to the counterfactual of no program (at these small values of r, the annual change rate, and for these time scales, the compound interest law is almost perfectly linear in time).
Moving from this measure of forest cover change to a standardized effect measure may proceed as follows. We work with the difference between the actual forest cover change rate in the treated area and the counterfactual change rate for that area. Studies typically report forest coverage on an average-per-parcel basis. Given N parcels, then this does not affect the calculations as (C/N)/(C0/N) = C/C0. Using the a subscript to denote quantities for the actual treated area and c subscript for counterfactual quantities, then we note that ra -rc = [ln(Ca/C0) -ln(Cc/C0)]/(t2-t1) where Δ is the estimated effect on mean forest cover change (in area units) in the N parcels, Ca/N is mean forest cover in the N treated parcels and (Ca/N) -Δ estimates mean counterfactual forest cover in the treated parcels. Given a standard error for Δ denoted as se(Δ), an approximate standard error for the difference in rates that takes the treated area forest cover Ca as fixed is obtained via the delta method as For studies that report effects in terms of proportion of fully forested parcels deforested, denoted as Δp, we have that the average pre-treatment forest cover in treated parcels, In cases where Pa is not reported, we impute a value using the treatment parcels deforestation rate in the most similar case where such information is provided.

Standardized consumption and income effects
We standardized consumption or income effects in terms of percentage change relative to the counterfactual. For studies that estimate effects using log of income or log of consumption expenditure as the outcome, then for an effect estimated as Δ l, the percentage change over the counterfactual is given by
For studies that use the raw income or consumption expenditure levels as the outcome, then for an effect estimated as Δr, the percentage change over the counterfactual is given by where T is the mean income level in the treatment group. A delta method approximate standard error is given by, show the implications of the rate-of-change effects for forest cover trajectories. The x axis shows years. The y axis shows proportional change in forest cover relative to the amount of forest cover that prevailed before the program was implemented (this baseline level of forest cover is denoted as C0, refer to the discussion in appendix section 9.3.2 on standardized forest cover change measures). A horizontal dashed reference line is drawn at 0 on the y axis. This reference line would correspond to no forest cover change over time. The black curves trace out the actual forest cover change trajectories in the program areas (treatment group) as reported in each of the studies. We trace out the change trajectory for the number of years that the program ran before the assessment provided by the study. Trajectories that run below the zero reference line imply forest loss; trends that run above the zero reference line imply forest gain. Each graph also displays a gray shaded area that corresponds to the 95% confidence interval for the estimated counterfactual change trajectory. That is, the gray shaded area translates the effect estimate from the study (ra -rc, in the notation from appendix section 9.3.2) into an estimate of what would have happened in the program areas had there been no program (thus, the counterfactual). If the black trajectory line overlaps with the gray shaded area, this means that the study found no statistically significant effect (at 95% confidence). When the black line does not overlap with the gray shaded area, this means that the effect of the program was statistically significant and thus the implied counterfactual trajectory is clearly distinct from what actually transpired in the program area. The titles for each graph show the study authors, the program, the observation period from which the estimates were derived, and the outcomethat was used in the original analysis. forest cover trends in the program ("treated") areas, and gray shaded areas show the 95% confidence interval for the counterfactual forest cover trends implied by the effect estimates shown in Table 5 and Figure 4. The first five graphs show effects on forest cover attributable to deforestation (in which case, trajectories can never go above zero), the next five on forest attributable to either deforestation or forest growth. Effect estimates from Scullion et al.  (2011) were not accompanied by standard errors and so we simply trace out the point estimate for the counterfactual forest cover trend.

DE V IAT I ON S F R OM P R OT O CO L
Our protocol proposed that our risk of bias assessment code studies as 'high', 'low', or 'unclear' risk of bias for each of the domains considered (exogenous assignment, control for confounding, avoidance of motivation bias, accounting for spill-over, avoidance of selective outcome reporting, avoidance of analysis fishing, and appropriate statistical inference. We decided rather to indicate as to whether the study satisfied these criteria by indicating 'yes', 'no', or 'unclear', which can be interpreted as equivalent to the designations of 'low', 'high', and 'unclear' risk of bias with respect to each of these domains.
Our protocol specified meta-regression analyses to test our moderator and mediator hypotheses and a set of quantitative publication bias analyses. We were unable to implement these as the number of eligible quantitative studies was too few.
The protocol also included a proposal for a set of descriptive and moderation analyses to assess external validity of our estimates. We were unable to implement these as the number of eligible studies and associated contexts was too few.
The study was initially designed to also include Decentralised Forest Management (DFM) programs, as outlined in the ToR from the funder and Samii et al (2013). However, the search, data extraction and analysis was conducted in parallel rather than integrated and once the review was completed it was decided to split the two interventions into two separate reviews for ease of interpretation.

PROCEED ONLY I F YOU ANSW ERED "YES" TO ALL OF THE ABOVE.
Eligibility for Quantitative Synthesis Were any of the following experimental or quasiexperimental methodologies used to assess impact of PES or DFM on deforestation or welfare (poverty, income, or consumption)? (Mark yes for each that applies.) By "reported" we mean that they appear clearly in some kind of table or graph. …deforestation ...income, consumption or poverty Is the study eligible for quantitative synthesis? (Mark yes only if an experimental or quasi-experimental methodology was used AND impact estimates were reported on at least one of the outcomes above.) The study is ineligible if you answered "no" to all of the above.

Eligibility for Qualitative Synthesis
Is the aim of the study clearly about the impact of PES or DFM? Does the study work from a theoretical framework?
This would be to distinguish a qualitative study from, say, a journalistic account or a report that is m ostly intended to advertise rather than scrutinise. Are original qualitative or quantitative data (e.g., quotes/paraphrasing from interviews, close observation, process tracing, etc.) used to support conclusions about impact, background conditions, or mediating factors? Is the study eligible for qualitative synthesis? (Mark yes only if you answered yes to all three of the questions above.)

EX C LU D ED S T UD IE S
In the table that follows, we list studies that were subject to full text search but were then excluded on the basis of substantive or methodological grounds. Studies are listed by first three authors, date of publication, and then reasons for exclusion.

I TEMS/ QUESTI ONS LI STED I N THI S COLUMN ENTER RESPONSES I N THI S COLUMN I NSTRUCTI ONS Coder I nform ation Name of person filling in this form
First Last Date that the form was begun

MM/ DD/ YY General Study I nform ation
Author 1 First Last Author 2 (if applicable)

First Last
Author 3 (if applicable) (If more than 3 authors, include only the first 3.)

First Last
Year of