Defining practical and robust study designs for interventions targeted at terrestrial mammalian predators

Conflicts between humans and mammalian predators are globally widespread and increasing, creating a long‐lasting challenge for conservation and local livelihoods. Protection interventions, which are essential to conflict mitigation, should be based on solid evidence of effectiveness produced by robust study designs. Yet, it is unclear what study designs have been used in predator‐targeted interventions and how they can be improved to provide best practices for replications. I examined how applications of five study designs (before‐after, before‐after‐control‐impact, control‐impact, crossover [i.e., the same randomly assigned study units acting as treatments and controls during alternating trials], and randomized controlled trial) have changed over time and how these changes are related to authors, predator species, countries, and intervention types (aversion, husbandry, mixed interventions, invasive management, lethal control, and noninvasive management). I applied multinomial regression modeling to 434 cases (28 predator species and 45 countries) from 244 studies published from 1955 to 2020. Study design was related only to intervention type. Less reliable before‐after and control‐impact studies were the most common (47.7% and 38.2% of cases, respectively), and their use increased over years as did all interventions. The contribution of the most robust before‐after‐control‐impact (7.4%), randomized controlled trial (5.3%), and crossover designs (1.4%) remained minor over time. Crossover is suitable for aversion, most husbandry techniques, and a few other interventions, but crossover interventions also have the most limitations in terms of applicability. Randomized controlled trial is generally applicable, but impractical or inappropriate for some interventions, and before‐after‐control‐impact appears to be the most widely applicable study design for predator‐targeted interventions.


INTRODUCTION
Unprecedented global human pressures, such as habitat loss and land encroachment, lead to escalating competition between humans and mammalian predators for shared landscapes and food resources (König et al., 2020). Large and midsized predators in particular create conflicts with rural and suburban residents by killing domestic animals and destroying crop fields, beehives, and orchards, manifesting nuisance behavior in neighborhoods, and even attacking people (Baker et al., 2008;Khorozyan & Waltert, 2019a). These conflicts make up a large part of widespread human-wildlife conflicts caused by a diverse array of species, such as predators, elephants, ungulates, primates, rodents, and birds (Torres et al., 2018). Preventive or retaliatory destruction of predators is common, but generally ineffective, unethical, and often illegal because many predators are officially protected at global or national levels (Bergstrom, 2017;Lennox et al., 2018). The most obvious solution to curb human-predator conflicts is to prevent predator access through the applications of different protection interventions, but this requires a scientifically robust measurement and validation of their effectiveness. Ultimately, predator conservation should hinge on reliable evidence which, in turn, necessitates the involvement of scientists and practitioners in linking research and implementation (van Eeden et al., 2018).
Estimation of effectiveness must be based on robust controlled experiments capable of producing reliable and replicable results, and this work depends on study designs used by researchers. Metrics of effectiveness are also important for accurate measurements and reporting (Khorozyan, 2020), but they will not help if the study design is flawed. The principle garbage in, garbage out, which is commonplace in simulation modeling and illustrates the importance of data quality (Kilkenny & Robinson, 2018), is fully applicable here. A number of study designs have been used in biodiversity conservation to estimate the effectiveness of interventions, including the after, before-after, before-after-control-impact (BACI), control-impact, crossover (defined below), and randomized controlled trial (de Palma et al., 2018;Adams et al., 2019;Burivalova et al., 2019;Christie et al., 2019;Treves et al., 2019).
Mitigation of damage caused by predators can be measured in different ways, such as reduction in number of livestock killed, area of crop fields raided, predator visitation rates, percentage of losses, and so on. The after design estimates damage in only the treatment sample (with intervention), and there is no comparison with the other sample, which could act as control or counterfactual (without intervention). This is the simplest, but most unreliable, study design because the magnitude and direction of change cannot be determined (Christie et al., 2019(Christie et al., , 2020a. The before-after design is used to compare the same sample before and after an intervention. It is prone to different biases and can produce misleading results because there is no proper control (Ausband et al., 2013;Iliopoulos et al., 2019). For example, the same households had a reduction in livestock losses after they acquired guard dogs relative to before they had dogs (e.g., van Bommel & Johnson, 2012), but this change could have been caused by factors other than dogs (e.g., by changes in livestock or predator numbers or weather). Without an experimental control, it is not possible to be sure that only dogs led to fewer losses from predation.
The control-impact design is used to compare nonrandomly selected treatment and control samples after the effect of an intervention (Palmer et al., 2010;Stone et al., 2017). Reliability of this method is low to moderate because treatment and control samples may have a preexisting and not obvious sample selection bias, which leads to poor-quality data. For example, consider a comparison of households with and without an intervention (e.g., night enclosures, herding, or deterrents). If taken nonrandomly, these households may differ in some factors unrelated to interventions. For example, control households may be located closer to forests, from which predators come and kill more livestock than in treatment households. The solution is to select treatment and control samples randomly; this transforms the design from control-impact to randomized controlled trial and minimizes the influence of confounding factors (Bromley & Gese, 2001;Kissui et al., 2019;. The BACI design combines before-after and control-impact, which incorporates before-after comparisons for treatment and control samples (Rossler et al., 2012;Johnson et al., 2018;Weise et al., 2018). Simulations of empirical ecological data show that BACI provides results similar to randomized controlled trials and outperforms before-after designs by 3-4 times, controlimpact designs by 3-5 times, and after designs by 7-10 times (Christie et al., 2019). It can be applied to both random and nonrandom samples and is especially popular in interventions where random sampling is impossible (e.g., spatially fixed land plots, waterbodies, or protected areas) (Geldmann et al., 2013;Larson et al., 2016;Long et al., 2017). Before-after comparisons of nonrandom treatment and control samples may reduce limitations of the selection bias, but only if these samples are similar in location, landscape, and management history (Adams et al., 2019). When randomization is possible, it makes BACI a very robust design because its control-impact component becomes a randomized controlled trial (Christie et al., 2020a). Crossover is the design in which the same randomly assigned study units, such as individuals or land plots, act as treatments and controls during alternating trials (Edgar et al., 2007;Ohrens et al., 2019;Louis et al., 2020). This design reduces biases associated with individual variation and is recommended as best practice for evaluating effectiveness of conservation interventions . Overall, randomization and use of a control make BACI, randomized controlled trial, and crossover the best study designs for interventions. These designs need wider applications and reporting (Christie et al., 2019;Treves et al., 2019;Christie et al., 2020a).
Despite quite comprehensive published information on study designs related to conservation interventions, understanding of how particular designs are affected and biased by ambient factors is limited. The questions what interventions are used, where, and why and what interventions are needed generally remain unanswered, at least in regard to predators (van Eeden et al., 2018). The results of analyses of study designs from a diverse array of interventions maintained in the Conservation Evidence database (www.conservationevidence.org) show that the most reliable designs, BACI and randomized controlled trial, were scarce and geographically clustered in Europe, Australasia, and North America, whereas less reliable after, before-after, and control-impact were the most common designs (Christie et al., 2019(Christie et al., , 2020a(Christie et al., , b, 2020c. However, these studies did not consider predators because this information was added to Conservation Evidence later (Littlewood et al., 2020). The patterns of study design usage in predator-targeted interventions have not yet been studied, notwithstanding a massive and further increasing number of specialized meta-analyses and systematic reviews (see "Literature Search"). Applications of interventions aimed at predators can be challenging and demanding of time, money, and skilled personnel. Therefore, the prevalence of their designs can differ significantly from those applied to other species. The lack of summarized comparative information on study designs in predators may lead to the use of better known, politically backed, or inexpensive designs, whereas more reliable designs may remain unknown due to lack of awareness or training, or neglected because they are perceived as unrealistic. Sometimes, ethical reasons may prevent applications of robust designs, such as dissatisfaction of uninvolved people with random sampling (personal observation). Thus, it is important to inform managers about alternative, more reliable study designs to allow them to weigh the pros and cons of each and make the best possible choice of study design.
The sources of bias, other than geographic bias, in selection and use of intervention study designs include the taxonomic bias (numerous species are overlooked [Christie et al., 2020b, c]), author's choice of study methodology, time period, and intervention type, but I am not aware of publications exploring them. Authors sometimes select particular methods because they specialize in them and have used them in different studies and publications. Study design applications change over time and, logically, less reliable designs should be used less and more reliable designs used more as scientific knowledge and skills advance. Certain study designs may be more applicable than the others based on methodological specifications of the intervention. I used a standardized approach for the analysis of large data sets and a model-based synthesis (Christie et al., 2020a) to examine the role of these biases in applications of study designs to predator-targeted interventions.
I tested the following hypotheses: applications of less reliable before-after and control-impact designs decrease and those of more reliable BACI, crossover, and randomized controlled trial increase over time; applications of study designs are related to author; applications of study designs differ between predator species and countries; and applications of study designs depend on intervention type. I also sought to estimate how frequently BACI, crossover, and randomized controlled trial were used and to propose practical applications of these interventions where these study designs are most underrepresented.

Literature search
I searched the scientific literature to find applications of interventions to protect human assets (domestic animals, crops, orchards, tree plantations, beehives, and neighborhood safety) from terrestrial mammalian predators in the order Carnivora. I also considered applications in captivity and those related to managed game species because they were conducted in simulated conditions that allowed replications in the wild. First, I included all publications from specialized meta-analyses and systematic reviews (Linnell et al., 1997;Graham et al., 2005;Baker et al., 2008;Miller et al., 2016;Treves et al., 2016;Eklund et al., 2017; Littlewood et al., 2020;Rashid et al., 2020;Khorozyan & Waltert, 2021). Second, in February 2021, I examined all the issues of Conservation Evidence (www.conservationevidence.com, 2004-2020), Ursus (www.bearbiology.org and www.bioone.org, 1968-2020), Cat News (www.catsg.org, 1984-2020), Carnivore Damage Prevention News (www.lcie.org and www.medwolf.eu, 2000-2005 and 2014-2020), and digital libraries of the International Union for Conservation of Nature (IUCN)/Species Survival Commission (SSC) Human-Wildlife Conflict Task Force (www. hwctf.org) and the IUCN/SSC Cat Specialist Group (www.catsg.org, 1950Group (www.catsg.org, -2020. Finally, I searched Web of Science (www.webofknowledge.com, 1970Science (www.webofknowledge.com, -2020 with the search terms "livestock" and "effectiveness" OR "efficacy" and *predat*; "wolf," "Canis lupus," "livestock," "protection," "eff*" AND "*predat*"; scientific names of seven recent bear species in combination with eff*; and scientific names of 38 recent felid species in combination with *predat* AND eff*. I did a systematic search of publications without geographical or other restrictions by scanning titles and abstracts for information about intervention effectiveness and scanning results for quantitative data that either estimated the effectiveness of interventions directly or allowed it to be estimated independently. Original meta-analyses and systematic reviews, as well as some other sources, such as the digital library of the IUCN/SSC Cat Specialist Group, contained the gray literature that I also included. The only limitation that I assumed from this literature search was an underrepresentation of non-English literature. Although I included it during the search, I did not explore specialized non-English databases and resources. The primary purpose of this search was to create a data set of study cases for which the effectiveness could be quantified. For quantification, I used the metrics of relative risk and Cohen's d. Therefore, I excluded cases that did not contain sufficient data to calculate these metrics and that described perceived and not actual effectiveness; did not describe details of interventions; described interventions not related to the mentioned assets, captivity, or game species (e.g., attacks on humans or interventions related to recovery of predator populations); used the same databases as the cases included in the study; were derived from correlation studies and not from comparisons of treatment and control samples; and used model simulations.

Data collection
Each case in the data set described an effect of a particular intervention on the protection of a particular object (asset, captive setting, or game species) from a particular predator species in a site. For each case, I used a continuous datum for publication year and assigned nominal (i.e., unordered, categorical [integer]) codes to the first author of the publication, study design, intervention, country, and predator species. When the authors lumped several known species or indicated predators or carnivores in general, such cases were coded as a grouped species. I coded all study designs that were used in publications (BACI, before-after, crossover, randomized con-trolled trial, and control-impact) and did not search for particular designs. Randomized controlled trial was assigned only when the authors explicitly stated that treatment and control samples were selected randomly. I am confident that the reporting of study designs by the authors was sufficiently comprehensive for me to correctly assign the study designs. I paid particular attention to how control or pseudo-control samples were selected. The after design was by default excluded from data set compilation because its effectiveness could not be estimated with relative risk and Cohen's d.

Data analyses
To test the first hypothesis, I checked how the number of cases for each study design and the total number of cases for all study designs changed over time and used Spearman's rho to estimate correlation between them. Because study design was a nominal, categorical, response variable, to test the other hypotheses I used multinomial logistic regression to determine the effects of publication year, author, intervention, country, and predator species on study designs. Multinomial regression is an extension of binomial logistic regression with more than two outcomes, which has the following form (Chatterjee & Simonoff, 2013):

FIGURE 1
The dynamics of study design applications related to predator-targeted interventions from 1955 to 2020. The years when no applications were published (1956-1972, 1975, 1989, 1991, and 1993) are excluded where k is the category (study design), K is the total number of categories (K = 5), β 0 is the intercept, β n is the slope, x n is the predictor, and n is the number of predictors. In models, k changes from 1 to K -1, and the category K is used as a reference, or baseline, category for which β 0K = β 1K = … = β nK = 0. So, the model is based on separate K -1 equations, each with a distinct set of β k (Chatterjee & Simonoff, 2013). Predictors were checked for multicollinearity with a conservative threshold value of variance inflation factor (VIF) of <3 (Zuur et al., 2010). Models with single predictors, additions and interactions of no more than two predictors were built to foster meaningful interpretation of the most parsimonious models. Control-impact design was set as a reference category in multinomial regression. Directions of predictor effects were determined from the positive or negative signs of model slopes (β) and from the significance of their difference from 0 at a conservative level of p = 0.005. I selected this threshold p value in contrast to commonly used p = 0.05 so as to maximize the reliability of results and secure the minimum levels of false positives and negatives (Benjamin et al., 2018;Colquhoun, 2019). Effect size was measured as the odds ratio exp(β) and its 99% confidence interval (CI) in relation to no effect (when β = 0, odds ratio = 1; e.g., odds ratio = 0.4 means a decrease of effect by 60%) (Khorozyan, 2020). I ranked multinomial models according to their Akaike information criterion (AIC) and selected the best models as those having ΔAIC < 2 and the highest model weights toward 1 (Symonds & Moussali, 2011). The AIC values are incomparable between models with different sample sizes (Symonds & Moussali, 2011), but my data set did not have missing values, and samples of all variables were equal. Predictors and study designs from the best models were cross-tabulated for a post hoc analysis with χ 2 tests.

DISCUSSION
My results clearly demonstrated that less reliable study designs, such as before-after and control-impact, are still widespread in predator-targeted intervention applications globally and their use has increased over time, along with all interventions. In contrast, robust designs, including BACI, crossover, and randomized controlled trial, have been applied rarely and made only a minor contribution to evidence building in regard to predators (Appendix S4). Invasive management, especially translocations, and lethal control relied almost entirely on beforeafter design, whereas husbandry tended to be dominated by control-impact. Therefore, I rejected the hypotheses of the effects of time, author, predator species, and country on the use of study designs and showed only that applications of study designs depended on interventions.
The key message is that all interventions need wider applications of BACI, crossover, and randomized controlled trial designs to collecting data on predator interventions. The list of relevant publications is in Appendix S4 and can be used by researchers and practitioners when they are considering possible options among study designs. It is imperative to promote funding, awareness raising, professional training, and cost-benefit analyses of BACI, crossover, and randomized controlled trial applications vis-à-vis before-after and control-impact applications. Naturally, the selection of particular designs depends on local practicalities and available resources. Affordability of before-after and control-impact has often been mentioned as a factor in the popularity of these two study designs (Ausband et al., 2013;Christie et al., 2019). When randomization is not possible or is impractical, say, in large commercial farmlands of Latin America (Quigley et al., 2015) and southern Africa (Weise et al., 2018(Weise et al., , 2019, BACI can serve as a viable alternative (Christie et al., 2019(Christie et al., , 2020a. A simple replacement of the control-impact or before-after of nonrandomly selected treatment and control livestock groups by the randomized controlled trial of sub-samples within the same groups can be a good solution for individual-based deterrents, such as protective collars, bells, or artificial eyespots (Knarrum et al., 2006;McManus et al., 2015;Loveridge et al., 2017;Radford et al., 2020). Importantly, this approach may lessen local resentment because few livestock owners would be happy to hold control herds when their neighbors have treatments and may enjoy reduced losses due to interventions (personal observation).
Apart from knowledge sharing, it is also important to fill existing gaps by considering interventions for which BACI, crossover, and randomized controlled trial are promising, but apparently not yet used. Figure 2 summarizes the existing uses (Appendix S4) and my recommendations on these designs for different interventions. I did not find any studies with BACI applications to physical deterrents, electric fences, transloca-tions, shooting, trapping, calving control, and replacement of livestock breeds or species. Studies of chemical deterrents (Bourne & Dorrance, 1982;Burns, 1983;Martin & O'Brien, 2000;Massei et al., 2003;Baker et al., 2007), visual deterrent (fladry), and a light-and-sound device (Shivik et al., 2003) showed that BACI is also feasible for physical deterrents, such as protective collars, rubber bullets, or wire netting. Effectiveness of electric fences is relatively easy to study by BACI if one compares random or nonrandom fenced and unfenced plots before and after fencing. It is surprising that such studies remain obscure, at least for predators.
Studies of translocations of damage-inflicting predators were designed mostly by uncontrolled before-after comparisons (e.g., Landriault et al., 2009;Weise et al., 2015), apparently because it is not clear what to assign as control. Only one study used control-impact with a control of preemptive translocation of individuals that did not kill livestock (Bradley et al., 2005). I suggest data collected from translocation studies be analyzed by comparing the capture site and the release site before and after translocation and by considering translocated individuals as treatments. These individuals may stay in or around the release site or return to the capture site, thus affecting damage in either site and allowing reliable measurement of the effectiveness of translocation. A BACI design appears to be the most optimal for lethal control studies in that similar sites are compared with and without hunting, trapping, or poisoning before and after an intervention, as is done with bait poisoning (Greentree et al., 2000). Finally, BACI is also suitable for manipulations with livestock, such as breeding control to shorten the period of availability of young livestock from all year to several months or replacement of docile predation-prone livestock breeds or species with more vigilant or defensive ones. For example, a control-impact study of calving control in which one farm with calving control and another one without it were compared (Breck et al., 2011) could be upgraded to a BACI level if additional before data were provided for each farm. This improvement would be key to determining whether the main cause of zero loss in the treatment farm was calving control or other unrelated factors, such as farm size, numbers of livestock, or predator densities.

FIGURE 2
The framework of used (circle) (Appendix S4), recommended but not yet used (triangle), and not recommended because it is inappropriate, not feasible, or impractical (square) robust study designs for different predator-targeted interventions. Abbreviations: BACI, before-after-control-impact; RCT, randomized controlled trial Randomized controlled trial has been used in interventions where random sampling of study units is relatively easy to implement (e.g., in all types of deterrents [e.g., Shivik et al., 2003;Beckmann et al., 2004;Radford et al., 2020] and most types of small-scale husbandry [e.g., Mahoney & Charry, 2005;Lance et al., 2010;Kissui et al., 2019]). However, this approach is neither practical nor feasible in cases when large land plots are treated (electric or conventional fences, geofence, shooting) and randomization is counterintuitive (translocations). For herding, randomized controlled trial is feasible only in areas where livestock breeding is the main source of income and, therefore, herders are common (Alexander et al., 2015). It is not feasible where herders are rare, often limited to a few elderly men per village due to economic changes (Mosalagae & Mogotsi, 2013). Although this study design is inappropriate for shooting, it can be applied to trapping in randomly assigned plots as it is done in bait poisoning (Allen, 2014). I could not find any randomized controlled trial studies of noninvasive management, yet this study design is appropriate here (e.g., by allocating random villages for capacity building or random households for implementation of antipredator activities). Still, random sampling can be problematic when selection bias is introduced by motivations of village heads or local authorities, variable interests of local people in participation, and different levels of damage (Martin & O'Brien, 2000); in these cases, BACI is recommended.
As expected, crossover was the most infrequently used study design. In regard to predators, it was applied to different types of deterrents (Edgar et al., 2007;Appleby et al., 2017;Ohrens et al., 2019;Louis et al., 2020) and shock collars (Andelt et al., 1999), but not to husbandry (Figure 2). This is unusual because husbandry techniques, such as small-scale electric fences, herding, and use of guard animals, can be conveniently examined with a crossover approach by alternating treatment and control periods in the same study units at certain predefined intervals. Ohrens et al. (2019) split 11 livestock plots randomly into six treatment plots exposed to flashlight deterrents and five control plots without deterrents in the first period. They switched the first six plots to controls and the other five to treatments in the second period. However, crossover is difficult and impractical to use with night enclosures because leaving livestock outdoors at night for control periods may violate local practices and can be opposed by local participants. Crossover is hard to implement on large scales and in noninvasive management due to substantial efforts required for treatment-control alterations, and it is not feasible with translocations, sterilization, and shooting due to specifications of these interventions. Crossover can be applied to trapping and poisoning by creating a grid of sampled units and alternating treatments and controls in each of them.
My results suggest that BACI is the most widely applicable robust study design for predator-targeted interventions.
Randomized controlled trial is also generally applicable, but impractical or inappropriate for some interventions. Crossover is suitable for aversion, most husbandry techniques, and a few other interventions, but it also has the most limitations in terms of applicability. I encourage researchers and practitioners to use more robust study designs to quantify the effects of predator interventions and ultimately to improve how predators and human livelihoods are protected.