A toolkit for open and pluralistic conservation science

Conservation science practitioners seek to preempt irreversible impacts on species, ecosystems, and social–ecological systems, requiring efficient and timely action even when data and understanding are unavailable, incomplete, dated, or biased. These challenges are exacerbated by the scientific community's capacity to consistently distinguish between reliable and unreliable evidence, including the recognition of questionable research practices (QRPs, or “questionable practices”), which may threaten the credibility of research, including harming trust in well‐designed and reliable scientific research. In this paper, we propose a “toolkit” for open and pluralistic conservation science, highlighting common questionable practices and sources of bias and indicating where remedies for these problems may be found. The toolkit provides an accessible resource for anyone conducting, reviewing, or using conservation research, to identify sources of false claims or misleading evidence that arise unintentionally, or through misunderstandings or carelessness in the application of scientific methods and analyses. We aim to influence editorial and review practices and hopefully to remedy problems before they are published or deployed in policy or conservation practice.

the rigor and democratic accountability of crisis decisions (Bennett et al., 2017;Ely et al., 2014), but they also may meet resistance from entrenched patterns of power, privilege, and patronage that narrow inputs to analysis and unnecessarily restrict the consideration of viable alternatives (Drury et al., 2011;O'Brien, 2000;Stirling & Burgman, 2021).
A second and underappreciated challenge to making good decisions under uncertainty is the scientific community's ability to detect unreliable evidence, including the occurrence of questionable research practices (QRPs, or "questionable practices"), which may threaten the credibility of research. Questionable practices refer to procedures and actions that-even without malicious intent-misuse or misrepresent data and analyses, generating spurious results and misleading advice. The full extent of the impact of questionable practices is not yet established, but the evidence suggests they are commonplace, including in ecology  and make a substantial contribution to the high levels of false-positive findings in the scientific literature (Fidler et al., 2017;Christie et al., 2021). The latter has contributed to what is known as the "reproducibility crisis," which refers to the failure to successfully replicate many scientific findings across disciplines from preclinical medicine to economics (e.g., Begley & Ellis, 2012;Open Science Collaboration, 2015;Camerer et al, 2016). Questionable practices appear surprisingly resilient and research training may even have perverse consequences (e.g., Antes et al., 2010). Unlike fraud, which is reassuringly rare (Fanelli, 2009), the prevalence and persistence of questionable practices, therefore, has potential to lead to poor decisions that precipitate unacceptable and avoidable impacts on biodiversity and social-ecological systems (for an example of a discussion of unanticipated and harmful ecological consequences of misinterpreted evidence, see Parris et al., 2010). Some instances of questionable practices result from the pressure to publish. Researchers are incentivized to present the best possible "story" to convince reviewers, particularly when initial results are ambiguous or unconvincing (Banks et al., 2016). Whether they reflect "innocent" embellishment or purposeful misrepresentation, these practices potentially inflate expectations of the potential effectiveness of interventions and lead to the misdirection of research. They also potentially harm trust in conservation science that is well-designed and executed.
The issues we address below are broader than questionable practices, which continue to be debated (e.g., Leek et al., 2017, including in conservation science;Fidler et al., 2017;Mayo, 2022), and include systematic, contextual and motivational biases which threaten the credibility of the evidence base in conservation science and practice. Current quality assurance mechanisms, such as peer review, often fail to detect these confounding factors .
To tackle the dissemination of unreliable and blatantly false evidence in epidemiology, Soskolne et al. (2021) outlined a "toolkit" for identifying the misuse of scientific methods by those with conflicts of interest, reflecting a specter of intentional, self-serving, and ultimately harmful use of procedures and evidence. Notably, the specific problems that their toolkit identifies are not exclusively linked to malicious interference. Such a toolkit may therefore be helpful in addressing a broader spectrum of errors and misrepresented evidence. Like epidemiology, conservation science is susceptible to hidden interests and deliberate manipulation, but arguably most questionable practices and other sources of bias and uncertainty arise when there is little reason to presuppose conflicts of interest or malign intent. Indeed, researchers voluntarily "admit" to such practices as continuing data collection after checking if the results have reached a threshold for statistical significance, reporting a set of statistical models as the complete tested set when other candidate models were also tested, or changing to a different statistical analysis after the initial analysis failed to reach statistical significance . Worryingly, such practices are often passed down inadvertently from teacher to student (Casadevall & Fang, 2018).
In this paper, we take inspiration from the toolkit proposed by Soskolne et al. (2021) but focus on issues that are relevant in conservation and environmental science. O'Dea et al. (2021) already outlined several measures to improve the quality of scientific practice in ecology, but as far as we know, no relatively comprehensive toolkit exists. This is not intended as a fraud-detection device, nor does it presuppose any malign intent. Rather, we focus on common questionable practices and sources of bias that arise unintentionally or through misunderstandings or heedlessness in the application of scientific methods and analyses. The aim of the toolkit is to provide an accessible resource for anyone conducting, reviewing, or using conservation research, to identify sources of false or misleading evidence, and hopefully to remedy them before they are published or deployed in policy or conservation practice. Soskolne et al.'s (2021) toolkit summarizes inappropriate applications of common scientific methods and practices in epidemiology. They outlined issues under three broad headings: (1) practices that artificially amplify uncertainty, mask, or confuse cause-and-effect; (2) practices that delay action and maintain the status quo; and (3) practices that misdirect initiatives and policy priorities through influence. For each item in the table, Soskolne et al. (2021) specified the "argument" (the reasons why the practice is harmful) and its effects.

A TOOLKIT FOR OPEN AND PLURALISTIC CONSERVATION SCIENCE
We used Soskolne et al.'s (2021) toolkit as a starting point and adapted the items to reflect issues of special relevance to conservation biology. Table 1 was created from discussions between the authors, based on their research and experience, which encompasses quantitative and applied ecology, conservation science, psychology, philosophy of science, epistemology, social science, mathematics, economics, and biostatistics. We also suggest remedies (i.e., actions that serve to encourage best practice in research conduct) synthesized from over 70 publications on research practices relevant to conservation science. While grounded in the extant literature, this toolkit was developed as a perspective rather than a definitive or comprehensive compendium. In the interest of transparency, we include references to all source materials in the Supplementary Information. We grouped the practices within two broad categories based on their remedies: (1) widespread adoption of open science principles in conservation (e.g., preregister studies, publish all data, computer code and study materials, encourage replication and scrutiny, explicitly present and quantify uncertainty); and (2) encouragement of pluralism (disseminate diverse methodologies, take a broad view of expertise, involve interested parties and stakeholders [von der Porten & de Loë, 2014] in decision processes, ensure geographical representation).
In some instances, Table 1 may provide several related but different strategies to address a single problem. For instance, the rigid interpretation of probability thresholds in null hypothesis significance testing may be mitigated in some circumstances by calculating the power of a test and specifying the size of an important effect a priori, or by plotting confidence intervals, depending on the context of the study and relevant decision.

DISCUSSION
Both a more open and pluralistic approach to conservation science will lead to a more balanced evidence base. Although synthesizing research can be challenging, it will support more effective action by policymakers, because it obliges decision-makers to engage with the realities of the uncertainties inherent in scientific evidence (Stirling & Burgman, 2021). To describe the suggested remedies as applications of open science and pluralism is just one way of representing the issues. It was not always obvious whether to classify remedies as either open science or encouraging pluralism. The classification is approximate, and one can make arguments for other renderings, and indeed for positioning some of the elements in Table 1 in both categories. We use this classification to assist in structuring the discussion below, and to guide thinking about who might be involved in applying the remedies. This toolkit provides a checklist for designing or evaluating empirical research in conservation. The recommendations in Table 1 could be arranged straight-forwardly into a checklist of issues, depending on the context and details of the study at hand. A checklist is a management aid used to reduce failure by compensating for the limits of human cognition and ensuring that critical steps are not omitted during execution of a task. The complexity of the task at hand ("empirical research in conservation") precludes specification of detailed protocols for each issue and remedy. We have also restricted ourselves to the most common and therefore most impactful issues. We recognize that it is unlikely that simply presenting this list will lead to a revolution in conservation research culture in part because the checklist is incomplete and in part because applying the remedies may not be straightforward. Indeed, many of the prescriptions above have been proposed previously, some decades ago (e.g., A5 and A6, Taylor & Gerrodette, 1993), yet the problems persist and as noted in the introduction, seem to resist efforts to change them. However, authors may find Table 1 helpful, journal editors and post-graduate training programs could adopt it, and policymakers may use it to direct questions and to request-from the scientific community-clear indications of the reliability of the conservation research evidence for the purpose of decision-making. Figure 1 illustrates how the application of open science protocols may remedy at least some questionable practices regarding inference from conventional tests. The objective (enhanced nest success) leads to the intervention. The clarification of the research question provides a platform for the design of a practical experiment, including a biologically important effect size. The initial study design did not account sufficiently for confounding variables, local knowledge or prior information from other species and sites. The revised design uses prior information and is more robust to confounding. It takes account of the sampling effort required to detect a biologically important effect reliably. Preregistration of the research question, study design and proposed approach to statistical analysis ensures that a range of questionable practices (HARKing and related phenomena) are avoided. Visual interpretation of confidence intervals will help to avoid erroneous interpretations of the resulting p values.
It is difficult to avoid all the questionable practices outlined in Table 1 in all instances. Some of the practices are not immediately remediable as they involve systemic TA B L E 1 Toolkit for addressing problematic practices and implications in conservation science, drawn, adapted, and extended from the tool developed by Soskolne et al. (2021)

Part A Practices that may be remedied by widespread adoption of open science principles Item
Problematic practices Implications Remedies

A1
Using inappropriate models Failure to utilize appropriate models (mathematical, spatial, statistical) or analytical techniques may lead to biased or inaccurate inferences.
Publish all study materials, including code, analyses, and underlying data on an openly available online repository and conduct replication studies. Encourage the use of modeling notebooks/reporting of code. Consult specialists experienced in quantitative and qualitative study design and analysis.

A2
Suppressing data Failure to include findings in subgroups or failure to report or publish the findings. Deliberate omission of findings or inappropriate groupings of outcomes that hide or dilute impacts. Omitting rare events from statistical analysis or removing outliers.
Preregister studies in scientific journals or institutes for open science. Establish accountability systems. Encourage or require transparency in reporting of data and analysis in journal publications. Report and justify all data manipulations or exclusions comprehensively.

A3
Selecting inappropriate controls; failing to ensure that controls are representative of the population from which the study group is taken.
Invalidates comparisons between study groups and leads to inappropriate inferences.
Use guidelines for careful selection of controls. Consider theory-driven qualitative evaluations to avoid overreliance on exclusively data-driven and resource-intensive investigations.

A4
Failing to recognize the validity of qualitative evidence, including traditional and local ecological knowledge.
Relying solely on quantitative methods can give a narrow view of the issue at hand, ignoring relevant evidence and overlooking critical insights and context.
Encourage best practices in qualitative and mixed methods research and follow guidelines on qualitative study design in conservation and/or ecology. Use a standardized framework of reporting and documenting qualitative evidence in scientific publications. Encourage methodological pluralism in conferences and doctoral training programmes.

A5
Using "statistical significance" at the 0.05 level as a strict decision criterion, without considering Type II errors.
Unacceptable impacts may be ignored because of low statistical power, and unimportant effects may be "detected" because of high statistical power.
Specify "important" effect sizes and calculate power a priori. Plot confidence intervals visually. Encourage consultation with a statistician when designing a study. Do not interpret statistical significance as ecological significance.

A6
Designing studies or operating environments in which the burden of proof is on the environment; that is, those in which human actions continue until there is compelling evidence that unacceptable impacts are occurring.
Ignores the probability of a Type-II error, leading to acceptance of the null hypothesis of "no effect" or the omission of vital information because results are considered statistically nonsignificant. Failure to adjust for confounding and/or effect-modifying variables.
If possible, conduct randomized controlled trials (such as BACI designs). If that's not possible, engage community members with on-the-ground knowledge to identify confounding variables and/or use instrumental variables. Consider stratified analyses, and instrumental variable analyses.

A8
Assuming that "no data" equates to "no effect," or that no observation equates to absence, without considering detection probability.
The absence of data (because of the failure to conduct studies) may be invoked or misinterpreted as evidence of no risk. The absence of an observation may be taken to indicate absence.
Recognize that a lack of research about a conservation issue-and a paucity of data-does not equate to "no risk." Broaden sources of information to include qualitative analyses, local and traditional ecological knowledge, and evidence from similar systems and contexts.

A9
Ignoring existing evidence, including producing incomplete meta-analyses, conducting meta-analyses only in one language, and reporting them as representing a weight-of-evidence summary.
Excluding relevant studies or prior information from parameter estimates, study designs, or meta-analyses (deliberately or unintentionally), leading to erroneous and biased findings.
Condition or update initial estimates using all available evidence. Apply recommendations from knowledge transfer/coproduction literature. Use systematic reviews and reporting tools (e.g., ROSES), incorporating grey literature to counter-balance possible publication bias. Include reviewers from multiple backgrounds to revise inclusion/exclusion criteria. Include research in multiple languages, which may require investment in international collaboration and cocreation of research.

A10
Using a variety of statistical tests and interpretations of data until the desired, statistically significant outcome is achieved (so-called p-hacking) Inferences are drawn that are not supported by the data.
Design and preregister data collection, analysis, and interpretation. Encourage replication. Provide guidance for peer reviewers to check for common signs of p-hacking.

A11
Failing to publish statistically nonsignificant results (the so-called file drawer problem) Evidence of no effect is suppressed; studies that result in statistically significant results by chance are privileged, resulting in misleading inferences.
Publish work outside traditional journals, such as in preprint repositories. Preregister research. Create an environment that encourages publication of nonsignificant results.

Part B Practices that may be remedied by the encouragement of pluralism Item
Problematic practices Implications Remedies

B1
Failing to disclose a conflict of interest (financial, agenda-driven, political, or vested interest) The absence of objectivity/impartiality resulting in the application of a biased design or analysis, or selective interpretation of the findings.
Disclose conflicts of interests. Upload all methods, analyses, and materials to a publicly accessible online repository to encourage scrutiny. Broaden and democratize decision-making, increase diversity within science to enhance accountability and challenge power structures. (Continues)

B2
Focusing on measures rather than fundamental objectives (the so-called Goodhart law) Measures fail in their purpose. Vulnerable elements of biodiversity may be exposed to avoidable risk, e.g., a focus on biodiversity measures that ignore the identities of species in an assemblage.
Distinguish between means and ends. Avoid managing measures of impact or effectiveness directly. Consider cumulative and cascading impacts.

B3
Failing to make transparent the value judgments that underlie decisions, including selecting appropriate standards of evidence, measures of impact and the range of alternative actions.
Failing to discern acceptable potential actions, and failing to consider all relevant values, could lead to sub-optimal courses of action.
Engage with stakeholders and interested parties through deliberative decision-making to develop values hierarchies and to scope the implications of decision alternatives for all those affected. Include researcher position statements.

B4
Neglecting to apply or dismissing the Precautionary Principle when there is evidence, to justify actions, demanding a high degree of proof "beyond a reasonable doubt," before actions are taken, placing the burden of proof on the environment.
Unacceptable impacts on social-ecological systems (e.g., significantly increased extinction risks or damage to systems) may arise before action is taken.
Consider the weight of evidence and the trade-offs in wrongly concluding there is an impact, as well as wrongly concluding there is no impact, especially when working with small data sets. Act to protect the environment and delay potentially irreversible impacts before conventional scientific certainty is established.

B5
Manipulating policy priorities through influence over research priorities.
Influencing the research agenda by funding research that supports policies, positions, or values of interested parties or stakeholders.
Declare implicit conflicts of interest. Monitor and track research funding sources and activity. Establish regular reporting and review of funding decisions in funding bodies to detect issues and identify strategies for improvement.

B6
Using scientists to make decisions When scientists decide on a "best" course of action on behalf of stakeholders and interested parties, they entrain their own prejudices, values, and priorities, often inadvertently.
Engage with stakeholders and interested parties to make decisions through deliberative decision-making, e.g., structured decision-making. Employ an "honest broker" to communicate evidence and inferences for decision-making. Empower decision-makers/users with tools to screen the validity of scientific judgments.

B7
Using the most highly regarded scientist to fill knowledge gaps with their expert judgments (so-called arguments from authority).
Individual scientists, even well-regarded ones, routinely provide relatively inaccurate and overconfident judgments compared to diverse groups.
Use explicit, structured techniques to elicit judgments about ecological processes, ideas of cause and effect, and parameters from diverse groups. Test and validate expert judgments.

B8
Overstating the generality or importance of results Scientists may claim more than is warranted to make their work more visible and impactful.
Define the target population, justify representativeness, and provide a "constraints on generality" statement.

B9
Demanding consensus or requiring definitive answers.
Many conservation problems are deeply uncertain, and consensus may be impossible. Requiring it will lead to misleading advice and poor decisions.
Admit uncertainty, leave options open, and examine where sensitivities to uncertainty are most critical. (Continues)

TA B L E 1 (Continued)
Part B Practices that may be remedied by the encouragement of pluralism Item Problematic practices Implications Remedies

B10
Incomplete or biased engagement in decision making.
When decision makers/researchers/modelers decide who should be involved, decisions may be biased, and important factors may be missed.
Develop a comprehensive map of interested parties and stakeholders. Use social network analysis and snowballing to identify stakeholders and interested parties. Follow established guidelines to avoid unethical or extractive engagement practices. Obtain ethics approval from a qualified (social science) panel to ensure principles of respect, beneficence, nonmaleficence, justice, consent, integrity, and confidentiality are upheld. Ensure prior disclosure of conflicts of interest.

B11
Assuming the geographical representation of evidence and effort reflects real conservation priorities.
Scientific consensus or meta-analyses may direct resources to areas where there has been the most activity.
Anticipate and adjust for geographic bias created by wealth and opportunity.

B12
Assigning unwarranted credibility to the results of studies conducted by influential scientists or work published in high-ranking journals, excluding historically under-represented voices.
Errors from studies by influential scientists may be amplified and erroneous inferences widely accepted before being replicated and verified. At the same time, research from early career and historically underrepresented researchers may be overlooked.
Use double-and triple-blind procedures when reviewing scientific literature. Replicate research outcomes. Strive for diversity in collaborative work. In reviews and meta-analyses or evidence summaries, use comprehensive search criteria and conduct independent evidence quality assessments of individual outputs.
Note: Remedies and references (provided in Supplementary Information) are relevant to conservation science. efforts and structural change, which is not something the individual researcher, reviewer or grant committee has direct control over at the time decisions are to be made. Issues of diversity in expert judgment and decision-making usually are addressed by ensuring geographical, linguistic, and disciplinary representation (Editorial, 2021). While such representation is needed and good, efforts to account for diversity should go further by broadening understandings of who counts as an expert, considering wider interpretations of expertise, recruiting from nontraditional backgrounds and testing performance and accuracy (Burgman et al., 2011;Burgman, 2015). Even in this paper, we are guilty of a lack of diversity among the authors (B10 and B11) (Nuñez et al., 2021; our affiliations are limited to Australia, Brazil, China, Sierra Leone, and the United Kingdom), and opinion pieces and perspectives may be construed as arguments from authority (B7). Part B of Table 1 does not explicitly address the ways in which pluralistic science could redress the balance, remedying conventional scientific practices (see Godwin, 2020) to achieve more effective and just conservation outcomes. A comprehensive treatment of this question was beyond the scope of this work, but others have specifically set out an agenda for a pluralistic perspective about biodiversity in science, policy, and practice to achieve socially just conservation outcomes (Pascual et al., 2021). Nevertheless, we welcome and encourage readers to be mindful of these limitations when assessing the value of this contribution. The strategies noted in Table 1 are necessarily described in brief. A full exposition would require a document of textbook length. Ideally, a more thorough and complete process would reach out to conservation researchers across the full breadth of the discipline. Such a contribution could have a piece of extensive supplementary information that would give sufficient information for the reader to truly understand the causes and symptoms of each problem and fully appreciate how to implement remedies. The actions and sources of information outlined above are intended to serve as prompts to identify potential methodological flaws in scientific papers, to direct questions at creators, and as a starting point for seeking solutions to problems. We accept that the proposed remedies are not a panacea. In some instances, the solutions may be theoretically obvious but logistically challenging and complex (e.g., engaging community members with on-the-ground knowledge to ensure adequate adjustment for confounding and/or effect-modifying variables). This toolkit could be made more useful and "open" by sharing conservation examples that illustrate and solve problems, linked to F I G U R E 1 Example of the application of the recommendations in Table 1 to a conventional experimental plan to estimate the effectiveness of an intervention to enhance nesting success for a wetland nesting bird publicly available open-source code that accompanied publicly available papers, together with resources for how to learn to use qualitative and quantitative techniques effectively. An example of such an actionable "open toolkit" is provided by the National Academies of Science, Engineering, and Mathematics (NASEM, 2022). Table 1 does not provide explicit links between the problems and potential remedies, the details of which appear in the citations in the Supplementary Material, beyond the scope of this paper. However, we can point to some important links between problems and their remedies for several common issues. Implementing the remedies to problems arising from statistical inference may require action from individual researchers or colleagues involved in collaborative research, during study design (A4, A6, B3) or analysis (A1, A5, A10). Editors and reviewers may encourage the comprehensive use of available evidence in design and appropriate analysis protocols (A9, A10). In addition, the research culture of the whole scientific community may need to evolve, due to the systemic nature of many problems (A8, B6, B11, B12).
Several recommendations may be especially relevant for editorial decisions and policies, such as the suggestion to require authors to publish all data and study materials to enhance replication studies and to conduct statistical checks on provisionally accepted papers (A1, A2, A9, A10, A11). While some studies in ecology and social science are difficult or practically impossible to replicate, more conservation journals could follow the lead of the American Journal of Political Science, which requires authors of accepted papers to "provide replication materials that are sufficient to enable interested researchers to reproduce all of the analytic results that are reported in the text and supporting materials." The journal checks that the materials submitted "do, in fact, reproduce the analytic results reported in the article" (AJPS, 2022). Tools such as Statcheck (Epskamp & Nuijten, 2016) could do some of this work automatically.
Most journals currently require authors to declare conflicts of interest such as funding sources and direct benefits to the authors from the research outcomes. However, journal guidelines are typically variable and provide modest guidance on what constitutes a conflict. The Committee on Publication Ethics (COPE), among others, has launched initiatives to establish international standards for Conflict of Interest (COI) disclosure (Ruff, 2015), offering discussion and voluntary compliance. Ruff (2015) characterizes current procedures as un-resourced delivering neither transparency nor accountability.
Preregistration, registered reports, open science data and code repositories and full disclosures of conflicts of interest will serve a multitude of goals and may substantially enhance the quality of conservation science (Nosek et al., 2018), although they may increase preparation time and require additional review effort. Some of the drivers of bias in Table 1, such as the motivations of funders, may also be pathways to the adoption of more open scientific practices (Gruby et al., 2021). Finally, a structured and collaborative peer-review process could support reviewers to better evaluate manuscripts on these dimensions and allow editors to more effectively scrutinize how reviewers' judgments withstand the scrutiny of their peers (Marcoci et al., 2022).
The remedies in Table 1 are also not intended as a definitive test of scientific quality. Studies not reporting confidence intervals can still make substantive contributions to our understanding of urgent issues in conservation, and studies that are preregistered could be useless for policymaking. It is important to treat the above list not as a set of sine qua non conditions for scientific quality but as a tool to structure thinking about what constitutes reliable evidence and progress in conservation science. The toolkit could be expanded based on agreed practice within the conservation community, establishing a living document that represents current consensus on the parameters of scientific credibility.
As it stands, the toolkit may serve multiple purposes. Depending on the stage of research in which this toolkit is employed, it may serve to address issues before they arise. It could be useful in helping doctoral schools to design their curriculum and researchers to decide the details of experimental and analytical protocols (e.g., during research design). Post hoc, it may assist editors and reviewers to judge the scientific quality of a journal/conference submission (e.g., during peer review or research synthesis), and practitioners to assess the reliability of the evidence that informs their decisions. When applied at each stage of the research cycle, the toolkit could promote transparency and accountability, improve communication between interested parties, and ultimately enhance research credibility.

A C K N O W L E D G M E N T S
We thank R. Fuller and two anonymous referees for helpful comments, which have improved the manuscript.