The ARRIVE guidelines 2.0: updated guidelines for reporting animal research

Reproducible science requires transparent reporting. The ARRIVE guidelines (Animal Research: Reporting of In Vivo Experiments) were originally developed in 2010 to improve the reporting of animal research. They consist of a checklist of information to include in publications describing in vivo experiments to enable others to scrutinise the work adequately, evaluate its methodological rigour, and reproduce the methods and results. Despite considerable levels of endorsement by funders and journals over the years, adherence to the guidelines has been inconsistent, and the anticipated improvements in the quality of reporting in animal research publications have not been achieved. Here, we introduce ARRIVE 2.0. The guidelines have been updated and information reorganised to facilitate their use in practice. We used a Delphi exercise to prioritise and divide the items of the guidelines into 2 sets, the ‘ARRIVE Essential 10,’ which constitutes the minimum requirement, and the ‘Recommended Set,’ which describes the research context. This division facilitates improved reporting of animal research by supporting a stepwise approach to implementation. This helps journal editors and reviewers verify that the most important items are being reported in manuscripts. We have also developed the accompanying Explanation and Elaboration document, which serves (1) to explain the rationale behind each item in the guidelines, (2) to clarify key concepts, and (3) to provide illustrative examples. We aim, through these changes, to help ensure that researchers, reviewers, and journal editors are better equipped to improve the rigour and transparency of the scientific process and thus reproducibility.


Why good reporting is important
In recent years, concerns about the reproducibility of research findings have been raised by scientists, funders, research users, and policy makers (Begley & Ioannidis, 2015, Goodman et al. 2016. Factors that contribute to poor reproducibility include flawed study design and analysis, variability and inadequate validation of reagents and other biological materials, insufficient reporting of methodology and results, and barriers to accessing data (Freedman et al. 2017). The bioscience community has introduced a range of initiatives to address the problem, from open access and open practices to enable the scrutiny of all aspects of the research (Kidwell et al. 2016, Else, 2018 through to study preregistration to shift the focus towards robust methods rather than the novelty of the results (Chambers et al. 2017, Nosek et al. 2018, as well as resources to improve experimental design and statistical analysis (Bate & Clark, 2014, Lazic, 2016.
Transparent reporting of research methods and findings is an essential component of reproducibility. Without this, the methodological rigour of the studies cannot be adequately scrutinised, the reliability of the findings cannot be assessed, and the work cannot be repeated or built upon by others. Despite the development of specific reporting guidelines for preclinical and clinical research, evidence suggests that scientific publications often lack key information and that there continues to be considerable scope for improvement (McCance, 1995, Hackam & Redelmeier, 2006, Kilkenny et al. 2009, Macleod et al. 2009, Rice et al. 2009, van der Worp et al. 2010, Glasziou et al. 2014, Macleod et al. 2015. Animal research is a good case in point, where poor reporting impacts on the development of therapeutics and irreproducible findings can spawn an entire field of research, or trigger clinical studies, subjecting patients to interventions unlikely to be effective (Scott et al. 2008, Begley & Ellis, 2012, Begley & Ioannidis, 2015.
In an attempt to improve the reporting of animal research, the ARRIVE guidelines (Animal Research: Reporting of In Vivo Experiments) were published in 2010. The guidelines consist of a checklist of the items that should be included in any manuscript that reports in vivo experiments, to ensure a comprehensive and transparent description (Kilkenny & Altman, 2010, Kilkenny et al. 2010a, 2010b, 2010c, 2010d, 2010e, McGrath et al. 2010, Kilkenny et al. 2011, Kilkenny et al. 2012a, 2012b. They apply to any area of research using live animal species and are especially pertinent to describe comparative research in the laboratory or other formal test setting. The guidelines are also relevant in a wider context, for example, for observational research, studies conducted in the field, and where animal tissues are used. In the 10 years since publication, the ARRIVE guidelines have been endorsed by more than a thousand journals from across the life sciences. Endorsement typically includes advocating their use in guidance to authors and reviewers. However, despite this level of support, recent studies have shown that important information as set out in the ARRIVE guidelines is still missing from most publications sampled. This includes details on randomisation (reported in only 30%-40% of publications), blinding (reported in only approximately 20% of publications), sample size justification (reported in less than 10% of publications), and animal characteristics (all basic characteristics reported in less than 10% of publications) (Macleod et al. 2015, Avey et al. 2016, Leung et al. 2018. Evidence suggests that 2 main factors limit the impact of the ARRIVE guidelines. The first is the extent to which editorial and journal staff are actively involved in enforcing reporting standards. This is illustrated by a randomised controlled trial at PLOS ONE, designed to test the effect of requesting a completed ARRIVE checklist in the manuscript submission process. This single editorial intervention, which did not include further verification from journal staff, failed to improve the disclosure of information in published papers (Hair et al. 2019). In contrast, other studies using shorter checklists (primarily focused on experimental design) with more editorial follow-up have shown a marked improvement in the nature and detail of the information included in publications (Han et al. 2017, Ramirez et al. 2017 The NPQIP Collaborative group, 2019). It is likely that the level of resource required from journals and editors currently prohibits the implementation of all the items of the ARRIVE guidelines.
The second issue is that researchers and other individuals and organisations responsible for the integrity of the research process are not sufficiently aware of the consequences of incomplete reporting. There is some evidence that awareness of ARRIVE is linked to the use of more rigorous experimental design standards (Reichlin et al. 2016). however, researchers are often unfamiliar with the much larger systemic bias in the publication of research and in the reliability of certain findings and even of entire fields (The Academy of Medical Sciences, 2015;Hurst & Percie du Sert, 2017;Fraser et al. 2018;Hair et al. 2019). This lack of understanding affects how experiments are designed and grant proposals prepared, how animals are used and data recorded in the laboratory, and how manuscripts are written by authors or assessed by journal staff, editors, and reviewers.
Approval for experiments involving animals is generally based on a harm-benefit analysis, weighing the harms to the animals involved against the benefits of the research to society. If the research is not reported in enough detail, even when conducted rigorously, the benefits may not be realised, and the harm-benefit analysis and public trust in the research are undermined . As a community, we must do better to ensure that, where animals are used, the research is both well designed and analysed as well as transparently reported. Here, we introduce the revised ARRIVE guidelines, referred to as ARRIVE 2.0. The information included has been updated, extended, and reorganised to facilitate the use of the guidelines, helping to ensure that researchers, editors, and reviewers-as well as other relevant journal staff-are better equipped to improve the rigour and reproducibility of animal research.

Introducing ARRIVE 2.0
In ARRIVE 2.0, we have improved the clarity of the guidelines, prioritised the items, added new information, and generated the accompanying Explanation and Elaboration (E&E) document to provide context and rationale for each item (Percie du Sert et al. 2020) (also available at https://www.arriveguidelines.org). New additions comprise inclusion and exclusion criteria, which are a key aspect of data handling and prevent the ad hoc exclusion of data (Landis et al. 2012); protocol registration, a recently emerged approach that promotes scientific rigour and encourages researchers to carefully consider the experimental design and analysis plan before any data are collected (Kimmelman & Anderson, 2012); and data access, in line with the FAIR Data Principles (Findable, Accessible, Interoperable, Reusable) (Wilkinson et al. 2016). Table S1 summarises the changes.
The most significant departure from the original guidelines is the classification of items into 2 prioritised groups, as shown in Tables 1 and 2. There is no ranking of the items within each group. The first group is the 'ARRIVE Essential 10,' which describes information that is the basic minimum to include in a manuscript, as without this information, reviewers and readers cannot confidently assess the reliability of the findings presented. It includes details on the study design, the sample size, measures to reduce subjective bias, outcome measures, statistical methods, the animals, experimental procedures, and results. The second group, referred to as the 'Recommended Set,' adds context to the study described. This includes the ethical statement, declaration of interest, protocol registration, and data access, as well as more detailed information on the methodology such as animal housing, husbandry, care, and monitoring. Items on the abstract, background, objectives, interpretation, and generalisability also describe what to include in the more narrative parts of a manuscript.
Revising the guidelines has been an extensive and collaborative effort, with input from the scientific community carefully built into the process. The revision of the ARRIVE guidelines has been undertaken by an international working group-the authors of this publication-with expertise from across the life sciences community, including funders, journal editors, statisticians, methodologists, and researchers from academia and industry. We used a Delphi exercise (Moher et al. 2010) with external stakeholders to maximise diversity in fields of expertise and geographical location, with experts from 19 countries providing feedback on each item, suggesting new items, and ranking items (a) The groups being compared, including control groups. If no control group has been used, the rationale should be stated.
(b) The experimental unit (e.g. a single animal, litter, or cage of animals).
Sample size 2 (a) Specify the exact number of experimental units allocated to each group, and the total number in each experiment. Also indicate the total number of animals used.
(b) Explain how the sample size was decided. Provide details of any a priori sample size calculation, if done.
Inclusion and exclusion criteria 3 (a) Describe any criteria used for including and excluding animals (or experimental units) during the experiment, and data points during the analysis. Specify if these criteria were established a priori. If no criteria were set, state this explicitly.
(b) For each experimental group, report any animals, experimental units, or data points not included in the analysis and explain why. If there were no exclusions, state so.
(c) For each analysis, report the exact value of n in each experimental group.
Randomisation 4 (a) State whether randomisation was used to allocate experimental units to control and treatment groups. If done, provide the method used to generate the randomisation sequence.
(b) Describe the strategy used to minimise potential confounders such as the order of treatments and measurements, or animal/cage location. If confounders were not controlled, state this explicitly.
Blinding 5 Describe who was aware of the group allocation at the different stages of the experiment (during the allocation, the conduct of the experiment, the outcome assessment, and the data analysis).
Outcome measures 6 (a) Clearly define all outcome measures assessed (e.g. cell death, molecular markers, or behavioural changes).
(b) For hypothesis-testing studies, specify the primary outcome measure, i.e. the outcome measure that was used to determine the sample size.
Statistical methods 7 (a) Provide details of the statistical methods used for each analysis, including software used.
(b) Describe any methods used to assess whether the data met the assumptions of the statistical approach, and what was done if the assumptions were not met.
Experimental animals 8 (a) Provide species-appropriate details of the animals used, including species, strain and substrain, sex, age or developmental stage, and, if relevant, weight.
(b) Provide further relevant information on the provenance of animals, health/immune status, genetic modification status, genotype, and any previous procedures.

Experimental procedures 9
For each experimental group, including controls, describe the procedures in enough detail to allow others to replicate them, including: (a) What was done, how it was done, and what was used.
(b) When and how often.
(c) Where (including detail of any acclimatisation periods).

10
For each experiment conducted, including independent replications, report: (a) Summary/descriptive statistics for each experimental group, with a measure of variability where applicable (e.g. mean and SD, or median and range).  according to their relative importance for assessing the reliability of research findings. This ranking resulted in the prioritisation of the items of the guidelines into the 2 sets. Demographics of the Delphi panel and full methods and results are presented in Supporting Information S1 Delphi and S1 Data. Following their publication on BioRxiv, the revised guidelines and the E&E were also road tested with researchers preparing manuscripts describing in vivo studies, to ensure that these documents were well understood and useful to the intended users. This study is presented in Supporting Information S1 Road Testing and S2 Data.
While reporting animal research in adherence to all 21 items of ARRIVE 2.0 represents best practice, the classification of the items into 2 groups is intended to facilitate the improved reporting of animal research by allowing an initial focus on the most critical issues. This better allows journal staff, editors, and reviewers to verify that the items have been adequately reported in manuscripts. The first step should be to ensure compliance with the ARRIVE Essential 10 as a minimum requirement. Items from the Recommended Set can then be added over time and in line with specific editorial policies until all the items are routinely reported in J Physiol 598.18 all manuscripts. ARRIVE 2.0 are fully compatible with and complementary to other guidelines that have been published in recent years. By providing a comprehensive set of recommendations that are specifically tailored to the description of in vivo research, they help authors reporting animal experiments adhere to the National Institutes of Health (NIH) standards (Landis et al. 2012) and the minimum standards framework and checklist (Materials, Design, Analysis and Reporting [MDAR] (Chambers et al. 2019)). The revised guidelines are also in line with many journals' policies and will assist authors in complying with information requirements on the ethical review of the research (Osborne et al. 2009;Rands, 2011), data presentation and access (Giofre et al. 2017;Vasilevsky et al. 2017;Michel et al. 2020), statistical methods (Giofre et al. 2017;Michel et al. 2020), and conflicts of interest (Ancker & Flanagin, 2007;Rowan-Legg et al. 2009).
Although the guidelines are written with researchers and journal editorial policies in mind, it is important to stress that researchers alone should not have to carry the responsibility for transparent reporting. Funders, institutions, and publishers' endorsement of ARRIVE has been instrumental in raising awareness to date; they now have a key role to play in building capacity and championing the behavioural changes required to improve reporting practices. This includes embedding ARRIVE 2.0 in appropriate training, workflows, and processes to support researchers in their different roles. While the primary focus of the guidelines has been on the reporting of animal studies, ARRIVE also has other applications earlier in the research process, including in the planning and design of in vivo experiments. For example, requesting a description of the study design in line with the guidelines in funding or ethical review applications ensures that steps to minimise experimental bias are considered at the beginning of the research cycle (Updated RCUK guidance for funding applications involving animal research, 2015).

Conclusion
Transparent reporting is clearly essential if animal studies are to add to the knowledge base and inform future research, policy, and clinical practice. ARRIVE 2.0 prioritises the reporting of information related to study reliability. This enables research users to assess how much weight to ascribe to the findings and, in parallel, promotes the use of rigorous methodology in the planning and conduct of in vivo experiments (Reichlin et al. 2016), thus increasing the likelihood that the findings are reliable and, ultimately, reproducible.
The intention of ARRIVE 2.0 is not to supersede individual journal requirements but to promote a harmonised approach across journals to ensure that all manuscripts contain the essential information needed to appraise the research. Journals usually share a common objective of improving the methodological rigour and reproducibility of the research they publish, but different journals emphasise different pieces of information (Enhancing reproducibility, 2013;Curtis et al. 2018;Prager et al. 2018). Here, we propose an expert consensus on information to prioritise. This will provide clarity for authors, facilitate transfer of manuscripts between journals, and accelerate an improvement of reporting standards.
Concentrating the efforts of the research and publishing communities on the ARRIVE Essential 10 items provides a manageable approach to evaluate reporting quality efficiently and assess the effect of interventions and policies designed to improve the reporting of animal experiments. It provides a starting point for the development of operationalised checklists to assess reporting, ultimately leading to the build of automated or semi-automated artificial intelligence tools that can detect missing information rapidly (Heaven, 2018).
Improving reporting is a collaborative endeavour, and concerted effort from the biomedical research community is required to ensure maximum impact. We welcome collaboration with other groups operating in this area, as well as feedback on ARRIVE 2.0 and our implementation strategy.