Improving transparency and scientific rigor in academic publishing

Abstract Progress in basic and clinical research is slowed when researchers fail to provide a complete and accurate report of how a study was designed, executed, and the results analyzed. Publishing rigorous scientific research involves a full description of the methods, materials, procedures, and outcomes. Investigators may fail to provide a complete description of how their study was designed and executed because they may not know how to accurately report the information or the mechanisms are not in place to facilitate transparent reporting. Here, we provide an overview of how authors can write manuscripts in a transparent and thorough manner. We introduce a set of reporting criteria that can be used for publishing, including recommendations on reporting the experimental design and statistical approaches. We also discuss how to accurately visualize the results and provide recommendations for peer reviewers to enhance rigor and transparency. Incorporating transparency practices into research manuscripts will significantly improve the reproducibility of the results by independent laboratories. Significance Failure to replicate research findings often arises from errors in the experimental design and statistical approaches. By providing a full account of the experimental design, procedures, and statistical approaches, researchers can address the reproducibility crisis and improve the sustainability of research outcomes. In this piece, we discuss the key issues leading to irreproducibility and provide general approaches to improving transparency and rigor in reporting, which could assist in making research more reproducible.

to present a transparent account of their work, including providing full details of the experimental and statistical procedures and results.
Transparent and rigorous accounts of how an experiment was performed, why the authors used specific statistical approaches, and what limitations arise from such work will allow the reviewers, editors, and subsequently readers to better judge the quality of the science.
In this commentary, we offer an update to basic approaches in reporting a thorough account of the experimental design and statistical approaches and provide an overview of data visualization techniques. 7 It is our hope, as publishers and editors, that these guidelines will help the authors adhere to specific reporting guidelines that promote rigor and transparency in scientific research, which will ensure an accurate and complete account throughout their experiments and discourage publication bias. This, in turn, will promote better, more reproducible science.

| BARRIERS TO REPRODUCIBILITY
Many factors can lead to irreproducibility of scientific results. Oftentimes, these trace back to flaws in the experimental design, statistical analyses (and a lack of understanding of fundamental statistical principles), including low statistical power or inadequate sample sizes, basic reporting of the information essential for labs to independently reproduce results (e.g., biological reagents and reference material), and selective reporting of data/results (e.g., p-hacking). 4,8,9 These factors and others might contribute to between 50% and 90% of the published papers being irreproducible. [10][11][12][13][14][15][16][17] Attempts to reproduce published results costs the United States approximately $28B annually, 9,18 yet poor descriptions of the published studies lead to a majority of studies becoming non-replicable. 11 The next subsections will break down some of the more common barriers to reproducibility.

| Neglecting the methods and materials section in manuscripts
The Methods and Materials section of the manuscript is an often neglected area. Journals and authors often limit the methods section to brief descriptions of the procedures or place more complete methods into supplemental materials, or for journals moving away from supplemental material, to online methods that are separate from the article; these are not often critically reviewed by referees and can go unread by the experimenters. Furthermore, reviewers might not be able to adequately review methods and tools and subsequently might fail to notice that key details are missing. This can lead to a lack of complete and transparent reporting of the information required for another researcher to repeat protocols and methods. 2 Similarly, journals requiring a subsection on statistical analyses rarely ask the authors to provide a full account of the statistical approaches, and the authors may also fail to include a full account of the statistical outputs in the results section. Without a rigorous description of the methods, materials, and statistical approaches, experimenters lack the necessary information to independently replicate or nearly replicate results with the same protocol under similar conditions. 2,13

| Aiming for novelty and impact
Current publication trends place emphasis on the pursuit of novelty and innovation, 19 which leads to a collection of reporting problems in how data were obtained. 8 At the most extreme, pressure to publish may lead individuals to rush their experiments, cut corners, make unintentional errors in statistical outputs, or overinterpret the findings, 20 which can lead to irreproducibility of the scientific findings.
To publish in "high impact" journals, scientists may resort to submitting only their most novel and impactful findings and avoid presenting nonsignificant or incremental findings, 19 though the latter also have important implications in driving scientific progress. The pressure to publish sensational findings has even led some "high impact" journals to state in their submission forms: "negative results are not accepted". 21 This emphasis might encourage scientists to pursue nonlinear lines of investigation in search of statistical significance (e.g., p-hacking), and may be one driver of scientific misconduct, including falsifying and fabricating data to increase its impact or statistical significance. 5 At the very least, it leads researchers to omit nonsignificant or incremental findings leading to a bias in the literature, and reinforces the perception that negative findings carry a low priority for publication. 22,23 This publication bias has led science reporters and the public to declare that it has become more difficult to trust scientific findings. 24 Even with the most rigorous reporting guidelines and stringent publication standards, including the precise application of the scientific   method to ensure robust and unbiased experimental design, method-ology, analysis, interpretation, and reporting of the results, 26 it is not guaranteed the authors will fully comply. Reporting guidelines cannot overcome poor training in experimental design and statistics, both of which may be responsible for many of the challenges leading to irreproducibility. 27,28 Indeed, investigators all too often make errors in designing and performing their research, in selecting statistical tests, and in reporting the results. 29 33 (see also Motulsky, 2014 34 ). These tools helps the authors improve statistical reporting in manuscripts and ensure that the correct approach was used, though statistical reviews may be limited by how much raw data are available.
In addition to the above tools, editorials and commentaries published in various journals attempt to help the authors improve the descriptions of their experimental procedures and results to ensure that the published research is transparently and accurately reported. [35][36][37][38][39][40] Unfortunately, the authors often fail to incorporate these guidelines into their articles and most journals do not enforce or penalize the authors for not including specific criteria. 6 Refining the steps necessary to ensure quality control during the peer review and publication processes is essential in order to improve transparency and scientific rigor. Adopting the approaches discussed below will better ensure that the experimental designs are accurate and deviations from that design are explained, with the ultimate goal of increasing the reproducibility of the published data. Journals and publishers should continue to provide detailed guidelines to help the authors during the submission process, but if researchers do not adopt a rigorous and transparent approach to scientific design and reporting from the onset of training, these requirements will continue to fall short.
In the following sections, we outline the key steps to improve transparency and scientific rigor that should be considered during the designing stages of experiments, not just before submission for publication.
These requirements can be broadly broken down into (a) reporting criteria to ensure rigor and transparency; (b) transparent account of experimental design; (c) improving statistical rigor and transparency; and (d) peer review to enhance rigor and transparency. Encouraging specific descriptions and a full account of the study will ensure transparency and could improve reproducibility efforts. The next four sections will break down these components to elaborate on how each can improve transparency and rigor in scientific reporting.

| REPORTING CRITERIA TO ENSURE RIGOR AND TRANSPARENCY
The following points describe the key characteristics that must be included in any research design to assess the internal validity, reliability, and potential for reproducibility of scientific findings. Many of these recommendations have been discussed in various venues (e.g., ARRIVE guidelines 7,18,38,41,42 ), and some might only be appropriate to specific sciences. However, we feel that inclusion of these criteria, when applicable, into research manuscripts will improve rigor and transparency of the experimental design and statistical approaches.

| Appropriately describing the experimental subjects
The methods section of each published study begins with a description of the experimental unit; however, in many cases, the information provided falls short. The experimental units are the entity that is randomly and independently assigned to the treatment conditions (e.g., human subject, animal, littler, cage, fish tank, culture dish, etc.). 43 The sample size is equal to the number of experimental units. In considering the sample size, one must ensure that the experimental units are independently allocated to the experimental condition, the application of the condition is applied independently to the unit, and the experimental units do not influence one another. 43 A significant concern in cell biology is determining whether cells or sections, for example, can be considered an experimental unit. In cases where an animal is treated and subsequent testing occurs postmortem (e.g., immunohistochemistry or electrophysiology), then the histological sections, neurons per section, spines per neuron, tumor cells per section etc. are all subsamples of the experimental unit, which is the animal, and should be considered an n of 1. 43,44 If data are not independent, one strategy is to analyze clustered data (e.g., convert the replicates from a single subject into a single summary statistic. 44 Alternatively, there are also procedures to accurately model the true variability in data sets using modern statistical techniques (e.g., handling nested data such as cells/animals, littermates). 45  Reporting the number of experimental units (i.e., subjects, animals, cells) excluded as well as the reason for exclusion is necessary to prevent the researcher from introducing selection bias that favors positive outcomes and distorts true effects. 48 Crucially, studies involving human subjects must not reveal individual identifying information but must contain a full description of the participants' demographics as variations in the demographics can lead to confounding variables if not appropriately controlled. When designing an experiment, one must also account for sex as a biological variable (see below). One should carefully review the extant literature to determine whether sex differences might be observed in the study and, if so, design and power the study to test for sex differences. Omitting this step could compromise the rigor of the study. 49,50

| Randomization and blinding procedures
Choices made by investigators during the design and execution of experiments can introduce bias, which may result in the authors reporting false-positives. 13,39,51 For example, when investigators are aware of which animals belong to one condition or know that a given treatment should have a specific effect, or human subjects become aware of the conditions they are in, the researchers and participants may inadvertently be biased toward specific findings or alterations in a specific behavior. 52,53 To reduce bias in subject and outcome selection, the authors should report randomization and blinding procedures. 54 Implementing and reporting randomization and blinding procedures is simple and can be followed using a basic guide, 52,55 but to reduce bias, it is essential to report the method of participant randomization to the various experimental groups as well as on random sample processing and collection of data. 38,39 Moreover, investigators should report whether experimenters are blind to the allocation sequence and also, in animal studies, report whether controls are true littermates of the test group. 44 Similarly, once the investigator is blind to the conditions, they should remain unaware of the group in which the subject is allocated and the assessment outcome. 39 Blinding is not always possible. In these cases, procedures to standardize the interventions and outcomes should be implemented and reported so groups are treated as equally as possible. In addition, researchers should consider duplicate assessment outcomes to ensure objectivity. 52 Attention to reporting these details will reduce bias, avoid mistaking batch effects for treatment effects, and will improve the transparency of how the research was conducted.

| Animal housing and husbandry
Many life science disciplines use animal models to test their hypotheses. Few studies provide detailed information regarding housing and husbandry and those reports that contain the information typically do not provide any level of detail that could allow for others to follow similar housing procedures. When using animals, care should be taken to adequately describe the housing and husbandry conditions as these conditions could have profound implications on the experimental results. 56 At a minimum, the authors should introduce in the abstract the race, sex, species, cell lines, etc. so that the reader will be aware of the population/sample being studied. However, in the methods section, the authors should carefully describe all animal housing and husbandry procedures. For example, it is normally unclear whether animals were single or group housed, and in most journals, the age and/or weight of the animals are commonly omitted. 57

| Sex as a biological variable
Sex/gender plays an influential role in experimental outcomes. A common practice within research is that findings in one sex (usually males) are generalized to the other sex (usually females). Yet, research consistently demonstrates that sex differences are present across disciplines.
For example, as evidence reveals in a recent issue of JNR (see Sex Influences on Nervous System Function), sex not only matters at the macroscopic level, where male and female brains have been found to differ in connectivity, 58 but at the microscopic level too. 59 The National Institutes of Health as well as a number of funding agencies mandates the inclusion of sex as a biological variable, yet this mandate is not enforced by most journals. Starting at the study design, the authors must review whether the extant literature suggests that sex differences might be observed in the study, and if so, then design and power the study to test for sex differences. Otherwise, the rigor of the study could be compromised. When publishing the results, the authors must account for sex as a biological variable, whenever possible. At a minimum, the authors should state the sex of the subjects studied in the title and/or abstract of the manuscript. The rationale for choosing only one sex if a single sex study is conducted should also be provided, though discussed as a limitation to the generalizability of the findings. Investigators must also justify excluding either males or females. The assumptions that females are more variable than males or that females must be tested across the estrous cycle are not appropriate as these are not major sources of variability. 60 This policy is not a mandate to specifically investigate sex differences, but requires investigators to consider sex from the design of the research question through reporting the results. 49,50 In some instances, sex might not influence the outcomes (e.g., 61,62 ), but balancing sex in animal and cellular models will distinctly inform the various levels of research. 49 More specific guidelines for applying the policy of considering sex as a biological variable are also available, 50,63 but shifting the experimental group composition should be done in the context of appropriate a priori power analyses. One concern is that sample sizes need to be doubled to identify effects using both female and male subjects, but factorial designs can evaluate the main effects of the treatment and subject sex without increasing the sample size. 64 While the risk of false-positive errors associated with testing sex differences in this way is present, reporting that these differences may or may not be present is imperative to understanding how sex influences the function of the nervous system. This practice should be extended to all scientific journals using animal/human subjects. how the data were nested). and the type of design considered (e.g., completely randomized design, randomized complete block design, and factorial design; see 65,66 ) for definitions and procedures to implement these designs). Assuming the authors planned the analysis prior to data collection, the authors should describe the specific a priori consideration of the statistical methods and planned comparisons 7 or report that no a priori statistical planning was carried out. If the statistical approach deviated from how it was originally designed (see, for example, Registered Reports below), the authors should also report the justification for this change. This open description could help to improve independent research reproducibility efforts and assist reviewers and readers in understanding the rationale for specific approaches.

| Transparent account of the experimental design and statistical approaches
A precise description of how methodological tools and procedures are prepared and used should also be provided in the experimental design section. Oftentimes, methodological procedures are truncated, forcing the authors to omit critical steps. Alternatively, the authors may report that the methods were previously described but might have modified those procedures without reporting those changes.
Due to current publishing constraints, various caveats that go into the methodological descriptions remain unknown. However, this can be remedied easily by journals requiring a full description or step-bystep procedure of the experimental protocol used to test the dependent variables. Two options are available for publishing full protocols.
First, the protocol could be published in the manuscript, with the reviewers verifying that the procedures are appropriately followed; second, a truncated version of the methods could be published in the manuscript, but the extended methods must be required as supplemental material (the extended methods will be peer reviewed during the submission process). An alternative approach is to deposit step-by-step protocols into a database or a data repository such as Dryad, FigShare, or with the Center for Open Science, where they will receive a DOI and can be linked back to the original research article, which will contain the truncated procedures.

| Materials
Rigorous descriptions of the experimental protocols not only require a level of detail in the description of the experimental design, but also a full account of the resources and how they were prepared and used. A contributing factor to irreproducibility is the poor or inaccurate description of materials. In order for researchers to replicate and build upon published research findings, they must have confidence in knowing that materials specified in a publication can be correctly identified so that they might obtain the same materials and/or find out more about those materials. Most studies do not include sufficient detail to uniquely identify key research resources, including model organisms, cell lines, and antibodies, to name a few. 67 While most author guidelines request that the authors provide the company name, city in which the company is located, and the catalog number of the material, (a) many authors do not include this information; (b) the particular product may no longer be available; or (c) the catalog number or lot number is reported incorrectly, thus rendering the materials unattainable.
A new system is laying the foundation to report research resources with a unique identification number that can be deposited in a database for quick access. The Resource Identification Initiative standardizes the materials necessary to conduct research by assigning research resource identifiers (RRIDs). 68 To make it as simple as possible to obtain RRIDs, a platform was developed (www.scicrunch.org/ resources) to aggregate data about antibodies, cell lines, model organisms, and software into a community database that is automatically updated on a weekly basis and provides the most recent articles that contain RRIDs. While SciCrunch is among the founding platforms, these identifiers can also be found on other sites, including antibodyregistry.org, benchsci.com, and others. Similarly, though more

| Statistical rigor and transparency
With most statistical software having a user-friendly interface, students quickly learn how to perform basic statistical tests. However, users all too often choose inadequate and incorrect statistical methods or approaches or cannot reproduce their analyses since they have only a rudimentary understanding to each test and when to use them. 6 69 Moreover, it might be difficult to reproduce statistical output when the authors do not report the statistical software and specific version thereof, fail to include in the manuscript the exclusion criteria or code used to generate analyses, or explain how modifications to the experimental design might lead to changes in how statistical analyses are approached (e.g., independent versus non-independent groups) (additional details about these common mistakes can be found in, 7,28,32 but it is important to emphasize that failure to report these variables can lead to errors in data interpretation. Choosing the correct statistical analyses first depends on an appropriate experimental design and mode of investigation (exploratory versus confirmatory 71 ). One must decide whether experimental conditions are independent, meaning that no subjects or specimens are related to each other, 7,32 whether the conditions are nonindependent or paired, and whether there are any associations between variables. 72 The second step is that statistical analyses must include specific details about the test statistics, rationale for choosing each test, a description of whether normal distribution parameters are obtained, and a statement about which p-value level is deemed statistically significant. In addition, a transparent and rigorous statistical analysis section must include the following:  54 Yet, as more parameters come into play (for example, within mixed effects modeling), power analysis software becomes more complex (see Power Analysis for Mixed Effect Models in R). Conducting these analyses allows researchers to confidently select a sample size large enough to lead to a rejection of the null hypothesis for a given effect size. 75 However, one limitation to a priori power analyses is that effect sizes and SDs may not be known prior to the research being conducted and may lead to observed effects that are smaller or larger than the hypothesized effects, 78,79 ). Alternatively, if it is conventional to use a specific number of subjects for a particular test, then one can report the calculated effect size for that particular sample size and decide whether more samples would be warranted. Either way, power and sample size calculations provide a single estimate, ignoring variability and uncertainty as such simulations are highly encouraged (see 80 ).
An alternative to the a priori power analysis is a post hoc power analysis (SPSS calls this "observed power") or confidence intervals.
The post hoc power analysis takes the observed effect size as the assumed population effect, though this computation might be different from a true population effect size, which might culminate in a misleading evaluation of power. 75 Post hoc power analyses always show there is low power with respect to nonsignificant findings. 77  to-do-if-your.html). If a reviewer or journal requests a power analysis, we recommend that rather than using post hoc power analyses, report confidence intervals to estimate the magnitude of effects that are consistent with the statistical data reported. 76,77,82 Alternatively, if increasing power is a necessity and/or sample sizes are already at their limits for financial or logistic reasons, one should consider alternative approaches, which are well described by Lazic; these include: (a) using fewer factor values for continuous predictors; (b) having a more focused and specific hypothesis test; (c) not dichotomizing or binning continuous variables; (d) using a crossed or factorial design rather than a nested arrangement. 46 We also advise authors to determine whether a parametric or nonparametric test is the most appropriate for the obtained data. Analogues to ordinary parametric tests (e.g., t-test or ANOVA, etc.) can be performed even if data are skewed or have nonnormal distributions; multiple robust analytics are available for these circumstances (see 83 ) as long as the sample size is sufficient. Importantly, parametric tests also generally have somewhat more statistical power than nonparametric tests and are more likely to detect a significant effect if one exists. Alternatively, when one's data are better represented by the median, nonparametric tests may be more appropriate, especially when data are skewed enough that a mean might be strongly affected by the distribution tail, whereas the median estimates the center of the distribution. Nonparametric tests may also be more appropriate when the obtained sample size is small, as occurs in many fields where sample sizes average less than eight per group 48 or when the data obtained are ordinal, ranked, or there are outliers that cannot be removed. 84 Beware, however, that meaningful nonparametric testing with sample sizes too low (e.g., n < 5) contains very little appreciable power to reveal an effect, if indeed one is present; difficulties due to violations of the underlying statistical assumptions of the particular test being used might be present. Bayesian analyses with small sample sizes are also possible, though estimates are highly sensitive to the specification of the prior distribution. bar graphs are designed for categorical data; when used to display continuous data, bar graphs with error bars omit key information about the data distribution (see also 85 ). To change standard practices for presenting data, continuous data should be visualized by emphasizing the individual points; dot plots (e.g., univariate scatterplots) are strongly recommended for small samples, along with plots such as violin plots (or overlaid points on the plots) to provide far more informative views of the data distributions when samples are sufficiently large. Bar graphs should be reserved for categorical data only. Moreover, graphic data plots involving multiple groups are often shown as overlaid, but should be "jittered" across the X-axis so that each discrete data point can be visualized. The use of jittering means that when there are fewer unique combinations of data points than total observations, the totality of the data distribution is not obscured. By adopting these practices, readers will be better able to detect gross violations of the statistical assumptions and determine whether results would be different using alternate strategies. 42 When plotting data, it is important to also report the variability of the data. Typically, this is expressed as the SD or standard error of the mean (SEM), but it is important to note that SEM does indicate variability. 34 The SD is calculated as part of an estimate of the variability of the population from which the sample was drawn. 86,87 The SEM, on the other hand, describes the SD of the sample mean as an estimate of the accuracy of the population mean. In other words, the SD shows how many points within the sample differ from the sample mean, whereas the SEM shows how close the sample mean is to the population mean. 87 The main function of SEM is to help construct confidence intervals, which are a range of values that take into account the true population value (usually an unknown), so that one can quantify the proximity of the experimental mean to the population mean. 88 Yet deriving confidence intervals around one's data (using SD) or the mean (using SEM) is premised on those data being normally distributed. Robust estimators are increasingly important as heteroscedasticity (having subpopulations with differing variabilities) is a frequent consequence of real-world measurement.

| Graphical representation of data
Traditional data transformations are an attempt to cope with this phenomenon but for many, such transformations may not actually serve to resolve anything and may add a layer of unnecessary complexity.
In determining which estimate of variability to depict graphically, it is important to remember that the SD is used when one wants to know how widely scattered measurements are or the variability within the sample, but if one is interested in the uncertainty around the estimate of the mean measurement or the proximity of the mean to the population mean, SEM is more appropriate. 87 When plotting data variability, it is important to consider that when SEM bars do not overlap, the viewer cannot be sure that the difference between the two means is statistically significant (see 34 ). We also note that it is misleading to report SD's in the narrative and tables but plot SEMs. Furthermore, unless an author specifically wants to inform the reader about the precision of the study, SD should be reported as it quantifies variability within the sample. [86][87][88] Therefore, the optimal method to visualize data variability is to display the raw data, but if that makes the graph too difficult to read, instead show a box-whisker plot, frequency distribution, or the mean ± SD. 34 3.5.5 | Inclusion of statistically significant and nonsignificant data The probability that a scientific research article is published traditionally depends on the novelty or inferred impact of the conclusion, the size of the effect measured, and the statistical confidence in that result. 21,89 The consequence of obtaining negative results can lead to a file-drawer effect; scientists ignore negative evidence that does not reach significance and intentionally or unintentionally select the subsets of data that show statistical significance as the outcomes of interest. 41 This publication bias skews scientific knowledge toward statistically significant or "positive" results, meaning that the results of thousands of experiments that fail to confirm a result are filed away. 89 These data-contingent analysis decisions, also known as phacking, 90

| Real and perceived conflicts of interest
Though objectivity of a researcher or group is assumed, conflicts of interest may exist and could be a potential source of bias. Conflicts of interest largely focus on financial conflicts, 91,92 but they can also occur when an individual's personal interests are in conflict with professional obligations, including industrial relationships. 93 Conflicts, whether real or perceived, arise when one recognizes an interest as influencing an author's objectivity. This can occur when an author owns a patent, or has stock ownership, or is a member of a company, for example. All participants in a paper must disclose all relationships that could be viewed as presenting a real or perceived conflict of interest.
When considering whether a conflict is present, one should ask whether a reasonable reader could feel misled or deceived. While beyond the scope of this article, the Committee on Publication Ethics offers a number of resources on conflicts of interest.

| Registered reports and open practices badges
One possible way to incorporate all the information listed above and to combat the stigma against papers that report nonsignificant findings is through the implementation of Registered Reports or rewarding transparent research practices. Registered Reports are empirical articles designed to eliminate publication bias and incentivize best scientific practice. Registered Reports are a form of empirical article in which the methods and the proposed analyses are preregistered and reviewed prior to research being conducted. This format is designed to minimize bias, while also allowing complete flexibility to conduct exploratory (unregistered) analyses and report serendipitous findings.
The cornerstone of the Registered Reports format is that the authors submit as a Stage 1 manuscript an introduction, complete and transparent methods, and the results of any pilot experiments (where applicable) that motivate the research proposal, written in the future tense.
These proposals will include a description of the key research question and background literature, hypotheses, experimental design and procedures, analysis pipeline, a statistical power analysis, and full description of the planned comparisons. Submissions, which are reviewed by editors, peer reviewers and in some journals, statistical editors, meeting the rigorous and transparent requirements for conducting the research proposed are offered an in-principle acceptance, meaning that the journal guarantees publication if the authors conduct the experiment in accordance with their approved protocol. Many journals publish the Stage 1 report, which could be beneficial not only for citations, but for the authors' progress reports and tenure packages. Following data collection, the authors prepare and resubmit a Stage 2 manuscript that includes the introduction and methods from the original submission plus their obtained results and discussion. The manuscript will undergo full review; referees will consider whether the data test the authors' proposed hypotheses by satisfying the approved outcome-neutral conditions, will ensure the authors adhered precisely to the registered experimental procedures, and will review any unregistered post hoc analyses added by the authors to confirm they are justified, methodologically sound, and informative. At this stage, the authors must also share their data

| PEER REVIEW TO ENHANCE RIGOR AND TRANSPARENCY
The process of peer review is designed to evaluate the validity, quality, and originality of the articles for publication. Yet peer reviewers are not immune to making mistakes. For example, several studies were conducted where major errors were inserted into papers. In these studies, no reviewer ever found all the errors and some reviewers did not spot any errors. 95,96 While it is beyond the scope of this article to discuss many of the defects of peer review (see 97 ), it is important to note that the changes to the peer review process are ongoing 98 and publishers are working to develop more formal training processes. However, to quickly improve rigor and transparency in scientific research, peer review should emphasize the design and execution of the experiment. We are not saying that reviewers should focus solely on the experimental design; it is important for reviewers to weigh in on the novel insights of a study and how study results may or may not contribute to the field. However, to help ensure the accuracy and the validity of a study, emphasis should first be on the experimental design. To assist the reviewers, the authors should submit as part of their manuscript a Transparent Science Questionnaire (TSQ), or something equivalent, which identifies where in the manuscript specific elements that could aid in reproducibility efforts are found. The reviewers use this form to verify that the authors have included the relevant information and ensure that the study was designed and executed objectively, ensuring the study's validity and reliability. Using this or similar forms will also help reviewers to find the relevant information necessary to ensure the appropriateness of the design, which can then allow them to focus on the experimental outcomes. Adopting forms such as the TSQ or using services such as those offered by Research Square could also speed up the peer review process and reduce the cost in time committed by unpaid reviewers (which, in 2008, was estimated to cost $2.3 billion) (https:// scholarlykitchen.sspnet.org/2010/08/31/the-burden-of-peer-review/).
A multistage review where different parties are concerned with different aspects of the review may be optimal. Because many errors in manuscripts are found in the statistical output, one stage of review should be a statistical review, whereby a statistical editor reviews the statistical analyses of the manuscript to ensure accuracy, but also verifies that the most appropriate statistical tests for that design were used.
Upon completion, the editor will then make a decision as to whether the approach and execution is sufficient and is in line with the reported statistical output. By having experts focus on specific aspects of a research report, journal editors will become more confident that the research published is valid and of high quality and integrity.

| CONCLUSIONS
A challenge in science is for scientists to be open and transparent about the procedures used to obtain results. A major source of irreproducibility is substantial human error, which can occur while scientists are conducting the experiments or during data/statistical analysis. Groups are continuing to develop systems that help researchers cover every aspect of the experimental design (e.g., EQIPD or XDA), but education and awareness of the key elements in research design and analysis is essential to transparent and reproducible research. By incorporating the specific elements discussed in this document into research manuscripts, researchers can reduce subjective bias, while actively improving methods' reproducibility, which will increase the likelihood of research reproducibility as the two are closely linked. 2 While variability in results is inevitable, ensuring that every salient aspect of a study is reported will help others understand the procedures involved and potential sources of errors during the experimentation process, which will ultimately lead to greater transparency in science. * When describing the data, it is important to differentiate between an exploratory and confirmatory study, as this could have profound implications as to how data are presented. Exploratory analyses are meant to identify patterns in the data without much emphasis on hypothesis testing, but most studies publish confirmatory experiments to test one or a few stated hypotheses.