Implementing an Open & FAIR data sharing policy—A case study in the earth and environmental sciences

This paper outlines the impact of the introduction of an Open & FAIR (findable, accessible, interoperable, and reusable) data sharing policy on six earth and environmental science journals published by Taylor & Francis, beginning in November 2019. Notably, 18 months after implementing this new policy, we observed minimal impacts on submission, acceptance rates, or peer‐review times for the participating journals. This paper describes the changes that were required to internal systems and processes in order to implement the new policy, and compares our findings with recent literature reports on the impact of journals introducing data‐sharing policies.


INTRODUCTION
We rolled out our suite of data sharing policies in 2018 and, within 12 months, had successfully implemented our 'basic' policy on more than 1600 journals. As set out by Jones et al. in their case study (Jones, 2019), our policies were informed by both FAIR (findable, accessible, interoperable, and reusable) data principles (Wilkinson, 2016) and by the TOP Guidelines (Nosek et al., 2015); however, as a publisher of Humanities research under the Routledge imprint, as well as Science and Technology and Medical content, we required a range of policies that covered the breadth of attitudes to data sharing within the different research areas. As such, we adopted a tiered approach, with increasing levels of prescription in each policy (Table 1) Herein, we discuss the impact of implementing the most open policy on the framework, which is called 'Open & FAIR'.
We also cover the changes that were required to our systems and processes. We also share feedback from the editors of the participating journals regarding their experiences during the pilot.

METHODS AND IMPLEMENTATION
Scope and ambitions for the pilot As outlined above, Taylor & Francis first published their suite of data policies in 2018, but, at that time, our focus had been on encouraging our editors and society partners to adopt our 'basic' policy. While this policy does encourage data sharing and raises the question of depositing and citing data sets in the Instructions for Authors, no specific actions are required. In line with other research in this area, we found that very few authors proactively shared their research data as a result of implementing this policy (Giovanni Colavizza, 2020). By signing up to the Enabling FAIR Data initiative and moving a pilot group of journals onto our Open & FAIR data policy, we would be testing our workflows, systems and author/editor communications on the most-open policies in the Taylor & Francis framework. We felt that using these processes on a small group of journals would give us some feedback on the effectiveness of these resources and leave us better placed to roll-out our more-open policies more widely.
Finally, we collected and analyzed information about the participating journals in the pilot in order to contribute to the wider understanding about the effect of introducing data-sharing policies on journal metrics. In recent months, publications on this topic have shown diverse range of effects of introducing data sharing policies causing both an increase in citations (Giovanni Colavizza, 2020) and a reduction in submissions (Vines, 2018). We were interested to see if our findings aligned with either of these studies.

Scope of the pilot
Taylor & Francis publishes a large portfolio of more than 100 titles in the Earth and Environmental Sciences. Therefore, we developed a set of criteria to identify suitable titles for the pilot. These included: • Alignment with subjected areas covered by the Enabling FAIR data project • The use of an online submission system • Already-established administrative support for the academic editors, so as not to add to their already busy workload Using these criteria, we selected suitable journals and began discussions with the editors. At this stage, we were very pleased to be joined by the AGU's Director of Research Data, Shelley Stall, for a joint webinar to explain more about the Enabling FAIR Data project to our editors and to answer any questions or field any comments they may have.
We also received some interest from editorial teams of journals that did not have the desired administrative support in place. We were clear about the extra work that would be required in checking the papers and corresponding with authors and were delighted when a couple more journals joined the pilot.
Following the webinar and further discussions, the list of participating titles was finalized as follows: • Australian Journal of Earth Sciences We expected some authors to be philosophically opposed to sharing their data (many of the common reasons why authors are opposed to data sharing are covered in the State of Open Data; Baynes, 2020). However, we anticipated that others would be open to sharing, but might lack information about the key practical elements, such as how to write a data availability statement (DAS) or how to select a repository, which would limit compliance at the point of submission. Indeed, we still expect to encounter this issue for the next few years, while funders and research institutions roll-out data-management plans more widely and open research practices become more commonplace. The first time many authors consider sharing their data are when prompted by a journal. As funders and other stakeholders adopt data policies, this will happen earlier in the process and authors should become better informed and better prepared at the point of submitting to a journal.
The impact of implementing the new policy on our editors was a key concern as well. Mainly, that extra work generated by ensuring compliance with the new policy could reduce the time available for other valuable journal-development tasks, such as strategic planning. We were also aware of the administrative support needed to check compliance by authors at the point of submission, and the fact that these teams work across a range of journals and are not subject specialists. We were concerned that this would also generate a high volume of queries for both our teams and our editors. We were also prepared for regional variations on knowledge of data-sharing practices, given the global spread of our authors.
Alongside implementing the policies, this initiative was designed to track the influence of the new policy on our pilot journals. We monitored the number of new article submissions to assess if such a prescriptive data-sharing policy would cause authors unsure about sharing their data (for either practical or ideological reasons) to choose to submit their work to another journal. In this regard, we encouraged our journal editors to be as 'open as possible, but as closed as necessary' in their discussions with existing and potential authors, to mitigate the negative impact on submissions.
Secondly, we were prepared to field queries from potential authors and the need to address circumstances in which authors had not shared their data and included a DAS within their submission. This additional back-and-forth could significantly increase the volume of email correspondence for our academic editors, who are also juggling teaching and research obligations, in addition to their journal responsibilities. We did not want this initiative to add further demands on their time.
To pre-emptively address this challenge, we asked our editors to reach out to their contacts at Taylor & Francis should they encounter additional queries, and we provided templates to assist with replying to common queries. We created and shared a new email address to assist with data-related queries from potential authors, which would be monitored by our Open Research team.

Aims of the participating journal editors
To include our editors' experiences in this paper, we asked them a series of questions via email to allow them to share why they

Implementation
Once the final list of participants had been agreed, we implemented the necessary changes to the journal workflows so that we could inform submitting authors about the new policy and ensure that we captured the necessary information on submission. This process involved updating the Instruction for Authors text on the journal website to include specific information about the policy (Fig. 1).
Participating journals added a new question to their submission systems to provide policy-specific information to authors.
When the new policy was launched, this question asked the authors to add the DOI of their data to the submission system and to confirm that they had included a DAS in their manuscript.
In addition, we overhauled our Author Services pages on Data Sharing (https://www.authorservices.taylorandfrancis.com/ data-sharing-policies/) to provide updated guidance on our suite of data-sharing policies, along with new guidance on how to write a DAS and how to select a suitable data repository.
We rolled out our new data-sharing policy on the pilot group of journals on 12 November 2019, which was coordinated with an update to the journal websites, the online submission systems and some promotional messaging over our social media channels.

Iterative refinement
Since we began our pilot, we have updated a number of our processes and policies as a direct result of feedback that we received from our editors and internal colleagues, as described below.

Online submission systems
In the first few weeks following the implementation of the new policy, we closely monitored the number of original submissions to the pilot journals, as well as encouraging feedback from the editors and submitting authors to our data-sharing inbox.
We were especially alert for any negative feedback, in terms of author experience or submission 'pain points', and we were pleased to receive almost no such responses. As the pilot progressed, and we continued to monitor the performance of the participating journals, we became more comfortable that we were not going to see a significant negative impact from implementing the policy, but were concerned by the minimal feedback that we had received. We then looked more closely at the submissions that were coming through.
On further analysis, we found that very few of the submitting authors had been appropriately engaging with the policy and complying with its requirements at the point of submission. This had led to an increase in the workload for the journal administrators and editors in encouraging authors to comply with the new policy.
We then undertook a review of the new workflows, and amended the question in the submission systems to be more explicit about what we wanted authors to do. We clarified our messaging to ensure that submitting authors included a DAS in their manuscript, which included a link to their data set, which had been deposited in a suitable (FAIR-aligned) repository. In this regard, we changed the wording from: • This journal has an open data policy and authors are required to post their data in a suitable repository and link to it from the article (subject to restrictions on sharing due to data ownership or the inclusion of personally identifiable information). See further information here [https://www.authorservices.taylorandfrancis.

com/data-sharing-policies/open-data/]
Please provide the DOI, pre-reserved DOI or other persistent identifier for your data. To: • This journal has an Open Data policy and authors are required to post their data in a suitable repository, link to it from the article and include information in a Data Availability Statement (highlighting any restrictions on sharing due to data ownership or the inclusion of personally identifiable information

Data availability statement
This change caused the changes in author behaviour that we were looking for with more authors including DAS in their submissions. As shown above, in our guidance to authors, we included a link to our Author Services website, which provided general instructions on how to prepare a suitable DAS, along with requirements for the DAS texts for each of our policies. However, while this increased the number of DAS being submitted, not all were compliant with the policy. We found that some authors were selecting the DAS text that they found easiest to comply with (often 'data are available on request'), rather than a template that was compatible with the policy. Therefore, we updated our guidance in two ways. Firstly, we added examples of compliant DAS to each policy page (Fig. 2). We also added a column to the table on our DAS guidance page, which clearly listed with policy/policies each template text, was compatible.

Administrative checks
Some of the journals that participated in the pilot received additional administrative support, which included performing checks of the submitted article files to ensure that the authors had provided everything that was required and had complied with the submission requirements as laid out by the journal.
For the start of the pilot, we included an additional step in the instructions for participating journals to ask the administrator to check whether a DAS had been included. Early indications were that this step was working well. However, we wanted to see if we could enhance these checks so that they were more useful to our editors.
Therefore, we prepared a workflow for the administrators to follow so that they could conduct more detailed checks. Not only whether an author had included a DAS in their manuscript, but also whether the DAS was compliant with the journal's datasharing policy (Fig. 3). To accompany this new workflow, we also prepared three new email templates that the administrators could use in their correspondence with the authors to ask them to add or amend their DAS. We realize the limitations of this new workflow and understand that these steps will not ensure that every submitted DAS is fully compliant with the new policy. However, we felt that this should address the most common errors that we had encountered and allow the academic editors and peer reviewers to focus on the subject-specific details of the DAS and whether it complied with the policy.
Journals that did not receive administrative support approached this differently, since the administrative checks were undertaken by the journals' editors. For one of these journals, editors checked for a DAS only once a manuscript was accepted for publication, as returning manuscripts to authors for a DAS to be included was deemed burdensome when some of those manuscripts would then be rejected.
Editors who used this process developed template emails to send to authors of accepted manuscripts that did not already include a DAS, asking them to add one. This step was estimated by editors to add $15 min to the handling time of each accepted manuscript, as there were often multiple rounds of correspondence required. Because this step occurred only after acceptance, it also raised a dilemma for editors around whether to insist on the inclusion of a DAS where this may cause especially resistant authors to withdraw their manuscripts. Having already undertaken the handling work of these papers, the editors were more inclined to grant exceptions to the data-sharing policy if authors were unwilling to comply, even after requests from the editor.

Repository guidance
While most of the journals that had opted to participate in the pilot program operated single-anonymized (single-blind) peer review, two of the journals operated a double-anonymized (double-blind) system. To ensure these peer review workflows were respected we had to ensure that we included guidance for authors who would be required to share their data, but also needed to keep their identity anonymous. Initially, we suggested that authors consider depositing their data in Fig-share (https://www.figshare.com/) or Dryad (https://www. datadryad.org/), as repositories which both offer the functionality to create an anonymized link to the deposited data to facilitate double-anonymized peer review. However, one of the participating journals continued to experience quite a few issues relating to failing compliance with the doubleanonymized peer-review process, and so the editor chose to directly ask authors to use one of these two repositories when sharing their data.

Metadata
Alongside the implementation of the pilot, we also updated our metadata feed to Crossref (https://www.crossref.org/) to include specific tags for data citations. This update was the culmination of a long process, which required changes to our internal processes, as well as working with external parties, including our typesetters and our platform provider. By providing this information to Crossref, we are putting the metadata links in place between articles and data sets. This will enable authors of data to be able to see how data sets are being cited and get credit for the data that they create as well as the research articles. The other key factor was editor engagement. While this was an opt-in pilot that our editors chose to participate in, we saw variable levels of engagement by the editors, in particular around how strictly they enforced the policy. Finally, some of the participating journals had administrative support provided by the publisher, which allowed us to introduce checks on policy compliance (such as the inclusion of a DAS) on submission, but this was not the case for all of the participating journals.

Original submissions and acceptance rates
Across all of the participating journals in the pilot, the number of original submissions remained consistent in 2020 compared with the previous year and current trend (Fig. 4). Each journal had its own trajectory, owing to a range of factors, but we confidently believe that introducing the Open & FAIR data-sharing policy did not have a negative effect on the number of original submissions to the participating journals. Considered alongside the feedback received from our journal editors, we can conclude that authors in this subject area have generally been willing and able to comply with the new policy and to share their data prior to the submission of their research article.
We also considered the number of articles accepted by the participating journals, along with their acceptance rates. In this regard, the acceptance rate was determined to be the percentage of articles that were ultimately accepted for publication in the journal (following any revisions and/or resubmissions). As before, each journal again had its own trajectory of acceptance rates, but we once again found no clear influence of the new data sharing policy on the acceptance rates of the participating journals, which indicated that the implementation of an Open & FAIR datasharing policy was neither a driver of, not a barrier to, the submission of quality research.

Peer-review times
We monitored peer-review times for the journals that were involved in the pilot (Fig. 5). Specifically, we compared the median number of days to first decision and the median number of days to final decision. We also monitored the number of papers that were withdrawn post-submission for the pilot journals. The picture was more varied for the number of days to final decision: one journal showed an increase in the number of days to final decision of over 30 days, while two journals showed smaller changes (AE7 days), and two showed sizable reductions (30 and 40 days). We have prepared a (journal anonymized) sheet of the raw data which is available on Zenodo, see Cannon et al. (2021aCannon et al. ( , 2021b.

Article citations
Finally, we were interested in investigating the extent to which the deposition and citation of underlying data supports-and perhaps enhances-the usability of an article. To understand this, we felt that any increased visibility and reuse/iteration of a piece of work that might result from the sharing of underlying data might be inferred from citations to the main research article in which the work was reported, as well as from citations to the data deposit or accompanying published data note, if any. Therefore, we also wanted to investigate any changes in citation performance of the participating journals, following the implementation of the pilot.
However, citations of a journal article take time to accrue, as the work must first be published following peer-review, and then disseminated, read, and finally iterated upon or critiqued, with an accompanying citation. Traditional citation metrics allow for this delay time by considering citations to an article in the subsequent years following publication. For our purpose, this means that we can only expect to gain a fuller understanding of the influence of adopting an Open & FAIR data-sharing policy over the coming few years. However, we were still interested to see whether there were any notable short-term changes to the citation patterns for the participating journals, and so we used the 2020 Web of Science Immediacy Index, released in June 2021, to look for any signs of an impact on citations (Fig. 6). Big Earth Data was not indexed in Web of Science at the time of the pilot and so had no associated Immediacy Index data. While there are a number of influencing factors that contribute to the overall citation activity of journal articles, we felt able to infer a couple of tentative conclusions: First, the journal which adhered most strictly to the new data-sharing policy exhibited the most-pronounced positive change in Immediacy Index, which may indicate a causal relationship between data sharing and article citation. Second, all of the journals that participated in the pilot exhibited an increase in their Immediacy Index in 2020, which may indicate that data sharing at least does not negatively impact the publication of citable articles. We look forward to drawing clearer implications on the relationship between an Open & FAIR data-sharing policy and citation performance as more data become available.

LIMITATIONS OF THE RESEARCH
In reviewing the results of our study, we are confident that, at the very least, introducing a more open data policy on a journal does not have significant negative impacts. However, there are some limitations of our study that should be noted. Firstly, the pilot only involved a very small sample size of six journals. More research would need to be done with larger cohorts of journals to validate this outcome. Secondly, each of the journals in the pilot is on its own trajectory in terms of submissions and acceptances-this has multiple contributing factors, such as the age of the journal, its size (articles published each year), editorial strategy, status of special or thematic issues and trend in Impact to article submission. In addition, the kind of data discussed in some of these research areas is conducive to sharing-often these include measurements from the air, earth or water or outputs from instruments. There is little interaction with human subjects, which mitigates concerns around personal data or GDPR, which affect other areas and limit sharing activity. a. 'We will improve our submission guidelines, in particular for data articles and software articles, and work on peer-review guidelines for these two article types.'-Linlin Guan, Big Earth Data. b. 'Code Ocean allows reviewers to run codes directly on the site without download the codes and data. They also allow the codes and data to run online to reproduce graphs and maps with the paper when published. This seems ideal, but the procedures to set up Code Ocean capsules are much more cumbersome than FigShare, Dryad, and GitHub. I hope that Code Ocean can improve over time to simplify the submission process, or other similar sites will be available to support sharing data and codes with capabilities of online execution.'-May Yuan, International Journal of Geographical Information Science Vines in The Scholarly Kitchen. We also did not notice a consistent trend in the peer-review times for the participating journals following the implementation of the pilot. Interestingly, we noticed a potential positive influence on article citation, although it will take some time for this picture to become clearer.

Feedback from our editors
Following the completion of the pilot, we have continued to develop our workflows and practices to facilitate best practice on data sharing, the handling of metadata and data citation. Work in this regard, as well as the implementation of more-open datasharing policies on a wider range of journals, is ongoing.