The Research Data Alliance: globally co‐ordinated action against barriers to data publishing and sharing

This article discusses the drivers behind the formation of the Research Data Alliance (RDA), its current state, the lessons learned from its first full year of operation, and its anticipated impact on data publishing and sharing. One of the pressing challenges in data infrastructure (taken here to include issues relating to hardware, software and content format, as well as human actors) is how best to enable data interoperability across boundaries. This is particularly critical as the world deals with bigger and more complex problems that require data and insights from a range of disciplines. The RDA has been set up to enable more data to be shared across barriers to address these challenges. It does this through focused Working Groups and Interest Groups, formed of experts from around the world, and drawing from the academic, industry, and government sectors.


Context
The amount of activity dealing with the importance of data to research has increased perceptibly over the last fi ve years. This includes conferences specifi cally focused on research data issues, 1-3 data-focused tracks at discipline conferences (too many to cite), national reports, 4,5 funder requirements, [6][7][8][9][10] special issues of journals, 11,12 and new journals altogether. Those wishing to read further about some of these issues can consult a selective bibliography dealing with publications about this space. 13 The foci for this activity are quite diverse: researcher behaviour, incentives and rewards, changes in the ecology of scholarly communication, technical issues, and the challenges of building and operating data infrastructure.

Role of data infrastructure
This paper will focus specifi cally on data infrastructure, interpreted broadly: hardware (storage and associated computer hardware), software, content and format standards, and human actors. As the bulk of the data needed by and generated by researchers is increasingly managed electronically, the role of this data infrastructure is becoming critical. One of the pressing challenges in data infrastructure is how best to enable data interoperability across boundaries. These boundaries include those between countries, between disciplines, and between producers of research data and the consumers of those data. A new organization, the Research Data Alliance (RDA), has been brought into existence specifi cally to address those boundaries from an infrastructure perspective.
which led to a EU-US workshop in Lyon in October 2011. By 2012 two separate activities were underway to address the challenge of how best to co-ordinate infrastructure support for data interoperability. In Europe, the effort was called the Data Access Interoperability Task Force (DAITF). In the US, the National Science Foundation (NSF) together with the National Institute of Standards and Technologies (NIST) was developing a proposal for a DataWeb Forum. These two proposals were then discussed at the International Conference on Research Infrastructure in Copenhagen in March 2012, with a presentation on the DataWeb Forum proposal being made at the DAITF workshop held at that event. Part of the inspiration behind both proposals was the success of the grass-roots Internet Engineering Task Force (IETF) organization 14 and its processes.

Coming together
Individuals associated with both activities attended an EU-Australian workshop on research data infrastructure in Brussels in June 2012 and found themselves talking about shared concerns over morning tea. At additional conferences and meetings in Copenhagen and Barcelona in October 2012, supported by the NSF and the EU Commission, both the funders and those active at these meetings discussed how to combine the two strands of activity and also how to involve Australia (in large part because of its commitment to research data infrastructure expressed through the funding of the Australian National Data Service21 from 2009 onwards). A series of regular and intense videoconferences in the last quarter of 2012 and the fi rst quarter of 2013 resulted in the creation of the RDA. This involved the development of governance arrangements, an initial website, establishing a regular schedule of meetings during the startup phase, and the arrangements for the First Plenary meeting of the new organization.

Funder commitment
The NSF provided primary funding for RDA/ US (RDA members from the US) to support the development and operations of the RDA organization, build the RDA commu-nity within the US, and provide participant support for key US members to attend RDA events. The NSF also provided funding to a number of additional data projects and infrastructure efforts to support the sharing of data across disciplines and geographic boundaries. NSF and NIST supported the hosting of the Second Plenary.
The EC funded a project called iCORDI (later renamed RDA/EU), commencing in September 2012, to provide similar functions in a European context, and to support the First Plenary.
The Australian Commonwealth government provided additional funding to the Australian National Data Service (ANDS 15 ) to take part in RDA activities, and to support the Third Plenary.
At the time of writing (mid-2014) these commitments are all current, but are also fi nite in length. The RDA is seeking other research funding agencies who might also wish to join the RDA Colloquium (the group of those agencies who are currently contributing to the RDA).

Vision, mission and principles
The ultimate RDA vision is a world where researchers and innovators openly share data across technologies, disciplines, and countries to address the grand challenges of society. In support of this vision, RDA sees as its mission to work to build the social and technical bridges that enable this open sharing of data.
Membership of the RDA is open to anyone who agrees to support its guiding principles. The Research Data Alliance: globally co-ordinated action against barriers to data publishing and sharing S11 LEARNED PUBLISHING VOL. 27 SPECIAL ISSUE SEPTEMBER 2014 anced representation of its membership and stakeholder communities.
• Harmonization -The RDA works to achieve harmonization across data standards, policies, technologies, infrastructure, and communities. • Community-driven -The RDA is a public, community-driven body constituted of volunteer members and organizations, supported by the RDA Secretariat. • Non-profi t -The RDA does not promote, endorse, or sell commercial products, technologies, or services.
The RDA accomplishes its mission through two important mechanisms: Working Groups and Interest Groups.

Working Groups
Working Groups are 'comprised of experts from the community that are engaged in creating deliverables that will directly enable data sharing, exchange, or interoperability' [emphasis added]. RDA endorsement is dependent upon the Working Group committing to produce deliverables within an 18-month time frame that will be implemented and adopted by one or more specifi c communities. Working Group deliverables include, but are not limited to, technical specifi cations and implementation practices, conceptual models or frameworks, implemented policies, and other documents and practices that improve data exchange. Working groups and their deliverables undergo a 'community review process'. 17 Note that the expectation is that a Working Group will form quickly, tackle an achievable defi ned piece of work, deliver a solution, get the solution adopted, and then disband. A wide range of Working Groups are currently active. 18

Interest groups
Interest Interest Groups remain in operation as long as they remain active, subject to periodic evaluation of their activity and its relevance to RDA aims. An interest group that has been inactive for six months will be asked to disband. With respect to function and outcomes, Interest Groups may do one or more of the following:

Governance
RDA activity began in earnest in August 2012 with the establishment of an international Steering Group by funding agencies from the US, EU and Australia. During the startup phase, the Steering Group was charged with developing the RDA by defi ning its charter and organizational structures, and with promoting its aims and mustering support for its activities. Now that the organization is up and running, the governance arrangements have largely transitioned to a number of bodies with specifi c responsibilities.
The RDA Council is responsible for the oversight, sustainability, and overall success of the RDA. The Council's responsibilities include approval of candidate interest and Working Groups to ensure alignment with RDA goals. Administration for the RDA is carried out by the Secretariat, led by a Secretary General. The Technical Advisory Board provides technical expertise and advice to the Council. It also assists in developing and reviewing RDA Interest and Working Groups to promote their impact and effectiveness. The Organizational Advisory Board (OAB) provides organizational advice to the RDA Council on the directions, processes, and mechanisms of the RDA. 20

Lessons learned
So, a little over one year into the evolution of the RDA, what have we learned?

The right thing at the right time
While the progress of the RDA since its official birth in March 2013 is no doubt due in part to those involved in its creation, its success is also due to its timing and to changes underway in the research system. The found-ing group, responding to this, shared a sense of the importance of research data, the need to better co-ordinate solutions to the barriers to its wider use, and an urgent sense that the time to act was now.

Facilitating conversations
A critical step in reducing the barriers to data interoperability is meeting with others who have the same challenges in data reuse or who are developing solutions that might be applicable. This requires the ability to have effective conversations with others facing the same challenges. A real part of the value of RDA is in its provision of physical (plenaries) and online (the RDA website and its organic group spaces) locations to have these conversations.

Momentum begets momentum
The initial momentum enjoyed by the RDA at its fi rst two plenaries (and all of the associated Interest Group and Working Group activity) has become (in the short term at least) selfperpetuating. People come to RDA plenaries because other people come to RDA plenaries. Presumably, this will eventually be limited by the number of people who see the RDA as useful to them (or by the availability of venues with enough space for 10-20 parallel working sessions).

Providing a nexus
Another benefi t of the success of RDA is that it is rapidly becoming the place to meet with colleagues who also work with research data. At the last plenary a number of attendees commented that if all the RDA did was to bring data practitioners together twice a year this would still be a useful function. The value of creating this concentration of people is apparent from the list of collocated events planned around Plenary Four in September 2014. 21 At the time of writing, this numbered ten events, and is likely to rise.

Challenges of globality
Of course, any new organization comes with teething pains, and the RDA is no exception. One of the more signifi cant challenges is turning its global ambitions into reality. While the attendees at Plenary Three came attendees at Plenary Three came from 35 countries from 35 countries, the bulk came from the US and Western Europe. The membership of the various governance groups is also overrepresented from those regions. And the ability of those who cannot travel to plenary events to contribute effectively to the face-toface sessions is highly constrained. The RDA leadership recognizes these challenges, and is actively working to address them. But there is still quite a way to go.

Conclusion
Of course, none of this activity (while exciting) will ultimately matter if it does not make a difference. The real test for the RDA will be in whether it actually reduces barriers to data interoperability. This will depend on the outputs of RDA Working Groups being taken up. And this, in turn, depends on RDA Working Groups successfully completing. A number are on track to do so at Plenary Four in September 2014, and this event will feature a showcase of their outputs. The next task is then to promote wider uptake and reduce actual barriers to data sharing. As indicated in the introduction, the IETF provided inspiration to the founders of the RDA. The unoffi cial motto of the IETF is often quoted as 'rough consensus and running code'. 14 The equivalent for the RDA might be 'rough consensus and exchanged data'. If it is going to make a difference, this will need to be true. All those involved in the organization at the moment are working hard to bring about this result.