To the Editor:
Ecotoxicology is a relatively new scientific discipline that aims at examining the fate and effects of chemicals in the environment. It is an applied science, where most studies are conducted under standardized laboratory conditions. In laboratory studies, individuals, or to a smaller extent, populations, of a given taxon (mainly a few standard test organisms) are exposed to a single chemical or a mixture of chemicals for a predefined time period 1, 2. The ecotoxicological risk assessment is primarily based on experimental data. Given the large number of compounds in commerce, it seems impossible to generate ecotoxicity data for all combinations of compounds and relevant biotic endpoints (spanning from the molecular to the ecosystem level) 3. Therefore, it is crucial to develop modeling frameworks that allow for the prediction of effects if no or limited experimental data is available 4. Such modeling frameworks include quantitative-structure activity relationships (QSARs), toxicokinetic and toxicodynamic models, and matrix population and individual-based eco(toxico)logical models 5.
For the advancement and validation of these and other models, independent raw data sets, rather than descriptive statistics (e.g., central value), are required. The conditions for resolving several important scientific gaps such as the extrapolation from the laboratory to the field, or from biomarker responses to population- and community-level effects, could be improved considerably if raw data was freely accessible and could be pooled for necessary (meta)analyses.
Sharing of raw data by governmental authorities, industry, and also academia—representing the tripartite structure of SETAC—is limited. Governmental authorities are mainly relying on public funds for the generation of data, but these data are often inaccessible to the public. While the Freedom of Information Act guarantees quick access to data in some countries, such as the United States and Australia, the success and duration of data requests to authorities in other countries is at least partly driven by the availability and willingness of staff to allocate time. Moreover, authorities often decide on the type and resolution level of raw data they make accessible on a case-by-case basis. For example, requests for governmental biomonitoring data in Germany by one of the authors (R.B. Schäfer) took from a few weeks to almost a year depending on the federal state. Additionally, the quantity of delivered data varied dramatically, with one federal state refusing to provide any data. Chemical exposure data from the European Union (http://www.eea.europa.eu/data-and-maps/data/waterbase-water-quantity-6) and the United States (http://www.epa.gov/storet/dw_home.html) represent shining examples for public access to governmental data. By contrast, only few biomonitoring data sets are available in publicly accessible databases, though these data sets would allow for a spatial or temporal assessment of ecotoxicological effects caused by chemical stressors in the environment.
Although academia would undoubtedly benefit from access to raw data, it has not taken the initiative, despite recommendations 6. Several ecological and multidisciplinary journals, including many prestigious high impact journals such as Nature or Science, require authors to make all their raw data accessible. For this purpose, data repository systems such as DRYAD (http://datadryad.org) and Pangea (http://pangaea.de) have been established.
Unfortunately, industry data produced for environmental risk assessments of chemicals are rarely published in scientific journals or available from government databases. Results of mesocosm studies would benefit research on community structure, function responses, and other issues. Despite the advantages that may arise from the increased availability of experimental data, requests for data access are often rejected 7. Given that most of these concerns can be allayed (7; see also below), we believe the benefits of increased access to data outweigh the concerns. Below are approaches to improve access to data.
Data sets collected during projects financed by public resources should be available on request after giving project partners sufficient time to publish. Limits on response time to request data should be established. Meta-databases should be established for easy identification of data. Finally, appropriate data quality documentation, data preservation, and sharing should be a criterion for future funding.
Scientists should act as role models by making research data publicly available. Unfortunately, only 30% requested data sets were made available in the field of global change biology 7. Possible concerns of scientists ranged from insufficient attribution and credit for the data, the loss of the opportunity to publish one's data, to the time and costs of data sharing. Unequivocal rules and database tools could ensure the mitigation of these and additional concerns. For example, raw data can be withheld until the original investigation is published (see Genbank; http://www.ncbi.nlm.nih.gov/genbank/). In the long term, use of data and the development of computer code sharing could develop where references and acknowledgements to original owners of data sets and codes would guarantee the visibility of the latter 8, 9. Appropriate tools, such as DRYAD, would ensure that only registered users were able to download data, and that personal time and financial costs associated with data and code sharing were minimized. Given that change is often a question of generations, it is important that current graduate students become exposed to such ideas fostering the necessary cultural change in our scientific community. A variety of tools integrating data analysis, data management, and text processing software, for example, ROpenSci (http://ropensci.org), are available or under development that could be included in teaching to support the implementation of open science.
Possibly the largest number of data sets are produced for environmental risk assessment of chemicals. Given their importance for theory development and considering their rareness in the public domain, these data sets should be made available for research. As outlined for the academic sector, it is certainly possible to develop appropriate procedures for data sharing ensuring that trade secrets remain protected. It is our view that taking social responsibility and the precautionary principle serious means to make these data available for public research. Given the tripartite structure of our society, bringing together researchers from business, government, and academia, SETAC can best discuss ways to create a culture of increased data and computer code accessibility, and hence, promote an improved understanding of conditions for the environmental safety of chemicals.