Analysis and synthesis of metadata goals for scientific data
Version of Record online: 26 JUN 2012
© 2012 ASIS&T
Journal of the American Society for Information Science and Technology
Volume 63, Issue 8, pages 1505–1520, August 2012
How to Cite
Willis, C., Greenberg, J. and White, H. (2012), Analysis and synthesis of metadata goals for scientific data. J. Am. Soc. Inf. Sci., 63: 1505–1520. doi: 10.1002/asi.22683
- Issue online: 25 JUL 2012
- Version of Record online: 26 JUN 2012
- Manuscript Accepted: 19 FEB 2012
- Manuscript Revised: 5 FEB 2012
- Manuscript Received: 30 NOV 2011
The proliferation of discipline-specific metadata schemes contributes to artificial barriers that can impede interdisciplinary and transdisciplinary research. The authors considered this problem by examining the domains, objectives, and architectures of nine metadata schemes used to document scientific data in the physical, life, and social sciences. They used a mixed-methods content analysis and Greenberg's (2005) metadata objectives, principles, domains, and architectural layout (MODAL) framework, and derived 22 metadata-related goals from textual content describing each metadata scheme. Relationships are identified between the domains (e.g., scientific discipline and type of data) and the categories of scheme objectives. For each strong correlation (>0.6), a Fisher's exact test for nonparametric data was used to determine significance (p < .05).
Significant relationships were found between the domains and objectives of the schemes. Schemes describing observational data are more likely to have “scheme harmonization” (compatibility and interoperability with related schemes) as an objective; schemes with the objective “abstraction” (a conceptual model exists separate from the technical implementation) also have the objective “sufficiency” (the scheme defines a minimal amount of information to meet the needs of the community); and schemes with the objective “data publication” do not have the objective “element refinement.” The analysis indicates that many metadata-driven goals expressed by communities are independent of scientific discipline or the type of data, although they are constrained by historical community practices and workflows as well as the technological environment at the time of scheme creation. The analysis reveals 11 fundamental metadata goals for metadata documenting scientific data in support of sharing research data across disciplines and domains. The authors report these results and highlight the need for more metadata-related research, particularly in the context of recent funding agency policy changes.