The use of scientific data: A content analysis



In order to improve our understanding of scientific data users' data usage behaviors, which has been rarely studied in the fourth science paradigm—data-intensive science, this study conducted a content analysis of publications associated with a frequently cited data-intensive science project, called Sloan Digital Sky Survey (SDSS). We analyzed 200 SDSS-related publications and identified the data used in each publication. Under the scope of SDSS project, we found that (1) nearly half studies used one data source only. A few studies were able to use three or more data sources; (2) studies that analyzed a small number of objects are the norm; (3) users are not only consumers of scientific data. They are also data producers; (4) studies that can utilize multiple large scale data sources are rare.