On information quality

Authors


Summary

We define the concept of information quality ‘InfoQ’ as the potential of a data set to achieve a specific (scientific or practical) goal by using a given empirical analysis method. InfoQ is different from data quality and analysis quality, but is dependent on these components and on the relationship between them. We survey statistical methods for increasing InfoQ at the study design and post-data-collection stages, and we consider them relatively to what we define as InfoQ. We propose eight dimensions that help to assess InfoQ: data resolution, data structure, data integration, temporal relevance, generalizability, chronology of data and goal, construct operationalization and communication. We demonstrate the concept of InfoQ, its components (what it is) and assessment (how it is achieved) through three case-studies in on-line auctions research. We suggest that formalizing the concept of InfoQ can help to increase the value of statistical analysis, and data mining both methodologically and practically, thus contributing to a general theory of applied statistics.

Ancillary