Big data and its epistemology



The article considers whether Big Data, in the form of data-driven science, will enable the discovery, or appraisal, of universal scientific theories, instrumentalist tools, or inductive inferences. It points out, initially, that such aspirations are similar to the now-discredited inductivist approach to science. On the positive side, Big Data may permit larger sample sizes, cheaper and more extensive testing of theories, and the continuous assessment of theories. On the negative side, data-driven science encourages passive data collection, as opposed to experimentation and testing, and hornswoggling (“unsound statistical fiddling”). The roles of theory and data in inductive algorithms, statistical modeling, and scientific discoveries are analyzed, and it is argued that theory is needed at every turn. Data-driven science is a chimera.