Cognition and Neurosciences
Hail the impossible: p-values, evidence, and likelihood
Article first published online: 16 NOV 2010
© 2010 The Author. Scandinavian Journal of Psychology © 2010 The Scandinavian Psychological Associations
Scandinavian Journal of Psychology
Volume 52, Issue 2, pages 113–125, April 2011
How to Cite
JOHANSSON, T. (2011), Hail the impossible: p-values, evidence, and likelihood. Scandinavian Journal of Psychology, 52: 113–125. doi: 10.1111/j.1467-9450.2010.00852.x
- Issue published online: 17 MAR 2011
- Article first published online: 16 NOV 2010
- Received 24 March 2010, accepted 1 August 2010
- significance testing;
- error control;
Johansson, T. (2011). Hail the impossible: p-values, evidence, and likelihood. Scandinavian Journal of Psychology 52, 113–125.
Significance testing based on p-values is standard in psychological research and teaching. Typically, research articles and textbooks present and use p as a measure of statistical evidence against the null hypothesis (the Fisherian interpretation), although using concepts and tools based on a completely different usage of p as a tool for controlling long-term decision errors (the Neyman–Pearson interpretation). There are four major problems with using p as a measure of evidence and these problems are often overlooked in the domain of psychology. First, p is uniformly distributed under the null hypothesis and can therefore never indicate evidence for the null. Second, p is conditioned solely on the null hypothesis and is therefore unsuited to quantify evidence, because evidence is always relative in the sense of being evidence for or against a hypothesis relative to another hypothesis. Third, p designates probability of obtaining evidence (given the null), rather than strength of evidence. Fourth, p depends on unobserved data and subjective intentions and therefore implies, given the evidential interpretation, that the evidential strength of observed data depends on things that did not happen and subjective intentions. In sum, using p in the Fisherian sense as a measure of statistical evidence is deeply problematic, both statistically and conceptually, while the Neyman–Pearson interpretation is not about evidence at all. In contrast, the likelihood ratio escapes the above problems and is recommended as a tool for psychologists to represent the statistical evidence conveyed by obtained data relative to two hypotheses.