Statistics-based research – a pig in a poke?


  • James Penston MB BS MD MRCP

    Corresponding author
    1. Consultant Physician/Gastroenterologist, Scunthorpe General Hospital, Cliff Gardens, Scunthorpe, North Lincolnshire, UK
    Search for more papers by this author

Dr James Penston, Scunthorpe General Hospital, Cliff Gardens, Scunthorpe, North Lincolnshire DN15 7BH, UK, E-mail:


Much of medical research involves large-scale randomized controlled trials designed to detect small differences in outcome between the study groups. This approach is believed to produce reliable evidence on which the management of patients is based. But can we be sure that the demonstration of a small, albeit statistically significant, difference is sufficient to infer the presence of a causal relationship between the drug and the outcome?

A study is claimed to have internal validity when other explanations for the observed difference – namely, inequalities between the groups, bias in the assessment of the outcome and chance – have been excluded. Despite the various processes that are put into place – including, for example, randomization, allocation concealment, double-blinding and intention-to-treat analysis – it remains doubtful whether the groups are equal in terms of all factors relevant to the outcome and whether bias has been excluded. As for the exclusion of chance, not only may inappropriate statistical tests be used, but also frequentist statistics has been subjected to serious criticisms in recent years that further bring internal validity into question.

But the problems do not end with the flaws in internal validity. The philosophical basis of large-scale randomized controlled trials and epidemiological studies is unsound. When examined closely, many obstacles emerge that threaten the inference from a small, statistically significant difference to the presence of a causal relationship between the drug and the outcome.

Given the influence of statistics-based research on the practice of medicine, it is of the utmost importance that the flaws in this methodology are brought to the fore.