How study design affects outcomes in comparisons of therapy. I: Medical


  • Dr. Graham A. Colditz,

    Corresponding author
    1. Channing Laboratory, Department of Medicine, Harvard Medical School and Brigham and Women's Hospital, 180 Longwood Avenue, Boston, MA 02115, U.S.A.
    • Channing Laboratory, 180 Longwood Avenue, Boston, MA 02115, U.S.A.
    Search for more papers by this author
  • James N. Miller,

    1. Center for Science and International Affairs, John F. Kennedy School of Government, Harvard University, Cambridge, MA 02138, U.S.A.
    Search for more papers by this author
  • Frederick Mosteller

    1. Technology Assessment Group, Department of Health Policy and Management, Harvard University School of Public Health, 677 Huntington Avenue, Boston, MA 02115, U.S.A.
    Search for more papers by this author


We analysed 113 reports published in 1980 in a sample of medical journals to relate features of study design to the magnitude of gains attributed to new therapies over old. Overall we rated 87 per cent of new therapies as improvements over standard therapies. The mean gain (measured by the Mann—Whitney statistic) was relatively constant across study designs, except for non-randomized controlled trials with sequential assignment to therapy, which showed a significantly higher likelihood that a patient would do better on the innovation than on standard therapy (p = 0.004). Randomized controlled trials that did not use a double-blind design had a higher likelihood of showing a gain for the innovation than did double-blind trials (p = 0.02). Any evaluation of an innovation may include both bias and the true efficacy of the new therapy, therefore we may consider making adjustments for the average bias associated with a study design. When interpreting an evaluation of a new therapy, readers should consider the impact of the following average adjustments to the Mann—Whitney statistic: for trials with non-random sequential assignment a decrease of 0.15, for non-double-blind randomized controlled trials a decrease of 0.11.