• comparative effectiveness;
  • dichotomization;
  • measurement error;
  • responder analysis;
  • threshold-defined outcome

In medical studies, the long-term level of a risk factor exceeding a threshold is often an outcome of interest. In practice, such a risk factor may not be directly measurable. Instead, outcome variables are based on a single or multiple biomedical measurements that have substantial variability. This variability is due to measurement error in a strict sense, true day-to-day variability, or a combination of the two. Estimates of prevalence based on such outcomes are biased; some individuals with long-term levels below the threshold will be diagnosed, and some with long-term levels above the threshold will not be diagnosed. From a public health point of view, this is a relatively minor concern; it is much less important than the fact that many individuals are not tested at all. However, in comparative effectiveness research studies, such as clinical trials evaluating a new treatment as compared with a placebo or a gold standard treatment, the combination of a noisy measurement and a threshold can distort the studies’ conclusions in important ways. Using simulations and theoretical formulas, we systematically describe the bias of prevalence difference and prevalence ratio when comparing arms and its effect on trial conclusions. Copyright © 2012 John Wiley & Sons, Ltd.