Performance of a diagnostic test is ideally evaluated by a comparison of the test results to a gold standard for all the patients in a study. In practice, however, it is common for a subset of study patients to have the gold standard not verified (missing) due to ethical or expense considerations. Sensitivity and specificity are often used as the relevant test performance measures and a joint confidence region (CR) for sensitivity and specificity can summarize the precision of estimates. In this paper, we present an approach to sample size computations when designing a study in which the gold standard is considered to be missing at random (MAR). We calculate the needed increase in sample size to ensure that the joint CR under MAR falls inside the boundaries of the joint CR derived for data with no missingness present. Copyright © 2010 John Wiley & Sons, Ltd.