The area under the receiver operating characteristic curve is frequently used as a measure for the effectiveness of diagnostic markers. In this paper we discuss and compare estimation procedures for this area. These are based on (i) the Mann–Whitney statistic; (ii) kernel smoothing; (iii) normal assumptions; (iv) empirical transformations to normality. These are compared in terms of bias and root mean square error in a large variety of situations by means of an extensive simulation study. Overall we find that transforming to normality usually is to be preferred except for bimodal cases where kernel methods can be effective. Copyright 2002 John Wiley & Sons, Ltd.