We thank Drs. Yankaskas, Schell, and Miglioretti for their comments. First, as to the closing statement in their communication, neither we nor Dr. Brem recommended anywhere in our article (or the accompanying editorial) a 17% recall rate as the ‘optimal’ operating point for screening mammography, and thus we do not understand how Yankaskas and colleagues came to the conclusion that we actually did.1
Second, our study was a relative comparison, and consequently, none of the possible sources of bias mentioned in their letter are relevant. A potential source of bias was one that they did not mention—namely, the possibility that only some of the radiologists selectively reviewed subsets of our screened population (e.g., younger women, high-risk women). Nonetheless, this was not the case in our study. Thus, although the population itself may have been a potential source for bias in their own study,2 their argument is not relevant to ours.
Third, we agree that ultimately, mammography is limited by the technology itself, but a large number of studies clearly show that 30–70% of all cancers have some mammographic signs (potential abnormalities) depicted on previous examinations (≥ 1 year before the actual detection). Thus, cancer detection rates can improve substantially, primarily at the cost of increasing recall rates before the fundamental limit of the technology itself is reached. This is one of the primary underlying justifications for, and driving forces behind, computer-aided detection (CAD).
Fourth, optimizing positive predictive value 1 alone is not necessarily the best way to optimize all screening practices.
Finally, when there are 10 operating points (1 for each radiologist), a curve can be fit in many ways, although linear fitting is performed most commonly. Fitting these data using other nonlinear models does not necessarily place us any closer to the truth (primarily due to the large amount of variability among individual readers and the limited number of points being analyzed). This is the case in our study and also in the study conducted by Yankaskas et al.2 In that study, breaking the curve resulted in minor improvements in the quality of the fit for the ensemble of sites and, in particular, for the higher-volume sites (i.e., those for which there were > 3000 cases). Because we are not limited by the technology itself, there is no compelling justification for Yankaskas and colleagues to analyze our data (or, for that matter, theirs) in similar fashion or to select any other nonlinear mathematical model.