An overview is presented of the rationale, design, and analysis plan for the WMH-CIDI clinical calibration studies. As no clinical gold standard assessment is available for the DSM-IV disorders assessed in the WMH-CIDI, we adopted the goal of calibration rather than validation; that is, we asked whether WMH-CIDI diagnoses are ‘consistent’ with diagnoses based on a state-of-the-art clinical research diagnostic interview (SCID; Structured Clinical Interview for DSM-IV) rather than whether they are ‘correct’. Consistency is evaluated both at the aggregate level (consistency of WMH-CIDI and SCID prevalence estimates) and at the individual level (consistency of WMH-CIDI and SCID diagnostic classifications). Although conventional statistics (sensitivity, specificity, Cohen's κ) are used to describe diagnostic consistency, an argument is made for considering the area under the receiver operator curve (AUC) to be a more useful general-purpose measure of consistency. In addition, more detailed analyses are used to evaluate consistency on a substantive level. These analyses begin by estimating prediction equations in a clinical calibration subsample, with WMH-CIDI symptom-level data used to predict SCID diagnoses, and using the coefficients from these equations to assign predicted probabilities of SCID diagnoses to each respondent in the remainder of the sample. Substantive analyses then investigate whether estimates of prevalence and associations when based on WMH-CIDI diagnoses are consistent with those based on predicted SCID diagnoses. Multiple imputation is used to adjust estimated standard errors for the imprecision introduced by SCID diagnoses being imputed under a model rather than measured directly. A brief illustration of this approach is presented in comparing the precision of SCID and predicted SCID estimates of prevalence and correlates under varying sample designs. Copyright © 2004 Whurr Publishers Ltd.