In the past 50 years, the DSM and ICD revision processes have focused almost exclusively on refining the diagnostic category definitions. The process by which patients are assigned a categorical diagnosis in routine clinical practice has received much less attention. Starting with the publication of DSM-I in 1952 and continuing with DSM-II and the mental disorders sections of ICD-8 and ICD-9, standardized glossary definitions were provided as a basis for assigning a diagnosis, essentially establishing a prototype-like approach as the standard method of psychiatric diagnosis. The problematic diagnostic reliability of glossary definitions (reviewed in 1) prompted researchers in the 1970s to develop explicit inclusion and exclusion criteria such as the Research Diagnostic Criteria (RDC) 2. The demonstration that the inter-rater reliability of these operationalized criteria was superior to the DSM-II glossary definitions 3,4 led to the provision of diagnostic criteria for every disorder in DSM-III, with the stated hope that they would “improve the reliability and validity of routine psychiatric diagnosis” 4.
However, despite the widespread utilization of the DSM diagnostic criteria by the research community, anecdotal and indirect evidence suggests that clinicians routinely fail to use them in everyday clinical practice. Although there have never been any studies examining how the DSM is actually used in clinical practice settings, several studies have demonstrated significant discrepancies between DSM diagnoses made by clinicians in practice and diagnoses made using structured diagnostic interviews that systematically evaluate the DSM criteria 5,6. Although the authors of these studies concluded that the problem was due to the misapplication of the diagnostic criteria and recommended clinicians should receive additional diagnostic training to improve their diagnostic accuracy, the more likely scenario is that experienced clinicians do not methodically evaluate every relevant DSM criterion but instead match the patient's symptomatology to a mental prototype. Studies demonstrating clinician preference for prototype matching over criterion counting (e.g., 7) further suggest that clinicians find prototype matching to be more concordant with the way they make psychiatric diagnoses. In recognition of the fact that diagnostic assessment using operationalized criteria is best suited to research settings, ICD-10 provides two versions of its classification of mental disorders: the Clinical Descriptions and Diagnostic Guidelines (CDDG) for clinical use, and a parallel system of diagnostic criteria for research use.
Although Westen's proposal to shift from a criterion-based to a prototype matching system is a step in the right direction with regard to improving user acceptability of the diagnostic system, from a practical perspective there are significant drawbacks with its proposed implementation. The most problematic aspect concerns the requirement that clinicians rate the degree of prototype matching on a 5-point scale ranging from 1 (little or no match) to 5 (very good match). Recognizing the necessity of extracting a categorical diagnosis for both clinical and coding purposes, Westen proposes that the top two levels (i.e., 4 and 5) indicate the presence of the diagnosis, thus placing the entire disorder/non-disorder differentiation on the ability of the clinician to distinguish between a rating of 3 (the highest non-disorder rating, which he defines as “patient has significant features of this disorder”) and a rating of 4 (defined as “patient has this disorder,” italics in original) without any guidance provided as to how much of the patient's clinical features would have to match the prototype in order to justify the clinician's judgment that the patient has the disorder.
Given that prototype matching is considered to be the method used in DSM-II 8, a possible objection to the adoption of a prototype approach might be concerns about its reliability, given the conventional wisdom that DSM-II diagnoses are significantly less reliable than those made using DSM-III criteria, a contention based on comparisons of pre-DSM-III reliability data from the 1960s with reliability data using DSM-III criteria. In actual fact there is virtually no evidence demonstrating that DSM-III, when used in clinical settings, represents a significant improvement over DSM-II in terms of diagnostic reliability. Because differences in experimental design such as method of reliability determination, training of clinician participants, and base rates of diagnoses can significantly impact measured reliability, comparing DSM-II and DSM-III reliability obtained using different methodologies is like comparing apples and oranges. The only study that compared DSM-II and DSM-III diagnoses head-to-head 9 failed to demonstrate any differences in reliability. Moreover, most of the deficiencies in the DSM-II definitions cited by Spitzer et al 4 as likely sources of unreliability to be addressed in the construction of criteria sets, such as not distinguishing those features that are invariably present from those that are commonly but not invariably present, can easily be incorporated into clinical descriptions of disorders without the need to use operationalized criteria. Indeed, the reliability field trials of ICD-10 CDDG 10, which, like DSM-II, used clinical descriptions rather than diagnostic criteria, demonstrated that satisfactory reliability can be achieved without using diagnostic criteria.
The ICD-11 CDDG approach is similar to Westen's prototype matching in that it eschews defining disorders in terms of pseudoprecise criteria with arbitrary thresholds and instead involves the clinician deciding whether the diagnostic features outlined in the clinical description match those of the patient. However, unlike Westen's prototype matching system, which offers only a textual paragraph describing the typical patient and nothing else, the ICD-11 approach conveys a considerable amount of clinically-relevant information about the disorder, including both essential and typical features, differential diagnosis, the boundary with normality, typical course, and developmental and cultural-specific features.
Uniform guidelines have been developed for the ICD-11 working groups with the goal of improving consistency across categories and increasing clinical utility. As noted above, diagnostic reliability comparable with what clinicians can achieve using diagnostic criteria can be attained utilizing diagnostic definitions expressed in terms of clinical descriptions, something which is planned to be tested again in the forthcoming ICD-11 field trials.