Portions of this work were presented at the 2011 Annual Meeting of the Eating Disorders Research Society in Edinburgh, Scotland.
Comparing operational definitions of DSM-5 anorexia nervosa for research contexts
Article first published online: 6 SEP 2013
Copyright © 2013 Wiley Periodicals, Inc.
International Journal of Eating Disorders
Volume 47, Issue 1, pages 76–84, January 2014
How to Cite
Brown, T. A., Holland, L. A. and Keel, P. K. (2014), Comparing operational definitions of DSM-5 anorexia nervosa for research contexts. Int. J. Eat. Disord., 47: 76–84. doi: 10.1002/eat.22184
- Issue published online: 10 DEC 2013
- Article first published online: 6 SEP 2013
- Manuscript Accepted: 6 AUG 2013
- Manuscript Revised: 1 AUG 2013
- Manuscript Received: 10 APR 2013
- National Institute of Mental Health. Grant Number: R01 MH63758
- anorexia nervosa;
- operational definitions;
- eating disorder
DSM-5 anorexia nervosa (AN) criteria include several changes that increase reliance on clinical judgment. However, research contexts require operational definitions that can be applied reliably and that demonstrate validity. The present study evaluated different operational definitions for DSM-5 AN.
DSM-5 AN criteria were applied to diagnostic interview data from 364 women varying two features: threshold for determining low weight for Criterion A (body mass index [BMI] <17.0 kg/m2 vs. <18.5 kg/m2) and explicit endorsement of weight phobia (Criterion B explicit vs. inferred). Resulting groups of individuals with DSM-5 AN were compared on estimated frequency. In addition, AN groups were compared to non-eating disorder controls and individuals with an other specified feeding or eating disorder (OSFED) on external validators.
All operational DSM-5 definitions produced higher lifetime frequency estimates than reported for DSM-IV AN, with a particularly large increase associated with the broadest definition. All definitions produced significant differences in comparison to controls on external validators that were associated with medium to large effect sizes. Only definitions that required a lower weight threshold or explicit endorsement of weight phobia demonstrated significant differences compared to OSFED on external validators, and these were of small effect size. The specific combination of BMI <18.5 kg/m2 with inferred weight phobia exhibited few meaningful distinctions from the OSFED group.
To balance inclusivity, syndromal reliability, and validity, an operational definition for DSM-5 AN in research contexts should define low weight as BMI <18.5 kg/m2 and require measurable rather than inferred weight phobia. © 2013 Wiley Periodicals, Inc. (Int J Eat Disord 2014; 47:76–84)
With the recent release of the fifth iteration of the Diagnostic and Statistical Manual (DSM-5), diagnostic criteria for anorexia nervosa (AN) have undergone several changes to help reduce the preponderance of DSM-IV eating disorder not otherwise specified (EDNOS). These changes include clarifications for DSM-IV Criteria A (low weight; see Ref. ) and B (weight phobia; see Ref. ), as well as the removal of Criterion D (amenorrhea; see Ref. ). No changes have been made to Criterion C (body image disturbance, undue influence of weight or shape on self-evaluation, or the denial of seriousness of low weight). In considering these changes, it is important to ascertain that increasing the number of individuals who are diagnosed with AN does not decrease the ability to distinguish these individuals from those without an eating disorder or those diagnosed with a DSM-5 other specified feeding or eating disorder (OSFED). Several studies have demonstrated that removing the requirement for amenorrhea will reduce the prevalence of OSFED without altering validity of AN.[5-7] However, to our knowledge, no studies have empirically examined the impact of different interpretations of Criteria A and B on the validity of the AN diagnosis.
While Criterion A in the DSM-IV required a refusal to maintain body weight at or above minimal expectations (e.g., <85% expected body weight (EBW)), DSM-5 criteria tempers the language used to describe low weight, including removal of a specific low weight guideline. To increase clinician flexibility in defining significantly low weight, Criterion A does not provide a specific numerical standard to define low weight in the DSM-5; however, the text offers guidelines based on definitions of low weight suggested by the World Health Organization (WHO) and Centers for Disease Control and Prevention (CDC). Specifically, for adults one suggestion for defining low weight includes a body mass index (BMI) less than 18.5 kg/m2, which represents the lower limit of normal body weight as defined by the WHO and CDC. The DSM-5 text also mentions the slightly more rigorous guideline of defining low weight as less than 17.0 kg/m2, which represents the WHO cutoff for moderate to severe thinness. This change to describing low weight in terms of BMI reflects the difficulty associated with reliably and accurately assessing EBW. Importantly, variability in potential weight calculations can have a substantial impact on the number of definitions of low weight and the number of individuals classified as low weight; thus, comparing potential low weight definitions to achieve consensus on an acceptable cut-off appears warranted.
While DSM-IV required an explicit fear of gaining weight or becoming fat, amendments to Criterion B in DSM-5 include a clause that permits individuals to either endorse this fear or engage in persistent behavior that interferes with weight gain, despite the individual being at low weight. This change addresses two observations: (1) a sizable minority of individuals, both within Western and non-Western populations, deny overt fear of gaining weight and (2) some patients may experience this fear but be reluctant to endorse it (e.g., young patients, individuals who minimize symptoms). Thus, this revision allows for individuals with AN to subscribe to a broader range of reasons for maintaining a minimally normal weight, other than fear of weight gain (e.g., somatic complaints, extreme need for control, etc.), and provides clinicians with greater flexibility to infer that patients fear weight gain based on behaviors intended to avoid weight gain, such as skipping meals or substantial caloric restriction. Notably, Criterion A requires restriction of energy intake relative to energy needs to attain a significantly low body weight which would, in most cases, involve persistent behavior that interferes with weight gain (e.g., caloric restriction, intense exercise). Thus, although not the intention of the revisions, the changes allow for an interpretation that would make Criterion B potentially redundant with Criterion A. In clinical contexts, this interpretation may not be employed; however, in research contexts, where assessors are striving to reach recruitment goals over limited time, operational definitions for AN may allow interviewers to infer Criterion B from Criterion A in order to achieve a sufficient number of participants for meaningful statistical analyses. Given that researchers may not have the time or resources to ascertain all possible operationalizations of Criterion B in research contexts, it is crucial to understand the potential impact of inferring Criterion B from Criterion A on the syndromal validity of AN.
Diagnostic changes to Criteria A and B will likely provide substantial benefits for defining AN in a clinical context, including providing greater flexibility for treatment providers diagnosing AN and greater insurance coverage for individuals who are in need of treatment. However, studying AN in a research context necessitates reliable definitions that can be applied without altering validity. In addition, there is precedent for creating research diagnostic criteria when criteria intended for clinical use do not ensure sufficient reliability or validity. The current study compared different potential operational definitions for AN on evidence of validity in distinguishing AN from non-eating disorder controls and OSFED, with the goal of identifying approaches that may be adopted within research settings to ensure consistency across sites. Data come from an epidemiological study of eating disorders that previously demonstrated that eliminating Criterion D from the diagnosis of AN increased prevalence without compromising diagnostic validity. However, the prior study utilized the DSM-IV definition of low weight and required Criterion B and thus was unable to evaluate how different operational definitions of Criteria A and B might influence validity. There are advantages to using community-based samples over treatment-seeking samples when examining the impact of differing criteria on syndromal validity. Specifically, community-based samples reduce potential biases associated with treatment-seeking, such as illness severity and comorbidity that might obscure the full impact of different operational definitions.
We hypothesized that the number of individuals diagnosed with AN would increase by expanding the threshold for low weight and by not requiring explicit endorsement of Criterion B. Based on results demonstrating that non-fat-phobic AN appears to be less severe than traditional AN, we hypothesized that not requiring Criterion B would negatively impact syndromal validity. Further, we expected that if narrower definitions of AN increased homogeneity, then narrower definitions would demonstrate larger effect sizes in comparisons with non-eating disorder controls and OSFED relative to broader definitions. In contrast, if broadening the definition of AN did not result in greater heterogeneity, then we would expect the effect size comparisons to be similar to those of the narrower definitions and statistical significance of differences to increase due to increased N and resulting increased statistical power.
Data were drawn from a two-stage epidemiological study that examined health and eating patterns. Women (n = 1,732) attending a northeastern university were recruited in the springs of 1982, 1992, and 2002 to complete self-report surveys from a randomly selected sample of 2,400 female students. In 2002, women in the 1982 and 1992 cohorts were contacted for 10- and 20-year follow-up, respectively. The second stage of the study involved inviting participants to complete semi-structured interviews if their survey responses indicated criteria were met for an eating disorder diagnosis at any assessment point. Among women who were identified as cases and invited to complete interviews (n = 272), 68% participated. Eating disorder cases were demographically matched with non-eating disorder controls based on age, gender, and race, and non-eating disorder controls were recruited to complete interviews. Thus, data for the current study came from interview assessments conducted between the years of 2002 and 2005 from a female sample of cases and matched controls (n = 364). Analyses represent a subgroup of these females who met criteria for AN according to various definitions, OSFED, or did not meet criteria for any eating disorder (n = 299). These participants included three age groups: late adolescents (n = 62; mean age = 19.7 ± 1.6 years), adults (n = 74; mean age = 29.8 ± 1.6 years), and mid-life adults (n = 163; mean age = 40.8 ± 2.0 years). Participants identified primarily as Caucasian (77.6%); 8.7% were Asian, 7.7% were African American, 5.4% were Hispanic, and 0.6% identified as biracial/other.
The Institutional Review Board approved this study, and participants completed informed consent documents prior to participation. Semi-structured interviews were completed over the telephone by interviewers trained using the Structured Clinical Interview for DSM-IV Axis-I Disorders (SCID-I) training tapes. All interviews were audiotaped with participant consent to establish inter-rater reliability.
Definitions of AN and OSFED
DSM-5 criteria were applied to interview data to create four definitions of AN with variations on two defining features: (1) definition of low weight (BMI threshold of <18.5 kg/m2 vs. <17.0 kg/m2) and (2) endorsement of Criterion B (explicit vs. inferred from behaviors to prevent weight gain from Criterion A). All definitions of AN required Criterion C (body image disturbance, undue influence of weight and shape on self-evaluation, or denial of seriousness of current low body weight). This resulted in four definitions, listed from most to least restrictive: (1) a BMI of < 17.0 kg/m2 and explicit endorsement of Criterion B (<17.0 ABC); (2) a BMI of <18.5 kg/m2 and explicit endorsement of Criterion B (<18.5 ABC); (3) a BMI of <17.0 kg/m2, with Criterion B inferred (<17.0 AC); and (4) a BMI of <18.5 kg/m2, with Criterion B inferred (<18.5 AC).
OSFED cases included participants who met DSM-5 criteria for OSFED or an unspecified feeding or eating disorder (UFED), which were defined as a clinically significant disorders of eating not meeting full DSM-5 criteria for AN, bulimia nervosa (BN), or binge eating disorder (BED). In addition, all OSFED cases had to endorse a BMI above 18.5 kg/m2 to prevent classifying individuals in both the OSFED and AN group depending upon the operational definition of AN. Thus, the OSFED group captured individuals meeting criteria for purging disorder, subthreshold forms of BN or BED, or any other clinically significant disorder of eating not meeting criteria for AN, BN, or BED. Consistent with the DSM-5 conceptualization of a clinically significant mental disorder, participants were required to endorse disordered eating behaviors that were associated with distress, functional impairment, or increased risk of suffering from death, pain, or disability in order to be diagnosed with an OSFED. While DSM-5 differentiates between a diagnosis of OSFED and UFED, for simplicity we will refer to OSFED to describe any individuals who were diagnosed with a clinically significant eating disorder not meeting DSM-5 criteria for AN, BN, or BED. Participants who did not meet criteria for any eating disorder over their lifetime, as assessed by the SCID-I, and provided adequate information to determine the absence of a lifetime eating disorder were classified as non-eating disorder controls.
External Validators from Stage One: Surveys
Eating Disorders Inventory (EDI)
The Eating Disorders Inventory (EDI) is a self-report, 6-point forced choice measure of behavioral and psychological traits in AN and BN. The EDI is a well-validated inventory with excellent support for its internal consistency and discriminant validity as well as test-retest reliability in both individuals with and without eating disorders. In the current study, items from the Perfectionism, Drive for Thinness, and Bulimia subscales of the EDI from the 2002 survey were included as external validators. Internal consistencies of the subscales from the 2002 survey were good in this study, α = 0.77 for Perfectionism, α = 0.93 for Drive for Thinness, and α = 0.89 for Bulimia.
External Validators from Stage Two: Interviews
Lifetime Axis-I Diagnoses and Suicidality
The Structured Clinical Interview for DSM-IV Axis-I Disorders (SCID-I) is a semi-structured clinical interview used to evaluate both current and lifetime DSM-IV Axis-I diagnoses. In the current study, lifetime history of eating disorder diagnoses, mood disorders, anxiety disorders, and suicidality were analyzed from SCID-I assessments conducted between 2002 and 2005. Lifetime eating disorder diagnoses were coded in a hierarchical manner, such that a lifetime diagnosis of AN would rule out another eating disorder diagnosis, and a lifetime diagnosis of BN would rule out a diagnosis of OSFED. Lifetime suicidality was assessed for all participants during SCID-I interviews, and standard skip rules were not observed in the assessment of eating disorder symptoms. As such, all individuals were asked eating disorder diagnostic criteria for AN, allowing us to assess remaining AN diagnostic criteria for those whose lowest weight was not below 85% of that expected or who did not endorse Criterion B. This included asking each participant for her lowest weight throughout her lifetime, and her height and age during this period, making it possible calculate lowest BMI for each participant. All subsequent questions from the AN section were then assessed during this time period. Inter-rater reliability for lifetime diagnoses in the current sample was good (κ = 0.71 for eating disorders, κ = 1.00 for mood disorders, and κ= 0.70 for anxiety disorders).
The Weissman Social Adjustment Scale-Self-report (WSAS) was used to assess overall psychosocial functioning, with higher scores indicating worse psychosocial functioning. Although the WSAS is a self-report measure, it was administered within the present study in an interview format, with high inter-rater reliability for total scores (r = .99). Internal consistency was good (α = 0.71), and was similar to estimates reported in previous studies. In addition to the WSAS, participants' global assessment of functioning (GAF) was assessed using the SCID-I. Both the WSAS and the GAF were assessed during the interview assessments (Stage 2), between 2002 and 2005.
Lifetime frequency of the four AN definitions was assessed by calculating the number of individuals meeting DSM-5 criteria for AN for each operational definition. Because interview data come from the second stage of a two-stage epidemiological study, these frequencies were recalculated as a percentage of the full sample. Univariate analyses of variance (ANOVA) were used to compare each definition of AN to controls and OSFED on EDI subscales (Perfectionism, Drive for Thinness, Bulimia) and measures of psychosocial functioning (WSAS, GAF). Dunnett's test was used to evaluate statistical significance of two sets of post hoc comparisons, those between AN and controls and those between AN and OSFED. Due to expected differences in N across definitions, and resulting differences in statistical power for group comparisons, Cohen's d was calculated to provide a measure of effect size for each comparison. According to guidelines set by Cohen, values of 0.2 represent a “small” effect, 0.5 a “medium” effect, and 0.8 a “large” effect. Logistic regression was used to compare each definition of AN to controls and OSFED on endorsement of a lifetime mood disorder, anxiety disorder, or lifetime suicidality.
Frequencies of the four definitions of AN were calculated in order to determine their lifetime occurrence both within the full sample of females who completed interviews (n = 364) and extrapolating to the full sample of females who completed surveys (n = 1,732). As expected, lifetime frequency generally increased as definitions became broader. Specifically, the most restrictive definition (<17.0 ABC) captured the smallest number of participants (8.79% of the interview sample and 1.85% of the total survey sample; n = 32). The <18.5 ABC definition captured 14.0% of the interview sample and 2.94% of the survey sample (n = 51), while the <17.0 AC definition captured 10.99% of the interview sample and 2.31% of the survey sample (n = 40). The least restrictive definition (<18.5 AC) captured 23.1% of the interview sample and 4.85% of the survey sample (n = 84). Thus, approximately two to three times as many women met criteria for the broadest definition of AN compared to the most narrow definition. Further, the greatest increase in prevalence was observed when relaxing Criterion A and inferring Criterion B, with estimates jumping from approximately 2–3% to nearly 5%. Examination of the proportion of current versus lifetime eating disorder diagnoses from the SCID-I indicated that approximately one-third of the women across all AN definitions were currently ill at the time of the interviews and surveys (range: 28–35%).
Table 1 presents comparisons of EDI external validators between AN, controls, and OSFED. Overall, all definitions of AN were associated with greater pathology on perfectionism, drive for thinness, and bulimia compared to controls (all p-values <.001), but not compared to OSFED (all p-values >.15). The exception was the <18.5 ABC definition, which was associated with higher drive for thinness scores than OSFED (p = .04). The <17.0 ABC definition was also associated with higher drive for thinness scores than OSFED at a trend level (p = .08).
|Predictor||EDI Perfectionism Scores||EDI Drive for Thinness Scores||EDI Bulimia Scores|
|<17.0 ABC||30||25.26b||5.76||7.43 (2, 236)b||0.58, 0.14||30||19.03b||7.69||24.84 (2, 236)b||1.12, 0.37||30||13.90b||6.46||17.85 (2, 236)b||0.81, 0.14|
|<18.5 ABC||49||26.08b||5.29||12.20 (2, 254)b||0.76, 0.31||49||19.10c||7.77||29.75 (2, 254)b||1.12, 0.38||49||14.55b||6.93||21.94 (2, 254)b||0.90, 0.25|
|<17.0 AC||36||25.47b||5.69||8.43 (2, 242)b||0.62, 0.18||36||18.33b||7.62||23.51 (2, 242)b||1.02, 0.27||36||13.58b||6.16||17.45 (2, 242)b||0.77, 0.09|
|<18.5 AC||79||25.45b||5.17||11.93 (2, 284)b||0.65, 0.19||79||16.73b||7.61||20.68 (2, 284)b||0.77, 0.05||79||13.54b||6.16||19.05 (2, 284)b||0.76, 0.08|
According to guidelines set by Cohen, comparisons of perfectionism between AN definitions and controls were of a medium effect size (all ds = 0.58–0.76). Comparisons between AN definitions and OSFED fell below the threshold for small effects, with the exception of the <18.5 ABC definition, which demonstrated a small effect.
For drive for thinness scores, comparisons between AN definitions and controls demonstrated large effect sizes (all ds = 1.02–1.12), with the exception of the <18.5 AC definition, which was of a medium effect size (d = 0.77). Most comparisons between AN definitions and OSFED demonstrated small effect sizes, with only the <18.5 AC definition falling below the threshold for a small effect.
For bulimia scores, comparisons between the AN definitions not requiring explicit weight phobia (<17.0 AC and <18.5 AC) and controls were of a medium effect size, while those definitions requiring explicit weight phobia (<17.0 ABC and <18.5 ABC) fell above the cutoff for a large effect size. In comparison to OSFED, only the <18.5 ABC definition demonstrated a small effect size; comparisons to remaining AN definitions were below the threshold for a small effect size.
Thus, all AN definitions were associated with greater pathology on perfectionism, drive for thinness, and bulimia scores compared to controls; however, effect sizes for drive for thinness were diminished for the broadest AN definition (<18.5 AC). In addition, for comparisons on bulimia scores, effect sizes were larger for comparisons with the <18.5 ABC criterion, suggesting that altering the weight threshold may have included participants who would otherwise be diagnosed with a bulimic syndrome.
Axis-I Diagnoses and Suicidality
Table 2 presents results of logistic regression analyses that examined lifetime endorsement of a mood disorder, anxiety disorder, or suicidality in AN, controls, and OSFED. Individuals in all AN definitions had a significantly higher likelihood of endorsing a lifetime mood disorder than controls. Odds ratios decreased as the definitions of AN broadened, with individuals in the least narrow definition (<18.5 AC) being three times more likely to have a mood disorder compared to controls, and individuals included in the most narrow definition (<17.0 ABC) being over five times more likely to have a lifetime mood disorder compared to controls. In comparison to individuals with OSFED, AN groups including explicit weight phobia (<17.0 ABC, <18.5 ABC) had a significantly higher likelihood of a lifetime mood disorder (approximately 2.5–3 times more likely). No significant differences were found between AN definitions inferring weight phobia (<17.0 AC, <18.5 AC) and the OSFED group on a lifetime mood disorder.
|Comparisons||Lifetime Mood Disorder||Lifetime Anxiety Disorder||Lifetime Suicidality|
|n||B||X||OR (CI)||n||B||X||OR (CI)||n||B||X||OR (CI)|
|AN to Controls|
|<17.0 ABC||31||−1.74||15.02||5.68 (2.36–13.70)d||31||−1.40||9.85||4.07 (1.69–9.80)c||27||−1.64||13.21||5.18 (2.13–12.50)d|
|<18.5 ABC||49||−1.60||19.25||4.95 (2.42–10.10)d||50||−1.20||9.39||3.32 (1.54–7.15)c||43||−1.32||12.01||3.73 (1.77–7.85)c|
|<17.0 AC||39||−1.38||12.77||3.95 (1.86–8.40)d||39||−1.17||7.63||3.23 (1.40–7.41)c||34||−1.39||11.37||4.00 (1.79–9.01)d|
|<18.5 AC||81||−1.16||15.84||3.19 (1.80–5.65)d||83||−0.90||6.58||2.47 (1.24–4.93)b||73||−0.91||7.68||2.48 (1.31–4.73)c|
|AN to OSFED|
|<17.0 ABC||31||−1.08||5.23||2.96 (1. 17–7.46)d||31||−0.68||2.15||1.97 (0.90–4.88)||27||−1.11||5.61||3.04 (1.21–7.63)b|
|<18.5 ABC||49||−0.94||5.71||2.57 (1.18–5.59)b||50||−0.47||1.35||1.61 (0.72–3.57)||43||−0.78||3.84||2.19 (1.00–4.81)a|
|<17.0 AC||39||−0.72||3.03||2.06 (0.91–4.063)||39||−0.44||1.02||1.56 (0.65–3.69)||34||−0.86||3.95||2.35 (1.01–5.46)b|
|<18.5 AC||81||−0.51||2.36||1.66 (0.87–3.16)||83||−0.18||0.23||1.19 (0.58–2.48)||73||−0.38||1.15||1.46 (0.73–2.91)|
For lifetime anxiety disorders, all AN definitions had a significantly higher likelihood of a lifetime anxiety disorder than controls. The broadest AN definition (<18.5 AC) was associated with the lowest (OR = 2.47) likelihood of a lifetime anxiety disorder compared to controls, whereas the most narrow AN definition (<17.0 ABC) was associated with the highest (OR = 4.07) likelihood of having a lifetime anxiety disorder compared to controls. No differences were observed between any of the AN definitions and the OSFED group on likelihood of endorsing a lifetime anxiety disorder.
For lifetime suicidality, all AN definitions had a significantly higher likelihood of endorsing lifetime suicidality than controls. Odds ratios decreased as the definitions of AN broadened, with the least narrow group (<18.5 AC) being approximately 2.5 times more likely to endorse suicidality compared to controls, and the most narrow group (<17.0 ABC) being about five times more likely to endorse suicidality compared to controls. All AN definitions also demonstrated significantly higher likelihood of endorsing suicidality compared to the OSFED group, with the exception of the <18.5 AC definition. The narrowest definition (<17.0 ABC) was associated with over a three-fold increased likelihood of endorsing suicidality compared to the OSFED group. The <18.5 ABC and <17.0 ABC groups were associated with a twofold increased likelihood of endorsing suicidality, with the <18.5 ABC definition just reaching the level of significance (p = .05). No significant differences were found between the least narrow definition (<18.5 AC) and the OSFED group on endorsement of lifetime suicidality. Thus, overall, increasing both the weight criteria and inferring weight phobia was associated with lower Axis-I mood disorders and suicidality relative to the OSFED group; however, all definitions significantly differed from controls.
Table 3 presents comparisons of psychosocial functioning. Compared to controls, only the <18.5 ABC definition was associated with significantly higher scores on the WSAS. Across the various definitions, women with AN did not score significantly higher on the WSAS than OSFED. Comparisons between all definitions and the controls were of a small effect size, while comparisons between all AN definitions and the OSFED group were below the threshold for a small effect size.
|<17.0 ABC||32||1.61a,b||0.38||2.17 (2, 244)||0.27, 0.03||30||66.57c||12.06||25.62 (2, 230)c||1.16, 0.43|
|<18.5 ABC||51||1.64b||0.39||3.30 (2, 269)b||0.36, 0.12||48||67.21c||11.54||28.89 (2, 247)c||1.12, 0.38|
|<17.0 AC||40||1.61a,b||0.37||2.20 (2, 252)||0.27, 0.03||37||67.86c||11.99||24.04 (2, 237)c||1.04, 0.31|
|<18.5 AC||83||1.61a,b||0.35||2.77 (2, 294)||0.28, 0.03||80||69.75b||10.71||25.42 (2, 279)c||0.91, 0.15|
Overall, all definitions of AN were associated with significantly greater impairment on GAF scores compared to controls. Compared to OSFED, the <17.0 ABC, <18.5 ABC, and <17.0 AC definitions were associated with significantly lower GAF scores. Only the definition allowing a higher weight threshold and inferring weight phobia (<18.5 AC) failed to distinguish women with AN from controls on global functioning. Effect sizes for the GAF were large (all ds > 0.90) for comparisons with controls. In contrast, comparisons between all definitions of AN and OSFED were of a small effect size (all ds = 0.31–0.43), with the exception of the <18.5 AC definition, which fell below the threshold for a small effect.
The ideal research definition of AN should decrease reliance on OSFED without diminishing diagnostic validity. Lifetime frequencies suggest that increasing the weight criterion threshold from <17.0 kg/m2 to <18.5 kg/m2 and inferring weight phobia will increase the number of individuals diagnosed with AN, and thus potentially reduce reliance on OSFED for AN-like presentations. Of note, three of our four estimates of lifetime DSM-5 AN were similar to those reported by Keel et al., who used the same data to examine lifetime frequency estimates of DSM-IV AN without amenorrhea. The notable exception to this was the <18.5 AC definition, for which the lifetime frequency estimate was considerably larger. In regard to diagnostic validity, the general pattern of results demonstrated that scores became less pathological across domains as the definition of AN broadened. Results from validity analyses suggest that increasing the definitional breadth of AN will not reduce distinctions from normality; however, the combination of increasing the weight criterion threshold and inferring weight phobia from Criterion A will reduce distinctions between AN and OSFED.
The combination of increasing the minimum weight threshold and inferring Criterion B produced a remarkably higher frequency than all other definitions, suggesting there is a potentially large pool of individuals with AN-like syndromes whose BMIs fall between 17.0 and 18.5 kg/m2 and who do not endorse weight phobia. This combination appears to result in a more heterogeneous group with no evidence of distinction from OSFED on eating pathology, lifetime comorbidity, or psychosocial functioning. Results suggest that differences between non-weight phobic AN and conventional AN may be more pronounced at a higher weight threshold. Thus, relaxing both the criterion for low weight and inferring weight phobia from Criterion A would accomplish the goal of reducing reliance on OSFED but would fail to maintain adequate diagnostic validity for comparisons to other eating disorders.
Inferring weight phobia from Criterion A, while holding the low weight threshold at <17.0 kg/m2, resulted in modest changes in syndrome frequency and evidence of syndrome validity. Importantly, the <17.0 AC definition failed to distinguish between AN and OSFED on lifetime history of a mood or anxiety disorder. The potentially important role of weight phobia in distinguishing diagnostic groups is consistent with results from latent class and latent profile analyses that have identified either a low-weight AN-like group or mixed-feature EDNOS-like group without weight concerns as a distinct latent class.[20-23] External validation analyses have demonstrated that individuals without weight phobia exhibit lower rates of comorbid psychopathology, less severe eating disorder cognitions, less psychological distress, and better psychosocial functioning compared to eating disorder groups with weight concerns. As weight phobia is a cognitive symptom, perhaps this fear represents an underlying cognitive vulnerability that may overlap with vulnerabilities for anxiety and depression. Further, studies that have specifically examined individuals with non-fat phobic AN or AN with low drive for thinness have demonstrated less severe eating pathology among these individuals than those diagnosed with conventional AN.[2, 3] Thus, results suggest some caution in inferring weight phobia from Criterion A as this may reduce eating disorder syndrome homogeneity and clinical significance.
Given that inferring weight phobia from Criterion A was associated with reduced diagnostic validity, it may be advantageous to provide additional examples of potential observable signs or indicators of fear (perhaps including body checking or avoidance behaviors, etc.) or measurable behaviors (e.g., food avoidance, purging behaviors, etc.) to help operationalize Criterion B in research contexts. Fortunately, the format of DSM-5, including the change from Roman to Arabic numerals, will facilitate a “living” document with the capacity for more frequent updates (e.g., DSM-5.1, DSM-5.2, etc.), similar to software systems that are frequently updated based on field feedback. Including additional observable and measurable indicators of weight phobia within the DSM-5 text would allow for this symptom to be directly (and reliably) measured when it is not explicitly endorsed in both research and clinical contexts.
Based on these results, we suggest that using an operational definition of <18.5 ABC would provide an adequate representation of DSM-5 AN for research contexts. This definition increases inclusivity by diagnosing DSM-5 AN in those whose low weight falls between 17.0 and 18.5 kg/m2. However, raising the threshold for low weight in the presence of weight phobia makes little difference on eating pathology, lifetime history of a mood disorder or suicidality, or psychosocial functioning, in comparison to controls and OSFED. Thus it appears that increasing the threshold for low weight alone, from the more restrictive WHO definition of moderate to severe thinness (17.0 kg/m2) to the CDC/WHO definition of the lower limit of normal body weight (18.5 kg/m2) does not reduce syndrome homogeneity or clinical significance. The requirement of measurable weight phobia matches the intention behind the revisions for the DSM-5 and ensures a sufficiently homogeneous group to ensure meaningful distinctions from both controls and other eating disorders. The purpose of the present study was to evaluate potential operational definitions of DSM-5 AN for research contexts; however, the DSM-5 criteria are not just designed for research settings, but rather for clinical practice. In addition, research is conducted to inform clinical practice. Thus, achieving a consensus on a research definition of DSM-5 AN has important clinical implications. Indeed, how researchers operationalize AN will have important implications for the population on whom treatment studies are conducted. In this regard, setting the low weight threshold for research contexts at <18.5 kg/m2 may allow for further research on early intervention, before weight reaches a critically low threshold that impairs ability to benefit from psychosocial interventions.
The present study had several methodological strengths worth noting. First, the data were drawn from a large community-based sample. This is important because samples ascertained from clinical settings have already accessed treatment and are less informative for issues regarding case identification and treatment access. Second, the study-specific interview structure (e.g., the disregard of Module H skip rules) allowed us to vary criteria for AN definitions that could not be assessed from other epidemiological studies that skip out of questions for subsequent criteria if initial criteria (e.g., Criterion A or B) are not met. This extends to the ascertainment of each individual's lowest weight during the interview, which permitted us to explore weight criterion definitions outside of those defined by EBW. Finally, the study included a combination of interview and self-report measures with high inter-rater reliability and strong psychometric properties, which increase confidence in the pattern of results across assessment methods.
With these strengths in mind, there are also limitations to consider. First, data for the present study were drawn from cohorts originally recruited from a selective northeastern university and thus, results may not generalize to individuals from other regions or demographic backgrounds. Second, the questionnaire measures were completed based on current self-report at the time of the survey, while the psychosocial functioning measures were completed based on current functioning at the time of the interview, and eating disorder and other Axis-I diagnoses were assessed for lifetime occurrence. Thus, completion of the disordered eating external validators (EDI scores) were not concurrent with the diagnostic interview, nor were they concurrent with lifetime diagnoses. However, a sizable minority of individuals with AN, according to each definition (approximately one-third), were currently ill at the time of the interview and survey. Further, there did not appear to be any consistent differences between results from the EDI data and interview data in differentiating AN diagnoses, which increases confidence in the consistency of our results despite temporal differences in self-report versus interview assessments due to our two-stage design. While the purpose of the present study was to examine the potential impact of inferring weight phobia from Criterion A, we acknowledge that “persistent behavior that interferes with weight gain” could be operationalized in additional ways that we were unable to measure in the present study. These include, but are not limited to, objective measures of caloric restriction, food avoidance, purging (e.g., vomiting, laxatives, and diuretics) or non-purging behaviors (e.g., excessive exercise, fasting). Thus, we acknowledge that our operationalized version of the <18.5 AC group may not sufficiently reflect options permitted by the DSM-5 and may not reflect the precise manner by which the criteria will be used. However, given that inferring Criterion B from Criterion A represents one potential interpretation of the changes, it is important that the impact of this potential, albeit unintended, interpretation be examined empirically. Moving toward future iterations of DSM, it will be important for studies to examine alternative methods of operationalizing Criterion B.
The present study represents the first empirical evaluation of the impact of different operational definitions of DSM-5 AN for research contexts. Results suggest that while broadening the weight criterion and inferring weight phobia from low weight would increase the number of individuals diagnosed with AN, the combination would introduce heterogeneity and reduce distinctions from individuals diagnosed OSFED. Given that the ideal definition of DSM-5 AN for research contexts should balance inclusivity with validity, we suggest operationalizing low weight as BMI < 18.5 kg/m2 and identifying a reliable approach to operationalizing Criterion B through observable measures or alternative methods, rather than inferring weight phobia from behaviors used to maintain low weight.
- 1American Psychological Association. Diagnostic and Statistical Manual of Mental Disorders, 5th ed. (DSM-5). Washington DC: American Psychiatric Publishing, Incorporated, 2013..
- 16Structured clinical interview for DSM-IV Axis I disorders—Patient ed. (SCID-I/P). New York: New York State Psychiatric Institute; 1995..
- 25What are we missing? The costs of skip rule designs in eating disorder research. Int J Meth Psych Res, in press., , , .