Validity and accuracy of the Adult Attention‐Deficit/Hyperactivity Disorder (ADHD) Self‐Report Scale (ASRS) and the Wender Utah Rating Scale (WURS) symptom checklists in discriminating between adults with and without ADHD

Abstract Objective To validate the Adult ADHD Self‐Report Scale (ASRS) and the Wender Utah Rating Scale (WURS) in a well‐characterized sample of adult attention‐deficit/hyperactivity disorder (ADHD) patients and population controls. Methods Both the ASRS and the WURS were administered to clinically diagnosed adult ADHD patients (n = 646) and to population controls (n = 908). We performed principal component analyses (PCA) and calculated receiver operating curves (ROC) including area under the curve (AUC) for the full WURS and ASRS, as well as for the PCA generated factors and the ASRS short screener. Results We found an AUC of 0.956 (95% CI: 0.946–0.965) for the WURS, and 0.904 (95% CI: 0.888–0.921) for the ASRS. The ASRS short screener had an AUC of 0.903 (95%CI: 0.886–0.920). Combining the two full scales gave an AUC of 0.964 (95% CI: 0.955–0.973). We replicated the two‐factor structure of the ASRS and found a three‐factor model for the WURS. Conclusion The WURS and the ASRS both have high diagnostic accuracy. The short ASRS screener performed equally well as the full ASRS, whereas the WURS had the best discriminatory properties. The increased diagnostic accuracy may be due to the wider symptom range of the WURS and/or the retrospective childhood frame of symptoms.


Signif ic ant O utcomes
• The Norwegian Wender Utah Rating Scale (WURS) and Adult ADHD Self-Report Scale (ASRS) were validated, both demonstrating excellent screening properties.
• Retrospective childhood symptoms of aggressiveness and social problems are highly predictive of an adult diagnosis of attention-deficit hyperactivity disorder.
• Our results support that emotional regulation problems constitute a large part of ADHD symptomatology in childhood.

Limit ations
• The use of retrospective self-report measures might be affected by memory biases and lack of recall.
• The use of self-report measures for present ADHD symptoms may be biased by the current health and life situation of the informant.
• This study was based on a sample diagnosed with ADHD as adults, thus it is uncertain whether the patients included would have obtained a childhood diagnosis of ADHD.

| INTRODUC TI ON
Adult attention-deficit hyperactivity disorder (ADHD) is a persistent neurodevelopmental disorder with childhood onset, characterized by inattention, hyperactivity, and impulsivity (American Psychiatric Association, 2013). ADHD has a prevalence of about 5% in childhood (Polanczyk, de Lima, Horta, Biederman, & Rohde, 2007), with about half persisting into adulthood (Faraone et al., 2015). As contextual demands continue to increase in number, scope and complexity with age, coupled with decreased support systems, ADHD may first be recognized and diagnosed in adults (Turgay et al., 2012). Fayyad et al. (2017) found an overall prevalence of 2.8% of DSM-IV adult ADHD across a range of nations, spanning from 1.4% in lower income countries to 3.6% in higher income countries. Adult ADHD is associated with for example lower educational achievement and increased rates of incarcerations, unemployment and illicit drug use (Faraone et al., 2015). Clinical assessment based on the Diagnostic and Statistical Manual of Mental Disorders (DSM) criteria is the gold standard for the diagnosis (Haavik, Halmoy, Lundervold, & Fasmer, 2010), but short screeners or symptom rating scales provide a quick and easy way of obtaining standardized information to select patients for further examination.
It is important to establish a history of childhood ADHD symptoms, as the pharmacological treatment of ADHD involves regulated substances and as several other disorders that appear in adulthood may display ADHD symptoms (e.g., affective disorders, substance use disorders, and sleep disorders; Haavik et al., 2010). To add to the complexity, these disorders may often also be comorbid with ADHD. The Wender Utah Rating Scale (WURS) was developed to retrospectively evaluate the presence and severity of childhood symptoms of ADHD in adult patients (Ward, Wender, & Reimherr, 1993). The WURS is based on the Utah criteria (Wender, 1995), requiring a childhood history of ADHD including both inattentive and hyperactive symptoms, with one of the following additional symptoms: behavior problems in school, impulsivity, over-excitability and temper outbursts. The Utah criteria also require an adult history of persistent attention problems and motor hyperactivity with at least two of the following symptom domains: emotional lability, hot temper, stress intolerance, disorganization and impulsivity (Ward et al., 1993). The original 61-item questionnaire was subsequently reduced to the 25 items that best distinguished an ADHD sample from control samples (i.e., healthy controls and depressed patients). Most of the final 25 items are thus not directly tapping into the core ADHD symptoms, but were chosen for their discriminative ability. A recent study has found that emotional lability measured by the WURS may be one of the best childhood predictors of adult ADHD (Gisbert et al., 2018). A WURS-25 score of at least 36 identified 96% of adults with ADHD and 96% of healthy controls (Ward et al., 1993). A cutoff of 46 or higher correctly identified 86% of adults with ADHD, 99% of "normal" controls, and 81% of a comparison sample with de- Current symptoms of inattention and/or hyperactivity and impulsivity are also essential for the diagnosis of ADHD to be made in adulthood. The Adult ADHD Self-Report Scale (ASRS) is the official screening instrument of the World Health Organization (WHO; Kessler et al., 2005), and includes the 18 items ADHD symptoms of the DSM. It is one of the most commonly used screening instruments of current ADHD symptoms in adults. The authors/creators of the ASRS tested several variants of administering the 18 DSM symptoms of ADHD, and concluded that a 6-item version was best suited as general population screen (Kessler et al., 2005(Kessler et al., , 2007. The authors based their conclusion on blind clinical ratings of DSM-IV adult ADHD in a sample of merely 154 respondents from the US National Comorbidity Survey Replication (NCS-R), oversampling those who reported childhood ADHD and adult persistence (Kessler et al., 2005). Recently, the same group (Ustun et al., 2017) created an updated 6-item screen of the ASRS replacing two of the 6 items with items on executive functioning (i.e., not part of the ADHD defining symptoms). They found this to have good psychometric properties as a general population screener. However, another small nonclinical study comparing the short screener to the full 18 items version found the lengthy version to have better psychometric properties (Zohar & Konfortes, 2010). The authors pointed out the need for a direct assessment of the utility of the ASRS in clinical samples, as there is a lack of studies examining the screening properties of the whole ASRS in an adequately large sample of adults with a clinically confirmed ADHD diagnosis and population controls. The ASRS and the 25-item WURS have been translated into several languages, including Norwegian. Validation studies of multiple versions have shown similar psychometric properties to those reported for the original English versions (Caci et al., 2010;Kessler et al., 2005Kessler et al., , 2007McCann et al., 2000;Stanton & Watson, 2016;Ustun et al., 2017;Zohar & Konfortes, 2010). The Norwegian versions were translated to Norwegian and back-translated according to commonly accepted protocols. Although these versions have been widely used, we are not aware of official validation studies.
The aims of the present study were threefold: first, to establish the construct and content validity of the Norwegian translations of the WURS and the ASRS using principal component analysis; second, to examine the psychometric properties of the WURS and the ASRS in a large clinically diagnosed adult ADHD patient sample and population controls; third, to compare the utility of these instruments to aid the clinical ADHD diagnosis.

| Participants
The participants were recruited as part of the "ADHD in Norwegian

| The Wender Utah Rating Scale
The 25-item version of the Wender Utah Rating Scale (WURS; Ward et al., 1993) assesses childhood symptoms by asking the participants to retrospectively recall the frequency and severity of ADHD symptoms and related problems experienced in childhood. Participants responded to these items on a Likert-type 5-point scale according to the following response categories: "not at all/very slightly" (0), "mildly" (1), "moderately" (2), quite a bit" (3), or "very much" (4), giving a possible range of 0-100 points.

| The Adult ADHD Self-Report Scale
The Adult ADHD Self-Report Scale (ASRS) is a brief screening instrument to identify current ADHD symptoms (Kessler et al., 2005). The scale was developed by the World Health Organization (WHO, 1992) and the Work Group on Adult ADHD (Kessler et al., 2005). The scale contains the 18 symptoms of inattention, hyperactivity, and impulsivity defining ADHD according to the DSM-IV-TR and DSM-5 (American Psychiatric Association, 2000, 2013. The severity of the symptoms are reported on a 5-point Likerttype scale (0-4 = never, rarely, sometimes, often, to very often), with a total range of 0-72. The total ASRS score has shown good reliability and validity in both clinical and population samples (Adler et al., 2006;Glind et al., 2013).

| Statistics and analytic plan
A Principal Component Analysis (PCA) with Varimax rotation was run to establish how the items of the WURS and the ASRS contributed to given components, selecting components with Eigenvalues above one (we henceforth refer to components as factors; Field, 2013). We calculated receiver operating curves (ROC) including area under the curve (AUC) for the full WURS and ASRS, as well as for the PCA generated factors.
The likelihood ratios for positive tests (LH+) and negative tests (LH−) and Diagnostic Odds Ratio (DOR) were calculated using formulas from Fischer, Bachmann (Fischer, Bachmann, & Jaeschke, 2003). The DOR is a measure of a diagnostic test's overall accuracy (Glas, Lijmer, Prins, Bonsel, & Bossuyt, 2003), and unlike positive and negative predictive values, the DOR does not depend on the prevalence of the disease, facilitating comparisons of tests for meta-analyses. A DOR value of 20 or more indicates that an instrument has useful screening properties (Fischer et al., 2003).
Cronbach's alpha was calculated to measure internal consistency in the resulting factors of the WURS and ASRS. SPSS version 24.0 was used for the statistical analyses (IBM 26).

| RE SULTS
The present study included n = 646 clinically assessed adult ADHD patients and n = 908 controls, resulting in a total sample of 1,554 participants. The mean ages were 34.0 (SD 10.3) years in the ADHD group and 29.4 (SD 7.8) years in the control group (p < .01). There were 48.5% females in the ADHD group and 59.9% females in the control group (p < .01). The total WURS and ASRS scores were strongly correlated (full sample r = .78, p < .001; ADHD group r = .36, p < .001; controls r = .70, p < .001). Figure 1 shows the distributions of WURS and ASRS scores in the ADHD and control samples, including the correlation between the two. For a subset of patients, we also obtained clinician ratings on whether the patients were currently on (n = 420) or off (n = 125) pharmacological treatment for ADHD, as well as if they had been treated for ADHD as a child (n = 89) or not (n = 530). Adults with ADHD on current pharmacological treatment reported a significantly lower ASRS score than the off treatment group, but there was no difference between these groups on the WURS. Adult patients who had been treated for ADHD as a child scored significantly higher on both the ASRS and the WURS compared to those patients who reported no childhood treatment.
Mean scores on the ASRS and WURS for the ADHD group and the control group, as well as for the different subgroups within the ADHD group, are shown in Table 1.

| Factor analyses
The Principal Component analysis generated a three-factor solution for the WURS items in the full sample (Table 2)  A two-factor solution was generated for the ASRS in the full sample (Table 3), explaining 62.2% of the variance. The first factor included items reflecting symptoms of inattention, the second factor symptoms of hyperactivity and impulsivity. The items reflecting impulsive behavior obtained the highest loadings on the second factor.
Internal consistency measured by Cronbach's alpha was 0.952 for the full ASRS score, 0.924 for the Inattentive factor and 0.918 for the Hyperactivity/Impulsivity factor. were no significant differences between males and females (data not shown). The optimal cutoff balancing the trade-off between sensitivity and specificity for the respective scales may vary depending on the aims in the specific clinical or research setting. Table 4 provides cutoff values for 98%, 95%, 90% and 80% sensitivity and

| D ISCUSS I ON
Both the WURS and the ASRS had excellent screening and psychometric properties, with somewhat stronger properties for the WURS. The recommended short screener ASRS performed as well as the full ASRS. A Principal Component analysis confirmed a threefactor structure of the WURS described in previous studies (Caci et al., 2010;Kouros, Horberg, Ekselius, & Ramklint, 2018;McCann et al., 2000;Stanton & Watson, 2016), albeit with some differences at item level. The well described two-factor structure was confirmed for the ASRS. Using area under the curve (AUC), our findings fit well with previous cutoff suggestions by Ward et al. (1993) for the WURS and by Kessler et al. (2005) on the full 18 item ASRS (Table 4). The total sum scores on WURS and ASRS were strongly correlated.
The delineation of disorder versus normality is a universal problem when a diagnosis is based on symptoms that are dimensional and normally distributed, and it is of particular concern in a disorder for which controlled stimulant substances with potential for abuse are Finding that the WURS outperformed the ASRS adds to the ongoing controversy of the defining features of adult ADHD. The factor analysis of the WURS showed that the main factor of the WURS was the Aggressiveness and social problems, indicating that these symptoms play an important role in ADHD. Adler et al. (2017) suggested that executive dysfunction is as central as the DSM-5 symptoms to adult ADHD, while emotional dysregulation has been suggested to be more distinct but nevertheless part of the combined presentation of adult ADHD Shaw, Stringaris, Nigg, & Leibenluft, 2014). In a recent study, both executive function deficits and emotional dyscontrol items have been included as part of expanded versions of screening instruments for adult ADHD, showing the increased focus on these symptoms in recent years (Silverstein et al., 2019). The better discriminatory properties of the WURS are noteworthy as our patients were diagnosed as adults based on a comprehensive clinical evaluation following the ICD/DSM criteria.
Thus, even strictly defined adult ADHD patients are more easily distinguished from controls with a broader childhood symptom array than the current DSM core symptoms. This fits well with the well-established finding that ADHD is characterized by childhood onset and symptoms within domains of executive problems and emotional dysregulation. Although traditionally viewed as comorbid problems, these symptoms rather seem to be characteristic of having ADHD itself. Thus, the broader aspect covered by the WURS may reflect the broader picture that is essentially characteristic of persistent ADHD.
Although the AUC was only slightly better for the WURS than for the ASRS, the differences in the diagnostic odds ratios were considerable, as the WURS had an overall better specificity with intact sensitivity. Our findings suggest the ASRS is not adequate in situations requiring very high sensitivity, as the specificity was merely 0.45 at sensitivity 0.95. ADHD and controls. Another possible explanation for the better screening properties of the WURS could be that some of the patients have ADHD in partial remission (and thus a low ASRS score). We found that adult ADHD patients on current pharmacological (mainly stimulant) treatment reported less current symptoms of ADHD on the ASRS compared to those who were not on medication, but there were no statistical differences on the WURS. Furthermore, patients treated for ADHD in childhood reported more symptoms than those who had not been treated in childhood on both the WURS and the ASRS, indicating a more severe and persistent phenotype (Halmoy et al., 2009).

| Strengths and limitations
The present findings should be viewed in light of some limitations.
There are problems related to the use of self-report measures because measures that employ a retrospective approach might be affected by memory biases and lack of recall. McGough and Barkley (2004) argued that "a major obstacle to retrospective diagnosis is that it is significantly biased by current functioning." However, our findings show that the retrospective WURS did better than reports of current ADHD symptoms in differentiating adult ADHD patients from controls. This is in line with previous studies on the WURS by, for example, Fossati et al. (2001) showing excellent short-term retest reliability. Both Fossati et al. (2001) and Grogan and Bramham (2016) found that current mood symptoms do not affect the accuracy of retrospective self-ratings of childhood ADHD symptoms.
A recent study has found that the WURS even has acceptable retest reliability over the time span of several years (Lundervold, Vartiainen, Jensen, & Haavik, 2019). The ASRS on the other hand may be more affected by short-term confounders such as affective fluctuations (Lundervold et al., 2011), time of day (Franke et al., 2012) and sleep problems (Benjamins et al., 2016;Brevik et al., 2017). Comorbid psychiatric disorders could have influenced findings, but to maintain external validity we chose not to control for these, as ADHD is more often comorbid than not (Singh, 2008;Sobanski, 2006).
This study was based on an adult ADHD sample ascertained in adulthood, meaning that it is uncertain whether the patients included would have obtained a childhood diagnosis of ADHD, with the expected symptomatic trajectory. This is potentially an important caveat, as some recent studies have put into question ADHD as a neurodevelopmental disorder, highlighting both discontinuation of childhood symptoms as well as a possible adult onset ADHD phenotype (Agnew-Blais et al., 2016;Caye et al., 2016;Moffitt et al., 2015).
We used a clinically validated patient sample and a representative population control sample, which strengthens the clinical utility of our findings. Our control sample was randomly recruited from the Norwegian Medical Birth Registry, without any formal exclusion criteria, so there is a potential for some undiagnosed cases of ADHD in the control group. However, screening instruments are generally more useful in at risk populations rather than in the general population, where the performance of the screening tools could be overstated.

| CON CLUS ION
The Norwegian translation of both the ASRS and the WURS had excellent psychometric properties and can be used independently for screening and diagnostic assessment for ADHD. We found that the WURS had even better screening properties than the ASRS, in spite of our sample being clinically assessed and diagnosed in adulthood. The wider WURS dimensions of aggression, learning problems and emotional lability were highly relevant to identify adult ADHD in our sample, supporting a broader conceptualization of ADHD.
With their different temporal focus and clinically relevant symptom domains, we recommend using the ASRS and the WURS jointly to assess for adult ADHD.

ACK N OWLED G M ENTS
We wish to thank all patients and controls who volunteered to participate in this study, and Lisa Vårdal, Anne Halmøy, MD, PhD, and Helene Halleland, PhD, for their work with patient recruitment and data collection.

CO N FLI C T O F I NTE R E S T
JH has received lecture honoraria as part of continuing medical education programs sponsored by Novartis, Eli Lilly and Company, and Janssen-Cilag. The other authors report no potential conflicts of interest.

AUTH O R CO NTR I B UTI O N
All authors were involved in the conception and design of the study.
JH supervised the data collection for the study. All authors were involved in the data analysis and interpretation, drafting the article and critical revision of the article. All authors approved of the final version to be published.

DATA AVA I L A B I L I T Y S TAT E M E N T
The data that support the findings of this study are available from the corresponding author upon reasonable request.