Validity of bipolar disorder hospital discharge diagnoses: file review and multiple register linkage in Sweden

Authors


Mikael Landén, Section of Psychiatry and Neurochemistry, Institute of Neuroscience and Physiology, SE431 80 Molndal, Sweden.
E-mail: Mikael.landen@neuro.gu.se

Abstract

Sellgren C, Landén M, Lichtenstein P, Hultman CM, Långström N. Validity of bipolar disorder hospital discharge diagnoses: file review and multiple register linkage in Sweden.

Objective:  Hospital discharge registers (HDRs) are frequently used in epidemiological research. However, the validity of several important psychiatric diagnostic entities, including bipolar disorder, remains uncertain. Hence, we aimed to develop an optimal algorithm for register-based identification of DSM-IV-TR bipolar disorder.

Method:  We identified potential cases in the Swedish national HDR using two separate discharge diagnoses of bipolar disorder according to ICD versions 8–10 during January 1, 1973 to December 31, 2004. In a randomly selected subsample of 135 cases from the county of Sörmland, two senior psychiatrists reassessed the diagnostic status based on patients’ medical records. We scrutinized false-positive cases and modified the initial algorithm to improve positive predictive value while minimizing false negatives. Finally, we externally validated resulting caseness algorithms by linking HDR diagnostic data with best-estimate clinical diagnoses from the National Quality Assurance Register for Bipolar Disorder (BipoläR), dispensed lithium prescriptions from the National Prescribed Drug Register, and the ICD-10 diagnoses from the National Outpatient Register respectively.

Results:  The algorithm with two discharge diagnoses of bipolar disorder yielded a positive predictive value of 0.81. Modification by excluding individuals diagnosed with ICD-8 296.20 (manic-depressive psychosis, depressed type), and/or ICD-9 296.B (unipolar affective psychosis, melancholic form), gave a positive positive predictive value of 0.92. The modified algorithm also had statistically superior external validity compared with the original algorithm.

Conclusion:  Our findings suggest that DSM-IV-TR bipolar disorder caseness based on two inpatient episodes with a bipolar disorder diagnosis is sufficiently sensitive and specific to be used in further epidemiological study of bipolar disorder.

Significant outcomes

  •  By using a search algorithm with two discharge diagnoses of bipolar disorder, we found a positive predictive value of 0.81 relative to reassessed diagnostic status based on patients’ medical records.
  •  A revised algorithm yielded a positive predictive value of 0.92 and improved external validity using other national registries.
  •  A valid and optimized algorithm for register-based identification of bipolar disorder will facilitate further research on this major mental disorder in the Swedish Hospital Discharge Register.

Limitations

  •  The hospital discharge register covers only affective episodes that involve hospitalization.
  •  The requirement of informed consent to access the medical records of included individuals might have introduced selection bias.
  •  Non-participants had a small but significantly increased rate of inpatient episodes for bipolar disorder compared with participants. This may warrant some caution as to the generalizability to the most severe forms of bipolar disorder.
  •  Validity testing would have been strengthened further if chart reviews had been supplemented with diagnostic interviews.

Introduction

Several countries maintain nationwide longitudinal hospital discharge registers (HDRs) with mandatory individual-level reporting of diagnoses and dates for assessment and treatment. In addition to facilitating health surveillance, such registers supply ample opportunities for psychiatric epidemiological research. Studies using population-based registries have, for example, provided information about the overall high mortality in schizophrenia (1). A variety of important studies of psychiatric disorders have been based on HDRs in Denmark (2–6), Finland (7, 8), and Sweden (9–13). Further, HDRs might be used to correctly estimate lifetime prevalences of psychotic disorders or other conditions that are associated with high non-response rates in population surveys (7). Finally, HDRs could be used to identify individuals with mental disorders that might be approached for DNA sampling in molecular genetic studies (14).

Although some research has attempted to validate psychiatric HDR diagnoses (15–17), the validity of various important psychiatric diagnostic entities in HDRs, including bipolar disorder, remains uncertain.

Aims of the study

To facilitate further research on bipolar disorder, we set out to develop and validate an optimal algorithm for register-based identification of Diagnostic and Statistical Manual of Mental Disorders, 4th edition, Text Revision (DSM-IV-TR) bipolar disorder in the Swedish Hospital Discharge Register.

Material and methods

Swedish nationwide registries

We linked several longitudinal total population-based registries in Sweden. Each citizen, including immigrants upon arrival to the country, receives a unique personal identification number (18), which makes linking of register data possible. The Swedish HDR (held by the National Board of Health and Welfare) contains discharge diagnoses according to WHO’s International Classification of Diseases and Related Health Problems (ICD). The HDR has nationwide coverage for psychiatric in-patient care since 1973, and discharge diagnoses have been provided according to ICD-8, ICD-9, and ICD-10 (19–21). The Swedish Prescribed Drug Register (PDR) contains individual-based data for all prescriptions dispensed in Sweden since 2005 (22), based on mandatory reporting from the state-owned National Corporation of Swedish Pharmacies.

During the last two decades, national quality assurance registers with detailed individual data related to disorder onset, subtype, clinical course, comorbidity and treatment have also been established in the Swedish health care system. These registers are administered by representatives from the respective clinical specialty and supported by the Swedish Association of Local Authorities and Regions. At the time of data extraction (August 2009), the Swedish National Quality Register for Bipolar Disorder (BipoläR) contained data for 4307 unique individuals diagnosed with bipolar disorder type 1, type 2, NOS, or schizoaffective disorder – bipolar type. BipoläR is based on treating psychiatrists systematically collecting data and assigning of best-estimate bipolar disorder diagnoses according to the ICD-10 and DSM-IV-TR (23). Psychiatrists who volunteer in data collection for the BipoläR registry are usually specifically trained in the diagnosis and treatment of bipolar disorder. Importantly, he or she has access to all available clinical data including a longitudinal perspective of the each patient’s course of illness. Hence, data quality is likely to be high and a bipolar diagnosis in BipoläR could be regarded as the gold standard for a bipolar disorder diagnosis with very high specificity (although with lower sensitivity). BipoläR also contains information about age at onset, family history, number of episodes, suicide attempts, sociodemographic variables, and pharmacological treatment. Finally, we obtained data from the National Outpatient Register (NOR; also held at the National Board of Health and Welfare) for individuals with registered outpatient visits to specialist physicians (other than a general practitioner) that resulted in one or more psychiatric diagnoses according to the ICD-10. This register contains data since 2001 (24).

Study population

Using the HDR, we identified all individuals with at least one discharge diagnosis of bipolar disorder according to ICD-8 (1973–1986; diagnostic codes 296.00/.1/.2/.3/.88/.99), ICD-9 (1987–1996; codes 296.0/.1/.2/.3/.4/.8/.9), or ICD-10 (from 1997; codes F30–31). Cases were diagnosed between January 1, 1973, and December 31, 2004 (HDR has nationwide coverage for psychiatric inpatient care since 1973, and December 31, 2004, was the last date of inclusion into the current register linkage). To keep the number of false-positive cases low (i.e., improving specificity), without losing too many true-positive cases (i.e., decreasing sensitivity), we required at least two separate discharges from hospital with a diagnosis of bipolar disorder. In accordance with prior studies (25), included individuals had to be alive and not having emigrated at the time of data merging (2005). Further, individuals with more than one inpatient diagnosis of schizophrenia were excluded. We refer to these search criteria as algorithm (A).

The study population included all individuals diagnosed with bipolar disorder in the county of Sörmland between 1973 and 2004 according to algorithm (A). In conjunction with enrollment in a genetic study of bipolar disorder, cases were asked for written or oral informed consent to access and review their medical records. Of 558 identified cases, 17 were excluded because they were too acutely ill, 72 were not possible to reach (including deceased individuals), and 214 declined to participate. Hence, 255 subjects (45.7%) agreed to participate and have their medical records scrutinized. From this group, we chose a random sample of 135 subjects to perform an extensive chart review.

At the end of 2008, the total population of Sweden was 9 256 347 individuals whereof 267 524 lived in the county of Sörmland (26). The county of Sörmland has three public psychiatric clinics comprising both inpatient and outpatient care. During the spring of 2008, two board-certified psychiatrists (ML and NL) conducted a meticulous chart review at the respective clinic and reassessed diagnoses according to the DSM-IV-TR. Diagnoses were validated against DSM-IV-TR because this is the most widely used diagnostic classification system in research, whereas ICD-10 is generally used clinically and for patient statistics. The ratings were primarily based on current computerized patient records accessed on-site at each clinic. The records of the 135 patients were scrutinized by the two raters, uncertainties discussed, and best-estimate diagnoses were established. In Sörmland, traditional paper files were replaced by computerized patient records in late 2004 to February 2005. In case no computerized record was available, or if computerized information was insufficient to reliably determine caseness (n = 60), paper-based patient records were ordered from the archives at all three psychiatric clinics. For those without recent contact with public psychiatric care, as represented by no computerized patient records, we also checked for patient records at the only private psychiatric out-patient unit in Sörmland (no hits).

To assess the probability that a bipolar disorder diagnosis in the HDR (according to the algorithm) reflected the best-estimate diagnosis established from chart review, we calculated the positive predictive value (number of true-positive bipolar disorder cases according to file-based best-estimate diagnoses divided by all – true and false positive – bipolar disorder cases suggested by the HDR). Agreement on case/non-case classification between the revised algorithm and best-estimate diagnosis according to the chart review was also tested with Cohen’s kappa using the interpretation guidelines of Landis and Koch (27).

Using national registers to further validate the algorithm

A registration in BipoläR was considered the gold standard for a bipolar disorder diagnosis with very high specificity. Hence, we compared the rate of BipoläR registrations among subjects with HDR-based bipolar disorder. The likelihood for a case being registered in BipoläR was estimated by matching (by birth year and sex) ten randomly selected general population controls without HDR bipolar disorder to each patient with HDR-based bipolar disorder. Cases and controls had to be to be alive and residing in Sweden at the time of data extraction in 2009. With the assumption that medication with lithium is a specific and independent proxy for bipolar disorder (28), we also compared the relative rate of prescribed and dispensed lithium in the PDR among subjects with HDR-based bipolar disorder vs. matched controls. To minimize false-positive cases of bipolar disorder, we required two dispensed prescriptions of lithium (ATC code: N05AN01). Finally, we compared subjects identified in the HDR as bipolar disorder and matched general population controls with caseness in the NOR (same algorithms but only for ICD-10 codes F30-F31, because the Outpatient Register has coverage only from 2001). We analyzed data using statistical software programs sas® version 9.1.3 (SAS Institute Inc., Cary, NC, USA, 2004) and PASW® Statistics 18, release version 18.0.0 (©IBM SPSS Inc., 2009, Chicago, IL, USA) for Mac.

Improving the search algorithm

To improve the hit rate, we checked which diagnostic codes that caused incorrect inclusion of false-positive cases, see Results. This yielded a revised algorithm, labeled algorithm (B).

Ethics

The study was conducted in accordance with the Declaration of Helsinki and approved by the Regional Ethics Committee in Stockholm. Individual medical records were assessed after written or oral informed consent. Data were merged and anonymized by the National Board of Health and Welfare and the linking of personal identification numbers to cases destroyed after merging.

Results

Age, sex, and history of bipolar disorder among all individuals eligible for the study are presented in Table 1. Table 1 also provides a comparison of those who agreed to have their medical records scrutinized (N = 255) and non-participants (N = 303). Non-participants did not differ from participants except for a small but statistically significantly increased rate of inpatient episodes for bipolar disorder among non-participants.

Table 1.   Comparison of sex, age, and bipolar disorder history among participants and non-participants in a validation study of 558 eligible patients with bipolar disorder diagnoses in the Swedish Hospital Discharge Register 1973–2004
VariableSelected participants (n = 135)All participants (n = 255)Non-participants (n = 303)P*P
  1. For both sets of comparisons, we used Fisher’s exact test for sex, t-test for age variables, and Wilcoxon rank-sum test for number of hospitalizations.

  2. *Comparing selected participants with non-participants.

  3. †Comparing all participants with non-participants.

Male sex (%)41.342.137.30.300.30
Age in 2008 (years): mean (SD)59.8 (11.6)59.8 (11.3)59.1 (12.4)0.500.59
Age at first hospitalization for bipolar disorder (years): mean (SD)40.0 (11.7)40.6 (11.2)40.2 (11.8)0.900.64
Number of hospitalizations for bipolar disorder: median (interquartile range)5 (6)4 (4)5 (7)0.740.04

Retrospective chart review and positive predictive value

Among 135 patients identified as having HDR-based ICD-8/-9/-10 bipolar disorder according to algorithm (A), and randomly chosen for a retrospective chart review, 110 were true-positive cases and assigned a lifetime DSM- IV-TR bipolar disorder diagnosis [including bipolar I, bipolar II, and NOS and schizoaffective disorder – bipolar type (20)]. The latter diagnosis was regarded as a hit because of its close association with bipolar illness (29, 30). The remaining 25 were false-positive cases, that is, were not confirmed upon chart review. This corresponded to a positive predictive value for bipolar disorder of 0.81 (Table 2).

Table 2.   Ability of algorithms (A) and (B) to correctly identify patients with lifetime DSM-IV-TR bipolar disorder* in a random sample of 135 living individuals in Sörmland, Sweden, hospitalized at least twice for bipolar disorder between 1973 and 2004
VariableResearcher-assigned DSM-IV-TR bipolar disorderPositive predictive value
YesNo
  1. *Researcher-assigned, record-based research diagnoses of bipolar disorder type 1, type 2, and NOS, or schizoaffective disorder – bipolar type.

  2. †Defined as bipolar disorder type 1, type 2, or NOS (total n = 106), or schizoaffective disorder – bipolar type (n = 4).

Bipolar disorder according to
 Algorithm (A)110†250.81
 Algorithm (B)9790.92

To improve the search algorithm, we scrutinized the 25 false-positive cases obtained with search algorithm (A). For 16 false positives, diagnostic codes ICD-8 296.20 (manic-depressive psychosis, depressed type) and/or ICD-9 296.B (unipolar affective psychosis, melancholic form) caused the incorrect inclusion (Table 3). The exclusion of cases whose identification depended solely on these two diagnoses yielded a revised algorithm, labeled algorithm (B), with a positive predictive value of 0.92 (corresponding to a specificity of 0.88 and a sensitivity of 0.64). This corresponded to moderate agreement on case/non-case classification between algorithm (B) and the confirmed bipolar disorder diagnoses upon chart review (Cohen’s κ = 0.49, 95% CI: 0.31–0.67).

Table 3.   Researcher-assigned diagnoses to false-positive cases of lifetime bipolar disorder identified in the Swedish Hospital Discharge Register using algorithms (A) or (B)
25 false positives using algorithm (A)9 false positives using algorithm (B)
  1. MDD denotes DSM-IV-TR major depressive disorder. Algorithm (B) classified 106 individuals as having bipolar disorder among the 135 initially classified as such by algorithm (A).

Schizophrenia + MDD
Organic psychosisOrganic psychosis
MDD + psychosis + alcohol abuseMDD + psychosis + alcohol abuse
MDD + psychosis
MDD + psychosis
MDD + psychosis
MDD + learning disorderMDD + learning disorder
MDD + alcohol abuse
MDD + alcohol dependence
MDD + alcohol dependenceMDD + alcohol dependence
MDD + social phobiaMDD + social phobia
MDD + eating disorderMDD + eating disorder
MDD + personality disorder
MDDMDD
MDDMDD
MDD
MDD
MDD
MDD
MDD
MDD
MDD
Uncertain
UncertainUncertain
Uncertain

External validation of identified cases

Table 4 depicts the representation of all individuals in the HDR (not only the Sörmland patients) with bipolar disorder according to algorithms (A) and (B) and general population controls matched on age and birth year in the BipoläR (inclusion vs. no inclusion), the PDR (defined as two or more dispensed prescriptions of lithium vs. 0–1), and the NOR (two or more diagnoses of bipolar disorder according to the corresponding algorithm vs. 0–1). Across all three registers, representation was markedly and statistically significantly higher using algorithm (B) instead of algorithm (A).

Table 4.   Absolute and relative rates of individuals with bipolar disorder (according to algorithms A and B) in the Swedish Hospital Discharge Register and general population controls (matched on sex and birth year) also found in the Prescribed Drug Register, the BipoläR quality register, and the National Outpatient Register respectively
VariablePercentage of cases in register (vs. population controls)Odds ratio (95% CI)
  1. CI, confidence interval.

Prescribed drug register: lithium
 Algorithm (A)14.2 (1.4)12.3 (11.8–12.9)
 Algorithm (B)19.7 (1.2)20.0 (19.0–21.0)
BipoläR: bipolar disorder
 Algorithm (A)2.7 (0.2)12.7 (11.6–13.9)
 Algorithm (B)4.0 (0.1)61.3 (51.7–72.7)
National Outpatient Register: bipolar disorder
 Algorithm (A)15.8 (1.1)19.3 (18.4–20.2)
 Algorithm (B)23.3 (1.0)33.6 (31.7–35.5)

Discussion

HDRs are often used in epidemiological research but the validity of bipolar disorder in such registers has not been documented. We validated register-based bipolar disorder diagnoses from 1973 and onwards across the 8th, 9th, and 10th editions of the ICD in the Swedish HDR against DSM-IV-TR diagnoses based on meticulous reviews of medical records. Using two separate discharge diagnoses of bipolar disorder, we found a positive predictive value of 0.81. A revised search algorithm improved the positive predictive value to 0.92. The improved algorithm obtained superior external validity upon linkage of all individuals diagnosed with HDR bipolar disorder with data from the National Quality Assurance Register for Bipolar Disorder, dispensed lithium prescriptions in the National Prescribed Drug Register, and diagnoses of ICD-10 bipolar disorder according to the NOR.

Based on these findings, we suggest that the search algorithm for bipolar disorder caseness based on two separate inpatient episodes according to the Swedish HDR is sufficiently sensitive and specific to be used in further epidemiological and genetic research of bipolar disorder, provided that ICD-8 296.20 (manic-depressive psychosis, depressed type) and/or ICD-9 296.B (unipolar affective psychosis, melancholic form) are excluded. While the BipoläR registry should be regarded as the gold standard for a bipolar disorder diagnosis in Swedish registers with very high specificity and detailed phenotypic information, usage is still limited by a low ascertainment rate. By contrast, the proposed algorithm for identifying bipolar cases in the HDR has complete nationwide coverage and therefore provides greater sample size and statistical power. The drawback is limited phenotypic information and that patients receiving outpatient treatment only are not possible to track.

Considering common differential diagnostic challenges faced by clinicians working with patients with suspected bipolar disorder (32, 33), we expected to find major depressive disorder (MDD), alcohol abuse/dependence, schizophrenia, and schizoaffective disorder of depressed and mixed types among false positives. For 21 of the 25 false-positive cases, an MDD diagnosis was established upon chart review. Six of these also had a psychotic disorder, including one individual re-diagnosed with schizophrenia (Table 3). Comorbid alcohol abuse or dependence was diagnosed for four of the 25 false-positive cases. For another three of the false positives obtained with algorithm (A), too sparse documentation precluded the assignment of any diagnosis.

One limitation with the current study is that only episodes leading to hospitalization were counted. Given the concept of bipolar spectrum disorder and that modern treatment may have reduced the need for inpatient treatment, less severe forms of bipolar disorder may not be optimally captured by the suggested algorithm. Second, the requirement of informed consent to access the medical records of included individuals might have introduced bias. However, participants did not differ from non-participants regarding age, sex, or age at first registered in-patient episode of bipolar disorder. It should be noted that since the HDR started in 1973, the ‘first episode’ is the first one registered from 1973 and onward although the onset might have occurred long before that date. In contrast, participants had slightly fewer in-patient episodes because of bipolar disorder than non-participants. This might reflect a more severe overall illness, including higher rates of comorbidity and relapse, among non-participants. Some caution is therefore warranted regarding the optimality of the suggested algorithm to individuals with the most severe and chronic forms of bipolar disorder. Third, to acquire computerized medical records at individual clinics for the reassessment of diagnoses, we defined the study population as living individuals registered in one single county, Sörmland. However, the population of Sörmland was judged to be sufficiently similar to the general population of Sweden with respect to sociodemographic variables age, sex, income, maternal, and paternal age at first birth (data not shown, available from first author upon request) (26). Nor did we have reasons to suspect that the diagnostic tradition regarding bipolar disorder differed markedly between Sörmland and Sweden as a whole during this period. Fourth, for the purpose of increasing the specificity, the algorithm excludes cases that have been admitted only once. This should be taken into account when using the algorithm to estimate prevalence. Finally, the validity testing would have been further strengthened if diagnostic interviews also had been conducted. We reasoned, however, that extensive chart reviews complemented with further testing of the algorithms by multiple register linkage would better balance methodological considerations against ethical and resource issues.

Acknowledgements

We are grateful to Rozita Broumandi, Karolinska Institutet, for assistance with data extraction. Funding was obtained from the US Broad Foundation and the Swedish Research Council (NL: 2007-8595; ML: K2008-62x-14647-06-3, K2010-61X-21569-01-1, and K2010-61P-21568-01-4).

Declaration of interest

We report no potential conflict of interests.

Ancillary