Assessing the accuracy of using diagnostic codes from administrative data to infer antidepressant treatment indications: a validation study

Abstract Purpose To assess the accuracy of using diagnostic codes from administrative data to infer treatment indications for antidepressants prescribed in primary care. Methods Validation study of administrative diagnostic codes for 13 plausible indications for antidepressants compared with physician‐documented treatment indications from an indication‐based electronic prescribing system in Quebec, Canada. The analysis included all antidepressant prescriptions written by primary care physicians between January 1, 2003 and December 31, 2012 using the electronic prescribing system. Patient prescribed antidepressants were linked to physician claims and hospitalization data to obtain all diagnoses recorded in the past year. Results Diagnostic codes had poor sensitivity for all treatment indications, ranging from a high of only 31.2% (95% CI, 26.8%‐35.9%) for anxiety/stress disorders to as low as 1.3% (95% CI, 0.0%‐5.2%) for sexual dysfunction. Sensitivity was notably worse among older patients and patients with more chronic comorbidities. Physician claims data were a better source of diagnostic codes for antidepressant treatment indications than hospitalization data. Conclusions Administrative diagnostic codes are poor proxies for antidepressant treatment indications. Future work should determine whether the use of other variables in administrative data besides diagnostic codes can improve the ability to predict antidepressant treatment indications.


| INTRODUCTION
Nearly half of all antidepressants in primary care are prescribed for indications other than depression, including anxiety disorders, insomnia, and pain, among others. 1 When antidepressants are not prescribed for depression, 2 out of 3 prescriptions are for unapproved (off-label) indications where in most cases, the drug's use is not supported by strong evidence. 2 These findings highlight the need for more pharmacovigilance and post-market evaluations on antidepressant use for indications other than depression.
Employment of information from large administrative databases to evaluate antidepressant use is advantageous because such databases can identify large, population-based cohorts of antidepressant users, capture many different off-label uses, and detect rare outcomes or long-term effects that otherwise might not be observed in clinical trials. 3 However, administrative databases do not contain information on treatment indications for drugs, which presents a major obstacle for using these data to evaluate antidepressant use for different indications.
In the absence of documented treatment indications, several stud-

| Context
This study took place in the Canadian province of Quebec, where all residents are publicly insured for the cost of essential medical care. Over 90% of physicians are reimbursed on a fee-for-service basis, with physicians submitting claims to the provincial health insurance agency (the Régie de l'assurance maladie du Québec [RAMQ]) for services provided in hospitals or private clinics. 9 For each claim, physicians can optionally provide a single diagnostic code using the International Classification of Diseases, Ninth Revision (ICD-9), coding system that represents the main reason for the visit. 10 Quebec also maintains a hospitalization discharge summary database (MED-ECHO) containing details of all hospitalizations at acute care institutions in Quebec. Each discharge summary contains a principal diagnosis and up to 15 secondary diagnoses 9 (up to 25 secondary diagnoses starting in April 2006) recorded by using the ICD-9 system until April 2006 and the ICD-10 system thereafter.

| Study design
We considered 13 plausible conditions where antidepressants would be used, including various on-label 11 and reported off-label indications [12][13][14][15] for antidepressants. We conducted a separate validation study for each indication, where the unit of analysis was the prescription.

| Data sources and inclusion criteria
The Medical Office of the XXIst Century (MOXXI) is an indication-based electronic prescribing and drug management system used by consenting primary care physicians at community-based clinics around 2 major urban centers in Quebec. 16 The MOXXI electronic prescribing tool requires physicians to document at least 1 treatment indication per prescription using either a drop-down menu containing on-label and offlabel indications without distinction, or by typing the indication(s) into a free-text field. In a previous study, 17

KEY POINTS
• Diagnostic codes from administrative health data are often used to infer treatment indications for antidepressant use, but this approach has never been validated against a gold-standard.
• We found that diagnostic codes in administrative health data had poor accuracy for inferring antidepressant treatment indications when compared with treatment indications documented by primary care physicians at the time of prescribing.
• The findings from this study suggest that use of administrative diagnostic codes to infer antidepressant treatment indications could introduce significant misclassification bias in studies where this approach is used. multiple indications documented, the prescription was classified as reference positive for all the indications.

Quebec health administrative databases
Antidepressant prescriptions were classified as positive for a given indication according to administrative data ("test positive") if the patient had an ICD-9 code for the indication recorded in either claims (RAMQ) or hospital discharge (MED-ECHO) data within ±3 days of the prescription date. International Classification of Diseases, Ninth Revision, codes for each indication were identified from code sets used in previous studies 4, [19][20][21] (see Supporting Information Appendix B). For pain, codes for osteoarthritis 22 and rheumatoid arthritis 23 were also included because pain is the primary complaint among patients with these conditions. 24,25 International Classification of Diseases, Tenth Revision, codes recorded in MED-ECHO from April 2006 onward were translated to their ICD-9 equivalent using conversion tables. 26 For 0.6% of antidepressant prescriptions where the patient had diagnostic codes for multiple treatment indications recorded within the time window, the prescription was classified as test positive for all the indications.  Based on physician-documented treatment indications recorded for antidepressant prescriptions in the MOXXI system. About 1.2% of antidepressant prescriptions were classified as reference positive for multiple treatment indications because more than 1 indication was recorded for the prescription in the MOXXI system. b Based on diagnostic codes in physician billing and hospitalization discharge summary data that were recorded for patients within ±3 days of the prescription date. About 0.6% of antidepressant prescriptions were classified as test positive for multiple treatment indication because diagnostic codes for more than one treatment indication were recorded.

| Patient characteristics
We determined patients' age and sex by using beneficiary information from RAMQ. We measured patients' level of chronic comorbidity by counting the number of distinct Charlson conditions for which the patient had a corresponding diagnostic code 19 recorded in administrative data over the past 365 days.

| Statistical analysis
For each indication, we conducted a separate validation study to calculate 6 measures of accuracy: sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), positive likelihood ratio (LR+), and negative likelihood ratio (LR−) (

| Sensitivity analyses
We conducted sensitivity analyses to investigate the effect of (a) increasing the lookback window for diagnostic codes (−30, −60, −90, −180, and −365 days) and (b) restricting the source of diagnostic codes to hospital data only, claims data only, or claims from the prescriber only (within a lookback window of 365 days).
To investigate how much of the total variance around each accuracy estimate was due to between-physician differences in coding practices, the 95% CIs corrected for both within-patient and withinphysician clustering were compared with 95% CIs corrected for within-patient clustering only. All analyses were conducted by using SAS software, version 9.4.

| RESULTS
The analysis included a total of 77 700 antidepressant prescriptions     Table 2). In comparison, the proportion of antidepressant prescriptions where the patient had diagnostic codes for these indications ("test positive") was considerably lower, especially for depression and sleeping disorders (Table 2). Consequently, the sensitivity of administrative diagnostic codes was very poor for all treatment indications, ranging from a high of only 31.2% (95% CI, 26.8%-35.9%) for anxiety/stress disorders to as low as 1.3% (95% CI, 0.0%-5.2%) for sexual dysfunction (Table 3). However, the specificity of diagnostic codes was excellent (90%+) for all treatment indications ( Table 3).
The predictive value of having an administrative diagnostic code for a given indication recorded varied between indications. When a diagnostic code for a given indication was recorded, the probability that the antidepressant was truly prescribed for the corresponding indication (ie, according to MOXXI) was high for depression (PPV of 80.3%; 95% CI, 73.7%-85.3%), moderate for obsessive-compulsive disorder (OCD) (69.1%; 95% CI, 51.7%-83.3%), and low (~50% or less) for the remaining indications ( Table 3). The high PPV of depression codes was mostly    Similarly, conclusions about the predictive value of not having a diagnostic code recorded for a given indication differed depending on whether the NPV or LR− was used as the performance statistic. When a diagnostic code for a given indication was not recorded, the probability that the antidepressant was not prescribed for the corresponding indication in MOXXI was low for depression (NPV of 49.2%; 95% CI, 45.3%-53.2%) but fairly high for anxiety/stress disorders (81.6%; 95% CI, 78.8%-84.0%) and high for sleeping disorders (90.4%; 95% CI, 88.2%-92.4%). For the remaining indications, the NPV was very high (>95%) because of the low prevalence of these indications (Table 3). In contrast, the LR− estimates were close to 1.0 for all indications, suggesting that the absence of a diagnostic code for any plausible indication did not improve the ability to rule out the corresponding indication.

| Subgroup analyses
For all indications, there was considerable heterogeneity in the PPV and NPV estimates across different classes of antidepressants (Table 4) (Table 7).

| Sensitivity analyses
As expected, using a longer lookback window for diagnostic codes increased sensitivity and decreased specificity for all indications, especially pain ( Figure 1A,B). However, even with a lookback window of −365 days, sensitivity remained low at ≤60% for all indications.
Increasing the length of the lookback window also caused the PPV and LR+ to deteriorate for all indications ( Figure 1C,E).
Compared with the performance of diagnostic codes from claims data in the past 365 days, diagnostic codes from hospital data in the  Figure 2A). However, when diagnostic codes from claims data in the past 365 days were restricted from all physicians to those from the prescriber only, the sensitivity of diagnostic codes was notably lower for pain only (Figure 2A). Diagnostic codes recorded by the prescriber also had slightly higher (better) PPV and LR+ than diagnostic codes recorded by all physicians (Figure 2C,E).
Finally, for all indications except sleeping disorders, the 95% cluster bootstrap-based CIs 28 around the sensitivity and PPV estimates were noticeably wider when they accounted for both within-physician and within-patient clustering than when they accounted for withinpatient clustering only, suggesting that within-physician differences exist in the quality of diagnostic coding for these indications, especially depression (Figure 3).

| DISCUSSION
In this study, we estimated the accuracy with which diagnostic codes in Quebec health administrative records reflected indications for antidepressant therapy in primary care. We found that diagnostic codes for a given indication identified only a small proportion of antidepressant prescriptions for the corresponding indication. Moreover, we found that the absence of a diagnostic code for a given indication did not provide much additional value for ruling out the indication.