SEARCH

SEARCH BY CITATION

Keywords:

  • Accuracy;
  • ICD-9 code;
  • Spondylarthritides;
  • Database;
  • Sensitivity;
  • Specificity;
  • Predictive value;
  • Area under the curve

Abstract

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. PATIENTS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. AUTHOR CONTRIBUTIONS
  8. Acknowledgements
  9. REFERENCES

Objective

To study the accuracy of diagnoses of spondylarthritides in computerized databases at the Minneapolis Veterans Affairs Medical Center.

Methods

Medical records were available and reviewed for a random sample of 184 patients from a cohort of 737 patients seen at the rheumatology clinic between January 1, 2001 and July 31, 2002. We compared 4 database definitions with the medical record gold standard of rheumatologists' diagnosis of ankylosing spondylitis (AS), psoriatic arthritis (PsA), or reactive arthritis (ReA): presence of 1) ≥1 or 2) ≥2 International Classification of Diseases, Ninth Revision (ICD-9) diagnostic codes for diagnoses of AS (720.0), PsA (696.0), and ReA (099.3, 711.11–711.19), and presence of 3) ≥1 or 4) ≥2 ICD-9 codes and prescription of a disease-modifying antirheumatic drug (DMARD). Accuracy was assessed by sensitivity, specificity, positive predictive values (PPVs) and negative predictive values (NPVs), kappa statistic, and receiver operator characteristic (ROC) curve area.

Results

Of 184 patients, 11 (6%) had AS, 17 (9%) had PsA, and 7 (4%) had ReA as per the gold standard. ICD-9 codes for AS, PsA, and ReA were very specific (98–100%) with excellent NPV (99–100%) and PPV (83–100%), good to excellent sensitivity (57–100%), almost perfect kappa agreement (0.72–1), and high ROC curve area (0.79–1). Addition of presence of DMARD prescription to ICD-9 codes of AS and PsA decreased sensitivity to 27–65% without improving the specificity (which was already high at 99–100%).

Conclusion

The ICD-9 codes for AS, PsA, and ReA in databases are accurate. ICD-9 codes may be used to identify cohorts of patients with spondylarthritides.


INTRODUCTION

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. PATIENTS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. AUTHOR CONTRIBUTIONS
  8. Acknowledgements
  9. REFERENCES

The seronegative spondylarthritides are an interrelated and overlapping group of chronic inflammatory rheumatic diseases. This group consists primarily of ankylosing spondylitis (AS), psoriatic arthritis (PsA), reactive arthritis (ReA), arthritis associated with inflammatory bowel disease, and undifferentiated spondylarthritis. Studies of disease epidemiology and outcomes frequently use databases such as Medicaid (1), Medicare (2, 3), group health plans (4, 5), and Veterans Affairs (VA) databases (6, 7) to identify cohorts of patients with a specific disease condition because these databases are cost effective and easy to access. Critics often question the reliability and validity of these data because most are collected primarily for administrative purposes and not for research. It has been suggested that alternate strategies of patient cohort identification such as review of medical records, interview and examination of the patients, patients' self report, or a physician's report of the diagnosis may have more validity. These methods of sample selection often require more resources and are associated with a higher cost. Due to cost and time constraints, health services researchers and epidemiologists often rely on database resources for hypothesis-generating studies. Therefore, administrative and clinical databases are an important resource for researchers.

Computerized administrative databases at VA hospitals, which constitute the largest health care delivery system in the US, can be useful sources of information for clinical and epidemiologic studies (8). With many recent changes, including reorganization of the VA health care delivery system in the mid 1990s (9) and the creation of uniform computerized medical record systems with measures to improve data quality (8, 10) and quality of care initiatives (11, 12), these databases may provide means to identify patient cohorts.

The accuracy of diagnoses of spondylarthritides such as AS, PsA, and ReA in the VA administrative databases has not been established. Previous studies of spondylarthritides have examined agreement between patient self report of diagnosis and computerized registry (13), between self report of diagnostic and classification criteria and medical records (14), and between lay interviewer telephone surveys and diagnosis and classification criteria at a rheumatologist visit (15). In our recent study, we found that the diagnostic code for rheumatoid arthritis in the Minneapolis VA database was inaccurate, with specificity of 55% and positive predictive value (PPV) of 66%, although the sensitivity was 100% (16). The results of that study raised the question of whether the diagnostic codes for other rheumatologic diagnoses including spondylarthritides were also inaccurate. The primary objective of the present study was to test the accuracy of International Classification of Diseases, Ninth Revision (ICD-9) codes for spondylarthritides obtained from the VA administrative databases against the gold standard of a chart diagnosis of the respective spondylarthritides by a rheumatologist on 2 separate visits. The secondary objective was to test whether a selection strategy that merged ICD-9 codes with prescription of disease-modifying antirheumatic drugs (DMARDs) increased the specificity for the respective diagnoses without loss of sensitivity.

PATIENTS AND METHODS

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. PATIENTS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. AUTHOR CONTRIBUTIONS
  8. Acknowledgements
  9. REFERENCES

Patient population.

The Human Studies Committee approved the study. An alphabetical list of all patients seen at the Minneapolis VA Medical Center Rheumatology Clinic between January 1, 2001 and July 31, 2002 was obtained (n = 737). From this cohort, we obtained a sample of 252 patients by using a circular systematic sampling design (17) and choosing a random starting point in the list of patients and then selecting every third individual in the list. Complete medical records (paper and computerized) of patients with at least 2 visits to the rheumatology clinic were available for 184 patients. Those with incomplete records (n = 58) and only 1 visit to the rheumatology clinic (n = 10) were excluded (total excluded = 68). Two physicians (JAS and ARH), independent of the rheumatologists who diagnosed and provided outpatient medical care to these patients, reviewed the medical records. We assessed interrater variability for 26 randomly selected charts by calculating the kappa statistic (18). Kappa values were 1 for the diagnoses of spondylarthritides and 0.69 for any rheumatic diagnoses.

We reviewed medical records of all 184 remaining patients. The gold standard for diagnoses was a chart diagnosis of spondylarthritides by a rheumatologist on 2 separate clinic visits. The gold standard of clinical judgment was chosen instead of the classification criteria for several reasons: 1) patients who have been diagnosed with a spondylarthritis by a rheumatologist are most likely to be treated for this condition in a clinical setting; 2) the classification criteria are meant for disease classification and not disease diagnosis and some patients, especially those with early disease, may not meet these criteria; and 3) there is often underdocumentation of criteria for rheumatic diseases in medical records as evident in previous studies (16, 19, 20). However, realizing that many researchers may still consider classification criteria as the true gold standard, we examined medical records of patients with rheumatologists' diagnoses of spondylarthritides for the presence of each European Spondyloarthopathy Study Group (ESSG) classification criteria (21). For each clinical criterion, we assessed whether the criterion was present, absent, or not documented. The radiographic criterion, namely, sacroiliitis (as defined by the ESSG criteria), was considered to be present if it was documented in the radiologists' reading or reported by an experienced blinded rheumatologist (HK) who read all available radiographs. The medical records were reviewed starting from the first rheumatology encounter in the records followed by review of multiple subsequent records. We used a standardized data collection form to extract the following information from the chart: demographics including name, age, sex, and social security number; presence of diagnoses of spondylarthritides during rheumatology clinic visits; other medical diagnoses; and presence, absence, or lack of documentation of classification criteria. We also obtained from the local administrative database a list capturing all inpatient and outpatient ICD-9 codes assigned by health care providers at clinical encounters for AS (720.0), PsA (696.0), ReA (099.3/711.10–711.19), inflammatory bowel disease–associated arthritis (713.1 and either 555 or 556), and unspecified inflammatory spondylarthritis (720.9) between January 1, 2001 and July 31, 2002. Because none of the patients had ICD-9 codes for either inflammatory bowel disease–associated arthritis or unspecified inflammatory spondylarthritis during the study period, results are presented for AS, PsA, and ReA.

To assess the accuracy of definitions combining the pharmacy and ICD-9 code data, a pharmacist searched the pharmacy database for DMARD prescriptions of ≥3 months' duration from the local VA pharmacy database and retrieved information including drug name, dose, quantity, days supply, number of refills, refill dates, and patient identifiers from January 2001 to July 2002. We chose prescription durations ≥3 months because this was the maximum and the most common days supply that could be dispensed at the VA for DMARDs at that time. We searched for the presence of DMARD prescriptions for all DMARDs available at the VA pharmacy during the study period including methotrexate, gold, hydroxychloroquine, cyclosporine, sulfasalazine, azathioprine, minocycline, penicillamine, leflunomide, anakinra, etanercept, and infliximab. Thus, 4 computerized data definitions of diagnoses were compared with the gold standard: presence of ≥1 specific ICD-9 code, presence of ≥2 specific ICD-9 codes, presence of both ≥1 specific ICD-9 code and DMARD prescription, and presence of both ≥2 specific ICD-9 codes and DMARD prescription.

Statistical analyses.

Measures of accuracy by classic statistical analytic methods.

The diagnosis of AS, PsA, or ReA according to the 4 data definitions described above was determined for each patient and compared with the gold standard. We calculated sensitivity, specificity, PPVs and negative predictive values (NPVs), and likelihood ratios (LRs) for the 4 data definitions for the diagnosis of AS, PsA, or ReA as compared with the gold standard. Sensitivity was defined as the fraction of patients with the diagnosis according to the gold standard that were correctly identified as positive by the data definition. Specificity was defined as the fraction of patients without the diagnosis according to the gold standard that were correctly identified as negative by the data definition. PPV (or NPV) was the fraction of patients with (or without) the diagnosis by data definition that met (or did not meet) the diagnosis according to the gold standard. LRs were calculated, which incorporate both sensitivity and specificity of the test and provide a direct estimate of how much a test will change the odds of having a disease. The LRs for a positive test (or negative test) tell us how much the odds of the disease increase (or decrease) when a test is positive (or negative), i.e., positive LR = sensitivity/(1 − specificity), and negative LR = (1 − sensitivity)/specificity. Positive LRs should be >1, and negative LRs should be a positive fraction between 0 and 1; the further away from 1 that positive LRs and negative LRs are, the better the test's discriminating power. The kappa coefficient was used to describe agreement (beyond chance) between the rheumatologist's diagnosis in the chart (gold standard) and the 4 database definitions. We performed a receiver operating characteristic (ROC) curve analysis (22) for each data definition as measured against the gold standard. An ROC curve area of 0.5 denotes that the method was no better than chance and a curve area of 1 denotes the most accurate method.

Bayesian approach.

Conclusions regarding the merits of each of the administrative data definitions depend on the relative importance given to sensitivity, specificity, and PPV and NPV. Specificity and sensitivity can be regarded as utility measures of a test procedure under 2 unknown states of nature, i.e., having or not having the disease. A weighted average of these 2 quantities is the Bayes utility of a test. Bayes values for each diagnosis definition were calculated by giving a range of importance (P; value ranging from 0 to 1) to sensitivity and (1 − P) to specificity, where 0 indicates the least importance and 1 indicates the maximum importance. For example, if sensitivity is most critical, we choose the method with the highest sensitivity, i.e., P of 1. However, in various situations sensitivity and specificity have different weights of importance. Linear combinations of sensitivity and specificity for different values of P were graphed. The analyses were performed using SPSS software, version 11.5 (SPSS, Chicago, IL) and S-plus 2000 (Mathsoft, Seattle, WA).

RESULTS

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. PATIENTS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. AUTHOR CONTRIBUTIONS
  8. Acknowledgements
  9. REFERENCES

Study population characteristics.

The medical records of 184 patients with ≥2 rheumatology visits were reviewed and a specific rheumatologic diagnosis was documented for each patient. The gold standard for diagnosis of spondylarthritides, i.e., rheumatologist's diagnosis in patient charts on 2 separate visits, was found in 35 patients: 11 (6%) of 184 had AS, 17 (9%) of 184 had PsA, and 7 (4%) of 184 had ReA. The mean ± SD age of patients with spondylarthritis (n = 35) was 60.4 ± 13.4 years, and all were men. The clinical characteristics of the patients who met the gold standards for AS, PsA, and ReA are described in Table 1. All patients (35 of 35) met the ESSG classification criteria for spondylarthritis (21). The documentation rate of individual ESSG criteria in patient charts ranged from 0% to 97%, the lowest for the inflammatory bowel disease criteria and the highest for the synovitis criteria. Rheumatoid factor was negative in 25 of 35 patients with spondylarthritides, positive in 2 of 35, and not documented in 8 of 35.

Table 1. Clinical characteristics of the patient population defined as those meeting the gold standard (i.e., a rheumatologist's diagnosis on 2 visits >6 weeks apart)*
 Ankylosing spondylitis (n = 11)Psoriatic arthritis (n = 17)Reactive arthritis (n = 7)
  • *

    Values are the percentage unless otherwise indicated. ESSG = European Spondyloarthopathy Study Group; ICD-9 = International Classification of Diseases, Ninth Revision; DMARD = disease-modifying antirheumatic drug; NSAID = nonsteroidal antiinflammatory drug.

Age, mean ± SD years59.6 ± 13.365.7 ± 13.349.4 ± 5.7
Male sex100100100
Met the gold standard100100100
Fulfilled ESSG criteria100100100
ICD-9 code9110071
DMARD prescription for ≥3 months274157
Current use of other medications   
 Prednisone0120
 NSAID735371
 Opioid27014

Five patients with ICD-9 codes did not meet the gold standard (false positives): 2 patients with the ICD-9 code for AS had a chart diagnosis of polyarthritis (n = 1) or inflammatory bowel disease–associated arthritis versus AS (n = 1), and 3 patients with the ICD-9 code for PsA had a chart diagnosis of gout versus PsA (n = 1), arthralgia (n = 1), or osteoarthritis versus PsA (n = 1). Ten of 13 patients with the ICD-9 code for AS, 17 of 20 with the code for PsA, and 5 of 5 with the code for ReA met the gold standard criterion of chart documentation. Conversely, 10 of 11 patients with AS, 17 of 17 patients with PsA, and 5 of 7 patients with ReA with gold standard criteria had ICD-9 codes for their respective diseases. There were 132 instances of ICD-9 codes for AS, PsA, or ReA in the study period: 39 (30%) of 132 were chosen by primary care physicians, 8 (6%) of 132 were chosen by nurses/other health care providers, and 85 (65%) of 132 were chosen in the rheumatology clinics.

The sensitivity, specificity, PPV and NPV, kappa statistic, and area under the curve for the 4 computerized data definitions as compared with the gold standard are given in Table 2. The definition requiring the presence of ≥1 ICD-9 code for AS, PsA, and ReA in the administrative databases was very specific (98–100%) with excellent NPVs (99–100%). Sensitivity of ≥1 ICD-9 code was 91% for a diagnosis of AS, 100% for PsA, and 71% for ReA. PPVs of ≥1 ICD-9 code were 83% for AS, 100% for PsA, and 100% for ReA. LRs were very high for a positive test (presence of ICD-9 code) and very low for a negative test (absence of ICD-9 code) (Table 2).

Table 2. Sensitivity, specificity, predictive values, kappa statistic, and receiver operating characteristic (ROC) curve areas for administrative and pharmacy data definitions*
TSMet GS and TSMet GS but not TSMet TS, but not GSMet neither GS nor TSSensitivity, % (95% CI)Specificity, % (95% CI)PPV, % (95% CI)NPV, % (95% CI)LR+ (95% CI)LR− (95% CI)Kappa statistic (95% CI)ROC curve area (95% CI)
  • *

    Values are the number/total number unless otherwise indicated. Numbers are rounded to the nearest digit. TS = test standard; GS = gold standard; 95% CI = 95% confidence interval; PPV = positive predictive value; NPV = negative predictive value; LR+ = likelihood ratio of a positive test; LR− = likelihood ratio of a negative test; AS = ankylosing spondylitis; ICD = International Classification of Diseases; DMARD = disease-modifying antirheumatic drug;

  • ∞ = infinity;

  • NA = not applicable; PsA = psoriatic arthritis; ReA = reactive arthritis.

  • A likelihood ratio >1 indicates that a test result makes a condition more likely to be present, and a value between 0 and 1 means that the test result makes a condition less likely; the further away from 1 that positive LRs and negative LRs are, the better the test's discriminating power.

AS
 ≥1 ICD code10/1842/1841/184171/18491 (87–95)99 (97–100)83 (78–89)99 (98–100)82.6 (19.7–347)0.09 (0.0–0.7)0.82 (0.7–1.0)0.95 (0.8–1.0)
 ≥1 ICD code + DMARD3/1848/1841/184172/18427 (21–34)99 (98–100)75 (69–81)96 (93–98)45.5 (3.2–652)0.73 (0.4–1.5)0.34 (0.1–0.7)0.63 (0.4–0.8)
 ≥2 ICD codes9/1842/1840/184173/18482 (76–87)100‡ (NA)100‡ (NA)99 (97–100) (NA)0.18 (0.0–0.5)0.89 (0.7–1.0)0.91 (0.8–1.0)
 ≥2 ICD codes + DMARD3/1848/1840/184173/18427 (21–34) 100 (NA)100 (NA)96 (93–99) (NA)0.72 (0.4–1.5)0.41 (0.1–0.7)0.64 (0.4–0.8)
PsA
 ≥1 ICD code17/1840/1840/184167/184100‡ (NA)100 (NA)100 (NA)100 (NA) (NA)0 (NA)1 (NA)1 (1–1.0)
 ≥1 ICD code + DMARD11/1846/1840/184167/18465 (58–72)100 (NA)100 (NA)97 (94–99) (NA)0.35 (0.2–0.8)0.77 (0.7–0.9)0.82 (0.7–1.0)
 ≥2 ICD codes16/1841/1840/184167/18494‡ (91–97)100 (NA)100 (NA)99 (98–100) (NA)0.06 (0.0–0.4)0.97 (0.9–1.0)0.97 (0.9–1.0)
 ≥2 ICD codes + DMARD10/1847/1840/184167/18459 (52–66)100 (NA)100 (NA)96 (93–98) (NA)0.42 (0.2–0.9)0.72 (0.5–0.9)0.79 (0.6–0.9)
ReA
 ≥1 ICD code5/1842/1840/184177/18471 (65–78)100‡ (NA)100‡ (NA)99 (97–100) (NA)0.29 (0.1–0.9)0.83 (0.6–1.0)0.86 (0.6–1.0)
 ≥1 ICD code + DMARD4/1843/1840/184177/18457 (50–64)100 (NA)100 (NA)98 (96–100) (NA)0.43 (0.2–1)0.72 (0.4–1.0)0.79 (0.5–1.0)
 ≥2 ICD codes4/1843/1840/184177/18457 (50–64)100‡ (NA)100‡ (NA)98 (96–100) (NA)0.43 (0.2–1)0.72 (0.4–1.0)0.79 (0.5–1.0)
 ≥2 ICD codes + DMARD4/1843/1840/184177/18457 (50–64)100‡ (NA)100‡ (NA)98 (96–100) (NA)0.43 (0.2–1)0.72 (0.4–1.0)0.79 (0.5–1.0)

Definitions requiring the presence of ≥2 ICD-9 codes improved the PPV for AS from 83% to 100%, compared with the definitions requiring ≥1 ICD-9 code; sensitivity for AS, PsA, and ReA changed slightly from 91% to 82%, 100% to 94%, and 71% to 57%, respectively, without any significant changes in NPV. LRs were higher for a positive test for both AS and PsA for the definition requiring ≥2 ICD-9 codes than the definition requiring ≥1 ICD-9 code (Table 2). For ReA and PsA, positive LR was infinite even for the definition requiring ≥1 ICD-9 code.

The addition of the presence of DMARD prescription to the ICD-9 code for AS, PsA, and ReA reduced the sensitivity to 27%, 65%, and 57%, respectively, without significantly changing the specificity (which was already high at 99–100% with ICD-9 code definitions) (Table 2). DMARD plus ICD-9 code definitions (when compared with ICD-9 code definitions) led to an undesirable change in negative LR (i.e., negative LR became closer to 1). Positive LR was infinite for ReA and PsA for both ICD-9 code definitions, and for AS definition requiring ≥2 ICD-9 codes.

Bayesian analysis.

Because one may arrive at different conclusions for the best administrative data definition of AS, PsA, and ReA depending on the relative importance given to sensitivity or specificity using the classic methods, we also performed Bayesian analysis. Weighted averages of these 2 criteria against various weights are presented in Figure 1. For example, in Figure 1A, the line labeled ICD represents the values of 0.91 × P + 0.99 × (1 − P) [sensitivity × P + specificity × (1 − P)] for different values of P. If we gave the most importance to sensitivity alone, P would be 1. If we gave the greatest importance to specificity alone, P would be 0. All other values of P reflect the relative degree of importance given to sensitivity as opposed to specificity. For both AS and PsA, the administrative data definition based on ≥2 ICD-9 codes outperformed the ICD-9 plus DMARD definition for all values of P and the definition based on ≥1 ICD-9 code for most of the values of P (Figure 1). For ReA, the definition based on ≥1 ICD-9 code outperformed the other 3 definitions. Other inferences for various values of P may be derived from Figure 1.

thumbnail image

Figure 1. A Bayesian approach to comparison of the 4 data definitions. The x-axis represents values of P, the level of importance given to sensitivity (ranging from 0 to 1). The y-axis represents the Bayes values (for the respective diagnoses) based on a linear combination of sensitivity and specificity for different values of P, with a higher value being better. A, Ankylosing spondylitis. B, Psoriatic arthritis. C, Reactive arthritis. ICD = International Classification of Diseases; DMARD = disease-modifying antirheumatic drug.

Download figure to PowerPoint

DISCUSSION

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. PATIENTS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. AUTHOR CONTRIBUTIONS
  8. Acknowledgements
  9. REFERENCES

This study found that the ICD-9 codes for AS and PsA in the administrative VA databases had excellent sensitivity, specificity, and NPVs and reasonable PPVs. The ICD-9 code for ReA had very high specificity, PPVs, and NPVs but lower sensitivity. Addition of pharmacy data to the ICD-9 code decreased sensitivity moderately without any significant improvement in the specificity that was already very high with the ICD definitions. Although we calculated many accuracy statistics, the excellent PPVs and NPVs of the ICD-9 codes for AS, PsA, and ReA signify that these ICD-9 codes are likely to be valid in this setting (and perhaps similar settings) for researchers using databases for health services and outcomes studies.

The VA has a state-of-the-art comprehensive computerized medical record system in all of its facilities. VA databases are similar to other commonly used data sets such as Medicare and Medicaid in that these data sets contain information on selected populations, i.e., socioeconomically disadvantaged veterans (23, 24), those ages 65 years and older, and socioeconomically disadvantaged members of the general population. VA databases are nonclaim based as opposed to Medicare and Medicaid databases. All VA administrative and clinical databases contain social security numbers as the unique identifiers that help to link them to each other and to non-VA databases (8, 25). The data originate from 2 sources: 1) medical administration clerks who compile patient demographics at various patient encounters, i.e., time of admission, transfer, or discharge, and 2) clinicians who enter orders, clinical notes, medical diagnoses, and results (25). Although originally designed for administrative purposes to fulfill the operational necessity of record keeping (25), as a result of a concerted effort to improve the quality of data, the VA data sets now lend themselves to high-quality health services and epidemiologic research (8).

The diagnoses of spondylarthritides in computerized records or databases have been compared with information obtained from the patients (13, 15). Rasooly et al compared patient self report with a rheumatology clinic computer database in 472 rheumatology outpatients and found that the sensitivity of an exact match was 100% for AS and 50% for PsA (13). Saraux et al compared agreement between the diagnosis of spondylarthritis at rheumatology visits and patients' self report during a telephone interview and found a kappa of 0.78 (15). Our study found high sensitivity (91–100%) and specificity (98–100%) for ICD-9 code definitions for AS and PsA, and moderate sensitivity (57–71%) and high specificity (100%) for ReA, when compared with the gold standard. Kappa values for the ≥1 ICD-9 code definition were slightly higher (0.83–0.91) than those noted in the study by Saraux et al (15). Our study differs from these previous studies with regard to the patient population (veterans versus general population), methodology (random sampling versus convenience/consecutive sampling), data definitions (administrative and pharmacy database definitions versus clinic records/database), and methods of analysis (both classic and Bayesian approaches versus classic approach), which may explain some of the differences in results. The high degree of specificity and predictive values of ICD-9 codes for the diagnoses of spondylarthritides is in contrast to our previous observation of low specificity and predictive values for ICD-9 codes (ICD-9 code alone test definitions) for rheumatoid arthritis in the same database (16). The exact reasons for these differences are not clear to us, but demonstrate that the accuracy of different rheumatic diagnoses in VA databases may vary. Differences in disease or patient characteristics, differences in characteristics of physicians choosing the diagnostic codes (rheumatologists more likely to follow almost all patients with spondylarthritides versus both rheumatologists and nonrheumatologists following patients with rheumatoid arthritis), or the fact that our VA medical center participated in the VA cooperative studies for AS, PsA, and ReA (26–28) may have contributed to the higher accuracy of the VA database for these diagnoses compared with other studies and compared with the accuracy observed for rheumatoid arthritis in the same database.

The use of ICD-9 code plus DMARD data definitions only slightly and significantly improved the specificity for each diagnosis, but at the cost of a dramatic reduction in sensitivity. If one wanted a 100% specific method, this approach may be slightly preferable to the ICD-9 code definitions; in most other situations, the ICD-9 code definitions would be preferable for identifying patient cohorts. The inference is similar using the Bayesian approach, which demonstrated that the ICD-9 code alone definitions outperformed ICD-9 plus DMARD definitions for most of or the entire range of P. LRs provided a similar conclusion that ICD-9 code definitions seemed more accurate than ICD-9 plus DMARD definitions. Both ICD-9 code definitions (≥1 and ≥2) performed similarly well with regard to PPV and NPV.

We examined patient records to determine if those with a rheumatologist's diagnosis also met the ESSG classification criteria for spondylarthritis. We realize fully that these are classification (and not diagnostic) criteria and not all patients meet these criteria at the time of diagnosis of spondylarthritis, but we performed this analysis to validate the rheumatologists' diagnosis as the gold standard. Our finding that all patients with a rheumatologist's diagnosis of AS, PsA, or ReA met the ESSG classification criteria provides evidence that a rheumatologist's diagnosis was a valid case definition (or gold standard) in this study. Our concern regarding criteria underdocumentation was confirmed by our observation of a low documentation rate for the presence of positive family history, alternating back pain, and inflammatory bowel disease. The documentation rate was high for inflammatory back pain, asymmetric synovitis, and radiography criteria, thereby providing evidence for variability in documentation depending on the criterion.

The strengths of our study include selection of a random sample, computation of various measures of accuracy, good interobserver agreement, and robustness of interpretation with both classic and Bayesian statistical methods. High PPVs and NPVs of database ICD-9 code for AS, PsA, and ReA suggest that database diagnoses may be used to study health outcomes in these populations in our setting and these results can be safely projected to underlying true diagnoses.

Our study has certain limitations. Our study was limited to patients seen in a VA rheumatology clinic with at least 2 visits and therefore may not be generalizable to non-VA health care settings. However, the main purpose of our study was to examine the validity of these definitions specifically in the VA health care system. Previous participation of our VA medical center in the cooperative study of spondylarthritides may have influenced accuracy. These results need to be replicated at other VA hospitals and clinics that did not participate in the VA cooperative studies for spondylarthritides. Some patients may receive diagnoses and drug prescriptions from an outside physician and pharmacy, respectively, in addition to or instead of the VA medical center, and our search methods are unable to account for this variation. Our results suggest that even without accounting for non-VA diagnoses, the database definition using ICD-9 codes is reasonably valid, providing evidence to support use of this simple strategy for case identification rather than attempting to obtain additional medical records from non-VA systems. Our results are limited to the diagnoses of spondylarthritides, and results may be different for other rheumatic diseases. Our previous study and the current investigation have examined the accuracy of database definitions for 2 major groups of inflammatory arthritides commonly seen in rheumatology practices and, therefore, add to the current body of literature.

In conclusion, the ICD-9 codes for AS, PsA, and ReA in the VA administrative databases are accurate. Addition of pharmacy data to ICD-9 code data increased the specificity and PPVs for the diagnoses of AS and PsA marginally, but reduced sensitivity and NPVs significantly. It remains to be seen if our findings can be replicated in other VA medical centers and other health care settings. The VA administrative databases may be useful in planning and conducting clinical trials for spondylarthritides.

AUTHOR CONTRIBUTIONS

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. PATIENTS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. AUTHOR CONTRIBUTIONS
  8. Acknowledgements
  9. REFERENCES

Dr. Singh had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Study design. Singh, Noorbaloochi.

Acquisition of data. Singh, Holmgren, Krug.

Analysis and interpretation of data. Singh, Noorbaloochi.

Manuscript preparation. Singh, Holmgren, Krug, Noorbaloochi.

Statistical analysis. Singh, Noorbaloochi.

Acknowledgements

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. PATIENTS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. AUTHOR CONTRIBUTIONS
  8. Acknowledgements
  9. REFERENCES

We thank Drs. Daniel Solomon and Maren Mahowald for their critical review of the manuscript, and Mr. Darshan Singh for his thoughtful suggestions.

REFERENCES

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. PATIENTS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. AUTHOR CONTRIBUTIONS
  8. Acknowledgements
  9. REFERENCES