To conduct a systematic review of the literature on the validation of algorithms identifying infections in administrative data for future use in populations with rheumatic diseases.
To conduct a systematic review of the literature on the validation of algorithms identifying infections in administrative data for future use in populations with rheumatic diseases.
Medline and EMBase were searched using the themes “administrative data” and “infection” between 1950 and October 2012. Inclusion criteria consisted of validation studies of administrative data identifying infections in adult populations. Article quality was assessed using a validated tool.
A total of 5,941 articles were identified, 90 articles underwent detailed review, and 24 studies were included. The majority (17 of 24) examined bacterial infections and 9 examined opportunistic infections. Eighteen studies were from the US and all but 4 studies used International Classification of Diseases, Ninth Revision codes. Rheumatoid arthritis patients were studied in 6 of 24 articles. The studies on bacterial infections in general reported highly variable sensitivity and positive predictive value (PPV) for the diagnosis of infections using administrative data (sensitivity range 4.4–100%, PPV range 21.7–100%). Algorithms to identify opportunistic infections similarly had a highly variable sensitivity (range 20–100%) and PPV (range 1.3–100%). Thirteen studies compared the diagnostic accuracy of different algorithms, which revealed that strategies including a comprehensive algorithm using a greater number of diagnostic codes or codes in any position had the highest sensitivity for the diagnosis of infection. Algorithms that incorporated microbiologic or pharmacy data in combination with diagnostic codes had improved PPV for identification of tuberculosis.
Algorithms for identifying infections using administrative data should be selected based on the purpose of the study, with careful consideration as to whether a high sensitivity or PPV is required.
Infections are a common comorbidity for patients with autoimmune rheumatic diseases such as rheumatoid arthritis (RA) ([1, 2]) and systemic lupus erythematosus (). They may arise as a consequence of the disease itself or of its treatment, causing significant morbidity and mortality. A population approach using administrative data in the form of billing data from physician visits and hospitalizations may be used to obtain comprehensive estimates of the burden of infections in patients with rheumatic diseases. Administrative data are also useful in pharmacoepidemiologic studies for evaluating infections as potential adverse events of medications used to treat rheumatic diseases; however, validated methods for utilizing administrative data to identify infections are required to assure accurate estimates of infections in populations.
There is increasing interest among epidemiologists and policymakers in the use of administrative data to identify the burden of comorbidities in patients with rheumatic diseases for surveillance and pharmacoepidemiologic purposes. A conference was held in Montreal in February 2011 to develop consensus statements for the use of administrative data for research and disease surveillance in rheumatic diseases (). In preparation for the meeting, a working group (led by DL) was tasked with conducting a series of systematic reviews of the literature to evaluate the validity of algorithms using administrative data for the identification of select comorbid conditions of interest in patients with rheumatic diseases. The primary question addressed by the series of systematic reviews was: “Can administrative health care data accurately identify the chronic conditions of interest for the purpose of using these comorbidities as covariates or as outcomes in research studies?” This systematic review examined whether administrative data can be used to accurately identify infections as covariates or as outcomes.
A systematic literature search was conducted to identify studies reporting on the validation of infections identified using administrative health data. Two medical databases were searched: Medline (from 1950 to October 2012) and EMBase (from 1980 to October 2012). Key search themes included “administrative data” and “serious or opportunistic infections” and were described by medical subject heading terms and keywords. The search strategy is shown in Supplementary Appendix A (available in the online version of this article at http://onlinelibrary.wiley.com/doi/10.1002/acr.21959/abstract). Additionally, the references of all identified studies were hand searched to identify additional relevant literature.
Peer-reviewed studies that reported on the validation of algorithms using administrative data to identify infections were eligible for inclusion. The focus of our review was infections requiring hospitalization and opportunistic infections. We did not attempt to capture specific individual types of infection, but rather evaluate the ability of administrative data to capture the overall risk of infections. The following criteria for inclusion were used for eligibility: original full-length articles, use of administrative health data, having performed a validation study of the infection diagnosis using a reference standard (such as chart review), and studies of adult populations evaluating serious infections (opportunistic infections or infections requiring hospitalization). Studies validating the International Classification of Diseases (ICD) prior to the ICD, Ninth Revision (ICD-9) were excluded, as were studies of acquired immunodeficiency syndrome (AIDS), nosocomial infections, or other specific infections, such as malaria, since they were not relevant to the focus of our study (Figure 1).
A standardized data collection form was used to describe the methods used for validation of the infection diagnosis and to extract the results. Quality was assessed using recently published guidelines ().
After exclusion of duplicates, 5,941 studies were found and 5,851 were excluded after a review of the abstract. A total of 90 studies were included for detailed review and 23 were retained as meeting the criteria for entry (Figure 1). Upon hand searching references, 1 additional study was identified. In total, 24 studies met our criteria for entry into the systematic review ([6-29]) (Figure 1 and Table 1). Seventeen studies examined bacterial infections, including 8 specifically examining the validation of pneumonia (Table 2). Nine studies examined opportunistic infections (Table 3).
|Author, year (ref.)||Country||Administrative data source||Type of infection studied||Studied as comorbidity or primary disease||Population|
|Bacterial infections (general)|
|Curtis et al, 2007 ()||US||Bacterial||Comorbidity||RA|
|Gedeborg et al, 2007 ()||Sweden||Sepsis, pneumonia, and central nervous system||Primary disease||ICU population|
|Grijalva et al, 2008 ()||US||Pneumonia, sepsis, invasive pneumococcal disease, and opportunistic mycoses||Comorbidity||RA|
|Landers et al, 2010 ()||US||Urinary tract||Primary disease||General population|
|Patkar et al, 2009 ()||US||Bacterial||Comorbidity||RA|
|Schneeweiss et al, 2007 ()||US||Bacterial||Primary disease||VA population|
|Sepsis and bacteremia (exclusively)|
|Madsen et al, 1998 ()||Denmark||Bacteremia||Primary disease||General population|
|Ollendorf et al, 2002 ()||US||Sepsis||Primary disease||ICU population|
|Aronsky et al, 2005 ()||US||Pneumonia||Primary disease||General population|
|Guevara et al, 1999 ()||US||Streptococcal pneumonia||Primary disease||General population age ≥65 years|
|Jackson et al, 2003 ()||US||Pneumonia||Primary disease||General population age ≥65 years|
|Marrie et al, 1987 ()||Canada||Pneumonia||Primary disease||General population|
|Meropol and Metlay, 2012 ()||UK||THIN database codes (Read codes)||Pneumonia||Primary disease||General population|
|Skull et al, 2008 ()||Australia||Pneumonia||Primary disease||General population age ≥65 years|
|Van de Garde et al, 2007 ()||The Netherlands||Pneumonia||Primary disease||General population|
|Whittle et al, 1997 ()||US||Community-acquired pneumonia||Primary disease||General population|
|Yu et al, 2011 ()||US||Pneumonia||Primary disease||General population|
|Opportunistic infections (exclusively)|
|Curtis et al, 2007 ()||US||Opportunistic (and other serious adverse events)||Comorbidity||RA and Crohn's disease patients|
|Calderwood et al, 2010 ()||US||TB||Primary disease||General population|
|Fiske et al, 2012 ()||US||TB||Comorbidity||RA|
|Trepka et al, 1999 ()||US||TB||Primary disease||General population|
|Winthrop et al, 2011 ()||US||TB and nontuberculous mycobacteria||Comorbidity||RA (treated with anti-TNF agents)|
|Yokoe et al, 1999 ()||US||TB||Primary disease||General population|
|Yokoe et al, 2004 ()||US||TB||Primary disease||General population|
|Author, year (ref.)||RS for validation||Case definition for case identification||N||Sensitivity (95% CI), %||PPV (95% CI), %|
|General bacterial infections|
|Curtis et al, 2007 ()||217||–|
|Landers et al, 2010 ()||No clear reference standard, each algorithm compared against another||7 different algorithms including combinations of the following elements: hospital discharge codes for UTI (ICD-9 code 599.0), positive urine culture, presence of fever||2,614 with ≥1 criterion for UTI||ICD-9 code vs. fever + positive culture algorithm: 55.6 (52.7–58.5)||–|
|Patkar et al, 2009 ()||Reviewer's impression on chart review||Evaluated 2 sets of ICD-9 codes, any position in claim: “comprehensive” set, “restricted” set||162||Definite infections: 100 (96–100) 59 (48–69)||–|
|Schneeweiss et al, 2007 ()||MD impression or diagnostic criteria on chart review||≥1 ICD-9 code for: ||–||Using MD impression: |
|Gedeborg et al, 2007 ()||ICU database (maintained by 2 ICU physicians)||Primary vs. secondary diagnosis, ICD-9 vs. ICD-10 (see appendix for codes; wide vs. narrow combinations numerous) ||–|
|Madsen et al, 1998 ()||1. Reviewer's impression on chart review; criteria used||ICD-10 (31 unique codes included)||83 (75 patients)||ICD-10: Septicemia: 4.4 (2.4–6.4)||Septicemia: 21.7 (12.8–30.5)|
|2. Bacteremia database||207 (186 patients)||Septicemia and sepsis: 5.9 (3.6–8.2) compared to RS 2|
|Ollendorf et al, 2002 ()||Prospective trial of sepsis||ICD-9 sepsis codes in any position: 038.3, 022.3, 790.7, 038.4, 038.49, 038.40, 038.41, 054.5, 036.2, 038.2, 038.43, 003.1, 038.8, 038.9, 020.2, 038.44, 038.1, 038.0||122||75.4||–|
|Aronsky et al, 2005 ()||3 steps: |
|Grijalva et al, 2008 ()||Medical chart review for Streptococcus pneumoniae organism identification required||Principal vs. secondary position ICD-9-CM codes: ||–|
|Guevara et al, 1999 ()||4,385 (all) 240 (definite) 53 (probable) 268 (possible)|
|Jackson et al, 2003 ()||Chart review; treating physician's impression was pneumonia||2,455||–||65|
|Marrie et al, 1987 ()||Prospective cohort of pneumonia for separate study (laboratory, clinical data)||69.5||57|
|Meropol and Metlay, 2012 ()||Reviewer's impression on chart review||Broad list of hospitalization codes for CAP, including organism-specific codes (total number 59 codes)||59 charts available||–||86 (75–94)|
|Skull et al, 2008 ()||ICD-10-AM codes J10–J18 in any position|
|Whittle et al, 1997 ()||Chart review: “explicit criteria,” “implicit review,” “physician panel”||Discharge ICD-9-CM (principal position) vs. study algorithm vs. DRG||Total: 144|
|Van de Garde et al, 2007 ()||Prospective study of pneumonia||Discharge ICD-9 codes in primary or secondary positions: ||293 total ||–|
|Yu et al, 2011 ()||Chart review: “definite CAP,” “probable CAP” (based on report of physician opinion)|
|Author, year (ref.)||Reference standard for case validation||Case definition for case identification||N||Sensitivity (95% CI), %||PPV (95% CI), %|
|General opportunistic infections|
|Curtis et al, 2007 ()||Chart review using diagnostic criteria (evidence-based abstraction form)||≥1 diagnosis code on any type of claim (could be laboratory or diagnostic) ||–|
|Schneeweiss et al, 2007 ()||MD impression or diagnostic criteria on chart review||–|
|Grijalva et al, 2008 ()||Medical chart review (organism identification required)||Principal vs. secondary positions: ICD-9 codes 117.3, 518.6, 484.6, 112.4, 112.5, 112.81, 112.83, 112.84, 114, 117.5, 321.0, 115 (21 candidiasis, 2 Cryptococcus, 2 aspergillosis, 1 histoplasmosis)||26||–|
|Calderwood et al, 2010 ()||Chart review, TB case must fulfill CDC criteria (for PPV, used denominator of all cases captured by all algorithms for identification and not shown here)||Confirmed TB |
|Fiske et al, 2012 ()||TIMS uses 4 criteria to confirm cases: isolation of organism, positive AFB, clinical diagnosis, and provider diagnosis|
|Trepka et al, 1999 ()||ICD-9 codes 010–018 in any position||133|
|Winthrop et al, 2011 ()|
|Yokoe et al, 1999 ()||Chart review: TB defined according to CDC criteria||45|
|Yokoe et al, 2004 ()||TB registry supplemented by chart review All cases met the following criteria: positive TB skin test, signs and symptoms compatible with TB, and treatment with ≥2 anti-TB medications||Pharmacy data: ≥2 anti-TB medications||244||36||33|
The characteristics of the studies are shown in Table 1. The majority of studies were from the US (18 of 24) and used ICD-9 codes (20 of 24). One study used The Health Improvement Network database, 3 used International Statistical Classification of Diseases and Related Health Problems, Tenth Revision (ICD-10) codes, and 1 used pharmacy data only (Table 1). Six of the studies were in populations of RA patients.
The studies on bacterial infections in general reported highly variable sensitivities and positive predictive values (PPVs) for the diagnosis of infections using administrative data depending on the infection, the algorithm used, and the population it studied, as well as the reference standard (Table 2). Although specificity was occasionally reported by studies, it was often unclear if it had been correctly computed based on the methods provided, and we have limited our discussion here to sensitivity and PPV.
Nine studies compared the diagnostic accuracy of different algorithms to identify bacterial infections ([8-13, 17, 22, 23]). Table 4 shows the types of algorithms employed. Algorithms tested varied based on the number of diagnostic codes used to identify a specific infection, whether the code for infection was in the first position (the most responsible diagnosis) or a secondary position, a combination of number of codes and position of codes, or combining data on diagnostic codes with other types of administrative data. Since studies that compared and contrasted differing algorithms offer significant insight into the use of administrative data for identifying infections, some selected examples will be described here, and the remainder is shown in Table 2.
|Diagnostic codes: position 1 vs. any position in the discharge abstract|
|Diagnostic codes: including discharge and admission codes|
|Diagnostic codes: using >1 code (e.g., physician code and diagnostic code)|
|Pharmacy data: either alone or in combination with discharge codes or culture data|
|Diagnostic codes in combination with additional administrative data: age, length of stay, sex, death, season of admission, comorbidity|
|Diagnostic codes in combination with culture data|
Patkar et al () conducted a cross-sectional study of hospitalized RA patients and examined the accuracy of 2 algorithms of ICD-9 codes for identifying infections that were established based on expert consensus. One algorithm was “restrictive” and one was more inclusive, with the goal of maximizing sensitivity. ICD-9 codes could be in any position in the (hospital) discharge data. The restricted set of codes had previously been validated (). The reference standard was chart review by 2 independent trained reviewers, and cases were classified based on “clinical judgment” as “no infection,” “infection empirically treated,” or “definite infection.” The study also simultaneously tested the diagnostic characteristics of a set of infection criteria for 16 types of bacterial infections based on “clinical, microbiological, laboratory and radiographic” parameters. The study concluded that the sensitivity of infections identified using the comprehensive set of ICD-9 codes was 100% (95% confidence interval [95% CI] 96–100%) compared to 59% (95% CI 48–69%) for infections defined with the restricted set, using “definite” infections as the reference standard. The specificity of infections using the comprehensive set of codes was lower compared to the restricted set using the same reference standard (40%; 95% CI 31–49% versus 81%; 95% CI 73–87%). Lastly, the study also examined the diagnostic utility of 16 infection criteria used in combination with ICD-9 codes for identifying infections and found that the combination of the two led to the greatest accuracy (PPV 96%).
Grijalva et al () examined infections in Medicaid patients with RA, including community-acquired pneumonia, invasive pneumococcal disease, sepsis, and opportunistic mycoses. They compared algorithms in which the infection was coded in the first position (principal) versus any other position in the discharge summary. A medical chart review was the gold standard and only the PPV was reported. The PPV for all diagnoses was higher when the infection was identified by a code in the principal diagnostic position (except for invasive pneumococcal disease, which had a PPV of 100% regardless of the field).
Gedeborg et al () examined central nervous system infections, sepsis, and pneumonia in an intensive care unit (ICU) population defined by a number of algorithms: using either ICD-9 or ICD-10 codes in either the primary or secondary position in the discharge abstract. For sepsis, they also examined using a “wide” algorithm of codes (higher number) versus more “narrow” criteria (smaller number of codes). The gold standard was the ICU database, which was maintained by the ICU physicians and was separate from the discharge register. The authors demonstrated that restricting the case definitions (using a narrower algorithm) increased the accuracy of the algorithm, but at the expense of sensitivity. Similar findings occurred when only the principal position in the discharge abstract was used in the algorithm. They also noted that the ICD-9 and ICD-10 codes performed differently (ICD-9 more accurately for sepsis and ICD-10 more accurately for pneumonia).
Whittle et al () selected a random sample of hospitalized patients with an ICD-9, Clinical Modification (ICD-9-CM) diagnosis of pneumonia. They excluded patients with hospital-acquired pneumonia and patients with human immunodeficiency virus/AIDS or organ transplants. They compared 3 administrative data–based algorithms for identifying subjects with pneumonia against the reference standard of clinical chart review by blinded abstractors with medical training. The algorithms evaluated were: 1) an algorithm developed by the authors based on hospital discharge data, including ICD-9-CM codes and a patient management categories (PMC) system, which grouped patients based on related clinical diagnoses; 2) an algorithm using disease-related grouping (DRG) classification; and 3) the presence of a pneumonia ICD-9-CM code in the principal diagnosis position of hospitalization data. The first algorithm (using PMC and ICD-9-CM codes) had a sensitivity of 89%, a specificity of 80%, and a PPV of 89%. Interestingly, algorithm 3 (using only an ICD-9-CM code for pneumonia in the principal diagnosis position) had a similar sensitivity (84%), specificity (86%), and PPV (92%) and was less complex to use. Lastly, algorithm 2 (based on DRGs) had a lower sensitivity for identifying pneumonia (74%), but had the highest PPV (93%). Some comorbidities, including lung cancer, made it more difficult to classify cases, but overall accuracy did not vary by age, number of secondary diagnoses, or vital status at discharge.
Aronsky et al () examined 5 different algorithms with specific codes to identify pneumonia using hospitalization discharge data: 3 of the algorithms utilized ICD-9 codes in varying number or position, with the third algorithm identifying severe pneumonia cases (pneumonia cases with sepsis or respiratory failure), and the last 2 algorithms used DRGs in different combinations. The authors had a complex 3-step reference standard for pneumonia, which is described in detail in the study (). The authors examined emergency department patients, 73.2% of whom required hospitalization (data are presented for all patients and hospitalized patients separately). Lastly, they combined chart review with the claims-based algorithms described above to evaluate whether the patients identified by the 5 different algorithms had different features with respect to age, sex, hospitalization rate, pneumonia severity and inpatient mortality, cost, and length of stay for the subset that was hospitalized.
Algorithms 2 and 3 included a greater number of diagnostic codes and had the highest sensitivity and PPV (Table 2), but results varied between patients that required hospitalization and those that did not. In the entire sample, the algorithms had a sensitivity of 65–66% and a PPV of 80%, whereas when hospitalized patients were examined separately, algorithms 2 and 3 performed slightly better, with a sensitivity of 68–69% and a PPV of 84%. When claims-based data were combined with chart review, length of stay and costs were determined to be less using DRG-based algorithms compared to the reference standard and mortality was slightly lower using one of the DRG algorithms (algorithm 4), but other features described above were not measurably different between the reference standard and in patients identified by the 5 algorithms.
Yu et al () examined diagnostic codes for pneumonia combined with other types of administrative data, including demographic features (e.g., age, sex, length of stay, season), relevant comorbidities (e.g., asthma, heart failure), and procedure codes, and examined the performance of various algorithms against the gold standard of chart review using classification and regression tree (CART) analysis. They determined that the performance of the algorithms varied by age group. Overall compared to models where only a primary discharge diagnosis code for pneumonia was used, the CART algorithms improved the sensitivity by 18–32%, with only a small decrease in PPV by 2–7%.
Opportunistic infections were examined in 9 articles ([6, 11, 14, 24-29]). Three studies examined a variety of opportunistic infections ([6, 11, 14]). In the study by Schneeweiss et al (), opportunistic infections, including pulmonary tuberculosis (TB), atypical mycobacteria, candidiasis, Cryptococcus, and aspergillosis, were examined. Candidiasis had the lowest PPV (20%) and the remainder had PPVs that varied between 67% and 100%. The overall PPV for the identification of an opportunistic infection using administrative data was 58% (95% CI 46–70%), and increased to 73% (95% CI 61–85%) if Candida infections were excluded. Grijalva et al (described above) showed a high PPV for opportunistic mycoses (100%) when the first diagnostic position was used (). The second article by Curtis and colleagues () reviewed adverse events, including opportunistic infections, in patients with RA or Crohn's disease treated with anti–tumor necrosis factor α agents. Patients were identified using medical and pharmacy claims data from a large US health care organization. The opportunistic events of interest, including active TB, Pneumocystis jiroveci, histoplasmosis, coccidioidomycosis, and Cryptococcus, were identified using ≥1 diagnosis code on any type of claim after the index date (including physician visits, diagnostic tests, or radiologic studies). Other adverse events captured by the study included aplastic anemia, non-Hodgkin's lymphoma, and “lupus-like syndrome.” The reference standard was medical chart review using an “evidence-based, pilot-tested data abstraction form.” The PPV of claims data for confirmed adverse events was poor overall (including opportunistic infection and other adverse events) at 18% (95% CI 9–33%). Individual PPVs for opportunistic infections were not reported. For some infections, including TB, none of the cases could be confirmed on chart review. Of note, overall, the PPVs of claims from inpatient settings were higher than those in outpatient settings, and the PPV was higher if >1 diagnostic claim was used in the algorithm for case definition. Because there were very few infectious complications other than TB (n = 14 with TB and n = 7 other), it is not possible to comment on the PPV of specific opportunistic infections identified using administrative data.
Six studies specifically examined algorithms for identifying TB ([24-29]). Calderwood et al developed algorithms for TB detection incorporating ICD-9 codes for TB, pharmacy data, and an order for acid-fast bacilli. They tested the algorithm in 3 separate cohorts (a development cohort, a historical cohort, and a prospective cohort; the first 2 are shown in Table 3). Although the PPV for confirmed active TB was not demonstrated to be high using their screening criteria (64%), the PPV for physician-suspected active TB was 91% and their algorithm aimed for high sensitivity, which was achieved (100%). They then implemented their algorithm during 18 months of prospective followup for physician-suspected TB and demonstrated a high PPV for physician-suspected active TB (100%; only 1 case was not confirmed); however, this represents only 7 cases and further validation is required.
In a recently published study by Fiske et al () of RA Medicaid patients using a TB registry as the gold standard, ICD-9 data alone grossly overestimated the number of TB cases (449 versus 10 confirmed cases in the registry); even when ICD-9 codes were combined with pharmacy data, the false-positive rate was still 75%. Trepka et al () also demonstrated that the sensitivity and PPV for discharge diagnosis for TB are low (47.7% and 38.3%, respectively).
Yokoe et al () examined pharmacy data alone for identification of TB using a definition of prescription for ≥2 anti-TB medications and similarly found very low sensitivity and PPV (36% and 33%, respectively).
Different algorithms may perform differently in different administrative databases. Winthrop et al () examined 2 different administrative data sources, a Veteran's Affairs data source and data from Kaiser Permanente, and found differing accuracy of their algorithms for identification of TB and nontuberculous mycobacteria (examples of some of the algorithms are shown in Table 3). This study also demonstrated that inclusion of microbiologic evidence is a highly sensitive and accurate method for case ascertainment and that TB diagnostic codes in combination with pharmacy data were superior to TB codes alone (Table 2).
The quality of the studies was assessed using a standardized assessment (), and key features of our quality review are shown in Table 5. Overall, the studies were rated as good quality. Twenty-two studies (91.7%) included a statement in the introduction that specifically stated that one of the goals of the study was disease identification and validation and 19 studies (79.2%) reported a PPV and/or negative predictive value, with 15 studies (62.5%) reporting 95% CIs. Areas for quality improvement include the following: only 10 studies (41.7%) described the training and expertise of those reading the reference standard, only 6 studies (25%) clearly stated that the readers of the reference standard were blinded, and only 8 studies (33.3%) reported ≥4 estimates of diagnostic accuracy (Table 5).
|Author, year (ref.)||Intro: states disease identification and validation as goals of study||Methods: describes disease classification||Number and training of those reading the reference standard described||Readers of the reference standard were blinded||Results: study flow diagram||Results: ≥4 estimates of diagnostic accuracy are reported||Results: for relevant subgroups additional data are presented||Results: PPV reported||Results: 95% CI reported||Discussion: applicability findings discussed|
|Aronsky et al, 2005 ()||Yes||?||?||?||No||Yes||No||Yes||Yes||Yes|
|Calderwood et al, 2010 ()||Yes||Yes||Yes||?||No||No||No||Yes||Yes||Yes|
|Curtis et al, 2007 () (bacterial)||No||Yes||Yes||Yes||No||No||No||No||No||?|
|Curtis et al, 2007 () (opportunistic)||Yes||Yes||?||?||No||No||No||Yes||Yes||Yes|
|Fiske et al, 2012 ()||Yes||Yes||?||?||No||Yes||No||Yes||Yes||Yes|
|Gedeborg et al, 2007 ()||Yes||Yes||Yes||?||No||Yes||No||No||Yes||Yes|
|Grijalva et al, 2008 ()||Yes||Yes||Yes||Yes||No||No||No||Yes||Yes||Yes|
|Guevara et al, 1999 ()||Yes||Yes||?||Yes||No||Yes||No||Yes||No||Yes|
|Jackson et al, 2003 ()||No||Yes||No||?||No||No||No||Yes||No||No|
|Landers et al, 2010 ()||Yes||Yes||NA||NA||Yes||No||No||No||Yes||Yes|
|Madsen et al, 1998 ()||Yes||Yes||No||?||No||No||No||Yes||Yes||Yes|
|Marrie et al, 1987 ()||Yes||?||No||?||Yes||No||No||Yes||No||?|
|Meropol and Metlay, 2012 ()||Yes||Yes||No||No||No||No||No||Yes||Yes||Yes|
|Ollendorf et al, 2002 ()||Yes||Yes||?||?||No||No||No||No||No||Yes|
|Patkar et al, 2009 ()||Yes||?||Yes||?||Yes||Yes||No||Yes||Yes||Yes|
|Schneeweiss et al, 2007 ()||Yes||Yes||Yes||?||No||No||No||Yes||Yes||Yes|
|Skull et al, 2008 ()||Yes||Yes||Yes||Yes||Yes||Yes||Yes||Yes||Yes||Yes|
|Trepka et al, 1999 ()||Yes||Yes||Yes||?||No||No||Yes||Yes||No||Yes|
|Van de Garde et al, 2007 ()||Yes||Yes||?||?||No||No||Yes||No||No||Yes|
|Whittle et al, 1997 ()||Yes||Yes||Yes||Yes||No||Yes||Yes||Yes||No||Yes|
|Winthrop et al, 2011 ()||Yes||Yes||No||?||No||No||Yes||Yes||Yes||Yes|
|Yokoe et al, 1999 ()||Yes||Yes||No||?||No||No||No||Yes||Yes||Yes|
|Yokoe et al, 2004 ()||Yes||Yes||?||?||No||No||Yes||Yes||No||Yes|
|Yu et al, 2011 ()||Yes||Yes||Yes||Yes||No||Yes||Yes||Yes||Yes||Yes|
Infectious complications are a significant cause of morbidity and mortality in rheumatic diseases. The results of our review have important implications for researchers and policy planners using administrative data for disease surveillance. Infections are also an important outcome of interest in pharmacoepidemiologic studies evaluating adverse events of medications used to treat rheumatic diseases. The principal finding of our study is that hospitalization administrative data have variable accuracy for identification of serious infections as outcomes or comorbidities, depending on the type of infection, source of administrative data, population studied, and algorithm used. Although we initially set out to define the most appropriate algorithms for identifying infections using administrative data, it is apparent from our review that we cannot endorse a specific algorithm, since the choice of algorithm would depend on the purpose of the study (i.e., whether it is more important to maximize sensitivity or PPV). Additionally, no specific threshold values exist for accuracy measures that are defined as acceptable (). We did uncover certain principles when choosing algorithms for identifying infections in administrative data that are important to consider when designing a study, and we have summarized these below.
Our study has a number of limitations. First, our search strategy was designed to broadly evaluate infections (specifically, opportunistic infections and those requiring hospitalization) and their identification using administrative data. As such, we did not search medical databases with an exhaustive list of individual types of infections and thus there may be specific infections where validation in administrative data sets exists that were not identified by our search. Additionally, the choice of index terms for this systematic review was difficult because administrative databases are not well indexed in the literature databases, and therefore relevant studies may have been missed. A major limitation of applying the algorithms identified by our search is that the majority utilized ICD-9 codes, and these will not be applicable in administrative data sets in jurisdictions using ICD-10 codes. It is also possible that performance of diagnostic coding algorithms may vary slightly by jurisdiction depending on coding practices (this was shown in one of our studies that used cross-validation in a separate administrative database ). Finally, comparison of algorithm performance across studies was limited by the variation of reference standards, which varied from clinical impression based on chart review to a specific “evidence-based” criterion and even identification of events in prospective cohorts.
Despite these limitations, there are some general key principles shown by this work. A number of studies compared differing algorithms to identify infections and demonstrated that increasing the number of diagnostic codes for infections improves sensitivity; however, this is often at the expense of decreasing specificity (). The use of multiple data sources for identifying infections also improved accuracy. For example, using infection diagnostic codes from hospital discharge data in combination with microbial or pharmacy data improved sensitivity and specificity for diagnosis of TB.
The position of the diagnostic code is also of importance. Using infection diagnostic codes placed at any position of hospitalization data also improved sensitivity when compared to an algorithm using only diagnostic codes placed in the first position of hospitalization data (); however, the latter improved PPV ().
Finally, the strategy of using ICD-9 diagnostic codes to screen for infections, followed by chart review to confirm infections, leads to improved PPV ([8, 13, 17]); however, access to medical records or limited patient data from discharge summaries is not available in many centers and may be impractical for large population studies.
Although in general the algorithms presented to identify bacterial infections from administrative data had reasonable sensitivity, we identified some significant exceptions worth noting. For example, some infectious complications have a very low PPV, such as systemic candidiasis (). Sepsis had highly variable estimates of accuracy ([10, 11, 15, 16]). Both sepsis and candidiasis have complex definitions and further validation is likely required prior to applying the presented algorithms in different databases.
In contrast to the available information for bacterial infections, less data were available evaluating the accuracy of administrative data to identify opportunistic infections. Furthermore, the number of patients in each study was relatively small (especially for non-TB opportunistic infections). This is likely because opportunistic infections are rare. Additionally, information about the occurrence of opportunistic infections such as TB is often maintained by additional agencies such as public health departments, which may have more accurate information on infections, but the data may not be linkable to other sources of administrative data.
Hospital records alone are an inaccurate data source for identifying TB, and the primary reason for this discrepancy may be that cases are often coded as TB during investigation when the diagnosis has not yet been proven. Use of pharmacy data to identify TB can also be problematic if patients are obtaining medications from public health departments, which are not captured in pharmacy billing databases. Our results suggest that the methods most likely to be successful in identifying opportunistic infections would require linkage to additional public health data sources for reportable diseases such as TB and/or addition of case confirmation using culture data.
Our quality assessment of the literature on validation studies using administrative data did identify some deficiencies in that many of the studies did not report on the training and expertise of the individuals reading the reference standard or if they were blinded, which could influence the presence of bias. Additionally, many of the studies did not report on 4 or more tests of diagnostic accuracy, which has been listed as a quality criterion in validation studies of administrative data (). Among studies that did report multiple measures, often methods for calculating specificity were not adequately described, and we have chosen not to report this measure. Since the quality criterion for validation studies of administrative data was only recently published, we hope that future studies of this nature will adhere more firmly to these recommendations ().
In conclusion, when using administrative data to identify serious infections as outcomes or covariates, hospitalization data can be used to identify serious bacterial infections. If greater sensitivity is desired, using a more comprehensive definition including a greater number of individual infection codes and/or using a diagnostic code for infection found in any position of the claims data is recommended. Current data are not sufficient to recommend the use of administrative data to identify opportunistic infections without multiple data linkage to ensure adequate specificity.
All authors were involved in drafting the article or revising it critically for important intellectual content, and all authors approved the final version to be published. Dr. Barber had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study conception and design. Barber, Lacaille, Fortin.
Acquisition of data. Barber, Fortin.
Analysis and interpretation of data. Barber, Lacaille, Fortin.