Validity of hospital ICD‐10‐GM codes to identify anaphylaxis

Anaphylaxis (ANA) is an important adverse drug reaction. We examined positive predictive values (PPV) and other test characteristics of ICD‐10‐GM code algorithms for detecting ANA as used in a multinational safety study (PASS).

• Restriction of the algorithm to primary discharge codes will substantially improve the PPV, but many cases will be missed.
• The more restricted algorithm is considered helpful for bias analysis in comparative research on relative risks of ANA.

| INTRODUCTION
Anaphylaxis (ANA) is defined as a severe and immediate hypersensitivity reaction with rapid onset following exposure to an antigenic trigger. 1 The definition applies irrespective of the underlying pathophysiological mechanism. 2 Reported incidence rates range from 1.5/10 5 to 7.9/10 5 person-years with a lifetime prevalence of up to 5%. [3][4][5] Cases are most likely encountered in emergency departments, where they may account for about 0.03% of visits and show a 0.7% case fatality rate. 5,6 For ANA, there is no single universally applicable code in the International Classification of Diseases systems (ICD-9, ICD-10). Various ANA-specific codes may apply, depending on context, severity and trigger, in addition to codes describing allergic reactions or adverse drug reactions. 7 Various combination algorithms have been used to identify ANA as a safety outcome in pharmacoepidemiological research. Where reported, diagnostic performance varies widely depending on the algorithm used and the context in which it was applied. 7-10 Information on the validity of algorithms used to identify this adverse event in the respective population/setting is therefore crucial when examining the risk of ANA.
Following safety concerns of the European Medicines Agency (EMA) on the risk of ANA from the use of intravenous iron, a multinational European post-authorization safety study was recently performed (IV iron PASS). 11 Data sources included the German Pharmacoepidemiological Research Database (GePaRD), which contains claims data from statutory health insurance providers in Germany. 12 Access to medical charts is not possible in GePaRD due to data protection regulations. Therefore, we applied the algorithms which were developed for the PASS to administrative data from a single hospital within the catchment area of GePaRD to examine the diagnostic accuracy of these algorithms (external indirect validation).
Of primary interest was the positive predictive value (PPV) of the main PASS outcome definition as used in GePaRD.

| Study design and setting
The study was performed as a cross-sectional validation study. Potential cases were retrospectively identified from hospital administrative data. Medical chart review was performed to verify the diagnosis. Protocol and procedures were developed in consensus with the IV Iron PASS study group. 11 The setting was an acute care hospital in Germany (approximately 830 beds). The five departments contributing the most case numbers of eligible discharge codes and those likely to be involved in the treatment of ANA were approached for participation (Gastroenterology, Cardiology, Internal Medicine /Nephrology, General & Visceral Surgery, Dermatology and Emergency Medicine). All but Dermatology agreed to contribute. Data on in-hospital treatment as well as outpatient care delivered on the hospital premises by hospital-employed specialists were available (ICD-10-GM codes). The study period ranged from January 01, 2004, to April 30, 2019. We included patients aged 18 and older.

| Identification of potential cases from Hospital
Information System (HIS)

| Sampling procedure
Sampling from HIS included all cases for which any ANA related diagnostic code as listed in Table 1 was documented during the study period (primary and secondary, discharge and admission). For nonspecific codes, only in-hospital care was considered. All sampled cases ("eligible cases") served as a basis (sampling frame) to which the various case-finding algorithms were applied.

| Case finding algorithms
Case-finding algorithms used in the PASS followed published recommendations on the development of ANA algorithms. [9][10][11] In the PASS, ANA was assumed if at least one of the following criteria was fulfilled: • A: Inpatient or emergency room encounters: any specific diagnostic code for ANA • B: Outpatient encounters: any specific diagnostic code for ANA in combination with at least one symptom, procedure and/or treatment code indicating ANA or shock • C: Inpatient or emergency room encounters: any non-specific diagnostic code combined with at least one symptom code compatible with ANA AND at least one code indicating shock or death.
Symptom and procedure codes as used for criteria B and C are shown in Table 2. For the primary outcome (main algorithm), discharge and admission and primary and secondary codes were considered. Minor modifications were necessitated by characteristics of the contributing databases. For GePARD, this related primarily to using ICD-10-GM (German Modification) codes and missing information on in-hospital medication.
For the validation study, we used the main algorithm as used by GePaRD. However, information on medication and procedures other than as coded via ICD-10-GM was unavailable.
In addition, we examined the following modifications: • Modification 1 (most specific): primary discharge codes only   Table 2.

| Data extraction
All eligible cases with specific diagnoses, all cases fitting criterion C, as well as a random sample (target size 300) of cases with non-specific diagnoses, were selected for medical record extraction.
Discharge letters and emergency room notes were collected from the electronic hospital system, complemented by a hard copy search for outpatient notes. Anonymized documents were reviewed in random order by two independent trained researchers (AT, SK). Information was extracted using a standard data form. This included type of stay, physician-reported ANA or allergy-related diagnoses, as well as any information on relevant symptoms, treatments, procedures, suggested trigger and timing (speed of onset, time from exposure to onset of symptoms) (Form available as supplemental material, S1).
Free text physician reported ANA or allergy-related diagnoses were grouped as ANA, adverse drug reaction (AE), allergic reaction other than ANA, past ANA/known allergy, and none (no ANA related diagnosis). T A B L E 1 ICD-10-GM codes used for initial identification of eligible cases ICD-10-GM codes Descriptor "Anaphylaxis-specific codes"

| Criteria defining true cases
T78.2 Anaphylactic shock, unspecified T88.6 Anaphylactic shock owing to adverse effect of correct drug or medicament properly administered T80.5 Anaphylactic reaction due to serum Other "non-specific codes for anaphylaxis" or "allergy codes"

| Adjudication procedure
Following independent extraction and preliminary categorization by the clinical reviewers, cases were categorized by reviewer consensus according to the verification criteria as ANA ("true" cases, confirmed ANA), non-ANA and non-evaluable (insufficient information). 5 In addition, verification was performed by applying a computerized algorithm on the extracted data, as shown in Table 3. Consensus and computerderived based diagnoses were compared, and inconsistencies resolved by re-evaluation.

| Descriptive and main analyses
All analyses are presented as case-based (per encounter) unless stated otherwise. Additional patient-based analyses were performed by including only the first encounter of a patient within the study period.
PPVs were calculated as the proportion of confirmed ANA cases, relative to all potential cases with EMR available, presented as percentage (%) with 95% Clopper-Pearson confidence intervals (CI).
Non-evaluable cases were treated as non-ANA ("worst PPV scenario") for the primary analysis.

| Subgroup and sensitivity analysis, additional analysis using missed cases
Results are also presented separately per criterion if at least ten potential cases were identified for the respective criterion. In addition, patientbased analyses are presented along with case-based analysis. Subgroup analyses were planned for type of stay, sex, and department.
Sensitivity analysis used the following modifications of the case verification algorithm: • S1 ("best PPV scenario"): Non-evaluable cases are considered true ANA.
• S2 ("clinically sensible diagnosis"): True ANA is assumed despite insufficient information or failure to meet all formal criteria where adrenaline, cardiopulmonary resuscitation (CPR), transfer to intensive care (ICU), intubation, or artificial ventilation had been applied in the context of an allergic event.
Additional sensitivity analysis examined the effect of excluding cases discharged prior to 2008 in order to detect effects from an organizational change in coding practice.

| Additional diagnostic information
Using all cases sampled as eligible for which extraction was performed, sensitivity (SE), specificity (SP) and negative predictive values (NPV) were calculated with 95% Clopper-Pearson CI.

| Data management and quality control
We used double data entry of chart extraction data, predefined plausibility checks for the full dataset, and double programming for all main analyses. The data collection and management was done using the OpenClinica open source software, vs 3. 13 Main statistical analyses were generated using SAS software, vs 9.4. 14 2.6 | Ethics and data privacy

| Description of HIS data and sampled cases
The evolution of case numbers for the main algorithm is shown in Figure 1. Nephrology /Internal Medicine emerged as the most important contributing medical specialty once the main algorithm was applied.
Almost all potential cases were identified by criterion A, in particular from codes T78.2 ("ANA not specified", 50 cases) and T88.6 ("drug-related ANA," 39 cases). The expanded algorithm (modification 2) increased the overall number of potential cases to 122 (113), mostly due to deceased patients from cardiology or oncology with AE coded as a secondary diagnosis. More detailed numbers per subgroup and algorithm are available as supplementary material (S2).

| Chart extraction and case adjudication results
Of 510 cases (433 patients) sampled for data extraction, chart information could be retrieved for 416 cases (344 patients) (81.6%, Figure 1), Note: All cases are considered, irrespective of whether medical records (EMR) were available. A: inhospital or emergency room encounters with specific ANA codes. B: outpatient cases with specific ANA related codes (+ symptom or procedure codes). C: in hospital non-specific cases (+ symptom + procedure codes). Any: any of the criteria A, B or C applies. Criteria A, B, C are not mutually exclusive. Diagnoses are not mutually exclusive. Sum of cases per diagnosis may exceed column total. AE: Adverse effect. Nos: not otherwise specified, not specified. ANA: anaphylactic shock, anaphylactic reaction. Department: Department of discharge. In the case of "other," the case was admitted to one of the participating departments but transferred prior to discharge.

| Extracted clinical information relating to ANA
Of the 49 confirmed cases from the main algorithm, 36 had a diagnosis of ANA reported in the medical chart (73.5%) ( Note: Diagnoses were treated hierarchically (as ordered), thus were mutually exclusive. Triggers: not mutually exclusive, sum of cases may exceed total. and 91.5%, respectively) ( Table 7). As expected, for the most restrictive algorithm, SE was worst (32.4%), while SP was excellent (98.0%) (modification 1). SE was best for the simulating algorithm (75.0%, modification 3).

| DISCUSSION
In this indirect external validation study, we determined the validity of ICD-10-GM codes based algorithms describing ANA. For the primary outcome, a PPV of 63% was calculated. This is in accordance with results from direct validation within IV Iron PASS, as well as with results from previous research on comparable algorithms. [9][10][11] About a third of all true cases in our preselected sample were missed. The more specific algorithm resulted in a PPV of almost 80% but identified substantially fewer cases.
Almost all potential cases had been identified by specific ANA related codes in in-patient cases (criterion A). For outpatient and nonspecific ANA-related codes, the algorithm required additional ICD-10 symptom codes (criteria B and C), which were likely to be underreported. We used several approaches to examine the effect this may have had on the validity of the algorithms.
First, the main algorithm was expanded to include all deaths occurring in combination with ANA related codes. This increased numbers of potential ANA, but the resulting PPV was substantially worse. In contrast to a previous report, based on our data, misclassification of ANA related deaths does not seem to play a relevant role when examining ANA in safety studies, at least when case fatality is low. 15 Second, we examined the frequency with which any of the symptoms, procedures and treatments which were part of the PASS outcome algorithms were reported in the charts, thus would have been available for coding. Using these data to inform case finding ("simulation" algorithm) improved case ascertainment as well as the PPV for criterion C (non-specific codes). The effect on the overall PPV, however, was small.
Lastly, we analyzed the proportion of cases missed by the combination algorithms as compared to using ANA related codes without any further conditions. These results have to be interpreted with caution as cases with non-specific codes were under-sampled for pragmatic reasons. Also, we only evaluated a preselected sample with an increased probability of ANA. Taking these factors into account, SE would be substantially lower, and NPV higher than estimated. This lends evidence to the impression that the inclusion of non-specific codes to identify ANA is probably not efficient, even if more symptom codes were available.  Overall, the study profited from close collaboration with an international study group with respect to protocol adherence, quality assurance and immediate applicability. Two of our examined algorithms directly compared to the main and the expanded outcome definition used in the PASS.
Limitations include the restriction to a single center, insufficient case numbers for subgroups, and underrepresentation of outpatient cases. The proportion of hospitalization for ANA varies substantially in the literature and may be as low as around 12%. 4,20 However, ANA related codes in ICD-10 as used in the PASS algorithms are reported to be biased towards severe cases. 7 Also, algorithms were formulated with a focus on drug-induced ANA, which seems to be associated with a higher risk of severe ANA as defined by hospitalization, ICU treatment, CPR, or fatal course. 4,20,21 Therefore, we assume that a focus on hospitalized ANA cases is probably appropriate for most comparative drug safety studies using ICD-10 based algorithms. Better documentation by treating physicians will be essential to improve coding, but coding itself is hampered by the unsatisfactory classification of ANA by ICD-10 diagnostic codes. The revised ICD system (ICD-11, expected to be introduced from 2022) and the addition of novel methods to identify potential cases, in particular, natural language processing techniques, may improve the recognition of ANA in the future. 22-25

| CONCLUSIONS
The assessed algorithm seems useful for identifying ANA cases in particular in hospital settings for comparative safety studies. A more restricted modification may be used for sensitivity analysis to examine the effect of including false-positive events on relative estimates. Both algorithms underestimate the absolute risk of ANA.
Identifying cases via non-specific or outpatient codes may improve sensitivity, but efficiency is questionable as long as recognition, reporting and coding of diagnoses, symptoms and procedures are insufficient.