Predicting Outcome in dogs with Primary Immune‐Mediated Hemolytic Anemia: Results of a Multicenter Case Registry

Background Outcome prediction in dogs with immune‐mediated hemolytic anemia (IMHA) is challenging and few prognostic indicators have been consistently identified. Objectives An online case registry was initiated to: prospectively survey canine IMHA presentation and management in the British Isles; evaluate 2 previously reported illness severity scores, Canine Hemolytic Anemia Score (CHAOS) and Tokyo and to identify independent prognostic markers. Animals Data from 276 dogs with primary IMHA across 10 referral centers were collected between 2008 and 2012. Methods Outcome prediction by previously reported illness‐severity scores was tested using univariate logistic regression. Independent predictors of death in hospital or by 30‐days after admission were identified using multivariable logistic regression. Results Purebreds represented 89.1% dogs (n = 246). Immunosuppressive medications were administered to 88.4% dogs (n = 244), 76.1% (n = 210) received antithrombotics and 74.3% (n = 205) received packed red blood cells. Seventy‐four per cent of dogs (n = 205) were discharged from hospital and 67.7% (n = 187) were alive 30‐days after admission. Two dogs were lost to follow‐up at 30‐days. In univariate analyses CHAOS was associated with death in hospital and death within 30‐days. Tokyo score was not associated with either outcome measure. A model containing SIRS‐classification, ASA classification, ALT, bilirubin, urea and creatinine predicting outcome at discharge was accurate in 82% of cases. ASA classification, bilirubin, urea and creatinine were independently associated with death in hospital or by 30‐days. Conclusions and clinical importance Markers of kidney function, bilirubin concentration and ASA classification are independently associated with outcome in dogs with IMHA. Validation of this score in an unrelated population is now warranted.

I mmune-mediated hemolytic anemia (IMHA) is among the most common autoimmune condition affecting dogs, 1 and some aspects of its pathogenesis have been well characterized. 2,3 Despite such insights, the prognosis for dogs with IMHA remains guarded, with published case fatality rates for primary IMHA in dogs ranging from 26% to 60%. [4][5][6] Previous studies have linked various clinicopathologic abnormalities with outcome in dogs with IMHA. Few prognostic indicators are consistent across multiple studies, however, perhaps because of differences between study populations or because of a lack of standardization. It has been suggested that validation and standardization of diagnostic criteria is urgently required for dogs with IMHA and that future interventional clinical trials would benefit from stratification by mortality risk. 7 Mortality risk assessment for clinical trials is typically performed using illness severity scores. 8  multifaceted scoring system. Two such schemes, the canine hemolytic anemia objective score (CHAOS) and a score developed in Japan (Tokyo) have been proposed, a 9 but neither has been independently evaluated to determine if they remain prognostic outside of the populations from which they were generated. Alternatives to these disease-specific illness severity scores that might be easier to estimate are the American Society of Anesthesiologists (ASA) health classification and the presence or absence of markers of a systemic inflammatory response syndrome (SIRS). The ASA classification is typically used to evaluate patient risk for anesthesia, 10 but the classification is easy to apply and has been used as a marker of disease severity in other canine populations. 11 The inflammatory response associated with IMHA in dogs is well-recognized and can be evaluated through measurement of acute phase proteins, 12 or cytokine concentrations. 13 These measures are not widely available however, while a SIRS score based on readily obtained clinical data is a more universal means to identify dogs with systemic inflammation. 14 Several studies from individual centers in the United Kingdom have been published recently, but each described relatively few cases and studied distinct aspects of the disease. Even with the benefit of these data, it is difficult to summarize the demographics, therapies and outcomes of the overall UK canine IMHA population presenting to referral centers. [15][16][17] In this study we aimed to address these knowledge gaps by surveying case presentations, management strategies and outcomes of dogs with IMHA presenting to multiple referral centers in the British Isles. In addition, we aimed to test the association of illnessseverity markers ASA and SIRS status with outcome and test the predictive ability of 2 previously published IMHA-specific illness severity scores. We also aimed to identify independent prognostic markers from our own dataset using a multivariate analysis approach and hypothesized that a multivariable scoring system would predict survival better than individual variables alone.

Sample Size
Based on previous publications we estimated case fatality at discharge at 17%, 16 and that 20% dogs would have previously identified risk factors. 6,18, 19 We aimed to detect a 2-fold increase in case fatality risk where such factors were present and therefore planned to enroll 335 dogs. b

Case Recruitment
Collaborators were recruited by publication of a letter inviting participation, 20 and through direct contact with referral centers. Data sharing was agreed in writing. Cases admitted from January 1, 2008 to December 31, 2009 were included retrospectively. Dogs admitted between January 1, 2010 and December 31, 2012 were enrolled prospectively. Dogs with primary, idiopathic IMHA admitted to participating institutions within the study period were eligible for inclusion. To maximize recruitment, the following previously published diagnostic criteria were used: 13,16 anemia (PCV<37%) AND at least one of the following: positive in-saline agglutination test, OR a positive Coombs' test, OR moderatemarked spherocytosis identified by a board-certified clinical pathologist. Dogs were excluded if evidence of a predisposing disease process was present. 21 All dogs underwent diagnostic evaluation according to their individual case histories as judged appropriate by their attending primary clinicians. These diagnostic evaluations (summarized in Table S1) were not standardized, but aimed to identify potential underlying causes and typically included CBC, serum biochemistry, thoracic, and abdominal imaging, PCR testing for tick-borne infections by Babesia, Ehrlichia, and Mycoplasma species and urine culture. Attending clinicians determined case management.

Data Acquisition and Handling
Study data (Data S1) were collected using secure, web-based software that enabled automated data export. c Historical, demographic, at-admission clinicopathologic, treatment, and outcome data were recorded via a custom survey, accessible from January 1, 2010 to February 1, 2013, agreed in advance by all participating centers (Data S2). The survey used dropdown menus, limited-response questions and constrained textboxes to minimize errors. Free-text boxes enabled addition of contextual comments to aid interpretation. Raw data were regularly inspected and where necessary, centers were contacted to correct erroneous or incomplete entries. ASA status was assigned as follows: Grade 1, Normal; Grade 2, Mild systemic disease; Grade 3, Severe systemic disease; Grade 4, Life-threatening systemic disease; Grade 5, Moribund patient, not expected to survive. 22 Illness severity scores CHAOS, a and Tokyo, 9 were calculated as previously reported (Table 1). Systemic inflammatory response syndrome (SIRS) was diagnosed using published criteria: Temperature ≤100.0°F or ≥103.5°F; heart rate >160 bpm, RR >40 bpm; leukocyte count ≤4,000/lL or ≥12,000/lL or ≥10% band neutrophils. 23 Anisocytosis and polychromasia were graded as mild, moderate, or severe as previously described. Similarly, spherocytes were quantified in the monolayer using a 1+ to 3+ scale, where 1+ equals 5-10 spherocytes per 1009 oil field (2-4% of the RBCs); 2+ equals 11-50 (4-20%); and 3+ equals 51-150 spherocytes per field (20-60%). 21 Where discrepancies between automated and manual platelet counts occurred, manual counts were used for calculations. Saline agglutination tests were performed using a drop of EDTA-anticoagulated blood If ≥7 score 2, otherwise score 0 Temperature (°F) If ≥102.0 score 1, otherwise score 0 Agglutination If present score 1, otherwise score 0 Albumin (g/dL) If <3.0 score 1, otherwise score 0 Bilirubin (mg/dL) If ≥5.0 score 2, otherwise score 0 Total Maximum score 7 Tokyo score Sex Male score 1, Female score 0 Season Apr-Sept score 1, Oct-Mar score 0 Packed cell volume (%) If <20 score 1, otherwise score 0 Platelet count (9 10 3 /lL) If <200 score 1, otherwise score 0 Total protein (g/dL) If <6.0 score 1, otherwise score 0 Total Maximum score 5 mixed with a drop of saline on a microscope slide and examined against a white background over a period of 1-2 minutes for gross agglutination, followed by microscopic evaluation for differentiation from rouleaux.

Data Analysis
Exported data were collated and analyzed using proprietary software. d,e,f Data were assessed for normality prior to test selection. Although some variables were parametric, most were not, thus variables are reported as median (interquartile range). To validate use of the whole dataset for outcome analyses, retrospective, and prospective data were compared using nonparametric tests. To correct for multiple (m) comparisons while minimizing the risk of dismissing significant differences, the P-value was adjusted: 24 CHAOS, Tokyo, case, treatment, and center variables were tested for association with death during hospitalization and death by 30 days by univariate logistic regression. The effect of center was assessed using multiple dichotomous variables with a reference group.
Multivariable logistic regression using case variables was then undertaken to generate prognostic models. Center and treatment were excluded from the prognostic models because they were not generalizable to other populations, and were potentially subject to bias from financial constraint and clinician preference respectively. Previously, reported illness-severity scores (CHAOS and Tokyo) were not included in the multivariable analyses. Candidate predictor variables were chosen as follows: associated with outcome in the univariate regression at P < .1; no evidence of collinearity (correlation coefficient <0.9); an event:variable ratio >5. 25 For a complete case analysis to be performed in the prognostic models, only variables with <5% missing data were included. 26 Alanine aminotransferase (ALT), bilirubin, urea, and creatinine values were indexed against (divided by) each center's upper reference interval to account for variations in reference ranges. All variables were simultaneously entered into the model to maximize the predictive ability of prognostic models. Model accuracy was determined using 2 9 2 classification tables. Model discrimination was determined by calculating area under the receiver-operating characteristic curve (AUROC). Model calibration was assessed by Hosmer-Lemeshow goodness-of-fit (model rejected if P < .05) and visual inspection of the contingency table. Model utility was assessed using Nagelkerke's R 2 .

Retrospective and Prospective Case Comparisons
Although our aim was to enroll 335 cases, the rate of case recruitment was slower than anticipated. To minimize time-dependent changes in case management, the registry was closed early, limiting the study period to 5 years. Data from 276 cases (215 prospective, 61 retrospective) were collected. Only 3 variables differed significantly between retrospective and prospective populations (Table 2), which were therefore considered sufficiently comparable for subsequent combined analyses.

Clinicopathologic Data
Two hundred twenty dogs were in-saline agglutina-

Survival
Two hundred and five (74.3%) dogs were discharged from the hospital, equivalent to 25.7% mortality at discharge. Of the 71 nonsurvivors, 56 (20.3%) were euthanized and 15 (5.4%) died. Sixteen dogs (5.8%) were discharged but subsequently were euthanized or died and 2 dogs were lost to follow-up, such that 186 (67.4%) dogs were alive at 30 days after admission, equivalent to 30day mortality of 32.6%. Twelve dogs underwent necropsy and no underlying diseases were identified.

Predictive Value of CHAOS and Tokyo
In univariate analyses CHAOS, when dichotomized as <3 or ≥3, was associated with death in hospital and death within 30 days of admission. Tokyo score, when dichotomized as <3 or ≥3, was not associated with any of the 3 outcome measures (Table 4). ROC curve data for CHAOS, Tokyo, and the prognostic score from the multivariable models are reported in Table 5. The AUROC point estimate and 95% confidence intervals for CHAOS scores were higher than those for Tokyo scores. The 95% confidence intervals for Tokyo scores all included 0.5, suggesting it was little better than chance at predicting outcome in this population.

Outcome Modeling
In univariate analyses, 8 candidate variables were associated with both survival at discharge and survival at 30 days (Table 4). There was no association between center and death at discharge or at 30 days. Two variables were excluded for having >5% missing cases. Center and the CHAOS and Tokyo scores were deliberately excluded from multivariable modeling. For survival prediction multivariable analyses, 6 variables were entered ASA, American Society of Anesthesiologists physical status classification; ALT, alanine transaminase activity; ALP, alkaline phosphatase activity; Retics, absolute reticulocyte count. P < .03 were considered significant at the P < .05 level after adjustment for multiple comparisons. into the final model: SIRS, ASA classification, ALT, bilirubin, urea, and creatinine. For prediction of outcome at discharge, this model was accurate in 82% of cases. In multivariate analysis, 3 variables were independently predictive of death in hospital: ASA classification, bilirubin, and urea. Three variables were independently predictive of death by day 30: ASA classification, bilirubin, and creatinine (Table 6).

Discussion
This multicenter study provides an overview of case characteristics, management, and outcome for 276 dogs with primary IMHA treated at referral centers in the British Isles between 2008 and 2012. Despite intensive management with immunosuppressives, blood products, and antithrombotics, the 30-day mortality rate was 32.6%. This figure is comparable to previous studies, 4,7,27,28 perhaps suggesting our ability to treat IMHA has not improved in recent years.
Illness-severity scores might help identify dogs that might benefit from treatment intensification. This study evaluated the association of 2 IMHA specific illnessseverity scores with outcome in our population. Of these, CHAOS ≥3 was associated with increased odds of death and in particular, a high CHAOS was associated with a risk of death during hospitalization. Tokyo score was not useful for outcome prediction in our study. We also found that ASA classification ≥3 was also associated with death, suggesting the subjective assessment of experienced clinicians can be a reasonable gauge of illness severity in IMHA. As can be seen from the AUROC values (Table 5), the final multivariate model allows outcome to be predicted more accurately than previously reported scoring systems. This is not unexpected, since a model generated from our data should describe our population better than those derived from other populations, and independent evaluation of our model in an unrelated population should be undertaken to ensure its validity. It is noteworthy however that of the 2 previously developed scores, the AUROC point estimate and 95% confidence intervals for CHAOS were good, while the 95% confidence intervals for Tokyo included 0.5, suggesting it was little better than chance at predicting outcome in our population.  We used logistic regression analysis to assess the association of individual clinical and clinicopathologic variables with outcome. Individual variables with signif-icant associations with outcome were combined into multivariable models and the accuracy of these models evaluated using 2 9 2 classification tables that identify how many dogs were correctly classified as dead or alive. The AUROC values were higher for the final multivariable model than for any of the individual variables alone (data not shown). The final 6-variable model for prediction of outcome at discharge was accurate in 82% cases. Although this suggests the model was highly accurate, it should be noted that assuming every dog was discharged alive would have been correct in 74% of cases. The R 2 values suggest that the 6 variables (SIRS, ASA classification, ALT, bilirubin, urea, and creatinine) included in our death in hospital model represent only a minority of the factors determining outcome. In linear regression, a model containing all the variables needed to explain outcome has R 2 = 1. While R 2 values in logistic regression are pseudo-R 2 , their interpretation is similar. For example, our 6-variable model predicting death during hospitalization had R 2 = 0.304. This indicates that most of the factors that influence the likelihood of death during hospitalization were not included in our model. These unquantified variables might be unidentified or unmeasured case factors, the effects of treatment and complications including thrombosis.  Seventy-two dogs died before discharge, while a further 16 did not survive to 30 days. Three variables were predictive of outcome at both times, suggesting some consistency between the causes of death during hospitalization and at 30 days. There was 1 difference in the prediction models between outcomes at discharge versus 30 days: urea was independently predictive of outcome at discharge but not at 30 days, while creatinine was not independently predictive of outcome at discharge but was at 30 days. The cause of these differences is unclear. The association between creatinine concentration and outcome at 30 days might suggest that end-organ dysfunction associated with hypoxemia, nephrotoxicity from drug administration, or hemoglobinemia affects medium-term outcome. Acute kidney injury is associated with nonsurvival in critically-ill dogs, 29 and even small deteriorations in kidney function can affect outcome. 30 Our intention was to test our hypotheses by enrolling 335 dogs; however, we were only successful in recruiting 276 dogs within a 5-year period. Although this might have reduced our ability to identify reliable prognostic markers, post hoc power calculations suggested we were powered to detect a 2.02-fold change in case fatality, which was almost exactly our goal. This study combined retrospective and prospectively collected data into 1 dataset, to maximize the data available for analysis, while minimizing the time taken for data collection. We compared demographics, clinicopathologic data, and treatment data for these populations prior to combining them. Three statistically significant differences were identified between these 2 populations (ALT activity, icterus frequency, and anisocytosis severity), but the clinical relevance of these differences is debatable. The absence of a difference in outcomes between the retrospective and the prospective groups supports this assertion. We cannot exclude the possibility that other differences might have existed that would affect the validity of our approach, but all data from the 2 parts of the study were collected from the same group of centers, which should improve the homogeneity of the data. Overall, we feel that the 2 populations were not clinically different sufficient that this would affect the results of our evaluation of associations using data from the combined population.
Our analyses attempted to account for differences in the reference intervals between centers. After initial screening, it was determined that some variables should be indexed to the institution's reference intervals. Since this was not performed for all variables, it is possible we might have overlooked some significant associations between nonindexed variables and outcome. However all of the variables in the final models were indexed, which maximizes their generalizability. We also considered that center might have had an effect on outcome either through distinct case demographics or institutional differences in case management or treatment availability. To address this, we evaluated the association of center with outcomes in univariate analyses. We found no significant effect of center on outcome in these analyses. Center was deliberately excluded from the multivariable analyses to maximize generalizability, but regardless center would not have been included in our multivariable models on the basis of a lack of association in the univariate analyses. This argues that if distinct treatment strategies employed by different institutions significantly influenced outcome then an effect of center on outcome might have been found, but was not. The effects of treatment on outcome were not evaluated by this study directly, but warrant investigation in prospective interventional trials.
Although tailored diagnostic evaluation was performed in all of the cases to identify an underlying mechanism for their IMHA, this was not exhaustive and thus we cannot exclude the possibility that some of these dogs had an unidentified primary cause. For instance, many dogs underwent PCR screening for vector-borne pathogens, but few underwent more complete screening as has been recently recommended. 31 The potential effect of this is difficult to quantitate, since unidentified primary causes of IMHA might be expected to worsen the prognosis by perpetuating the generation of autoantibodies. Other limitations inherent in this study are biases induced by financial limitations and euthanasia. Assessing dogs managed only at referral centers might have reduced these effects by reducing the number of dogs euthanized for financial limitations. Most deaths in this study were due to euthanasia, however, with the inherent potential for confounding by euthanasia for reasons other than illness-severity or lack of response to treatment that is difficult to codify or exclude.

Conclusion
This large multicenter cohort study provides insight into the current management and outcome of dogs with IMHA treated in the British Isles. Two previously published illness-severity scores (CHAOS and Tokyo) were prospectively evaluated for their ability to predict outcome in a population separate from that used to generate the score. Of these two, only CHAOS was predictive of outcome in our population. Using our large dataset, we identified that markers of kidney function, bilirubin concentration, and ASA classification are independently associated with outcome in dogs with IMHA; a multivariate model combining illness severity scores and clinicopathologic data correctly predicted outcome at discharge in 82% cases. The ability of the factors identified here to predict outcome can now be evaluated in other populations, ideally before use as a means to stratify dogs for prospective interventional trials.