Simplified criteria for the diagnosis of autoimmune hepatitis


  • Potential conflict of interest: Dr. Krawitt is a consultant for Life Cycle Pharma.


Diagnosis of autoimmune hepatitis (AIH) may be challenging. However, early diagnosis is important because immunosuppression is life-saving. Diagnostic criteria of the International Autoimmune Hepatitis Group (IAIHG) were complex and purely meant for scientific purposes. This study of the IAIHG aims to define simplified diagnostic criteria for routine clinical practice. Candidate criteria included sex, age, autoantibodies, immunoglobulins, absence of viral hepatitis, and histology. The training set included 250 AIH patients and 193 controls from 11 centers worldwide. Scores were built from variables showing predictive ability in univariate analysis. Diagnostic value of each score was assessed by the area under the receiver operating characteristic (ROC) curve. The best score was validated using data of an additional 109 AIH patients and 284 controls. This score included autoantibodies, immunoglobulin G, histology, and exclusion of viral hepatitis. The area under the curve for prediction of AIH was 0.946 in the training set and 0.91 in the validation set. Based on the ROC curves, two cutoff points were chosen. The score was found to have 88% sensitivity and 97% specificity (cutoff ≥6) and 81% sensitivity and 99% specificity (cutoff ≥7) in the validation set. Conclusion: A reliable diagnosis of AIH can be made using a very simple diagnostic score. We propose the diagnosis of probable AIH at a cutoff point greater than 6 points and definite AIH 7 points or higher. (HEPATOLOGY 2008.)

Autoimmune hepatitis (AIH) is an inflammatory condition of the liver that can affect patients of all ages, sexes, and races.1 The diagnosis needs to be considered in any patient with elevated aminotransferases. Timely diagnosis and immunosuppressive therapy contain disease activity in almost all affected patients, and various case series have reported near normal or normal life expectancy in patients diagnosed and treated adequately.2–4 Untreated AIH, however, has a 5-year mortality above 50%.3, 5 Early diagnosis may be difficult because the clinical picture is heterogeneous and there is no specific test applicable for all patients.6 In particular, in patients with cholestatic autoimmune liver diseases, atypical features or mixed manifestations in the differential diagnosis may be very challenging.7 Especially in older patients the diagnosis may often be unduly delayed and probably overlooked.7–9

A simple and accurate diagnostic scoring system for AIH has not been established. In 1993, the International Autoimmune Hepatitis Group (IAIHG) proposed diagnostic criteria, which were revised in 1999.10, 11 These criteria were devised primarily by expert consensus and introduced to allow comparison of studies from different centers. Because these criteria are complex, insufficiently validated, and include a variety of parameters of questionable value, the IAIHG decided to devise a simplified scoring system for wider applicability in routine clinical practice based on the data of patients with well-established diagnoses.


AIH, autoimmune hepatitis; AMA, antimitochondrial antibodies; ANA, antinuclear antibodies; AUC, area under the curve; ELISA, enzyme-linked immunosorbent assay; HCV, hepatitis C virus; IAIHG, International Autoimmune Hepatitis Group; IgG, immunoglobulin G; IgM, immunoglobulin M; LKM, liver-kidney microsomal antibodies; PBC, primary biliary cirrhosis; PSC, primary sclerosing cholangitis; ROC, receiver operating characteristic curve; SLA/LP, soluble liver/liver-pancreas antibodies; SMA, smooth muscle cell antibodies.

Patients and Methods


This retrospective cohort study included 359 patients with AIH and 393 controls (training set and validation set). Patient data were collected from 11 international centers in 10 countries from North and South America, Europe, and Asia. All centers participating in the study specialized in liver diseases. Data were assembled between January 2005 and September 2006. The validation set was collected after successful analysis of the training set, and therefore the frequency of different diagnosis differed in the two sets. The training set consisted of 250 patients [240 AIH, 8 primary biliary cirrhosis (PBC)/AIH overlap, 2 primary sclerosing cholangitis (PSC)/AIH overlap] and 193 controls [125 PBC, 23 PSC, 6 nonalcoholic steatohepatitis, 4 hepatitis B, 22 hepatitis C (HCV), 8 toxic liver disease, 1 acute hepatitis, 2 acute hepatitis C, 1 idiopathic bile ductopenia, 1 cryptogenic cirrhosis]. The validation set included 109 patients with AIH and 284 controls (62 PBC, 22 PSC, 48 nonalcoholic steatohepatitis, 49 hepatitis B virus, 95 HCV, 2 Wilson's disease, 4 hemochromatosis, 2 toxic liver disease) (Table 1). Diagnoses of controls were assigned according to established diagnostic criteria. Diagnoses had to be unequivocal for patients to be used as controls. For all AIH patients the IAIHG score showed definite AIH before start of immunosuppressive therapy. As an additional confirmation of the diagnosis, subsequent response to immunosuppressive therapy was mandatory for patients to be included in this study. For this analysis, only laboratory findings at the time of diagnosis were used. Patients or controls with the following conditions were excluded from the study: presence of other causes of liver disease, patients under immunosuppressive therapy, or prior liver transplantation.

Table 1. Distribution of Diagnoses in the Training and Validation Sets
 Training SetValidation Set
Hepatitis B449
Hepatitis C2295
Toxic liver disease82
Acute hepatitis10
Acute hepatitis C20
Idiopathic bile ductopenia10
Cryptogenic cirrhosis10
Wilson's disease02

Methods and Statistical Analysis.

Endpoint of the study was the presence of AIH. Variables included in the analysis were age, sex, autoantibodies [smooth muscle actin (SMA), anti-nuclear antibody (ANA), antimitochondrial antibodies (AMA), liver-kidney microsomal antibodies (LKM), soluble liver/liver-pancreas antibodies (SLA/LP)], γ-globulins, immunoglobulin A, immunoglobulin G (IgG), immunoglobulin M (IgM), absence of viral hepatitis, and liver histology. Autoantibody titers were reported on the bases of local laboratory standards. The choice of variables was based on expert opinion of the IAIHG.

Although portal tract plasma cell infiltration is a characteristic feature of AIH,12 it is not specific for AIH, and its absence does not preclude diagnosis. Professor H. P. Dienes (Cologne, Germany) and Professor A. W. Lohse (Hamburg, Germany) therefore defined three categories for grading histology: atypical histology, histology compatible with AIH, and typical histology. Interface hepatitis, lymphocytic/lymphoplasmocytic infiltrates in portal tracts and extending into the lobule, emperipolesis (active penetration by one cell into and through a larger cell), and hepatic rosette formation were regarded as typical for the diagnosis of AIH.13 To be considered typical, each of the three features of typical AIH histology had to be present. Compatible features are a picture of chronic hepatitis with lymphocytic infiltration without all the features considered typical. Histology was considered atypical when showing signs of another diagnosis, such as steatohepatitis. Liver histologies were evaluated by the local pathologists who were not blinded to the patient's history.

Continuous variables were compared by Mann-Whitney U test and categorical variables using Fisher's exact test. Continuous data were described using median and quartiles. According to these levels, patients were assigned one or two points. For the formulation of predictive models, univariate analysis was performed in the training set. Variables showing differences between patients and controls in univariate analysis were then used in building scores. The diagnostic value of each variable and score was assessed by the area under the receiver operating characteristic (ROC) curve. To facilitate bedside management, scores were built categorizing the chosen continuous variables, assigning point values to the different categories, and adding them up (Table 2).

Table 2. Simplified Diagnostic Criteria for Autoimmune Hepatitis
  • *

    Addition of points achieved for all autoantibodies (maximum, 2 points).

ANA or SMA≥1:401
ANA or SMA≥1:80 
or LKM≥1:402*
or SLAPositive 
IgG>Upper normal limit1
 >1.10 times upper2
  normal limit 
Liver histology (evidenceCompatible with AIH1
 of hepatitis is aTypical AIH2
 necessary condition)  
Absence of viralYes2
  ≥6: probable AIH
  ≥7: definite AIH

Logistic regression with backward selection was used to assess the combined influence of the potential discriminators. The scores derived from the training set were then verified using the validation set. Sensitivity, specificity, and corresponding 95% Clopper-Pearson confidence intervals were calculated. The different scores were compared by DeLong test.14 Statistical analysis was performed by STATA Version 9.2 software by M. Geißler (1bisN, Castrop-Rauxel, Germany). Supplementary data can be found under


Training Set.

We collected data from 250 patients with proven AIH and 193 controls. In AIH patients, the median age at diagnosis was 49 (34-59) years [controls: median age, 50 (41-58)].

Median ANA titer in AIH patients was 1:160, (controls: 1:40, P < 0.0005). Median SMA titer in patients with AIH was 1:40, (controls: 0, P < 0.001). AMA was positive (≥1:40) in a significantly lower number in AIH patients [AIH patients: 17% (8 PBC/AIH overlap patients), controls: 43%, P < 0.001]. We included overlap patients in the group of AIH patients because we believe that it is essential to diagnose the AIH component of the disease in these patients. Characterization of these patients as controls does not significantly change the results of the analyses performed. Of all AIH patients with SLA/LP tested (n = 47), 28% tested positive. SLA was not found to be positive in any of the controls (n = 193, P = 0.05). LKM titers did not show a significant difference between tested AIH patients (7% positive) and controls (8% positive) in the training set (P = 0.77). All LKM-positive controls were patients with hepatitis C infection. However, AMA, LKM, and SLA/LP were not tested in 13%, 48%, and 81% of AIH patients, respectively.

Median of IgG levels in AIH patients was 1.44 times upper normal limit (UNL) [controls: 1.02 × UNL (P < 0.001)], median of γ-globulin-levels in AIH patients was 1.53 times UNL [controls, 1.07 times UNL (P < 0.001)].

In the training set, 5% of AIH patients were found to have uncharacteristic histology as compared with 92% of the controls. Three percent of the AIH histologies were classified as only compatible with the diagnosis of AIH. Three percent of the control patients were found to have liver histology compatible with AIH. Ninety-two percent of AIH patients and 6% of the controls (including patients with PBC, PSC, HCV, toxic liver disease) showed histological features considered typical for AIH (P < 0.001).

Variables associated with the diagnosis of AIH were first assessed by univariate analysis. Analysis showed that IgG [area under the curve (AUC) under the ROC curve, 0.80], γ-globulin levels (0.78), SMA (0.73), and liver histology (0.95) were univariate discriminators of the diagnosis of AIH. Because γ-globulin levels were missing in a large number of patients (n = 153) and IgG and γ-globulin levels showed a close correlation (Pearson coefficient: r = 0.87), only IgG levels were considered in the following analyses. IgM (AUC under the ROC curve, 0.63), AMA (0.63) and ANA titers (0.62) are also useful but less potent discriminators. Sex and the exclusion of viral hepatitis reached an AUC under the ROC curve of 0.597 and 0.594, respectively. They were taken into account as possible discriminators as well (Fig. 1).

Figure 1.

(A) ROC curve training set. Comparison of the simplified score before and after exclusion of viral hepatitis. (B) ROC curve validation set. Comparison of the simplified score before and after exclusion of viral hepatitis.

Using stepwise logistic regression, we then tested combined predictors for the presence of AIH. The optimal combination included IgG levels, ANA and SMA titers, and liver histology. Together they reached an AUC under the ROC curve of 0.99. Stepwise logistic regressions with backward and forward selection led to the same result.

Development of Scores.

The aim of this study was to design a simple score for routine clinical practice. On the basis of these results, we assigned points to each of the parameters described so bedside calculation of the score could be easily accomplished. This score was then validated with the help of a second set of patients and controls (validation set). Comparing AUCs of different scores by the Delong test, in the training set there were no significant differences between the scores. However, in the validation set, the score including autoantibodies, histology, exclusion of viral hepatitis, and IgG levels was significantly better than all other combinations of parameters:

  • ANA or SMA titers ≥ 1:40 → 1 point

  • ANA or SMA titers ≥ 1:80 or LKM ≥ 1:40 or SLA positive → 2 points

  • (The sum of both results was limited to 2 points.)

  • IgG-levels > UNL → 1 point

  • IgG levels > times 1.10 UNL (>10% above UNL) → 2 points

  • Liver histology atypical, compatible, or typical for AIH → 0, 1, or 2 points

  • Exclusion of viral hepatitis → 2 points

Taking all patients into account and assigning 2 points to patients with exclusion of viral hepatitis, we found a sensitivity of 96% and specificity of 66% at a cutoff of 5, 90% sensitivity and 77% specificity at a cutoff of 6 points, and 68% sensitivity and 99% specificity at a cutoff of 7 points in the training set. Exclusion of viral hepatitis in patients is simple, and including these patients may produce misleadingly good results. We therefore tested sensitivity and specificity of the score when excluding patients with viral hepatitis before application. There was no significant difference when excluding patients with viral hepatitis first (Table 3).

Table 3. Sensitivity and Specificity of Different Score Variants
Training Set
Cut pointSensitivityExact 95% CI (Clopper-Pearson)SpecificityExact 95% CI (Clopper-Pearson)
Score ≥ 3 after exclusion of viral hepatitis186/191 (97%)94%-99%100/154 (65%)57%-72%
Score ≥ 4 after exclusion of viral hepatitis178/191 (93%)89%-96%118/154 (77%)69%-83%
Score ≥ 5 after exclusion of viral hepatitis153/191 (80%)71%-86%152/154 (99%)95%-100%
Score ≥ 5 before exclusion of viral hepatitis215/224 (96%)93%-98%105/159 (66%)58%-73%
Score ≥ 6 before exclusion of viral hepatitis202/224 (90%)86%-94%123/159 (77%)70%-84%
Score ≥ 7 before exclusion of viral hepatitis153/224 (68%)62%-74%157/159 (99%)96%-100%
Validation Set
Cut pointSensitivityExact 95% CI (Clopper-Pearson)SpecificityExact 95% CI (Clopper-Pearson)
Score ≥ 3 after exclusion of viral hepatitis90/93 (97%)91%-99%114/127 (90%)83%-94%
Score ≥ 4 after exclusion of viral hepatitis82/93 (88%)80%-94%122/127 (96%)91%-99%
Score ≥ 5 after exclusion of viral hepatitis75/93 (81%)71%-88%125/127 (98%)94%-100%
Score > 5 before exclusion of viral hepatitis90/93 (97%)91%-99%224/240 (93%)89%-96%
Score ≥ 6 before exclusion of viral hepatitis82/93 (88%)80%-94%232/240 (97%)94%-99%
Score ≥ 7 before exclusion of viral hepatitis75/93 (81%)71%-88%238/240 (99%)97%-100%

Validation Set.

The validation set included 109 patients with AIH and 284 controls. The proportion of AIH patients in the training set (250/443 = 56%) was higher than in the validation set (109/393 = 28%). The score derived from the training set was evaluated in the validation set.

For the score before exclusion of patients with viral hepatitis, the AUC under the ROC curve was 0.987 in the validation set compared with 0.935 in the training set; 97% sensitivity and 90% specificity at a cutoff of 3 points; 88% sensitivity and 96% specificity at a cutoff of 4 points; and 81% sensitivity and 98% specificity at a cutoff of 5 points (Fig. 1, Table 3)

Taking all patients into account and assigning 2 points to patients with exclusion of viral hepatitis, we found a sensitivity of 97% and specificity of 93% at a cutoff of 5 points; 88% sensitivity and 97% specificity at a cutoff of 6 points; and 81% sensitivity and 99% specificity at a cutoff of 7 points (Table 3).

Direct comparison between the established IAIHG and the simplified diagnostic criteria is difficult because in this study the diagnosis of AIH was based on the descriptive criteria published by the IAIHG.10, 11 The application of the IAIHG score in all AIH patients showed definite AIH before and after initiation of immunosuppressive treatment (above 15 before treatment or above 17 after treatment initiation).


We aimed to develop a simplified diagnostic score differentiating AIH patients from those patients suffering from other liver disease. A limited number of routinely available measurements were selected to design the score. We found that liver histology, autoantibody titers, gamma-globulin/IgG levels, and the absence of viral hepatitis were independent predictors for the presence of AIH. This study shows that a simple score based on four measurements can differentiate between patients with and patients without AIH with a high degree of accuracy, although the frequency of diagnosis varied in the training and validation set. Despite these findings, some limitations require comment.

Autoantibody testing is not sufficiently standardized and therefore may lead to inadequate scoring values. Indeed, in our study we had to rely on the methods used in the various laboratories of the participating centers. Unified standards for autoantibody testing are required and have been proposed by the IAIHG Group.15 Centers using enzyme-linked immunosorbent assay (ELISA) antibody testing may need to reevaluate the cutoff values to be used in this novel score in the future.

ANA screening may be performed on human epithelial (HEp-2) cells, but these give higher values than tissue sections. For this score, values from tissue sections should be employed. If results from HEp-2 cells are used, they should be halved. ELISA tests for ANA and SMA have so far not been sufficiently standardized to be used in the application of this scoring system. However, the typical SMA of autoimmune hepatitis are mainly specific for F-actin, and both immunofluorescence testing and ELISA testing for IgG anti-F-actin antibodies may help to improve diagnostic accuracy.16, 17

Antibodies to SLA/LP do not show up on standard immunofluorescence testing but require testing by ELISA, immunoblotting, or immunoprecipitation assays.18–21

Standardized test systems for SLA/LP only became available a few years ago and are not used universally. Thus, in this retrospective worldwide study, many patients had not been tested for SLA/LP antibodies at presentation. However, previous studies have shown extremely high specificity, yet limited sensitivity, of SLA/LP antibodies for the diagnosis of AIH.18 This was confirmed in the current study, even though the low number of patients tested resulted in a low level of statistical significance. Because in some patients SLA/LP is the only autoantibody present, and because the diagnostic specificity of this parameter is exquisite, we decided to include it in the score.

Different considerations argue for inclusion of LKM antibodies in the score. LKM-positive AIH is uncommon, and therefore LKM antibodies were not found in many patient of the current study. However, patients with type 2 AIH usually have solely LKM antibodies and thus are overlooked by the scoring system. Conversely, contrary to SLA/LP antibodies, LKM antibodies can also occur in other conditions, in particular in hepatitis C. Therefore, in patients with HCV infection, LKM antibodies should not be included in the score.

The validity of these expert opinion statements will need to be tested in further studies specifically addressing this question.

Selective elevation of IgG levels with normal IgM and immunoglobulin A levels is a hallmark of AIH11, 22 but is only partly reflected in the proposed score. This is primarily because many centers did not measure immunoglobulin A, IgG, and IgM levels routinely. Nonetheless, we believe that these relatively cheap tests could probably improve the score even further, and in particular exclude false-positive scores for patients with polyclonal B cell activation attributable to advanced cirrhosis or active inflammatory disease of other causes. Future studies will have to show to what extent a measure of selectivity of IgG elevation may increase the accuracy of the proposed score.

Demonstration of hepatitis on histology is considered a prerequisite for application of this score. Women with cirrhosis attributable to nonalcoholic steatohepatitis, for example, may have elevated gamma-globulins as well as autoantibodies.12 Histology is essential for the diagnosis of AIH, although it does not always show lesions typical for AIH. It is required to exclude other disease entities. In addition, histology may help in assessing disease severity and thus guiding intensity of immunosuppressive therapy. In patients with acute AIH and compromised coagulation status, histology can be obtained by minilaparoscopy or transjugular biopsy.23–25

Negativity for viral markers was documented completely only for hepatitis B and C. Patients with acute presentation require exclusion of other hepatotropic viruses such as hepatitis A and E, cytomegalovirus, Epstein-Barr virus, herpes simplex virus, parvovirus B19, or adenoviral infections. Most of these cases, however, do not have elevation of IgG or gamma globulins, and histology tends to be more suggestive of viral infection. In hepatitis B and C, liver histology and clinical presentation might be more difficult to differentiate from AIH.26

We considered excluding all patients with viral hepatitis and requesting negativity for viral hepatitis markers as a prerequisite of applying the diagnostic score. However, in some countries, prevalence of viral hepatitis is above 10%, and here patients may well suffer from both diseases simultaneously.20 The simplified criteria showed comparably good results when tested with and without excluding patients with viral hepatitis first. We therefore included the parameter in the score.

Because a major problem in diagnosing AIH is to distinguish between AIH patients and patients with autoimmune cholestatic liver diseases, the emphasis in selecting patients for the control groups was on these patients. In comparison with routine clinical practice, patients with nonalcoholic fatty liver disease and HCV may be underrepresented in this study. This might lead to misleadingly low sensitivity and specificity of the score.

Some important disease groups that need to be considered in the differential diagnosis have not been evaluated sufficiently and will require further exploration. One of the most difficult differential diagnoses in patients with acute hepatitis is drug-induced hepatitis. In immunologically mediated drug reactions, the clinical and histological picture may look similar. However, hypergammaglobulinemia and highly elevated autoantibodies are rare in drug-induced hepatitis. The concept of certain drugs such as minocycline and statins triggering AIH, however, may confound this situation. However, immunosuppression is probably indicated in these drug-induced reactions, and therefore differentiation is less urgent. It should also be noted that this study did not include many patients with Wilson's disease, genetic hemochromatosis, or alpha1-antitrypsin deficiency, whose biopsies may sometimes mimic AIH.

The issue of overlap is controversial within the scientific community. The investigators believe that an active AIH component requires treatment independent of coexistent PSC or PBC; however, we acknowledge that others may consider this debatable. Including these patients as AIH patients is based on the assumption that in predominant AIH this is the prognosis-determining disease process, and therefore a new score should detect these patients. Excluding this group of patient does not significantly change the overall results. Distinction between PBC and AIH and PSC and AIH as well as the so-called variant or overlap syndromes was not always possible on the basis of this score. This difficulty reflects the ongoing discussion regarding to what extent there may be overlap among these autoimmune liver diseases (Fig. 2). The IAIHG is undertaking a large study trying to define these patients further. The diagnostic score devised in 1993 was applied to a larger number of patients by six published studies.27–31 Studies agreed that the scoring system had a high degree of sensitivity for the diagnosis of AIH (97%-100%). In patients with PSC and other biliary disorders, the specificity for exclusion of definite AIH was 96% to 100%. However, a portion of between 8% and 52% of patients achieved scores placing them as probable AIH, reducing the overall specificity to 45% to 92%. The scoring system was revised in 1999 to improve exclusion of patients with biliary liver disease.32–35 In view of the large number of patients with PBC and PSC included in the current study, the new simplified criteria seem to compare well with the previous score.

Figure 2.

(A) Training set. Discrimination of AIH patients and controls using points scored with simplified diagnostic criteria. (B) Validation set. Discrimination of AIH patients and controls using points scored with simplified diagnostic criteria.

In summary, we have proposed and evaluated a simplified scoring system for the diagnosis of AIH that can easily be applied in daily clinical practice. The score has been shown to have a high degree of sensitivity and specificity. Further evaluation in patients with characteristics of two autoimmune liver diseases and standardization of immunoglobulin and autoantibody testing might further improve this clinical tool.