Nonalcoholic fatty liver disease (NAFLD) activity score and the histopathologic diagnosis in NAFLD: distinct clinicopathologic meanings

Authors


  • Potential conflict of interest: Nothing to report.

  • The Nonalcoholic Steatohepatitis Clinical Research Network (NASH CRN) is supported by the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) (grants U01DK061718, U01DK061728, U01DK061731, U01DK061732, U01DK061734, U01DK061737, U01DK061738, U01DK061730, U01DK061713), and the National Institute of Child Health and Human Development (NICHD). Several clinical centers use support from General Clinical Research Centers or Clinical and Translational Science Awards in conduct of NASH CRN Studies (grants UL1RR024989, M01RR000750, M01RR00188, UL1RR02413101, M01RR000827, UL1RR02501401, M01RR000065, M01RR020359, UL1RR025741).

Abstract

The diagnosis of nonalcoholic steatohepatitis (NASH) is defined by the presence and pattern of specific histological abnormalities on liver biopsy. A separate system of scoring the features of nonalcoholic fatty liver disease (NAFLD) called the NAFLD Activity Score (NAS) was developed as a tool to measure changes in NAFLD during therapeutic trials. However, some studies have used threshold values of the NAS, specifically NAS ≥5, as a surrogate for the histologic diagnosis of NASH. To evaluate whether this unintended use of the NAS is valid, biopsy and clinical data from the 976 adults in NASH Clinical Research Network (CRN) studies were reviewed. Biopsies were evaluated centrally by the NASH CRN Pathology Committee. Definite steatohepatitis (SH) was diagnosed in 58.1%, borderline SH in 19.5% and “not SH” in 22%. The NAS was ≥5 in 50% and ≤4 in 49%; in this cohort only 75% of biopsies with definite SH had an NAS ≥5, whereas 28% of borderline SH and 7% of “not SH” biopsies had NAS ≥5. Of biopsies with an NAS ≥5, 86% had SH and 3% “not SH”. NAS ≤4 did not indicate benign histology; 29% had SH and only 42% had “not SH.” Higher values of the NAS were associated with higher levels of alanine aminotransferase and aspartate aminotransferase, whereas the diagnosis of SH was associated with features of the metabolic syndrome. Conclusion: The diagnosis of definite SH or the absence of SH based on evaluation of patterns as well as individual lesions on liver biopsies does not always correlate with threshold values of the semiquantitative NAS. Clinical trials and observational studies should take these different performance characteristics into account. (HEPATOLOGY 2011)

The diagnosis of nonalcoholic steatohepatitis (NASH) is established by the presence of a characteristic pattern of steatosis, inflammation, and hepatocellular ballooning on liver biopsies in the absence of significant alcohol consumption. The value of establishing a diagnosis of NASH is that it identifies individuals who are at risk for progressive liver disease to the point of cirrhosis and death from chronic liver disease. However, the dichotomous assessment of liver biopsies as either having steatohepatitis (SH) or not is less helpful in treatment trials of therapeutic agents to improve NASH because it cannot identify patients in whom NASH significantly lessened in severity with treatment but continued to fulfill diagnostic criteria for NASH. For this reason, a scoring system was needed that included the full spectrum of nonalcoholic fatty liver disease and would be sensitive to changes in the underlying disease process independent of the diagnosis of NASH.

To meet this need, a scoring system for nonalcoholic fatty liver disease (NAFLD) was developed and validated by the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) sponsored Nonalcoholic Steatohepatitis Clinical Research Network (NASH CRN) Pathology Committee.1 The methodology proposed for feature-based scoring of histologic lesions of NAFLD has been widely utilized, as evidenced by its application in numerous clinical and experimental settings in NAFLD-related studies. The recognized strengths of the method include the relative ease of understanding and, therefore, application of the system; division of lesions of active and potentially reversible injury (“grade”) in the NAFLD Activity Score (NAS) and those potentially less reversible and characterized by collagen deposition and architectural alterations that may evolve toward more permanent parenchymal remodeling (“stage”). The proposed NAS also clearly separates the three lesions that comprise grade: steatosis, lobular inflammation, and ballooning. This allows detailed analysis of histologic changes for comparative and correlative studies in therapeutic intervention trials.

The histologically based NAS was derived from 10 pathologists' blinded and individual readings of biopsies from 32 adults and 18 children with clinically presumed NAFLD. The adult biopsies were read twice, and the pediatric biopsies once by each pathologist. It was noted in the publication of the validation study that the numeric scores correlated closely but not perfectly with separately derived diagnoses of “definite SH,” “not SH,” and “borderline SH.”

It is, however, increasingly apparent from ongoing and published studies that the numeric value of the composite NAS is considered by some investigators to be either “synonymous” with, or actually a replacement for, a microscopic diagnosis that is based on overall pattern of injury as well as the presence of additional lesions such as zonality of lesions, portal inflammation, and fibrosis.2 The validity of this unintended use of the NAS has not been formally evaluated.

In order to objectively assess the relationships of the NAS, the diagnosis of SH, and important clinical characteristics of NAFLD, we availed ourselves of the large, well-characterized dataset from the NASH CRN. We demonstrate that the NAS and the diagnostic category of definite SH are closely correlated, but also have distinct clinicopathologic relationships. The study further highlights that not all biopsies with NAS ≥5 have findings that meet diagnostic criteria of definite SH, and that some cases of NAS ≤4 do, indicating that the a threshold value of an NAS > 5 cannot be used reliably to establish the presence or absence of NASH.

Abbreviations

ALT, alanine aminotransferase; AST, aspartate aminotransferase; NAFLD, nonalcoholic fatty liver disease; NAS, NAFLD Activity Score; NASH, nonalcoholic steatohepatitis; SH, steatohepatitis.

Materials and Methods

Biopsies from adult patients enrolled in either the database study or pretreatment biopsies from the adult treatment trial (Pioglitazone versus Vitamin E versus Placebo for the Treatment of Nondiabetic Patients with NASH; [PIVENS]) were reviewed in a standardized blinded fashion by the Pathology Committee of the NASH CRN, composed of a pathologist from each of the eight clinical centers, and one from the National Cancer Institute. Assignment of a diagnostic category was based on consensus recognition of the distinctive features of SH independent of the degree of NAFLD severity as indicated by the NAS. Biopsies that had been classified by the Pathology Committee during Central Review as “cirrhosis with or without features of NAFLD or NASH” were excluded from this analysis, as it is well recognized that the active lesions of SH may not be retained in cirrhosis. Biopsies with the zone 1 borderline pattern were also excluded as this is a pattern that most commonly occurs in pediatric NAFLD and was rare among our adult cases. When more than one biopsy for a subject was available, only the first biopsy was used in the analysis. Histologic and clinical data were analyzed as described below.

Histologic Data.

The following histologic data were analyzed: diagnosis rendered by the Pathology Committee (i.e., “not steatohepatitis,” “borderline, zone 3 pattern,” “definite steatohepatitis”); the aggregate NAS; the score of each component of the NAS (steatosis (0-3), lobular inflammation (0-3), ballooning (0-2)), and fibrosis scores (0,1a,1b,1c,2,3). In addition, portal chronic inflammation and steatosis location were included. The “borderline zone 3 pattern” was reserved for biopsies that have zone 3 accentuation of lesions, but not all the lesions for definite SH were present. This definition is purposely left broad so as to neither preclude further evaluation nor include the biopsies in the “not SH” category.

Clinical Data.

Clinical data obtained at baseline were used in the analyses, and laboratory measures were limited to those values within 6 months of the biopsy for each subject. From the dataset the following were analyzed: demographic features (age at biopsy, gender, race, ethnicity), body mass index (BMI), and laboratory values including alanine aminotransferase (ALT), aspartate aminotransferase (AST), fasting serum glucose and insulin, antinuclear antibody (ANA), and triglycerides. Calculations were performed to derive the presence or absence of criteria for metabolic syndrome,3 HOMA-IR, and QUICKI. QUICKI is an inverse log transformation of HOMA, and has been found to be linearly related to formal clamp measures of IR.4

Statistical Analyses.

Chi-square tests were used to compare univariate associations of categorical variables with NASH diagnosis (not SH, borderline SH, definite SH) and with NAS (≤4 versus ≥5). Fisher's Exact test was used for categorical variables with small expected numbers. Continuous variables were compared to NASH diagnosis (three categories) using analysis of variance (ANOVA) for normally distributed variables (age at biopsy and BMI). The nonparametric Kruskal-Wallis test was used to compare laboratory measures with the three-category NASH diagnosis, and the Wilcoxon rank sum test was used for the binary NAS (≤4 versus ≥5). Also examined were the associations between patient characteristics (demographics and laboratory measures) and diagnosis, within categories of NAS (≤4 versus ≥5). Components of the NAS (steatosis, lobular inflammation, and ballooning) were excluded from statistical analysis when making comparisons to the binary NAS variable (≤4 versus ≥5). Methods for evaluating a diagnostic test (sensitivity, specificity, percent agreement, and Cohen's kappa statistic) were used to evaluate the information loss in an NAS cutoff (≥5 versus ≤4) as a surrogate for the histological diagnosis of SH (definite SH versus borderline/not SH).

Univariate regression analyses were performed to assess the individual associations between the NAS (≥5 versus ≤4) and select patient characteristics. These results were compared to the univariate regression analyses of SH diagnosis and these same patient characteristics. Outcome measures included: ALT (U/L), AST (U/L), diabetes, metabolic syndrome, HOMA-IR, and QUICKI. Linear regression was used for continuous outcome measures (ALT, AST, HOMA-IR, and QUICKI) and logistic regression was used for binary outcome measures (diabetes and metabolic syndrome). Multivariate regression analyses were used to assess the independent association between diagnosis of definite SH and these same patient characteristics, controlling for the NAS. That is, both the binary NAS variable and SH diagnosis were included as covariates in the models. The β coefficients (for continuous outcomes), odds ratios (for binary outcomes), 95% confidence intervals, and P-values were compared for each model.

Nominal, two-sided P-values were used and were considered statistically significant if P < 0.05; no adjustments for multiple comparisons were made. Analyses were performed using SAS statistical software (v. 9; SAS Institute, Cary, NC) and Stata (Rel. 10; Stata Corp., College Station, TX).

Results

Comparison of Demographic and Clinical Characteristics Across NASH Diagnostic Categories.

Data from a total of 934 adult liver biopsies without cirrhosis or the “zone 1 borderline” diagnosis were available for this analysis. The diagnoses were “not steatohepatitis” in 208 (22%), “borderline steatohepatitis” in 183 (20%), and “definite steatohepatitis” in 543 (58%). Table 1 highlights clinical correlates in these categories. Definite SH was observed in a higher proportion of women (P = 0.004) and tended to be seen in older individuals (P = 0.05). There were no significant differences in the distribution of diagnostic categories among races, but Hispanics had proportionally fewer borderline cases (P = 0.03). BMI did not differ among the diagnostic categories. Definite SH was associated with higher serum ALT, triglycerides, insulin, and calculated HOMA-IR, lower QUICKI, as well as a higher frequency of diabetes. No association was found with the presence or absence of serum ANA or serum fasting glucose. Thus, the most important clinical correlations conventionally associated with SH, i.e., older age, female gender, evidence of insulin resistance, and elevated ALT5 were confirmed by blinded analysis of the biopsies for diagnostic category, regardless of the NAS.

Table 1. Baseline Characteristics by Diagnosis*
 Not SteatohepatitisBorderline SteatohepatitisDefinite SteatohepatitisTotalP
  • *

    Values are N (%), means ± SD, or medians (IQR), unless otherwise specified.

  • P-values derived from chi-square tests for categorical variables (Fisher's Exact test when expected numbers were small), from ANOVA for age at biopsy and BMI, and from the Kruskal-Wallis Test for laboratory measures.

  • Only laboratory values collected within 6 months of the liver biopsy were included. N =573 for ALT and glucose measurements; N =577 for triglycerides; N =563 for insulin and HOMA-IR measurements; N =612 for ANA. HOMA-IR is the homeostasis model assessment method for insulin resistance, calculated as (fasting insulin (μU/mL)*fasting glucose (mmol/L))/22.5. QUICKI is calculated as: 1/log(insulin μU/mL*glucose mg/dL).

N208183543934 
Age at biopsy-yrs (range)47.0 (18.2-78.5)46.7 (19.1-68.6)48.7 (18.3-75.0)47.9 (18.2-78.5)0.05
Female gender115 (55.3)102 (55.7)360 (66.3)577 (61.8)0.004
Race    0.14
White168 (83.6)147 (82.6)439 (84.4)754 (83.9) 
Black7 (3.5)7 (3.9)14 (2.7)28 (3.1) 
Asian or Pacific Islander11 (5.5)17 (9.6)25 (4.8)53 (5.9) 
American Indian or AK Native9 (4.5)1 (0.6)18 (3.5)28 (3.1) 
More than 1 race6 (3.0)6 (3.4)24 (4.6)36 (4.0) 
Hispanic/Latino28 (13.5)12 (6.6)75 (13.8)115 (12.3)0.03
BMI (kg/m2)33.8 ± 6.233.8 ± 6.834.4 ± 6.334.2 ± 6.40.37
Diabetes (Type 2)38 (18.3)37 (20.2)170 (31.3)245 (26.2)<0.001
ALT (U/L)51 (33-73)67 (38-91)77 (51-116)67 (46-98)<0.0001
Triglycerides (mg/dL)144 (104-194)151 (104-208)159 (115-238)157 (110-219)0.02
Glucose (mg/dL)96 (87-107)93 (85-104)96 (85-110)95 (85-108)0.34
Insulin (μU/mL)15.4 (10.0-20.8)17.5 (11.8-24.0)20.5 (13.1-30.0)18.3 (12.1-28.2)<0.0001
HOMA-IR3.5 (2.3-5.3)4.3 (2.6-6.2)4.9 (3.1-7.6)4.3 (2.8-7.0)<0.0001
QUICKI0.319 ± 0.0290.313 ± 0.0290.307 ± 0.0340.311 ± 0.0320.001
ANA positive33 (25.4)20 (17.1)87 (23.8)140 (22.9)0.24

Comparison of Demographic and Clinical Characteristics by NAS Category.

Table 2 shows the results of similar comparisons utilizing the subsets of NAS ≤4 (n = 461) and NAS ≥5 (n = 473). The biopsies in the higher NAS category were associated with female gender, as well as elevated ALT, triglycerides, insulin, and HOMA-IR, and lower QUICKI values. There was a trend of association of the higher NAS with diabetes (P = 0.05), but this categorization of low versus high NAS was not associated with age, race, ethnicity, BMI, serum fasting glucose, or presence of positive ANA.

Table 2. Baseline Characteristics by NAS*
 NAS ≤4NAS ≥5TotalP
  • *

    Values are N (%), means ± SD, or medians (IQR), unless otherwise specified.

  • P-values derived from chi-square tests for categorical variables (Fisher's Exact test when expected numbers were small), from ANOVA for age at biopsy and BMI, and from the Kruskal-Wallis Test for laboratory measures.

  • Only laboratory values collected within 6 months of the liver biopsy were included. N =573 for ALT and glucose measurements; N =577 for triglycerides; N =563 for insulin and HOMA-IR measurements; N =612 for ANA. HOMA-IR is the homeostasis model assessment method for insulin resistance, calculated as (fasting insulin (μU/mL)*fasting glucose (mmol/L)/22.5. QUICKI is calculated as: 1/log(insulin μU/mL*glucose mg/dL).

N461473934 
Age at biopsy - yrs (range)47.7 (18.3-78.5)48.1 (18.2-72.5)47.9 (18.2-78.5)0.56
Female gender257 (55.8)320 (67.7)577 (61.8)0.0002
Race   0.74
White368 (82.7)386 (85.0)754 (83.9) 
Black16 (3.6)12 (2.6)28 (3.1) 
Asian or Pac. Islander30 (6.7)23 (5.1)53 (5.9) 
Am Indian or AK Native13 (2.9)15 (3.3)28 (3.1) 
More than 1 race18 (4.0)18 (4.0)36 (4.0) 
Hispanic/Latino52 (11.3)63 (13.3)115 (12.3)0.34
BMI (kg/m2)34.0 ± 6.534.3 ± 6.334.2 ± 6.40.51
Diabetes (Type 2)108 (23.4)137 (29.0)245 (26.2)0.05
ALT (U/L)58 (36-78)82 (53-121)67 (46-98)<0.0001
Triglycerides (mg/dL)149 (105-202)161 (115-239)157 (110-219)0.007
Glucose (mg/dL)95 (86-107)96 (84-110)95 (85-108)0.85
Insulin (μU/mL)17.0 (11.5-24.0)20.1 (13.0-30.0)18.3 (12.1-28.2)0.001
HOMA-IR3.9 (2.6-6.3)4.8 (3.1-7.5)4.3 (2.8-7.0)0.004
QUICKI0.314 ± 0.0320.307 ± 0.0330.311 ± 0.0320.008
ANA positive74 (24.8)66 (21.0)140 (22.9)0.26

Comparison of Histology and Diagnostic Category (Table 3).

All histologic components of the NAS, as well as fibrosis scores and amounts of portal chronic inflammation, were highly correlated with the diagnostic categories (P < 0.0001 for all). Steatosis scores of <5% or 5%-33% were more often found in the “not” SH category, whereas those of grades 2 (33%-66%) and 3 (>66%) were evenly distributed between borderline and definite SH. Only three biopsies had no lobular inflammation; they were all in the not SH category; there was a clear association of increased lobular inflammation with definite SH. The majority of all biopsies had mild portal chronic inflammation; however, a greater percentage of the not SH biopsies had none, and more of the definite SH were classified as greater than mild portal chronic inflammation (P < 0.0001). Ballooning was clearly absent in the majority of “not SH” (95.7%) and borderline (62.8%), and clearly present in the majority of definite SH (99.6%) (P < 0.0001). Of note, two cases of definite SH did not have ballooning, and seven cases with “many” ballooned hepatocytes had been categorized as either not SH (n = 3) or as borderline (n = 4).

Table 3. Histology by Steatohepatitis
Histology variable*No steatohepatitis (n = 208)Borderline (n = 183)Definite steatohepatitis (n = 543)Total (n = 934)P
  • *

    Values are N (%). Biopsies with cirrhosis were excluded. P-values derived from chi-square tests or Fisher's Exact test (for categorical variables with small expected numbers).

NAS    <0.0001
 ≤ 4194 (93.3)131 (71.6)136 (25.1)461 (49.4) 
 ≥ 514 (6.6)52 (28.4)407 (75.0)473 (50.6) 
Steatosis grade    <0.0001
 0 - <5%24 (11.5)4 (2.7)8 (1.5)37 (4.0) 
 1 - 5-33%106 (51.0)68 (37.2)175 (32.2)349 (37.4) 
 2 - 34-66%49 (23.6)56 (30.6)201 (37.0)306 (32.8) 
 3 - > 66%29 (13.9)54 (29.5)159 (29.3)242 (25.9) 
Steatosis location    <0.0001
 0 – Zone 3105 (51.0)86 (47.0)200 (36.8)391 (42.0) 
 1 – Zone 15 (2.4)1 (0.6)2 (0.4)8 (0.9) 
 2 - Azonal51 (24.8)36 (19.7)140 (25.8)227 (24.4) 
 3 – Panancinar45 (21.8)60 (32.8)201 (37.0)306 (32.8) 
Lobular inflammation    <0.0001
 0 - none3 (1.4)0 (0.0)0 (0.0)3 (0.4) 
 1 - <2167 (80.3)104 (56.8)190 (35.0)461 (49.4) 
 2 - 2-436 (17.3)70 (38.3)261 (48.1)367 (39.3) 
 3 - >42 (1.0)9 (4.9)92 (16.9)103 (11.0) 
Chronic portal inflammation    <0.0001
 0 - none60 (28.9)33 (18.0)58 (10.7)151 (16.2) 
 1 - mild125 (60.1)128 (70.0)354 (65.2)607 (65.0) 
 2 – > mild23 (11.1)22 (12.0)131 (24.1)176 (18.8) 
Ballooning    <0.0001
 0 - none199 (95.7)115 (62.8)2 (0.4)316 (33.8) 
 1 - few6 (2.9)64 (35.0)174 (32.0)244 (26.1) 
 2 - many3 (1.4)4 (2.2)367 (67.6)374 (40.0) 
Fibrosis    <0.0001
 0- none149 (73.0)59 (32.4)39 (7.3)247 (26.7) 
 1a17 (8.3)47 (25.8)75 (13.9)139 (15.0) 
 1b2 (1.0)12 (6.6)94 (17.5)108 (11.7) 
 1c18 (8.8)7 (3.9)2 (0.4)27 (2.9) 
 27 (3.4)36 (19.8)145 (27.0)188 (20.4) 
 3 - bridging11 (3.4)21 (11.5)183 (34.0)215 (23.3) 

Histologic Features and the NAS.

Table 4 shows the sensitivity (0.75; 95% confidence interval [CI]: 0.72-0.78), specificity (0.83; 95% CI: 0.80-0.85), percent agreement (78.4%, 95% CI: 75.6-81.0), and Cohen's kappa statistic (0.57; 95% CI: 0.51-0.62) when using an NAS cutoff (≥5 versus ≤4) as a substitute for the histological diagnosis of SH (definite SH versus borderline/not SH). Taken together, these measures indicate a substantial loss in information if the NAS were used as a surrogate for the diagnosis of SH.

Table 4. Sensitivity, Specificity, Percent Agreement, and Cohen's Kappa Statistic Using NAS Cutpoint of 5 for Classification of NASH Diagnosis
 NASH Diagnosis 
NASDefiniteBorderline/not 
  1. Sensitivity (95% CI): 0.75 (0.72 - 0.78); specificity (95% CI): 0.83 (0.80 - 0.85); percent agreement (95% CI): 78.4% (0.76 - 0.81); kappa (95% CI): 0.57 (0.76 - 0.81).

≥540766473
≤4136325461
 543391934

Figure 1 shows the relationship between the NAS and the diagnostic category, which is nearly identical to the relationship our group previously reported.1 Table 5 shows a detailed breakdown of histologic features between the NAS ≥5 and the NAS ≤4 biopsies. Although NAS ≥5 biopsies were most commonly categorized as definite SH (86%), 66 biopsies were diagnosed as either not SH (3%) or borderline (11%). Less than half of NAS ≤4 biopsies were diagnosed as not SH (42%), whereas 28% were considered borderline, and nearly 30% had definite SH. Portal inflammation and fibrosis were more severe in NAS ≥5 biopsies compared to NAS ≤4. Of note, however, was the finding that ballooning, a central feature of the diagnosis of SH, was classified as none in 41/473 (9%) of NAS ≥5; on the other hand, ballooning was not only present, but was marked in 60/461 (13%) NAS ≤4. Figure 2A and B illustrate high NAS, “not SH,” and low NAS, definite SH, respectively.

Figure 1.

The percentages of biopsies with diagnoses of definite SH (closed triangle), borderline (probable) SH (open circle) and definitely not SH (closed square). As can be noted, the majority of definite SH are >5 and the majority of not SH are <3; however, the scores and diagnostic categories are not as easily separated in the NAS 3-5 ranges.

Figure 2.

(A) An example of high NAS (steatosis, grade 3; lobular inflammation grade 2; ballooning grade 0, NAS = 5), but not SH by diagnosis (20×, hematoxylin and eosin). (B) An example of low NAS (steatosis, lobular inflammation, and ballooning all grade 1, NAS = 3), but diagnosed as definite SH (20×, hematoxylin and eosin).

Table 5. Histology by NAS
Histology Variable*NAS ≤4 (n =461)NAS ≥5 (n =473)Total (n =934)P
  • *

    Values are N (%). Biopsies with cirrhosis were excluded. P-values derived from chi-square tests or Fisher's exact test (for categorical variables with small expected numbers). P-values for components of the NAS (steatosis amount, lobular inflammation, and ballooning) were not included.

Diagnosis   <0.0001
 Not steatohepatitis194 (42.1)14 (3.0)208 (22.3) 
 Borderline131 (28.4)52 (11.0)183 (19.6) 
 Definite steatohepatitis136 (29.5)407 (86.0)543 (58.1) 
Steatosis grade   n/a
 0 - <5%33 (7.2)4 (0.9)37 (4.0) 
 1 - 5-33%267 (57.9)82 (17.3)349 (37.4) 
 2 - 34-66%129 (28.0)177 (37.4)306 (32.8) 
 3 - > 66%32 (6.9)210 (44.4)242 (24.3) 
Steatosis location   <0.001
 0 – Zone 3231 (50.3)160 (33.8)391 (42.0) 
 1 – Zone 15 (1.1)3 (0.6)8 (0.9) 
 2 - Azonal132 (28.8)95 (20.1)227 (24.4) 
 3 – Panancinar91 (19.8)215 (45.5)306 (32.8) 
Lobular inflammation   n/a
 0 - none3 (0.7)0 (0.0)3 (0.3) 
 1 - <2379 (82.2)82 (17.3)461 (49.4) 
 2 - 2-479 (17.1)288 (60.9)367 (39.3) 
 3 - >40 (0.0)103 (21.8)103 (11.0) 
Chronic portal inflammation   <0.0001
 0 - none91 (19.7)60 (12.7)151 (16.2) 
 1 - mild306 (66.4)301 (63.6)607 (65.0) 
 2 - > mild64 (13.9)112 (23.7)176 (18.8) 
Ballooning   n/a
 0 - none275 (59.7)41 (8.7)316 (33.8) 
 1 - few126 (27.3)118 (25.0)244 (26.1) 
 2 - many60 (13.0)314 (66.4)374 (40.0) 
Fibrosis   <0.0001
 0- none192 (42.3)55 (11.7)247 (26.7) 
 1a71 (15.6)68 (14.5)139 (15.0) 
 1b35 (7.7)73 (15.5)108 (11.7) 
 1c25 (5.5)2 (0.4)27 (2.9) 
 259 (13.0)129 (27.5)188 (20.4) 
 3 - bridging72 (15.9)143 (30.4)215 (23.3) 

NAS, Histological Diagnoses, and Clinical Characteristics.

To better understand the clinical characteristics of patients with low NAS but a definite SH diagnosis, or conversely, a high NAS but not a definite SH diagnosis, the biopsies with low and high NAS were analyzed separately. Table 6 shows the demographic and clinical characteristics by diagnostic category among NAS ≤4 biopsies and NAS ≥5 biopsies across all diagnostic categories, respectively. In the NAS ≤4 biopsies, elevated ALT correlated with the diagnosis of definite SH (P = 0.003). No other clinical finding was discriminatory in that group. For biopsies with high NAS (≥5), several clinical features showed strong associations with the category of definite SH. The strongest was diabetes (P < 0.0001); other factors were Hispanic ethnicity, higher fasting insulin levels and HOMA-IR, and lower QUICKI. A trend toward positive ANA was noted (P = 0.06). Table 7 compares diagnostic categories of not SH with definite SH according to the NAS. Elevated serum ALT and triglycerides correlated with NAS ≥5 in those with definite SH (P = 0.002, P = 0.05, respectively), but not in those without SH (P = 0.14, P = 0.95, respectively). However, other clinical features were not significantly different among either the not SH category or definite SH category based on NAS ≤4 or ≥5.

Table 6. Characteristics of Patients by NAS and Diagnosis*
 Not SteatohepatitisBorderline SteatohepatitisDefinite SteatohepatitisP*
  • *

    Values are N (%), means ± SD, or medians (IQR), unless otherwise specified.

  • P-values derived from chi-square tests for categorical variables (Fisher's Exact test when expected numbers were small), from ANOVA for age at biopsy and BMI, and from the Kruskal-Wallis Test for laboratory measures.

  • Only laboratory values collected within 6 months of the liver biopsy were included. N =278 for ALT and glucose measurements; N =280 for triglycerides; N =272 for insulin and HOMA-IR measurements; N =298 for ANA. HOMA-IR is the homeostasis model assessment method for insulin resistance, calculated as (fasting insulin (μU/mL)*fasting glucose (mmol/L)/22.5. QUICKI is calculated as: 1/log(insulin μU/mL) + log(glucose mg/dL).

N - NAS≤4194131136 
N - NAS≥51452407 
Age at biopsy (yrs)
 NAS≤447.3 ± 11.646.5 ± 11.649.3 ± 11.90.13
 NAS≥541.9 ± 13.147.1 ± 10.648.5 ± 11.30.08
Female gender
 NAS≤4107 (55.2)72 (55.0)78 (57.4)0.90
 NAS≥58 (57.1)30 (57.7)282 (69.3)0.16
Caucasian race
 NAS≤4156 (83.0)103 (81.1)109 (83.9)0.84
 NAS≥512 (92.3)44 (86.3)330 (84.6)0.90
Hispanic ethnicity
 NAS≤426 (13.4)11 (8.4)15 (11.0)0.39
 NAS≥52 (14.3)1 (1.9)60 (14.7)0.02
BMI (kg/m2)
 NAS≤433.8 ± 5.934.5 ± 7.433.9 ± 6.30.63
 NAS≥534.8 ± 9.532.1 ± 4.734.6 ± 6.30.02
Diabetes (Type 2)
 NAS≤438 (19.6)31 (23.7)39 (28.7)0.16
 NAS≥50 (0)6 (11.5)131 (32.2)<0.0001
ALT (U/L)
 NAS≤451 (33-72)60 (33-78)65 (48-86)0.003
 NAS≥557 (50-120)89 (57-111)82 (52-122)0.76
Triglycerides (mg/dL)
 NAS≤4146 (104-193)151 (105-207)151 (106-209)0.87
 NAS≥5130 (104-201)147 (99-209)164 (118-249)0.09
Glucose (mg/dL)
 NAS≤496 (87-107)92 (85-104)97 (86-108)0.24
 NAS≥597 (93-100)93 (87-108)96 (84-111)0.92
Insulin (μU/mL)
 NAS≤416 (10-21)18 (12-26)18 (12-28)0.06
 NAS≥514 (10-15)17 (9-24)22 (13-31)0.01
HOMA-IR
 NAS≤43.6 (2.3-5.4)4.3 (2.6-6.3)4.2 (3.0-7.3)0.07
 NAS≥53.3 (2.3-3.3)4.2 (2.1-6.2)5.0 (3.1-7.6)0.02
QUICKI
 NAS≤40.318 ± 0.0300.312 ± 0.0260.311 ± 0.0380.21
 NAS≥50.328 ± 0.0290.315 ± 0.0350.305 ± 0.0320.04
ANA positive
 NAS≤433 (27.3)17 (20.2)24 (25.8)0.51
 NAS≥50 (0)3 (9.1)63 (23.2)0.06
Table 7. Characteristics by Diagnosis* and NAS
 No SteatohepatitisDefinite Steatohepatitis
 NAS≤4 (n=194)NAS≥5 (n=14)PNAS≤4 (n=136)NAS≥5 (n=407)P
  • *

    Diagnoses of borderline steatohepatitis and cirrhosis were excluded. Values are means for age, BMI, and QUICKI, medians for laboratory measures and HOMA-IR, and percent for categorical variables.

  • P-values derived from chi-square tests for categorical variables (Fisher's Exact test when expected numbers were small), from ANOVA for age at biopsy and BMI, and from the Kruskal-Wallis Test for laboratory measures. P-values for components of the NAS (steatosis amount, lobular inflammation, and ballooning) were not included.

  • Only laboratory values collected within 6 months of the liver biopsy were included. N =462 for ALT and glucose measurements; N =465 for triglycerides; N =454 for insulin and HOMA-IR measurements; N =495 for ANA. HOMA-IR is the homeostasis model assessment method for insulin resistance, calculated as (fasting insulin (μU/mL)*fasting glucose (mmol/L)/22.5. QUICKI is calculated as: 1/log(insulin μU/mL*glucose mg/dL).

Age at biopsy (mean)47.341.90.1049.348.50.46
Gender (% female)55.157.10.8957.469.30.01
Race (% Caucasian)83.092.30.8383.984.60.57
Hispanic ethnicity (%)13.414.31.0011.014.70.28
BMI (kg/m2) (mean)33.834.80.5333.934.60.33
Diabetes (% with)19.60.00.0828.732.20.44
ALT (U/L) (median)51.057.00.1465.082.00.002
Triglycerides (mg/dL) (median)1461300.951511640.05
Glucose (mg/dL) (median)95.597.00.8297.096.00.55
Insulin (μU/mL) (median)15.813.80.2618.022.00.13
HOMA-IR (median)3.63.30.254.25.00.29
QUICKI (mean)0.3180.3280.350.3110.3050.17
ANA positive (%)27.30.00.1125.823.20.61
Steatosis grade (%)
 0 - <5%11.97.1n/a3.70.7n/a
 1 - 5-33%54.60.0 69.119.9 
 2 - 34-66%24.214.3 27.240.3 
 3 - > 66%9.378.6 0.039.1 
Steatosis location (%)
 0 – Zone 351.635.70.1846.333.7<0.0001
 1 – Zone 12.60.0 0.00.5 
 2 - Azonal25.314.3 36.022.4 
 3 – Panancinar19.650.0 17.743.5 
Lobular inflammation (%)
 0 - none1.60.0n/a0.00.0n/a
 1 - <286.10.0 86.817.7 
 2 - 2-412.485.7 13.259.7 
 3 - >40.014.3 0.022.6 
Chronic portal inflammation (%)
 0 - none28.926.60.8512.510.10.26
 1 - mild60.357.1 68.464.1 
 2 - >mild10.814.3 19.125.8 
Ballooning (%)
 0 - none96.485.7n/a0.70.3n/a
 1 - few2.67.1 58.823.1 
 2 - many1.07.1 40.476.7 
Fibrosis (%)
 0- none73.271.40.3714.25.00.005
 1a7.421.4 14.913.6 
 1b1.10.0 18.717.1 
 1c9.50.0 0.80.3 
 23.70.0 19.429.5 
 3 - bridging5.37.1 32.134.7 

Regression Analyses.

Univariate and multivariate linear and logistic regression analyses were performed as described, where serum ALT and AST, the presence of diabetes, metabolic syndrome as defined by the NCEP,3 calculated HOMA-IR and its inverse log transformation, the QUICKI, were the outcome measures and the NAS (≥5 versus ≤4) and SH diagnosis were covariates. (The results are shown in Table 8.) Both the NAS ≥5 and definite SH were individually highly associated with serum ALT and AST. When both the NAS ≥5 and definite SH were included in the model, the significant association with ALT remained, but the NAS ≥5 showed a stronger association (β = 24.5, P < 0.0001) compared to definite SH (β = 11.8, P = 0.02). The association with AST was highly significant for both NAS ≥5 and definite SH in the multivariate model (P < 0.0001). With respect to the clinical conditions and tests associated with insulin resistance, the SH diagnosis alone was strongly associated with diabetes, metabolic syndrome, HOMA-IR, and QUICKI (P < 0.01 for all). In comparison, the NAS ≥5 alone showed no association with diabetes or metabolic syndrome (P = 0.06 and P = 0.16, respectively), but was associated with HOMA-IR and QUICKI (P = 0.003, P = 0.008, respectively). However, when the diagnosis of definite SH was included in the model with NAS ≥5, the association between definite SH and these measures remained statistically significant, but any contribution by the NAS was lost.

Table 8. Regression Analysis of Liver Enzymes and Measures of Metabolic Syndrome and Insulin Resistance on Liver Histology (NAS≥5 and NASH Diagnosis)
 ALT (U/L)
Modelsβ95% CIP
One variable model
NAS≥530.923.0-38.8<0.0001
NASH diagnosis25.517.3-33.7<0.0001
Two variable models
NAS≥524.515.1-34.0<0.0001
NASH Dx11.82.1-21.30.02
 AST (U/L)
Modelsβ95% CIP
One variable model
NAS≥525.920.2-31.7<0.0001
NASH diagnosis25.719.8-31.6<0.0001
Two variable models
NAS≥517.310.4-24.1<0.0001
NASH Dx16.19.1-23.0<0.0001
 Diabetes
ModelsOR95% CIP
One variable model
 NAS≥51.330.99-1.790.06
 NASH diagnosis1.921.41-2.62<0.0001
Two variable models
 NAS≥50.900.63-1.290.58
 NASH Dx2.041.40-2.96<0.0001
 Metabolic Syndrome
 ModelsOR95% CIP
One variable model
 NAS≥51.210.93-1.580.16
 NASH diagnosis1.431.09-1.980.009
Two variable models
 NAS≥50.980.71-1.360.91
 NASH Dx1.451.04-2.010.03
 HOMA-IR
 Modelsβ95% CIP
One variable model
 NAS≥51.390.48-2.310.003
 NASH diagnosis1.700.77-2.62<0.0001
Two variable models
 NAS≥50.68−0.41-1.770.22
 NASH Dx1.320.21-2.42<0.0001
 QUICKI (X 1,000)
 Modelsβ95% CIP
One variable model
 NAS≥5−7.17−12.49 to −1.850.008
 NASH diagnosis−9.24−14.63 to −3.860.001
Two variable models
 NAS≥5−3.12−9.45 to −3.860.33
 NASH Dx−7.51−13.95 to −1.060.02

Discussion

This study was undertaken using a large dataset of prospectively obtained clinical data and results from liver biopsies blindly reviewed by a committee comprised of the pathologists from 9 different centers in the United States involved in the NASH CRN. The aim was to evaluate if the diagnosis of SH made by the pathologists correlated with a threshold value of feature-based scores that comprise the NAS of ≥5. Several interesting observations can be made. First, of 934 noncirrhotic entry liver biopsies from adult patients with phenotypic NAFLD enrolled in either the Database study or PIVENS treatment trial, 543 (58%) met histologic criteria for definite SH. In this subset, the NAS (the sum of steatosis, lobular inflammation, and ballooning scores) was ≥5 in 75%, but ≤4 in the remaining 25%. Because liver biopsy review for rigorously conducted treatment trials is done by pathologist(s) blinded to any clinical information, and therefore not influenced by knowledge of gender, ALT, or insulin resistance status, the discordance between the NAS and the diagnosis of NASH has serious implications if a discriminating criterion for trial entry is based on NAS alone. On the other hand, of the 208 biopsies from the noncirrhotic adult Database and PIVENS cohorts that were definitely “not SH,” 14 had NAS ≥5. These results collectively highlight the fact that diagnostic criteria for SH and scoring of particular lesions are related, but also provide different results.

A second major observation is clear from the regression analyses. When the diagnosis of definite SH and the NAS were analyzed together in relation to clinical features known to be associated with NAFLD, the diagnosis of SH was a stronger predictor of metabolic abnormalities than the score. This further emphasizes the point that the recognition of the histologic pattern of SH cannot be replaced by a numerical score based on the presence and severity of certain features. On the other hand, the NAS is highly correlated with aminotransferase levels, commonly assumed to be markers of liver disease severity.

The NAS was created in the same way and for the same reasons that other systems for “scoring” histologic lesions in liver disease were: by an individual or a group of focused liver pathologists evaluating the lesions of significance for the specific disease process (SH or chronic hepatitis) and assigning relative, graduated values to represent severity. None of these systems was developed to replace a diagnostic determination of the disease; that process is the result of assessing a combination of the features (lesions) and their pattern(s). Scoring systems for chronic hepatitis, such as Knodell, METAVIR, Scheuer, and Ishak, were developed for semiquantitative evaluation of liver histology for clinical trials (reviewed6). On the other hand, the histopathologic diagnosis of a disease process derives from several pieces of visual information ultimately integrated to formulate either a diagnosis or a differential diagnosis. This information includes the parenchymal location(s), alterations of surrounding cell compartments, and types of tissue responses (inflammatory cell types, presence and location of fibrosis, cell necrosis or apoptosis, etc). Also included in those “pieces” of information in liver biopsy evaluation are the relationships of the vascular structures within the parenchyma to one another, the amount and types of inflammation, the location of each of the lesions being assessed, the presence and relative abnormalities of the parenchymal components' cellular types. The lesions themselves are important individually, as they relate to each other and to other features of parenchymal alterations, and thus as part of the composite. Thus, pathologists utilize multiple inputs to formulate a final diagnosis, a process more complex than the simple addition of any one type or types of lesions to derive a “score.” Although the qualitative recognition of a pattern of injury may at first seem to be less precise and more subjective than a numerical score, our regression analysis indicated that it is a powerful result. Scores, on the other hand, are quite useful in comparative analyses, such as interventional studies, for objective measures of change of specific lesions. The exercises of diagnosis and scoring, while leading to interrelated results, are thus distinct and separate, and, when done properly, serve distinct, separate, and important purposes.

The histologic feature of the NAS that appears to be most significant in the determination of the diagnosis of SH is ballooning. Regardless of the final NAS, >99% of 543 cases with a diagnosis of definite SH had ballooning. Ballooning as an individual feature was significantly correlated with clinical features of insulin resistance in regression analyses. Ballooning remains a challenge for pathologists; as depicted in textbooks, ballooned cells are enlarged and have pale, flocculent cytoplasm and may contain Mallory-Denk bodies. However, in practice, they are not always enlarged, and many do not contain Mallory-Denk bodies. A recent study highlighted the loss of K8/18 detectable by immunohistochemistry as a more sensitive indicator of ballooning.7 This technique is useful in detecting more subtle ballooning, and it is available in most diagnostic pathology laboratories. Other features of hepatocellular injury include acidophil bodies and immunohistochemical markers of apoptosis8, 9; however, these markers have not been validated in terms of replacing ballooning as a key feature in the diagnosis.

Biopsies with definite SH but NAS ≤4 are clinically important because a low NAS could be interpreted as indicating absence of significant disease. These biopsies had milder steatosis (grades 0-1 in 73%) and inflammation (grades 0-1 in 87%) but ballooning in >99% and fibrosis ≥2 in 52%. Conversely, in the NAS ≥5 biopsies with a diagnosis of not SH, 86% had no ballooning, but 93% had >33% steatosis (grades 2-3) and all had at least grade 2 lobular inflammation; 93% had either no fibrosis (71%) or delicate zone 3 perisinusoidal fibrosis only (21%).

On the other end of the spectrum, the cases with NAS ≤4, but with definite SH by diagnostic criteria were compared with both NAS ≥5, definite SH and NAS ≤4, not SH. There were higher ALT values and greater percentage of females with NAS ≥5 compared to NAS ≤4 in the definite SH group, but no other demographic or clinical data that were significantly different.

Major strengths of this study include the large amount of prospectively obtained clinical data from the NASH CRN, and biopsy results obtained over a period of several readings from a central review process from up to 9 liver pathologists. It is our hope that our work will allow others in the field to have the confidence to continue to “split” their diagnostic and scoring efforts, and not confuse diagnosis with “scoring” nor compromise diagnostic categories by using a summation numeric value.

Appendix

Members of the Nonalcoholic Steatohepatitis Clinical Research Network

Clinical Centers:

Baylor College of Medicine, Houston, TX: Stephanie H. Abrams, MD, MS; Leanel Angeli Fairly, RN.

Case Western Reserve University Clinical Centers: MetroHealth Medical Center, Cleveland, OH: Arthur J. McCullough, MD; Patricia Brandt; Diane Bringman, RN (2004-2008); Srinivasan Dasarathy, MD; Jaividhya Dasarathy, MD; Carol Hawkins, RN; Yao-Chang Liu, MD (2004-2009); Nicholette Rogers, PhD, PA-C (2004-2008); Margaret Stager, MD (2004-2009).

Cleveland Clinic Foundation, Cleveland, OH: Arthur J. McCullough, MD; Srinivasan Dasarathy, MD; Mangesh Pagadala,MD; Ruth Sargent, LPN; Lisa Yerian, MD; Claudia Zein, MD.

California Pacific Medical Center: Raphael Merriman, MD; Anthony Nguyen.

Children's National Medical Center, Washington DC: Parvathi Mohan, MD; Kavita Nair.

Cincinnati Children's Hospital Medical Center, Cincinnati, OH: Stephanie DeVore; Rohit Kohli, MD; Kathleen Lake; Stavra Xanthakos, MD.

Duke University Medical Center, Durham, NC: Manal F. Abdelmalek, MD; Stephanie Buie; Anna Mae Diehl, MD; Marcia Gottfried, MD (2004-2008); Cynthia Guy, MD; Meryt Hanna; Paul Killenberg, MD (2004-2008); Samantha Kwan, MS (2006-2009); Yi-Ping Pan; Dawn Piercy, FNP; Melissa Smith.

Indiana University School of Medicine, Indianapolis, IN: Elizabeth Byam, RN; Naga Chalasani, MD; Oscar W. Cummings, MD; Ann Klipsch, RN; Jean P. Molleston, MD; Linda Ragozzino, RN; Girish Subbarao, MD; Raj Vuppalanchi, MD.

Johns Hopkins Hospital, Baltimore, MD: Kimberly Pfeifer, RN; Ann Scheimann, MD; Michael Torbenson, MD.

Mount Sinai Kravis Children's Hospital: Nanda Kerkar, MD; Sreevidya Narayanappa; Frederick Suchy, MD.

Northwestern University Feinberg School of Medicine/Children's Memorial Hospital: Mark H. Fishbein, MD; Katie Jacques; Ann Quinn, RD; Cindy Riazi, RN; Peter F. Whitington, MD.

Seattle Children's Hospital & Research Institute, WA: Melissa Coffey; Sarah Galdzicka, Karen Murray, MD; Melissa Young.

Saint Louis University, St Louis, MO: Sarah Barlow, MD (2002-2007); Jose Derdoy, MD; Joyce Hoffmann; Debra King, RN; Andrea Morris; Joan Siegner, RN; Susan Stewart, RN; Brent A. Neuschwander-Tetri, MD; Judy Thompson, RN.

University of California San Diego, San Diego, CA: Cynthia Behling, MD, PhD; Janis Durelle; Tarek Hassanein, MD (2004-2009); Joel E. Lavine, MD, PhD; Rohit Loomba, MD; Anya Morgan; Steven Rose, MD (2007-2009); Heather Patton, MD; Jeffrey B. Schwimmer, MD; Claude Sirlin, MD; Tanya Stein, MD (2005-2009).

University of California San Francisco, San Francisco, CA: Bradley Aouizerat, PhD; Kiran Bambha, MD (2006-2010); Nathan M. Bass, MD, PhD; Linda D. Ferrell, MD; Danuta Filipowski, MD; Bo Gu (2009-2010); Raphael Merriman, MD (2002-2007); Mark Pabst; Monique Rosenthal (2005-2010); Philip Rosenthal, MD; Tessa Steel (2006-2008).

University of Washington Medical Center, Seattle, WA: Matthew Yeh, MD, PhD.

Virginia Commonwealth University, Richmond, VA: Sherry Boyett, RN, BSN; Melissa J. Contos, MD; Michael Fuchs, MD; Amy Jones; Velimir AC Luketic, MD; Puneet Puri, MD; Bimalijit Sandhu, MD (2007-2009); Arun J. Sanyal, MD; Carol Sargeant, RN, BSN, MPH; Kimberly Noble; Melanie White, RN, BSN (2006-2009).

Virginia Mason Medical Center, (original grant with University of Washington) Seattle, WA: Kris V. Kowdley, MD; Jody Mooney, MS; James Nelson, PhD; Sarah Ackermann; Cheryl Saunders, MPH; Vy Trinh; Chia Wang, MD.

Washington University, St. Louis, MO: Elizabeth M. Brunt, MD.

Resource Centers:

National Cancer Institute, Bethesda, MD: David E. Kleiner, MD, PhD.

National Institute of Child Health and Human Development, Bethesda, MD: Gilman D. Grave, MD.

National Institute of Diabetes and Digestive and Kidney Diseases, Bethesda, MD: Edward C. Doo, MD; Jay H. Hoofnagle, MD; Patricia R. Robuck, PhD, MPH (Project Scientist).

Johns Hopkins University, Bloomberg School of Public Health (Data Coordinating Center), Baltimore, MD: Patricia Belt, BS; Frederick L. Brancati, MD, MHS (2003-2009); Jeanne M. Clark, MD, MPH; Ryan Colvin, MPH; Michele Donithan, MHS; Mika Green, MA; Rosemary Hollick (2003-2005); Milana Isaacson, BS; Wana Kim, BS; Alison Lydecker, MPH (2006-2008), Pamela Mann, MPH (2008-2009); Laura Miriel; Alice Sternberg, ScM; James Tonascia, PhD; Aynur Ünalp-Arida, MD, PhD; Mark Van Natta, MHS; Ivana Vaughn, MPH; Laura Wilson, ScM; Katherine Yates, ScM.

Ancillary