Endpoints and clinical trial design for nonalcoholic steatohepatitis


  • Potential conflict of interest: Dr. Chalasani declares that he has served as a consultant in the NAFLD/NASH area for the following entities over the last 24 months: Amylin, Gilead, Genentech, and Fulcrum/Mochida. He has served as a consultant in the drug hepatotoxicity area for the following entities over the last 24 months: Abbott, J&J, KaroBio, Teva, Salix, and Merck. He has received research funding from Eli Lilly and Amylin. Dr. Kowdley has received funds for a clinical trial of NASH from Mochida Pharmaceuticals. The other authors have nothing to report.

  • This work was supported by the American Association for the Study of Liver Diseases and represents the summary of the 2009 workshop on “Endpoints in Nonalcoholic Steatohepatitis” along with updated information where applicable. This work was also supported, in part, by the Intramural Research Program of the National Institutes of Health, National Cancer Institute.


Nonalcoholic fatty liver disease is a common cause of chronic liver disease in the general population. Nonalcoholic steatohepatitis (NASH), the aggressive form of nonalcoholic fatty liver disease, is associated with an increased risk of liver-related mortality and cardiovascular disease. At present, a liver biopsy is the only generally acceptable method for the diagnosis of NASH and assessment of its progression toward cirrhosis. Although several treatments have shown evidence of efficacy in clinical trials of varying design, there are no approved treatments for NASH, and published trials are often too divergent to allow meaningful comparisons. There is thus a lack of established noninvasive, point-of-care diagnostics and approved treatment on one hand and a substantial population burden of disease on the other. These provide the rationale for developing consensus on key endpoints and clinical trial design for NASH. Conclusion: This article summarizes the consensus arrived at a meeting of the American Association for the Study of Liver Diseases on the key endpoints and specific trial design issues that are germane for development of diagnostic biomarkers and treatment trials for NASH. (HEPATOLOGY 2011;)

Nonalcoholic fatty liver disease (NAFLD) is the most common cause of chronic liver disease in most of the Western world.1-3 The clinical-histologic phenotype of the disease extends from nonalcoholic fatty liver to nonalcoholic steatohepatitis (NASH). NASH can progress to cirrhosis4 and is also associated with an increased risk of cardiovascular mortality and type 2 diabetes mellitus.5, 6 Cirrhosis due to NASH increases the risk of hepatocellular carcinoma and NASH contributes substantially to the population burden of hepatocellular cancer.7-9

There are many uncertainties in the diagnostic approaches, evaluation, and management of NASH. The diagnosis currently requires a liver biopsy, which is invasive, somewhat painful, and may be associated with life-threatening complications in some individuals.10 Also, whereas several drugs have shown efficacy in clinical trials of varying design,11-13 there are currently no approved therapies for NASH. Thus, because of the lack of such modalities, there is a major public health need to develop diagnostic and therapeutic approaches that are widely applicable.

The American Association for the Study of Liver Diseases (AASLD) convened a research workshop in 2009 to discuss the key questions that are critically important for advancement of the field, the specific populations that need to be treated, and the goals of treatment for each of these populations. These were the bases for recommendations for trial design and development of endpoints for various types of clinical studies of NASH. Safety-related issues and their relevance to trial design and endpoints were also discussed. This report summarizes the discussion and areas of consensus among members of the working group and has been updated to incorporate new data that have been published since this workshop was held. The objective is to provide guidance to investigators and regulators about issues related to clinical trial design and endpoints in clinical trials for NASH. This is not a clinical practice guideline, but rather a general guidance document related to best practices in trial design and performance. However, wherever feasible, the level of evidence to support specific recommendations for endpoints is provided using the Grading of Recommendations, Assessment, and Evaluation (GRADE) system used for AASLD guidelines.14 In other sections, the best practices in trial design and execution, based on expert opinion, are discussed.


ALT, alanine aminotransferase; AST, aspartate aminotransferase; MELD, Model for End-Stage Liver Disease; MR, magnetic resonance; NAFLD, nonalcoholic fatty liver disease; NAS, NAFLD activity score; NASH, nonalcoholic steatohepatitis; NASH CRN, Nonalcoholic Steatohepatitis Clinical Research Network.

1. Case Definitions for Clinical Trials for NAFLD/NASH

A critical need in the field is to use standard nomenclature and uniform case definitions that will allow individual studies to be compared to others and permit meaningful analysis of groups of studies with similar objectives. The common nomenclature recommended for clinical studies is listed below (Table 1):

Table 1. Diagnostic Categories and Associated Histologic Lesions: Prefibrotic NAFLD/NASH
DiagnosisMacrovesicular SteatosisHepatocyte BallooningZonalityLobular InflammationPortal Inflammation*
  • *

    Portal inflammation that is “disproportionately present” compared to lobular lesions may be a harbinger of concurrent liver disease.

Definite steatohepatitis+ (Any degree)+ (Any degree)Zone 3 accentuation+ (Any degree)+/– (Any degree)
Borderline zone 3+ (Any degree)Zone 3++/– (Any degree)
Borderline zone 1+ (Any degree)Zone 1 or panacinar+/–+ (Common, especially in the panacinar pattern of steatosis)/–
Not steatohepatitis, with steatosis+ (Any degree)Any pattern+/–+/–
Not steatohepatitis, without steatosisNot applicable+/–+/–

Fatty Liver

This is defined by >5% macrovesicular steatosis as evaluated by light microscopic examination of a hematoxylin-and-eosin–stained liver section (4-5 μm thick) under a 10× objective lens.

NAFLD is characterized by predominantly macrovesicular steatosis, and the presence of visible steatosis in >5% of hepatocytes is generally accepted as a working definition of a fatty liver.15 Assessment of steatosis at higher magnifications can lead to a higher estimate of the severity of steatosis, and should be avoided.


The minimal criteria for the diagnosis of steatohepatitis include the presence of >5% macrovesicular steatosis, inflammation, and liver cell ballooning, typically with a predominantly centrilobular (acinar zone 3) distribution in adults.

Steatohepatitis is not simply the presence of inflammation and steatosis but is a specific histologic entity as defined above.15-18 Steatohepatitis is more likely to lead to cirrhosis than steatosis alone or steatosis with inflammation.19 The presence of Mallory-Denk bodies is supportive but not required for the diagnosis of steatohepatitis.15 It is, however, recognized that some subjects may have an atypical distribution of changes and that children often have a different distribution of histologic findings compared to adults, e.g., greater portal-based findings than in adults.20, 21 On this basis, it is recommended that steatohepatitis may be further classified as definite or borderline.

Definite Steatohepatitis

This is defined by zone 3 accentuation of macrovesicular steatosis of any grade, hepatocellular ballooning of any degree, and lobular inflammatory infiltrates of any amount.12, 15 Apoptotic bodies are commonly present in definite steatohepatitis. This lesion may or may not include portal inflammation, Mallory-Denk bodies, which are most often noted within ballooned hepatocytes, nonzonal patches of microvesicular steatosis, and megamitochondria, which are most often noted in hepatocytes with microvesicular steatosis.

Borderline Steatohepatitis

Biopsies with this diagnosis do not meet classical criteria for steatohepatitis, because the lesions may be predominantly in acinar zone 3, but liver cell ballooning is not classic or is absent, and Mallory-Denk bodies are absent. In the zone 1 borderline pattern, steatosis is either periportal or panacinar; ballooning may be present, but is not classical; and Mallory-Denk bodies are not present. The other findings of definite steatohepatitis may or may not be present to varying degrees, but commonly, portal inflammation and portal fibrosis are present. Lobular inflammation is often less prominent than in definite steatohepatitis. This pattern has been more often seen in pediatric cases of NAFLD.20 It is currently not known if this pattern is associated with a different natural history than those with definite steatohepatitis.

Disease Activity

  • It is recommended that the NAFLD activity score (NAS) be used to define and quantify disease activity (Grade 1b).

Disease activity refers to the severity of ongoing liver injury as assessed by a liver biopsy. The NAS was developed by the Nonalcoholic Steatohepatitis Clinical Research Network (NASH CRN) to evaluate disease activity.15 It is based on the NASH CRN methodology for scoring the severity of steatosis (0-3), inflammation (0-3), and hepatocellular ballooning (0-2).15 It has been shown that scores ≥4 are associated with increasing likelihood of having steatohepatitis.15 It is, however, emphasized that the presence of steatohepatitis can not be inferred from the NAS and requires an overall assessment of the presence and distribution of the individual histologic findings.22 The NAS is sensitive to change and is relatively reproducible,15 which allows it to be measured repeatedly over time in the same subject. It has, however, not been validated as a marker for likelihood of progression to cirrhosis or mortality.

Stage of Disease

  • It is recommended that a validated method for the staging of NASH be used for assessment of changes in disease stage in clinical trials of NASH. The NASH CRN fibrosis staging system is one such system and is the most validated system currently available.

  • A liver biopsy is the recommended method for assessment of disease stage for phase 2 and 3 clinical trials.

Stage of disease refers to the amount and pattern of fibrosis, as well as parenchymal architectural remodeling, or how far toward cirrhosis the disease has progressed. The NASH CRN fibrosis staging system is widely used and has been validated.15 It differs from other staging systems in that stage 1 is subdivided into three substages: stages 1a and 1b are zone 3, perisinusoidal, and differ only by the character of the collagen deposition (delicate or dense, respectively) and stage 1c is portal or periportal (to represent the pediatric pattern). Thus, these substages do not represent disease progression, but rather varying degrees or distribution of fibrosis. The use of noninvasive biomarkers or quantitative morphometric methods is considered experimental, and a liver biopsy is recommended for assessing NASH stage at this time. This is discussed in greater detail in subsequent sections.

Additional Recommendations for the Histologic Categorization of NAFLD

The quality of the histologic data obtained is affected by several factors, such as the manner of procurement (intraoperative techniques may induce inflammation), the type of biopsy (needle core versus wedge), biopsy location, dimensions of the biopsy core, and the inherent variability in the subjective assessment of liver histology (Table 2).15, 23, 24 Ideally, a 2- to 3-cm core (with 10 evaluable portal tracts) of at least 16-gauge diameter from the right lobe should be obtained percutaneously to minimize such technical problems with histologic analysis of NASH.10, 25 Biopsies < 5 mm are discouraged and should not be considered for inclusion in trials. Given the intra- and inter-rater variability in assessment of liver biopsies,15, 23 a blinded central review of the liver biopsies by more than one pathologist is ideal. In the PIVENS (Pioglitazone or Vitamin E for NASH) Study trial, the sections used to assess eligibility and those used for the final analysis of the data were different, and 17%-26% of subjects failed to meet entry criteria, based on central review of independent sections from the baseline biopsy used for the final analysis.12 It is therefore recommended that the sections used for assessment of eligibility be used for the baseline data for assessing the impact of treatment.

Table 2. Best Practices for Histologic Assessment of Endpoints in NASH
1. Biopsy technique:
  - Needle core biopsy preferred
  - Biopsy should be obtained with a 16 or lower gauge needle
  - A tissue core ≥ 2 cm long (≥ 10 portal tracts) represents optimal biopsy length
  - Preferably obtain biopsy from the right lobe. If left lobe biopsy is used for entry, a left lobe biopsy should be used at end of study
2. Histologic technique:
  - Use hematoxylin and eosin stain
  - Use Masson's trichrome stain for fibrosis assessment
  - Central staining is ideal to minimize laboratory variability but may not be optimal
  - Role of special stains and quantitative morphometry to assess fibrosis remain experimental
3. Review procedures:
  - Use central review with at least two pathologists
  - The same section used to assess eligibility should be used for the final analysis
  - Histological assessment should record histologic phenotype (fatty liver, definite steatohepatitis, etc.), presence and severity of individual parameters (NASH CRN scoring system preferred), disease activity (NAS), and fibrosis stage. Portal inflammation should be scored separately from lobular inflammation

How to Establish the Nonalcoholic Nature of the Disease

  • A validated questionnaire to quantify and document the amount of alcohol consumption should be used in the context of clinical trials (Recommendation can not be graded for evidence).

  • The alcohol consumption thresholds to define the “nonalcoholic” nature of the steatohepatitis include <21 units of alcohol/week for men and 14 units of alcohol/week for women over a 2-year time frame prior to the baseline biopsy used to determine eligibility for a clinical trial (Grade 2b).

One drink “unit” or one standard drink is equivalent to a 12-ounce beer, a 4-ounce glass of wine, or a 1-ounce shot of hard liquor. There is a paucity of data to support a specific cutoff versus others, and the recommendations are based on the rarity of clinically significant liver injury from alcohol intake at such levels.26 There are conflicting reports of the impact of small amounts of alcohol consumption on NASH-related outcomes.8, 27 Given the possibility that small amounts of alcohol intake may affect outcomes in NAFLD, the amount and pattern of alcohol consumption during the trial should be documented, using established and validated questionnaires that are already available.28-30

2. Subpopulations of NAFLD and the Types of Studies Required in These Groups

NAFLD has several clinical-histologic phenotypes that have a varying clinical course. These include nonalcoholic fatty liver, fatty liver with modest inflammation alone, NASH with none or early stage disease (stage 0-2), NASH with advanced fibrosis or cirrhosis (stages 3-4), cryptogenic cirrhosis, decompensated cirrhosis, and recurrent NASH after liver transplantation. These have been reviewed in the literature and will not be discussed at length here.1, 4, 19, 31-34 There is broad consensus that the following groups should be targeted for treatment (in order of clinical and public health significance): (1) subjects with the greatest risk of progression to cirrhosis, (2) subjects with cirrhosis, and (3) subjects with recurrent NASH after liver transplant.

2a. Studies in Subjects at Risk of Progression to Cirrhosis

  • Primary endpoints for treatment trials (Table3):

    • Resolution of steatohepatitis with no worsening of fibrosis (Grade 1b).
    • A minimum two-point improvement in NAS with at least a one-point improvement in more than one category and no worsening of fibrosis (Grade 2b).
    • Improvement in steatosis as determined by magnetic resonance (MR) spectroscopy along with sustained improvement in alanine aminotransferase (ALT) (for short-term phase 1 and 2 trials where a follow-up biopsy is not practical) (Grade 2b).
  • Secondary endpoints for treatment trials (Evidence grade can not be provided for these):

    • Individual histologic parameters
    • Anthropometric measures
    • Changes in insulin sensitivity and oxidative stress
    • Changes in cardiovascular risk profile
    • Quality of life
    • Economic endpoints
Table 3. Guidance on Design and Endpoints for Clinical Trials for NASH
1. Inclusions:
  - Presence of steatohepatitis (record definite or borderline)
  - Active disease (NAS ≥ 3 or 4)
  - Cytologic ballooning score of 1 or greater
  - Stratify for diabetes
  - Do not include subjects with cirrhosis in clinical trials to define efficacy for prevention of disease progression in subjects at risk of developing cirrhosis, i.e., NASH (objectives of treatment in those with cirrhosis is different)
  - Safety studies may include those with cirrhosis
2. Record at baseline and end of study:
  - Alcohol consumption (amount consumed in last 2-3 years, pattern of drinking, and type of alcohol)
  - Dietary history (use a validated questionnaire)
  - History of physical activity (use a validated questionnaire)
  - Body mass index
  - Waist circumference
  - Measures of insulin sensitivity (oral glucose tolerance test with simultaneous insulin measurement is acceptable, snapshot measures of insulin–glucose relationship such as the homeostatic model are practical but may not be meaningful in diabetic subjects; a fasting insulin level can be used in nondiabetic subjects as a minimum standard)
  - Measures of systemic oxidative stress (4-OH nonenal, malonlydialdehyde etc)
  - Measure of glycemic control (hemoglobin A1c)
  - Liver enzymes and functions
  - Fasting lipid profile
  - Exploratory markers for noninvasive assessment of disease status
4. Management of confounders:
  - Record levels of alcohol consumption through the study
  - Provide practical but uniform diet and physical activity recommendations
  - Do not include subjects with >10 pounds weight gain or loss in last 6 months
  - Do not include subjects who have undergone bariatric surgery in last 5 years
  - Include subjects on stable dose of hypolipidemic drugs (if they are taking any) in trials
  - Include subjects with hemoglobin A1c ≤ 9 for diabetic subjects
  - Avoid those taking drugs known to have potential activity against NASH prior to entry into study
  - Monitor all medication intake
5. Duration of study:
  - 12 months for those with histology as primary endpoint
  - < 6 months duration may be reasonable for early phase proof-of-concept studies where tolerability of drug is the main objective of trial along with some evidence of efficacy
  - At least 12-24 months where fibrosis is considered a key endpoint
  - Follow for at least 6 months after drug discontinuation to assess durability of response

The primary objective of treatment for NASH is to prevent liver-related mortality, due mainly to the development of cirrhosis, which takes 10-20 years to develop.4 It is impractical to perform studies over this duration to identify treatment benefits due to the logistics of performing such studies. This necessitates the use of surrogate measures of avoidance of cirrhosis and thus liver-related mortality.

The hallmark of disease progression to cirrhosis is increasing fibrosis, as reflected by the disease stage. Other markers of risk of development of fibrosis include increasing age, body mass index, and type 2 diabetes mellitus.35 Also, subjects with steatohepatitis are more likely to develop cirrhosis and experience liver-related mortality than those with fatty liver alone or steatosis and inflammation.4, 19 Therefore, reversal of steatohepatitis with no worsening of fibrosis, or actual improvement in fibrosis, is a reasonable surrogate for prevention of cirrhosis and liver-related death. It follows that subjects with steatohepatitis, stratified for diabetes and including at least a subset with enough fibrosis to allow reliable assessment of change, should be enrolled in pivotal trials to demonstrate a decreased risk of development of cirrhosis. For obvious reasons, subjects with cirrhosis should not be included in such trials. A NAS ≥ 4 is another frequently used inclusion criterion. It allows assessment of change in the individual features of steatohepatitis. The predictive value of NAS for development of cirrhosis or liver-related mortality has not, however, been experimentally verified. Thus, a NAS alone should not be the sole inclusion criterion for inclusion in clinical trials. When used, it should be in conjunction with the histologic diagnosis of steatohepatitis.

The optimum duration of phase 2b and 3 trials remains under debate. Several clinical trials have shown that reversal of steatohepatitis and the individual features of steatohepatitis can improve as early as 6-12 months.36, 37 Fibrosis progresses slowly over years; conversely, it is generally believed that clinically significant improvement in fibrosis also takes longer than other features of NASH. Although some studies performed for 2 years have failed to show improvement in fibrosis,12 this may reflect lack of activity of a given drug rather than a problem with duration of study. It is therefore recommended that trials that hope to demonstrate improvement in fibrosis should be at least 1-2 years in duration. Also, subjects should be followed for 6 months after drug discontinuation to define the “off-drug response”.

The primary endpoint should be measurable, sensitive to change, clinically meaningful, and be able to be quantified consistently. Given the relevance of steatohepatitis and fibrosis for the risk of developing cirrhosis, which drives liver-related mortality,4, 19 reversal of steatohepatitis with at least no worsening of fibrosis should be considered as primary endpoints of clinical trials. It is worth noting that although steatohepatitis is more commonly associated with cirrhosis compared to fatty liver alone, it has not been experimentally verified that reversal of steatohepatitis prevents cirrhosis.

The NAS was developed as a measurable system that was also sensitive to change.15 It is not known how much improvement in NAS reflects a clinically significant decrease in the risk of developing cirrhosis or liver-related mortality. Also, the potential impact of improvement of various components of NAS, e.g., steatosis versus inflammation on the risk of developing cirrhosis is unknown. With disease progression to cirrhosis, active lesions of steatohepatitis may also decrease.38, 39 Therefore, if NAS improvement is used as a primary endpoint, an improvement by a minimum two points, with contribution from more than one parameter and no worsening of fibrosis, should be used to maximize clinical relevance and robustness of the findings.

For short-term studies (phase 1 and 2a) designed mainly to assess tolerability of new drugs and to look for futility signals to direct decisions regarding further development, an improvement of hepatic steatosis, as determined by MR spectroscopy, and a sustained improvement in aspartate aminotransferase (AST) and ALT may be used as an efficacy endpoint as well.

NASH is closely associated with obesity, dyslipidemia, type 2 diabetes mellitus, and increased cardiovascular morbidity and mortality. It is theoretically possible that specific treatments may have differential effects on NASH and its associated comorbidities. It is therefore important to capture information related to these in the form of secondary endpoints such as the severity of the individual histologic features of steatohepatitis; progression or regression of fibrosis; changes in body weight, body fat content, and distribution; measures of insulin resistance and oxidative stress; glycemic indices; lipid profile; and enumeration of adverse events. In addition, measures of fatigue and quality of life should be assessed.

2b. Special Considerations for Trials Focused on Diabetic Subjects with NASH

Type 2 diabetes mellitus poses special challenges for the design of clinical trials in NASH, because many drugs used for glycemic control of diabetes are also used for the treatment of NASH. Also, the potential confounding effects of poor glycemic control on the hepatic lesions, as well as the impact of changing the dose or type of antidiabetic drugs are not well established. It is therefore recommended that in trials including diabetic subjects, the diabetes should be at least moderately well controlled (hemoglobin A1c < 9), and subjects should have been on a stable dose of antidiabetic medication for at least 3 months prior to entry. It is permissible for subjects to receive metformin, sulfonylureas, and/or insulin for glycemic control. During the course of the study, if worsening glycemia requires a change to another agent that could potentially affect histologic outcome, the subjects should be taken out of the study and the frequency of such events noted as a secondary endpoint. Such cases are considered treatment failures in an intent-to-treat analysis. It is hoped that randomization will keep the frequency of such events similar in the two arms of the trial. In longer term studies in diabetic subjects, quantification of microalbuminuria, retinal changes, and development of symptoms of neuropathy are additional key secondary endpoints.

2c. Clinical Trials in Subjects with Cirrhosis

  • Primary endpoints:

    • Two-point increase in Child-Pugh score (Grade 1a).
    • Development of clinical decompensation (ascites, encephalopathy, hepatocellular cancer, variceal hemorrhage) (Grade 1a).
    • Increase in Model for End-Stage Liver Disease (MELD) scores (Grade 1a).

In patients with cirrhosis resulting from NASH, histological progression becomes far less meaningful as an endpoint, and the rate of progression to complications and liver failure takes precedence. Subjects with compensated cirrhosis, e.g., Child-Pugh class A and B, should be included in such trials. There is not enough data to make evidence-based recommendations about the MELD score criteria for inclusion and exclusion from clinical trials. It is worth noting that if the scores are very low (<10), the likelihood of clinical events in a 1-2 year time frame is low and will affect the sample size. On the other hand, if the MELD score is high (>18), there will be a lot of clinical events, and many subjects may undergo liver transplantation in the course of the trial. The key endpoints in this population include development of portal hypertension (measurement of hepatic venous pressure gradient, imaging, or endoscopic findings), worsening MELD or Child-Pugh scores and clinical complications of cirrhosis such as variceal hemorrhage, ascites, encephalopathy, and hepatocellular carcinoma and mortality. If an agent with the promise of reversal of cirrhosis is studied, the endpoints must be histologically verified and ideally accompanied by a demonstration of improvement in the hepatic venous pressure gradient.

2d. Clinical Trials in Subjects with NASH Following Liver Transplantation

The patient population with NAFLD developing after liver transplant represents a complex group including those in whom the disease is recurrent (based on known NASH before transplant) and those where it develops de novo after transplant.34, 40-42 The latter may include individuals with a pretransplant diagnosis of cryptogenic cirrhosis, which is widely recognized to represent “burned out” NASH in the majority of patients.34, 42 Prevention of development of steatohepatitis and progression to graft failure are key endpoints in this population. Analysis of such endpoints may be confounded by the presence of the metabolic syndrome, use of specific immunosuppressive drugs, donor characteristics, and the recipient's pretransplant diagnosis, all of which must be accounted for in this population.43 Although the role of specific therapeutic agents remain relatively unexplored, the common presence of metabolic syndrome, overweight, and obesity in patients after liver transplantation44 indicates an urgent need for studies of lifestyle modification in this population with the key endpoints as listed above.

2e. Special Considerations in Trials Involving Children

  • Primary endpoints in clinical trials:

    • Reversal of steatohepatitis (definite or borderline versus no steatohepatitis) (Grade 1b).
    • Improvement in NAS of two points without worsening of fibrosis is recommended (Grade 2b).
    • Improvement in hepatic steatosis (with imaging studies, e.g., MR spectroscopy) along with persistent normalization of AST and ALT may be considered as an alternate endpoint, especially in short-term trials focused on safety and proof of biologic effect (phase 1 and 2a trials) (Grade 2b).

The assessment of liver histology as the primary endpoint in clinical trials of NASH in children exposes them to two liver biopsies. The histologic spectrum of NAFLD in children is also more variable than in adults.20, 45 Given the asymptomatic nature of the disease in most children with NASH, this poses obvious logistic and practical issues and underscores the need to develop noninvasive markers that can be used to assess outcomes in this population. The status of biomarker development in this population is not sufficiently mature to make recommendations about any specific biomarker. Therefore, in pediatric trials, the entry criteria are still based on histologic analysis. Trials in children should assess histologic endpoints, because the AST and ALT are not adequate surrogates of histologic activity.46, 47 Histologic endpoints in children are, however, not as well defined as in adults. Either reversal of steatohepatitis (definite or borderline versus no steatohepatitis) or an improvement in NAS of two points, both without worsening of fibrosis, is recommended. Improvement in hepatic steatosis, as determined by imaging studies, e.g., MR spectroscopy, along with persistent normalization of AST and ALT may be considered as an alternate endpoint, especially in early phase studies (phase 1 and 2a). It is important again to note that this combination has not been studied as a formal biomarker of disease outcome and primarily represents clinical expedience.

3. Managing Confounding Variables in the Course of Clinical Trials for NASH

Several factors can potentially confound analysis and interpretation of data in clinical trials for NASH. These include the baseline body mass index; distribution of fat, especially the waist circumference; glycemic control; diet, including total calories, saturated fat, carbohydrates, choline, and foods of varying glycemic index; physical activity; use of medications for diabetes; dyslipidemia; and the amount of alcohol, even when modest, that is consumed. Some of these can be accounted for in the randomization process and need to be documented. It is also critically important to develop standard practices for diet and physical activities as well as rules for the use of concomitant medications. The amount, nature, and pattern of alcohol consumed, if any, during the course of the trial should also be documented using established validated questionnaires.

4. Noninvasive Markers in Clinical Trials for NASH

There is currently considerable interest in the development of noninvasive biomarkers for (1) the diagnosis of NASH (Table 4), (2) the fibrosis stage, and (3) the effect of treatment of NASH. The currently available data on noninvasive markers are not robust enough to replace a liver biopsy for any of these indications. Existing standards for biomarker discovery such as those used by the early detection research network and the National Cancer Institute (http:// cabig.nci.nih.gov) provide guidance on standards for development of biomarkers and should be referred to before designing such studies.

Table 4. Special Considerations for Trials on Noninvasive Biomarkers of Disease in NASH
1. Key concepts:
  - Reproducibility of results
  - Proper experimental design to protect against bias
  - Proper evaluation of algorithms/decision rules
2. Inclusions:
  - Include subjects who are relevant to the specific question to be answered
  - Include the entire spectrum of disease relevant to the population to be studied
  - Avoid having a population skewed toward mild or severe disease
3. Design issues:
  - Define populations that are to be included and excluded
  - Define rules for decision-making, and consider creation of design matrix to evaluate confounders
  - Perform and report estimate of generalizable performance characteristics over a range of sampling effort
4. Reporting standards:
  - Provide STARD or other similar documentation standards data
  - Provide data to allow independent assessment of diagnostic algorithms (define rules for decisions or provide source codes)
  - Provide information on predictive values of specific cutoffs in specific populations
  - Provide information on false negative and false positive results at levels below and above low and high thresholds, respectively
  - Provide data on proportion of subjects in the indeterminate range when test is applied to a population with full spectrum of disease
  - Provide error report for individual subjects who are misclassified and evaluate in context of design matrix to assess for additional confounders

The entry criteria for trials to develop biomarkers will depend on the specific indication for which a biomarker is being developed. For example, studies to develop a biomarker for the diagnosis of NASH should include populations where the clinical possibility of NASH is entertained, such as individuals with persistent elevation of liver enzymes without obvious cause, among others. Similarly, studies of fibrosis biomarkers should include subjects with all stages of fibrosis to allow meaningful analysis. In all these studies, the biomarker must be compared to well-defined histologic findings. The endpoints should not only include the sensitivity, specificity, and the predictive values in a given clinical setting but also the proportion of subjects in the population of interest who would have indeterminate values.

A key issue in the conduct of biomarker discovery studies is the avoidance of bias. Several reports provide guidance on how to avoid bias in such studies.48, 49 The generalizability of the results must be evaluated in rigorously performed independent validation studies in large, well-characterized, and independent cohorts in a variety of settings. Analogous to the CONSORT statement for clinical trial, the Standards for Reporting of Diagnostic Accuracy (STARD) document is recommended for the reporting of clinical trials of the use of noninvasive methods for assessment of disease status in NAFLD.50, 51 The STARD checklist provides guidance on what is to be reported and also offers recommendations on graphical representation of data.

5. Safety-Related Endpoints in NASH

Excess mortality in subjects with NASH is related to liver-related deaths, cardiovascular disease, and nonhepatocellular cancers.31-33 An ideal treatment for NASH should be one that that improves not only the liver disease, but also reduces the risks of cardiovascular outcomes and development of diabetes and cancers. This is likely to be too high of a goal in the context of clinical trials designed to obtain approval for improvement in liver disease. However, in the context of pivotal trials for NASH, measures of cardiovascular health such as the fasting lipid profile including small density low-density lipoprotein and high-density lipoprotein subclass, carotid intimal thickness, and markers of systemic and vascular inflammation; for example, c- reactive protein should be measured to provide reassurance that there are no “alarm” signals.

Hepatic steatosis has been shown to be associated with increased susceptibility to endotoxin-mediated liver injury and may modulate sensitivity to acetaminophen-induced injury52, 53; it is not known if this holds true for other forms of drug-induced liver injury. Recently, the U.S. Food and Drug Administration published a position paper that provides guidance to the pharmaceutical industry for the identification of the hepatotoxic potential of a compound in premarketing clinical trials. However, patients with NASH frequently have spontaneous fluctuations of liver enzymes, and these rules may not fully apply in NASH. A proposal for identifying signals of hepatotoxicity and defining rules of discontinuation on the basis of U.S. Food and Drug Administration recommendations and the specificities of NASH-induced liver disease is presented in Table 5.54-57

Table 5. A Proposal for the Diagnosis of Potential Hepatotoxicity in NASH Trials
Suspected hepatotoxicity (stopping rules) during clinical trials
  • Increase in serum AST and/or ALT > 3 times the baseline value odds ratio > 500 IU/L (on one reading, unless the sample is hemolyzed)
  • New onset jaundice (total bilirubin > 3 mg/dL) that is not explained by Gilbert's syndrome or hemolysis, irrespective of other liver biochemistries
  • Increase in serum AST and/or ALT < 3 times the baseline value but associated with symptoms (fatigue, nausea, vomiting)
Signals of hepatotoxic potential from clinical trials
  • An excess of aminotransferase elevations to >3× baseline values when compared to the control group
  • Marked elevations of aminotransferases to 5×, 10×, or 20× baseline values in the test drug group and not seen (or less frequent) in the control group
  • One or more Hy's law cases (serum bilirubin >2× upper limit of normal in a setting of pure hepatocellular injury with no other explanation) in the test drug group, accompanied by an overall increased incidence of aminotransferase elevations >3× baseline in the test drug group compared to placebo

It is likely that drug–drug interactions, off-target toxicity, and facilitation of development of neoplasia will be rare events. Standardized definitions of “harms” and “clinical phenotypes” of toxicity would allow safety data from multiple trials to be combined, thus largely expanding the safety database and generating reliable large-scale evidence. It is recommended that robust postmarketing surveillance through mechanisms similar to the Sentinel network be used to track rare side effects.58

In summary, the field of NASH is evolving rapidly. Despite many advances in the understanding of the epidemiology and natural history of the disease and clear evidence that it contributes to the population burden of chronic liver disease, there are not yet any approved therapies for this condition. It is hoped that the recommendations here will help to both stimulate additional clinical trials for the treatment of NASH and provide guidance to both regulators and investigators about key inclusion and exclusion criteria and specific endpoints that should be evaluated in the context of such trials.