American association for the study of liver diseases endpoints conference: Design and endpoints for clinical trials in primary biliary cirrhosis


  • Potential conflict of interest: Dr. M. Silveira is the recipient of the 2009 AASLD Advanced/Transplant Hepatology Fellowship Award, Dr. E. Brunt has no disclosures, Dr. J. Heathcote has received grant funding (both restricted and unrestricted) from Axcan Pharma and Intercept Pharma, Dr. G. Gores has no disclosures, Dr. K. Lindor has received research support from Axcan Pharma and Intercept Pharma, and Dr. M. Mayo has received restricted grant funding from Intercept Pharma.


A group of multidisciplinary experts on primary biliary cirrhosis (PBC) and its complications convened on May 31, 2009, under the aegis of the American Association for the Study of Liver Diseases (AASLD) in order to identify the most appropriate design and endpoints for clinical trials of PBC based on current evidence and expert experience. The natural history of PBC was reviewed as well as current therapies. The current approaches to evaluating therapies for disease progression and symptoms as priorities were discussed. Appropriate aspects of trial design including entry criteria, study duration, and appropriate handling of issues such as stratification of subjects and use of ursodeoxycholic acid (UDCA), were identified and discussed. After a full day of presentations, in a consensus manner, appropriate endpoints for clinical efficacy trials regarding PBC and its complications were agreed upon and are reported in this summary.


AASLD, American Association for the Study of Liver Diseases; ALP, alkaline phosphatase; ALT, alanine aminotransferase; AMA, anti-mitochondrial antibodies; AST, aspartate aminotransferase; ELISA, enzyme-linked immune assay; EV, esophageal varices; FIS, fatigue impact score; FDA, U.S. Food and Drug Administration; HCV, hepatitis C virus; MRS, Mayo Risk Score; MELD, Model for End-Stage Liver Disease; PHG, portohepatic vein gradient; PBC, primary biliary cirrhosis; UDCA, ursodeoxycholic acid; ULN, upper limits of normal; VAS, visual analog scale.


PBC is a relatively rare but important cause of chronic cholestatic liver disease that affects predominantly middle-aged women.1 Elevations in serum alkaline phosphatase (ALP) levels are the biochemical hallmark of PBC. In this context, the best diagnostic tool for PBC is the measurement of anti-mitochondrial antibodies (AMA), which are characteristic of PBC and are present in at least 95% of the cases.2 Liver biopsy is not required for clinical diagnosis in PBC,3 but may be helpful in selected situations such as stratification of patients within clinical trials and in some cases may be used as study endpoints. Earlier epidemiological studies report an annual incidence ranging from 2.27 to 32 per million4, 5 with a female-to-male sex ratio averaging approximately 10:1,6 although over the last decade the disease is reported more frequently, perhaps at least in part due to greater disease awareness.7 PBC occurs in all races and ethnicities, the highest rates of prevalence and incidence have been within Caucasian populations from the United Kingdom,4 Scandinavia,8 and Minnesota,9 and it is infrequently reported in Africa and the Indian subcontinent.

PBC is a slowly progressive disease that causes substantial loss of intrahepatic bile ducts, ultimately resulting in cholestasis, advanced fibrosis, cirrhosis, and liver failure. As such, PBC is an important indication for liver transplantation.1 Cirrhosis may also lead to hepatocellular carcinoma in PBC. Histologically, the disease is characterized by chronic portal inflammation with infiltration, destruction and loss of the epithelial cells in the small-sized and medium-sized bile ducts.10 Progression of disease occurs at different rates and with varying degrees of severity in different patients.2 The natural history of PBC can be divided into four phases. The silent or preclinical phase is characterized by isolated AMA positivity and normal serum biochemistries11; this phase may last many years. The next phase is characterized by gradual elevation of the serum ALP levels. The vast majority of newly diagnosed patients with PBC presents without symptoms attributable to liver disease and are in this asymptomatic phase. Although this phase may last up to 20 years, only 30%-50% of patients remain asymptomatic after 5 years of follow-up.12 As PBC is diagnosed increasingly earlier, this percentage may be increasing. Patients in the symptomatic phase will most often complain of fatigue and/or pruritus, but may also report abdominal pain. Symptoms related to portal hypertension usually appear later, with 20% developing ascites and 10% developing bleeding varices within a 10-year period.13 Most patients who develop portal hypertension are often either anicteric or mildly jaundiced. If untreated, median survival ranges from 6 to 10 years, with an accelerated course after development of ascites and hepatic encephalopathy. Once progressive jaundice develops, patients enter a pre-terminal phase, which can last up to 2 years.13

UDCA has been the drug most widely evaluated in the treatment of PBC. Treatment with UDCA may delay disease progression and prolong survival free of liver transplantation.14 Most of the trials of UDCA have not recruited sufficient patients to have the power to show an effect of therapy on survival. Therefore, the evidence that UDCA inhibits the progression of the disease has been delayed until long after the completion of the trials that demonstrated improvement in markers of cholestasis. Several studies evaluating long-term survival of patients have been published and uniformly they indicate that those who demonstrated biochemical response to adequate doses of UDCA for prolonged periods of time have longer survival free of liver transplantation and longer overall survival.15-18 Currently, treatment with UDCA in a dose of 13-15 mg/kg/day is recommended as therapy for PBC by the AASLD19 and is approved for this indication by the U.S. Food and Drug Administration (FDA).

Despite significant evidence that points to an autoimmune etiology of the disease, the precise etiology of PBC remains unknown. As long as the etiology of PBC remains unidentified and not measurable, determining surrogate endpoints is particularly difficult. A parallel can be drawn to the history of conducting clinical trials in hepatitis C. With the identification of the hepatitis C virus (HCV) and the development of internationalized units of amplified HCV RNA, it became possible to use HCV RNA as a surrogate endpoint in clinical trials. Histological improvement in fibrosis, decline in the rates of death/transplantation, and other “hard outcomes” were not demonstrated until years after sustained virological response. Importantly, these hard outcomes were not realized before licensing and widespread availability of antiviral medication. The cost of large randomized placebo controlled trials of HCV was significant and was borne largely by the pharmaceutical industry. If the endpoints of HCV trials had rested upon hard outcomes such as survival, the duration and cost would have likely prohibited the development of what we now know to be successful therapies. Similarly, without a known etiology of PBC, the design and execution of clinical trials is significantly hampered.

Multiple issues add to the challenge for design and execution of clinical trials in PBC. The relative rarity of PBC, as compared to HCV, poses a challenge to achieving requisite sample sizes. In addition to small sample size, the slow rate of disease progression limits the evaluation of effects of therapy on survival and survival free of liver transplantation. Other issues that make trials even more complex to evaluate include the wide range of disease severity and response to UDCA, which are often inversely correlated. Meta-analyses could help overcome the effects of limited sample size; however, the marked heterogeneity of some of the clinical trials conducted to date has limited the interpretation of combined analyses. Thus, standardization of study design and appropriate endpoints are crucial for the advancement of clinical research in PBC and for capturing benefits in health outcomes that might be achieved with new interventions.

Design of Clinical Trials for PBC

Entry Criteria

A detailed description of the diagnosis of PBC in clinical practice is beyond the scope of this report. The interested reader is referred to the recent AASLD practice guidelines published on PBC.19

For the purpose of enrollment into clinical trials, the presence of AMA in the context of cholestatic liver biochemistries (i.e., elevated alkaline phosphatase, with or without mild elevation of aminotransferases) in the absence of biliary obstruction19 should be considered standard entry criteria. This is because little is known about AMA-negative PBC and whether it represents a different pathological entity. Furthermore, AMA-negative patients represent such a small percentage of patients with PBC that their exclusion from clinical trials is justified. There are several methodologies for determining AMA, including indirect immunofluorescence and enzyme-linked immunosorbent assay (ELISA). ELISA is recognized as more sensitive, but perhaps not as specific.20 Not all centers provide both options for testing; therefore, no specific type of AMA testing should be required for trial entry criteria. Detection of AMA by any method is acceptable for diagnosis. The presence of antibody, rather than the magnitude of antibody level, establishes AMA-positivity.

Liver biopsy is also considered desirable before entry into most trials aimed at altering the course of the disease. Diagnosis of concurrent pathological processes is an invaluable consideration for liver biopsy; steatohepatitis has been documented in up to 5% of other forms of chronic liver disease,21 and the presence of “overlap” features of autoimmune hepatitis are best documented by liver biopsy findings. In a recent study, 19% of otherwise typical PBC could be given a diagnosis of “probable” autoimmune hepatitis, according to the revised Autoimmune Hepatitis Group scoring system22 and were thus considered as having PBC-autoimmune hepatitis overlap. The histological finding of interface hepatitis in biopsy was one of five features that distinguished between overlap and nonoverlap patients. After 6 years of follow-up, the patients with overlap had significantly worse clinical outcomes than the nonoverlap PBC patients, thus confirming the potential importance of using liver biopsy to establish the correct diagnosis.23 It was recognized that in one study, the presence of AMA and cholestatic enzymes in a middle aged woman had a 98% positive predictive value for the presence of PBC,3 obviating the need for a liver biopsy for diagnostic purposes. However, this study was conducted and validated in centers of excellence for PBC by experienced investigators. It is not certain whether these criteria would hold the same positive predictive value when widely applied. Moreover, patients whose liver biopsies have interface hepatitis may have a more rapidly progressive disease than patients without this histologic feature,24 thus interface hepatitis is an important prognostic parameter. Other histological factors may also have prognostic significance, such as the degree of ductopenia.25 Therefore, treatment trials for the liver disease of PBC require histologic assessment for patient stratification. Liver histology may also be a desirable study endpoint (e.g., fibrosis progression or regression).

For some trials, such as those evaluating symptoms such as pruritus and fatigue in patients with an established diagnosis of PBC, liver biopsy for stratification or as a study endpoint may not be as important. Thus, a liver biopsy at study entry may not be required.

Recommendations for Entry Criteria into Clinical Trials:

  • 1.The diagnosis of PBC for enrollment of patients into clinical trials should be established by the presence of cholestasis in the absence of biliary obstruction and the presence of AMA (regardless of titer or diagnostic test type).
  • 2.Liver biopsy before entry into therapeutic trials is desirable for stratification and may have a role as a secondary endpoint in longer trials.
  • 3.Trials evaluating treatment for pruritus and fatigue in patients with an established diagnosis of PBC may not require a liver biopsy at study entry.

Duration of Trials

There is an intimate link between the duration of the study and the choice of outcome. Each study must be long enough, and the degree of change in outcome large enough, to ensure that the measured difference exceeds normal variation. The consensus opinion was that for biochemical outcomes, a minimum of 3 months is needed. Change in ALP levels should be far greater than 10%, which is within natural variation during stable disease. For histological outcomes, a study of at least 2 years is needed, based on studies that have described the natural history of histologic progression.26, 27 For symptoms such as fatigue and pruritus, a study duration of approximately 3 months may be sufficient.

Recommendations for Duration of Trials:

  • 4.The optimal duration of trials should be based on the primary endpoint
    • Biochemical endpoint: a minimum of 3 months are required.

    • Histologic endpoint: a minimum of 2 years are required.

    • Symptom severity (such as fatigue and pruritus): approximately 3 months may be sufficient.

Stratification of Subjects

As in other diseases, trials in patients with PBC may be susceptible to disease heterogeneity and confounding variables. These would include variables that may affect the endpoints of the study and thus require balanced representation between the comparison groups. Although random assignment is expected to avoid large imbalances, stratification will make certain that an important predictor is evenly distributed between the study groups. Stratification is relevant in PBC trials, which tend to be limited in size, whereas it may be less important in large trials (>1000 participants) where randomization is more likely to result in even distribution among the study groups.

Natural history studies have identified numerous markers of progressive disease in PBC, including but not limited to: response to UDCA,15-18 histology (degree of interface hepatitis and features of overlap with autoimmune hepatitis),23, 24 biochemical markers (serum bilirubin, albumin and prothrombin time),28 presence and/or degree of portal hypertension,29 certain genetic polymorphisms (apolipoprotein A, tumor necrosis factor [TNF]-alpha), specific autoantibodies (anti-gp210, anti-promyelocytic leukemia protein [PML], anti-sp100, anti-centromere),30, 31 and serum markers of fibrosis (hyaluronic acid, procollagen III, tissue inhibitor metalloproteinase).32

Models using time-fixed Cox proportional hazards regression analysis have been developed to predict survival in PBC. Among several well validated models (European, Mayo, Oslo, Barcelona, Newcastle),13, 28, 33, 34 the Mayo risk score is the most widely used, at least in the United States, and includes the following variables: age, total bilirubin, prothrombin time, albumin and presence/absence of peripheral edema and response to diuretics.28 For example, the model has been used as a historical control to estimate the benefits of treatment, including pharmacological therapy and liver transplantation.35 The risk score would be a rational means to stratify patients for survival analysis of observational studies as well as clinical trials. Finally, it has been used in everyday practice to counsel patients and help make clinical management decisions. This model has the disadvantage of overestimating the survival in patients with poor short-term survival. Of all the serum markers studied to date, serum bilirubin is the best independent predictor of survival.36-38

Table 1 lists several of the more compelling features upon which to consider subject stratification. Determining which ones are important enough to use for stratification of subjects in a clinical trial depends on the endpoint being examined. For example, in a trial evaluating a long term outcome such as survival, stratification for disease stage (i.e., degree of fibrosis) may be considered. Subjects may also be stratified by their symptom status (symptomatic versus asymptomatic). Alternatively, biochemical parameters may be used to ensure that patients with advanced PBC are equally represented between patient groups. Unfortunately, to date, only a limited number of trials have appropriately taken stratification into account in their trial design.39

Table 1. Stratification of Subjects
  1. UDCA, ursodeoxycholic acid.

Clinical Features
 Biochemical response to UDCA
Biochemical Features
 Mayo Risk Score
Histological Features
 Interface hepatitis
 Degree of fibrosis
Portal Hypertension
 Portal pressure measurement
 Presence of varices, ascites and/or hepatic encephalopathy
 Platelet count

Recommendations for Stratification of Subjects Included in Trials:

  • 5.Subjects enrolled in therapeutic trials should be appropriately stratified (Table1).

Use of UDCA in Trials of New Therapies

UDCA is widely considered the standard of care for PBC in North America19 and withdrawing or withholding UDCA treatment is considered impractical and possibly unethical. Therefore, new trials will likely have to compare UDCA with or without the new therapy.

After a minimum of 3-6 months of UDCA therapy, subjects can be divided into two categories: responders and incomplete biochemical responders. Therefore, treatment-naïve patients should be started on adequate doses of UDCA (13-15 mg/kg/day); and then it is important to wait 3-6 months after starting UDCA before testing any new drug in order to allow potential improvement in biochemical markers. It is known that improvement in ALP levels may continue up to 5 years,40 therefore a 3-6 month delay is considered a minimum. Biochemical response to UDCA at 1 year is a strong predictor of long-term prognosis.15-18 Indeed, patients with PBC who have achieved a biochemical response to UDCA have been shown to survive just as long as the normal population.41 Biochemical response has been be defined by numerous criteria (summarized in Table 2): the Mayo criteria15 (ALP < 2 times the upper limits of normal [ULN]), the French criteria16 (ALP < 3 times ULN, aspartate aminotransferase [AST] < 2 times ULN, and bilirubin < 1.0 mg/dL), the Spanish criteria17 (decline in ALP of more than 40% of baseline or to a normal value) or the Dutch (Rotterdam) criteria18 (normalization of bilirubin and/or albumin after treatment if one or both were abnormal at baseline). The French criteria16 were recognized as the best validated and easiest to use, and therefore recommended for use in therapeutic trials.

Table 2. Biochemical Response to Treatment with Ursodeoxycholic Acid
  1. ALP, alkaline phosphatase; AST, aspartate aminotransferase; ULN, upper limits of normal.

Angulo et al.15 (1999)
 ALP < 2 times ULN
Pares et al.17 (2006)
 ALP < 40% from baseline or to normal value
Corpechot et al.16 (2008)
 ALP < 3 times ULN
 AST < 2 times ULN
 Bilirubin < 1.0 mg/dL
Kuiper et al.18 (2009)
 Normal bilirubin and albumin (if one or both were abnormal before treatment)
 Normal bilirubin or albumin (if both were abnormal before treatment)

Initial treatment strategies would be best served by targeting incomplete biochemical responders. Once efficacy data is obtained, then treatment with the new drug could be compared to UDCA in UDCA-naïve patients. It was recognized that this approach might run the risk of testing new therapies only in subjects who might be too advanced to respond to any treatment. However, it is unethical to remove patients from a treatment that likely normalizes their survival when the effect of the new drug is unknown. The utility of primary prevention treatment, that is, to treat those who are AMA positive without liver disease (i.e., with normal liver biochemistries), is unknown and cannot be advocated at this time.

Recommendations for Use of UDCA in Trials of New Therapies:

  • 6.Therapeutic trials with a new agent should target patients with incomplete biochemical response after 3-6 months of treatment with an adequate dose of UDCA. Complete biochemical response should be defined as ALP < 3 times ULN, AST < 2 times ULN, and bilirubin < 1.0 mg/dL.16
  • 7.All therapeutic trials should compare UDCA with or without the new therapy.
  • 8.Only once robust efficacy data is obtained, treatment with the new drug should be compared directly to UDCA.
  • 9.Primary prevention of patients who are AMA-positive without liver disease is not recommended.

Endpoints of Clinical Trials in PBC

There are multiple potential endpoints for clinical trials in PBC. The hard endpoints of death or liver transplantation, whereas desirable measures of efficacy, were recognized as likely unfeasible in a therapeutic trial. Slow disease progression and the limited availability of study participants are the main reasons these endpoints are not feasible. Response to therapy could be defined as improvement or lack of progression in biochemical markers, prognostic models, portal hypertension, or liver histology. Response to therapy could also be defined as improvement or lack of progression in symptoms such as fatigue or pruritus. All were considered to be important goals of therapy, and each study must determine the aspect of the disease which is most germane to the therapy being evaluated.

Biochemical Markers

Biochemical markers are easy to assess in large trials and are valuable markers of disease activity and severity. Bilirubin has been consistently demonstrated to be the single most important serum marker of survival36-38 and therefore is a desirable endpoint. The limiting factor of using bilirubin as an endpoint is that it only becomes abnormal in late stages of the disease and thus is not a sensitive outcome measure. It is less useful in assessing improvement in patients with mild to moderate disease, a group which may actually be the most responsive to therapy. Response of ALP to therapy has been shown to be a good correlate of both survival17 and liver histology3 in PBC and is used globally in clinical practice to predict the progression of the disease; it is therefore an acceptable therapeutic criterion to monitor PBC treatment. Analysis of UDCA trials demonstrated that normalization of ALP, when assessing a large group, was associated with better survival than expected.15, 16

Recommendations for Biochemical Markers:

  • 10.The inclusion of death or liver transplantation as primary endpoints, though desirable, is unfeasible.
  • 11.The inclusion of biochemical markers is a satisfactory primary endpoint for therapeutic trials; the desirable biochemical response should be ALP < 3 times ULN, AST < 2 times ULN, and bilirubin < 1.0 mg/dL.16

Prognostic Models

Because intervention studies using clinical outcome endpoints, such as liver-related and/or all-cause deaths and the need for liver transplantation, are both very lengthy and expensive to conduct, the use of prognostic models as endpoints can be attractive. Risk scores, such as the Mayo Risk Score (MRS)28 and the Model for End-Stage Liver Disease (MELD) Score42 estimate survival. However, these scores are most useful in patients with sufficient hepatic decompensation to have increased scores; these constitute a minority of patients in the modern era. The MRS was validated and repeatedly cross-validated as an excellent predictor of survival in patients with PBC, although at a time when patients tended to present with more severe liver disease. Numerous trials of UDCA have used the MRS to demonstrate improved actual survival with therapy compared to expected survival. MRS suffers from the same drawback as serum bilirubin: it is insensitive to changes in mild-moderate disease.

Recommendations for Prognostic Models:

  • 12.Prognostic models should not be used as primary endpoints, as they are insensitive to changes in mild and moderate disease.
  • 13.Mayo risk score could be included as a secondary endpoint in clinical trials.

Portal Hypertension

Portal hypertension and its complications develop frequently in patients who are not treated with UDCA. In an older study, approximately one third of patients with PBC were noted to develop esophageal varices (EV) over 6 years43 and rates as high as 25% within 2 years have been reported.44 This number may be significantly reduced (16% versus 58% after 4 years)44 by UDCA treatment, but data are conflicting.45, 46

Although some trials in PBC have addressed complications related to portal hypertension as a relevant outcome, the majority of the clinical trials in PBC conducted to date have not. In the trials that have addressed this, the development of EV has been the most frequently measured endpoint of interest. A few studies have noted that the development of EV in patients with PBC appears to predict the development of symptomatic disease and also of other complications related to portal hypertension such as ascites or portosystemic encephalopathy. However, portal hypertension can be present in the absence of varices and can then only be determined by direct measurement of the portohepatic gradient.

A recent study including direct measurements of the portohepatic gradient (PHG) in patients with PBC reported the presence of portal hypertension (PHG > 6 mm Hg) in 35% of subjects, and severe portal hypertension (PHG > 12, threshold for the risk of bleeding from EV) in 20% of them.29 In this study, significant differences in survival were noted between patients classified at baseline as having PHG < 6 mm Hg, PHG 6-12 mm Hg, and PHG > 12 mm Hg.29 Higher PHG at baseline was associated with decreased survival. Furthermore, improvement of PHG in response to UDCA was associated with survival free of liver transplantation. Hence, the development of portal hypertension is as an endpoint for trials is justified.

Portal hypertension can be measured by several different means: indirectly by the development of varices, variceal bleeding, development of ascites, development of encephalopathy, or directly by wedged or direct portal venous pressure measurement. All are acceptable options, but selecting the best way to measure portal hypertension in a trial requires weighing the relative accuracy, invasiveness, and cost for each measure within the circumstances of the trial. Endoscopic evaluation is the gold standard for assessing the development of EV47 but it is invasive, costly, and does not detect early portal hypertension which may be present in the absence of EV. Capsule endoscopy is less invasive but is still not recommended for EV screening.47 Non invasive predictors of EV based on laboratory tests such as platelet counts may be useful in specific clinical settings but do not provide the needed accuracy to be used as measures in clinical trials. Direct PHG measurement is accurate and highly sensitive to detect small differences in portal pressure. However, it is invasive and not practical. Noninvasive techniques such as MR elastography may be useful in the noninvasive prediction of portal hypertension in patients with PBC48 but still need further study and validation.

Recommendations for Portal Hypertension:

  • 14.Portal hypertension can be assessed indirectly by several different means including development of varices, variceal bleeding, development of ascites, development of encephalopathy, or directly by wedged or direct portal venous pressure measurement. Portal hypertension can be a legitimate primary endpoint in treatment trials if measured directly. Indirect assessment of portal hypertension as secondary endpoints is encouraged so that further data may be obtained.

Liver Histology

Histological evaluation within the context of clinical trials should focus on the lesions that characterize the disease or lesions that are predictive of outcome. Histologic findings that have been identified as predictors of poor survival include: fibrosis, interface hepatitis,24 periportal bile stasis, extent of periportal cell necrosis,49 cirrhosis and “central” cholestasis.33 In the former study, fibrosis limited to the portal tracts was noted as a mitigating feature. Natural history studies have confirmed progressive histologic disease within 2-3 years, with progression of histologic stage in 79% of patients without cirrhosis at baseline over a median of 3 years, of whom 61% developed cirrhosis.27 Among patients with cirrhosis, only 3% experienced regression of histologic stage during follow-up.27 These findings may help determine study size as well as treatment and follow-up duration.

Published trials of medical therapy for PBC that have included assessment of interface hepatitis have confirmed the observation of Corpechot et al.24 that this lesion is refractory to reversal.50, 51 The French study further incorporated this histologic lesion into a Prognostic Score Index with equal value to elevated bilirubin and depressed albumin. Thus, inclusion of interface hepatitis would appear to be necessary in clinical trials. This lesion, however, has, at best, only moderate interobserver agreement52 and thus these data may be challenging to reproduce. A multicenter effort of examining interobserver agreement for nodular regenerative hyperplasia, portal vein lesions, and features of cholestasis is ongoing. Finally, potential use of stains beyond the use of hematoxylin and eosin and trichrome may need to be considered and exploited (e.g., Sirius red staining for assessing fibrosis by quantitative morphometry53).

Improvement in liver histology is a desirable outcome for clinical trials of PBC. Unfortunately, current staging systems, including the commonly used Ludwig54 and Scheuer55 systems, do not provide ordinal assessments for many features that have been associated with poorer prognosis. It is of interest that several recognized lesions of PBC that may contribute to portal hypertension and/or progressive cholestasis are not part of any of these scoring systems, including nodular regenerative hyperplasia, portal venous lesions, lobular inflammation and activity, and damage to canal of Hering structures.49, 56 The Ishak staging system of fibrosis, while originally created for HCV and not PBC, does provide a multistep assessment of fibrosis and has been employed in only one PBC study32 to predict the development of hard outcomes. However, it also ignores lesions of inflammation and ductopenia which are also considered important25 and are likely more useful to detect changes over a shorter time interval. The task of creating and validating a new staging system for PBC to address these shortcomings is being undertaken by participants of the meeting; the results of this systematic study of liver histology in PBC are not yet available.

The recognized limitations of liver biopsy for PBC are many.57 A few limitations include the potential absence of diagnostic lesions on any given biopsy specimen, due largely to the heterogeneity of disease throughout the liver. The distinction between stage 3 (bridging fibrosis) and stage 4 (cirrhosis) may be challenging because biliary fibrosis is characterized by portal-portal bridging, with retention of the vascular relationships (terminal hepatic venules) and acinar architecture. Likewise, there may be clinical evidence of portal hypertension without confirming evidence of advanced fibrosis on the biopsy, especially when the presence of nodular regenerative hyperplasia is recognized in PBC.58 Histological progression does not necessarily occur at the same rate as clinical progression, and the presence of cirrhosis at diagnosis does not correlate with the presence of symptoms. Markov modeling has been used by several investigators to predict histological progression. Locke et al.27 estimated that most treatment naïve patients with PBC will progress histologically within 2 years. Similarly, Corpechot et al.59 used a 2-stage model and showed that only 29% and 13% of untreated patients remain in early stage after 4 and 8 years, respectively.

Recommendations for Endpoints for Liver Histology:

  • 15.Although improvement in liver histology is a desirable endpoint for clinical trials, current staging systems are inadequate and, therefore, the use of histology as a primary endpoint is problematical.
  • 16.Comparative evaluation of predetermined features in liver histology following treatment remains valuable as a potential endpoint.

Novel Endpoints

Surrogate markers of fibrosis, such as serum markers,32 transabdominal elastography,60 and magnetic resonance elastography,61 have early data that are encouraging. For example, in a long-term cohort study of 161 patients with PBC, serum markers of liver fibrosis entered into the Enhanced Liver Fibrosis algorithm accurately predicted the development of future complications or death, particularly at earlier times in the disease process (4 and 6 years prior to the first event),32 when the performance of other prognostic models such as MRS and MELD score is typically limited. Common barriers to the use of surrogate markers of fibrosis include cost and/or limited availability. Furthermore, they are considered too preliminary to use as isolated primary endpoints of a clinical trial. The use of these modalities as secondary endpoints is encouraged so that further data may be obtained.

Recommendations for the Inclusion of Novel Endpoints:

  • 17.Novel endpoints should not be used as primary endpoints in treatment trials. The use of these modalities as secondary endpoints is encouraged so that further data may be obtained.


The conventional view of PBC is of a relatively indolent chronic liver disease, leading ultimately to advanced liver disease with its associated complications. The use of UDCA may lead to successful disease arrest in some individuals. As such, symptoms which are independent of the development of advanced disease, such as fatigue and pruritus, have become a predominant clinical problem. Therefore, improvement in quality of life, including fatigue and pruritus, is an increasingly important therapeutic goal for studies in patients with PBC.

Though there is a reasonable consensus in the literature that fatigue impacts 30% to 50% of patients in most disease populations, it can be a difficult to quantify fatigue, as this is a relatively ill-defined symptom complex. Instruments to measure fatigue fall into two broad groups, generic (e.g., the Fatigue Impact Score [FIS]) and PBC-specific (the PBC-40 fatigue domain). The FIS62 is a questionnaire-based inventory which requires patients to rate the impact of fatigue on 40 aspects of daily life over the previous month. The impact on each category is graded into five levels of severity which are scored from zero to four (with higher scores representing greater fatigue) to give a maximum score of 160. The 40-question FIS consists of three domains assessing the impact of fatigue on psycho-social, cognitive and physical activity. Reliability and construct validity for this score has been established in cross-sectional studies of patients with PBC.63-66 The PBC-40 is disease-specific health-related quality of life measure developed and validated for use exclusively in PBC.67, 68 The PBC-40 consists of five symptom domains relating to “fatigue”, “itch”, “cognitive” symptoms, “social and emotional” symptoms, and “other symptoms”, and has been used in other studies to assess fatigue severity and response to treatment.69 The ability of these measures to quantify change in the context of therapeutic intervention remains untested in PBC.

Pruritus is best measured by 5-D70 or a visual analog scale (VAS) in long-term outpatient studies. The 5-D is a multidimensional tool that encompasses the duration, degree, direction, disability, and distribution of pruritus.70 Although it has been specifically validated to detect changes in PBC, it is new and has neither been cross-validated nor widely used. The VAS, despite its uni-dimensional nature and inherent difficulties in comprehension and scoring, has been widely used in the literature and, therefore, is a well-recognized and validated instrument. For short term studies, measuring scratching activity with a piezoelectric transducer is considered superior because of its objective nature and previous validation in PBC.71 It is important that subjects perform any of these serial assessments at the same time of the day, as both fatigue and pruritus have a diurnal variation and are worst in the evenings. When designing trials, it is important to consider that fatigue persists throughout the course of the disease13, 63 and pruritus may spontaneously improve with disease progression.72

Recommended Endpoints for Symptoms:

  • 18.Fatigue should be measured by FIS or the fatigue domain of PBC-40.
  • 19.Pruritus should be measured by 5-D or VAS instruments.

Future Directions

The ability to identify the cause of the disease can have a major impact on the choice of endpoints for clinical trials. As an example, for nonA, nonB viral hepatitis studies in the 1970s, the endpoint was aminotransferase normalization whereas now the endpoint is eradication of the causative virus. Such a situation in which there is a known etiology does not yet pertain in PBC. There are a number of lines of investigation trying to identify potential causes, including the role of the innate immune system, as well as various xenobiotics, bacteria or other environmental pathogens, but these are not yet mature. Drugs that are directed at reducing the presence of autoantibodies, such as rituximab, have led to some benefit,73 but so far there is not a cure and this may not be possible until a cause is found. Should the cause of PBC be defined, the choice of endpoints for future clinical trials may be vastly altered.

Another group of agents that have been used are those directed at the immune response. Steroids, cyclosporin, methotrexate, azathioprine, mycophenolate mofetil, and tacrolimus have not demonstrated efficacy. Other therapies or unconventional immunosuppressants are being tested, such as statin therapy.74, 75 New therapies are being directed at nuclear factor-κB using various activators of peroxisome proliferator-activated receptor α including the fibrates with some evidence of benefit.76-78 Another potential target is nuclear receptors, and studies are currently underway with farnesoid X receptor ligands for patients with PBC. The discovery that mutations in the interleukin-12 pathway are associated with PBC79 may lead to immunotherapy directed at this pathway. Finally, fibrosis is a key component of the end stage of the disease, and there is an increased focus on altering fibrosis with various antifibrotic targets identified.80

To date, there is nothing on the horizon that would tell us that any of these emerging therapies will have an impact on trial design in the immediate future. It is hoped that new classes of agents, new targets of action, or ideally a cause, will be identified soon to help in the design of studies.


PBC affects fewer than 200,000 people in the United States and has been designated by the FDA as an orphan disease. If new therapeutic trials are targeted toward patients who are incomplete biochemical responders to UDCA, then the pool of potential study subjects drops to less than 40,000 nationwide. Thus, it is unavoidable that future clinical trials will require the collaboration of a large number of centers. A consortium of centers focused upon the study of PBC, including expertise in pathology, genetics, biliary physiology, immunology, biostatistics, and liver transplantation is urgently needed to spearhead and drive clinical trials of high quality. Other important issues such as the definition and management of PBC variants, genetic associations, and pathophysiology could be addressed if expertise in all autoimmune liver diseases, including hepatocellular and biliary autoimmune diseases, were represented within the consortium.


The expert panel that, together with the authors, participated in providing the background consisted of: John M. Vierling (Baylor College of Medicine, Houston, Texas), Cynthia Levy (University of Florida and Malcolm Randall VAMC, Gainesville, Florida), Olivier Chazouilleres (Hopital Saint-Antoine Service d'Hepatologie, Paris, France), Paul Angulo (University of Kentucky, Lexington, Kentucky), W. Ray Kim (Mayo Clinic, Rochester, Minnesota), David Jones (University of Newcastle, Newcastle, United Kingdom), Claudia O. Zein (Case Western Reserve University, Cleveland, Ohio), Terry M. Therneau, Ph.D. (Mayo Clinic, Rochester, Minnesota), John Senior (Center for Drug Evaluation and Research, Food and Drug Administration), David Shapiro (Intercept Pharmaceuticals).