The American Association for the Study of Liver Diseases (AASLD) recommendations on screening for hepatocellular carcinoma (HCC) have generated some controversy. In the March 6 issue of the Annals of Internal Medicine, there is an article entitled “Screening for Liver Cancer: A Rush to Judgment”.1 In it, the investigators criticize the AASLD recommendations on screening for HCC.2, 3 The basis for their criticism is that the only randomized, controlled trial (RCT) that showed a benefit4 to screening was statistically invalid. They imply that there is no reliable information on HCC screening, and that therefore AASLD should not be recommending screening to patients at risk for HCC. However, in addition to the AASLD, other organizations, such as the U.S. Veterans Administration,5 the World Gastroenterology Association,6 European Association for Study of the Liver,7 and the liver disease societies of several Asian countries8, 9 consider the Chinese study to be valid and recommend screening for HCC. The National Comprehensive Cancer Network in the United Sstates also recommends HCC screening.10 All these recommendations recognize the presence of a well-defined at-risk population and the availability of effective treatment for early-stage disease.
There have been two RCTs of HCC screening in China.4, 11 The first found no difference between the screened and unscreened group.11 However, the conduct of this trial made it impossible to show a difference. Resection was to be used as the treatment of early-stage HCC, but a large proportion of those with screen-detected HCC did not undergo resection. Therefore, this trial failed for methodological reasons and not because screening was ineffective. The second trial, also in China,4 used a cluster randomization method, but then analyzed the results on an individual patient basis. This is not statistically correct. The argument by the investigators of the Annals of Internal Medicine article is that if the study had been correctly analyzed, there would be no statistical difference between the screened and unscreened groups; and furthermore, even if the study had shown a difference in mortality, the results would not be applicable in North America, because in North America, the dominant cause of HCC is hepatitis C, not hepatitis B. Therefore, they argue, HCC screening was not worthy of a high level of recommendation.
There are two issues here: The first is the level of evidence, and the second is the recommendation and the strength of the recommendation. At the time of the initial guidelines, the AASLD was using a grading system that had broader categories with some overlap. A grade 1 level of evidence was defined as that based on RCTs and, to some extent, was to encompass the general consensus of experts in the field who treat these types of patients on a day-to-day basis. If we were to apply our current grading system, outlined in more recently published guidelines and adopted from the American College of Cardiology and American Heart Association,12 this might have been graded as a class I, level B, indicating strong evidence and/or general agreement for a given diagnostic evaluation based on data derived from a single randomized trial. One could argue that a single RCT is less than ample evidence to base conclusions, but this narrowed approach fails to capture the greater depth of information that supported the recommendation. A class II recommendation would indicate conflicting evidence and/or a divergence of opinion about the usefulness and efficacy of a particular diagnostic evaluation. This would be an unfair “downgrading” of the evidence and expert opinion available to us at this time.
In addition to the Chinese RCT, there are less-strong lines of evidence that also suggest that screening for HCC is effective in reducing mortality. These include cost-efficacy analyses in populations with hepatitis C and cirrhosis showing that screening is effective in reducing mortality and can do so at an acceptable cost,13-21 and many studies that show stage migration (i.e., diagnosis at an earlier stage of disease) with screening.22-26 Stage migration is not, of itself, evidence of the efficacy of screening. However, it is a necessary condition for screening to be effective. If earlier diagnosis cannot be achieved, screening will not be of benefit. Many studies of screening are subject to lead-time bias. However, there are some studies that correct for lead-time bias,27, 28 and these show that screening prolongs survival. Although efficacy of screening is determined by a decrease in mortality, and not by improved survival, improved survival is a necessary accompaniment of decreased mortality.
Although the Chinese RCT can be criticized, it is the largest study of its kind, and it does confirm many other studies that support that screening is likely to decrease mortality from HCC. This was the basis for the recommendation in the AASLD guidelines.
One of the challenges in HCC screening is determining when the risk is high enough to warrant screening. The guidelines were careful to indicate what the basis was for making the recommendations about who was at sufficient risk to warrant screening and how that assessment was reached, allowing readers to assess for themselves the strength of the evidence.
The investigators of the Annals of Internal Medicine article express the concern that the “rush to judgment” will make it more difficult to undertake an RCT in North America. This may be so, but this is by no means the only factor making such a trial very difficult to conduct. Previous attempts to establish an RCT of liver cancer screening have failed. Sample-size calculations suggest that the study will require upward of 10,000 subjects. If the population is to be stratified for baseline factors, such as age, underlying liver disease, stage of liver disease, hepatitis B viral load, and so on, the sample size will be even larger. A recent Australian study has demonstrated that when patients are presented with a balanced description of the advantages and disadvantages of screening (a necessary condition for informed consent), more than 90% would refuse participation in a randomized study, preferring to undergo screening.29 Second, there is the issue of contamination (i.e., members of the control group gets screening outside the trial). Because many of these patients have cirrhosis, they will be getting ultrasounds (US) for other reasons, even in the control group. Given the publicity around the study, some patients in the control group might decide to get US done anyway. It will also be very difficult to standardize treatment. All of these factors mitigate against the successful conclusion of any RCT of screening for HCC.
The gastrointestinal/hepatology community accepts the need for screening, because when at-risk patients do not undergo screening, they present with symptoms late in the course of their HCC and they die from their cancer within a few weeks to months in almost 100% of cases. In contrast, early detection of HCC is associated with a high rate of cure that may, under the best of circumstances, reach 90%.
Liver cancer is also different from many other cancers, in that there is no curative treatment for intermediate- or advanced-stage tumors. Other cancers that have progressed to more advanced stages may respond to adjuvant chemotherapy or radiation. In contrast, for HCC, neither chemotherapy nor radiation for late-stage disease will reduce mortality. However, there are effective treatments for early-stage disease. Resection, transplantation, and local ablation of small lesions are potentially curative therapies and thus highly likely to lead to reduced mortality. Although, on a population basis, it remains to be demonstrated that these treatments will reduce mortality, it is hard to imagine that a 90% cure rate, such as is achievable with radiofrequency ablation (RFA) of lesions <2 cm in diameter,30 a 30% long-term cure rate with resection,31, 32 and a 70%-80% cure rate with transplantation33, 34 will not translate into a decrease in overall HCC-related mortality, compared to an unscreened group.
Discussions around screening rightly take into account that screening is not an entirely benign process, and that some patients who are labeled as having cancer because of a false-positive screening test result will be worse off than if they had not had screening at all. If screening is not effective, then there will be harms from applying screening, including unnecessary liver biopsies and surgeries, and unnecessary psychological harm. On the other hand, one must also consider the harms that may come from not applying screening when screening is indeed effective, even though the benefit has still be demonstrated. These include that almost all patients will die of their disease.
In a sense, the issue of harms from screening revolves around overdiagnosis. Overdiagnosis likely occurs with most cancer screening programs, but in the case of HCC, the risk of this is felt to be small.
One of the arguments that rage about screening in general, and the interpretation of screening studies, is that by the time that studies are written, funded, rolled out, and completed, years have passed. Over the same period, technology and expertise in using technology will improve, so that results obtained using the older technology described in the trial are no longer applicable. The Chinese study4 used resection as the only treatment. Today, with a high rate of detection of small HCC using US and with effective treatment by RFA, the morbidity related to the treatment arm of the screening process is much lower than in the Chinese study, and the survival of treated small HCCs is also likely to be better. Better imaging technology and better algorithms have led to a reduction in the need for liver biopsy. Thus, the benefits from screening that were described in the Chinese study would probably be even greater if the study were to be conducted today. Screening has also led to better definition of the early stages of HCC, as well as to the identification of genes that are turned on and off during carcinogensis, knowledge that will ultimately help in understanding the carcinogenic process and ultimately lead to better treatment and better control. This must also be considered a benefit of screening. It is therefore our opinion that given the evidence available to date, failing to provide screening will result in greater harm than providing screening.
Sir Austen Bradford Hill, who conducted the first RCT in humans and who enunciated a number of principles of evidence, had this to say35: “All scientific work is incomplete—whether it be behavioural or experimental. All scientific work is liable to be upset by advancing knowledge. That does not confer upon us a freedom to ignore the knowledge we already have, or to postpone the action it appears to demand at a given time.”
The knowledge we already have appears to demand that HCC screening be instituted.
More recently Ken Dryden, an ex-member of the Canadian Parliament, said, in another context36: “As a society we seldom have the luxury of waiting for science's standard of proof—that's how thousands of asbestos workers and millions of smokers died. We need to take the best science we have, generate more, apply ………common sense, and decide.” This statement could have been addressed specifically about HCC screening.
We all recognize that an RCT of HCC screening is desirable and represents the ideal. We also believe that such a trial is unlikely to be successfully undertaken. We have to therefore deal with the evidence that is currently available, consider the harms, and decide what to do. Until further information suggests otherwise, the recommendation for HCC screening therefore remains in place.