Ultrasound screening and risk factors for death from hepatocellular carcinoma in a high risk group in Taiwan

Although previous studies have demonstrated the ability of ultrasonography (US) screening to detect small asymptomatic hepatocellular carcinoma (HCC), the efficacy of US screening in reducing deaths from HCC still remained unresolved. A 2‐stage screening program was designed to identify a high risk group in 7 townships in Taiwan by 6 markers (of risk for HCC) and repeated US screening was further applied to those with at least 1 positive result for the 6 markers, with a range of 3‐ to 6‐month inter‐screening intervals to those with liver cirrhosis or other chronic liver diseases and an annual screening regime for the remaining subjects with normal findings according to US. The 4,843 subjects in this cohort were followed up for an average of 7 years. We compared 4,385 attenders with 458 non‐attenders, in conjunction with baseline assessment for self‐selection bias. In addition, we assessed baseline variables with respect to their effects on risk of incidence of and mortality from HCC and on risk of incidence of liver cirrhosis. The difference in mortality between attenders and non‐attenders was then re‐estimated adjusting for significant predictors of cirrhosis, HCC incidence and HCC death as a further guard against baseline differences between attenders and non‐attenders in risk profiles. Results of US screening for this high risk group found the mortality was lower by 24% (95% CI: −52 to 62%) in the attenders compared to the non‐attenders. After adjustment for sensitivity, the mean sojourn time (MST) were 1.57 (95% CI: 0.94–4.68) for subjects with liver cirrhosis and 2.66 (95% CI: 1.68–6.37) years for non‐cirrhotic patient. Significant increases in risk of HCC incidence were associated with increasing age, male gender, hepatitis B surface antigen positive (HbsAg), hepatitis C antibody positive (Anti‐HCV), high levels of alanine transaminase (ALT) and alpha‐fetoprotein (AFP) and a family history of HCC. Significantly increased risks of liver cirrhosis were associated with predictors of cirrhosis were increasing age, HbsAg, high levels of ALT and of AFP. Significant or borderline significant increases in risk of HCC death were associated with increasing age, male gender, HbsAg, high levels of AST and AFP. Adjusted for the significant variables, the mortality was lower by 41% (95% CI: −20 to 71%, p = 0.1446) in the attenders compared to the non‐attenders. The present study provides suggestive evidence on the efficacy of US screening in a selective high risk group in an endemic area of hepatitis B. A randomized controlled trial would yield definitive evidence. Within the protocol of such a trial, a shorter interscreening interval for patients with liver cirrhosis is suggested. © 2001 Wiley‐Liss, Inc.

Although screening for hepatocellular carcinoma (HCC) with ultrasonography (US) is widely believed to be an effective form of secondary prevention, several issues remain unclear. Firstly, there is a lack of empirical evidence showing the efficacy of US screening in reducing deaths from HCC. Several prospective studies have shown that repeated, real time US only or in combination with AFP (Alpha-fetoprotein) was able to detect small HCCs. [1][2][3][4][5][6][7][8] For instance, Oka et al. 3 in a study of early detection of HCC in patients with liver cirrhosis, found that 65% HCCs identified by US were tumors less than 2 cm in diameter. Solmi et al. 4 also showed that all HCCs were diagnosed by US as a small hepatocellular carcinoma of 3 cm or less. Although these findings supported the use of US and AFP for detecting asymptomatic or small HCCs, there was still uncertainty about whether screening could reduce deaths from HCC. Two studies 9,10 from Italy showed that the mortality from HCC in the resected group, small tumors detected by US and AFP, was similar to that in the unresected group. This led to the tentative conclusion that US screening failed to bring down the mortality from HCC. Because both were clinically-based studies that were targeted at subjects with liver cirrhosis caution should be exercised in that subjects with liver cirrhosis were more likely to become unresected than those without liver cirrhosis and the mortality for this group was correspondingly high. Evaluation of the benefit of screening is also necessary in subjects representative of a larger population of hepatitis B carriers rather than only liver cirrhosis patients. There have been several studies [11][12][13][14] focusing on screening in chronic carriers of hepatitis B virus (HBV) or hepatitis C virus (HCV) using US, AFP or both. Some studies 6,14 found a benefit of screening with US in terms of detecting small HCCs as in the above studies based on liver cirrhosis patients. Other studies 11,13 demonstrated that chronic hepatitis patients constitute another high-risk group for HCC even in an area non-endemic for HBV. No studies, however, have so far put emphasis on elucidating whether screening for HCC, in particular among non-cirrhotic patients, with US could significantly reduce deaths from HCC. It would be very informative to estimate the efficacy of US screening in a population with a large proportion of non-cirrhotic subjects.
The greatest difficulty in assessing the efficacy of screening for these subjects is the lack of control or non-screened group via a randomized population-based screening trial. Although the best way to evaluate the efficacy of HCC screening with US is a population-based randomized trial, there are difficulties of cost, time, compliance and ethics. Once the high-risk group has been identified, it is difficult and arguably unethical to persuade some members of the group to be randomized to no preventive measure at all. One possible solution is to compare those who comply with invitation to US screening with those who do not, incorporating checks and if possible adjustments for self-selection bias.
Another neglected area is the natural history of tumour progression in HCC. Whether screening for HCC with US is potentially beneficial is highly dependent on the disease natural history. For example, the optimal inter-screening interval will depend on the rate of tumour progression from the preclinical screen-detectable phase (PCDP) to the clinical phase (this duration is usually called the sojourn time) in this group. The shorter the sojourn time the more frequent screening is required.
To understand the benefit of US screening it also seems necessary to examine whether the survival for screen-detected cases is more favorable to that for clinically-detected cases, including interval cancers and refusers. To the best of our knowledge, no studies have so far reported survival figures by detection mode.
A question of particular importance in Taiwan is whether selective or mass screening with US is more appropriate. Mass screening might be preferred in that incidence of HCC was the highest among all cancers. There are, however, insufficient resources for mass screening with US applied to the general population. If one could define a high risk group to whom US screening could be applied selectively, this might be more cost-effective than mass screening in the general population. Previous research 15 has demonstrated a constellation of risk factors accounting for the occurrence of HCC including HBV, 16 -19 HCV, 20 -24 family history of HCC 25 and elevated AFP. This indicates that one could first ascertain the high risk group for HCC according to these risk factors and further apply US screening only to this high risk group.
Thus, using a 2-stage screening project from Taiwan the aims of the present study were to: 1. Assess the likely efficacy of US screening for the high risk group identified by 6 markers at first stage by comparing the cumulative mortality of the attenders with the non-attenders of the high-risk group stratified by liver cirrhosis or other chronic liver diseases; 2. Examine whether the cumulative incidence rate by liver cirrhosis or other chronic liver diseases is consistent with the cumulative mortality; 3. Assess the major risk factors for HCC incidence, HCC mortality and cirrhosis incidence and adjust the comparison in (1) for these; 4. Estimate the mean sojourn time (MST) and sensitivity simultaneously for this high risk group in those with and without liver cirrhosis; 5. Make a tentative suggestion for randomized evaluation of US screening in this high risk group based on (1)-(4).

MATERIAL AND METHODS
Data used in our study were derived from a 2-stage screening program that has been conducted since 1991 in 7 townships in Taiwan. A total of 16,652 subjects were first screened for the 6 markers at first stage. Repeated US screening was offered to subjects with at least 1 positive result from these 6 markers. The markers were positive hepatitis B surface antigen (HbsAg), positive antibody for hepatitis C (anti-HCV), alpha-fetoprotein (AFP) Ն20 ng/mL, aspartate transaminase (AST) Ն40 IU/L, alanine transaminase (ALT) Ն45 IU/L and family history of HCC. The abdominal US screening applied to subjects with a positive marker result was conducted by experienced gastroenterologists. After US screening, subjects with positive results of US were referred to medical centres to receive confirmatory diagnosis based on liver needle biopsy, angiography and AFP. Apart from biopsy, subjects whose AFP level was above 400 ng/mL and had a positive result of angiography were also diagnosed as HCC. The screening frequency with US is determined by current baseline disease status. Those who were diagnosed as hemangioma, pseudotumor, AFP greater than 20 ng/mL and liver cirrhosis are screened every 3 months. An inter-screening interval for subjects with early liver cirrhosis is 6 months. For subjects without these liver diseases 1-year screening interval was otherwise applied.
A total of 4,843 subjects identified from the first stage with positive results formed the main data set. Of 4,843 subjects, 458 subjects refused to attend US screening (Table I). The entire cohort was linked with the national death certification data to ascertain causes of death. Interval cancers, including post-screening cancers and cancers diagnosed between screens, were identified during the follow-up period. HCCs detected by US screening were defined as screen-detected cases, which were further divided into 2 groups, the prevalent screen (first screen) and the subsequent screen. Follow-up until Dec 31, 1998 was 7 years in average.
Subjects were classified with respect to disease status at baseline. Subjects with liver cirrhosis were denoted as Group I. Due to sparse cases for chronic hepatitis, hemangioma and pseudotumor they were classified as Group II together. Subjects with normal findings at baseline by US were Group III.

Statistical analysis
Calculation of cumulative mortality and incidence was based on the density method developed by Kleinbaum et al. 26 Possible confounders of attender/non-attender status (the 6 baseline criteria, plus age and gender) were assessed for their effect on risk of HCC incidence and risk of cirrhosis incidence by logistic regression. The Cox regression model was fitted to calculate the relative mortality for the screened versus the non-screened, with and without adjustment for the potential confounders, the latter analysis to adjust for potential self-selection bias. To estimate the progression rate from the PCDP to clinical phase a Markov process model 27,28 was employed to calculate mean sojourn time (MST) for cirrhotic and non-cirrhotic patients respectively. The Markov process model assumes that subjects move between states (in this case, 3 states: no HCC, preclinical HCC, clinical HCC) with exponential distributions of times spent in each state. The MST and sensitivity were estimated simultaneously, i.e., each adjusted for the other. Mortality was compared between those who attended for screening and those who did not. To further illustrate the possible extent of self-selection or healthy volunteer bias, we compared baseline states by the 6 markers defined above between the attenders and non-attenders. Table I shows numbers screened and cases of HCC by detection modes and Groups I, II and III. Seven-year cumulative incidence of HCC in Group I was approximately 20%. The corresponding figures for Group II and Group III were 1.7% and 0.6% respectively (Fig. 1). There was a 22% higher incidence rate, in the non-attenders than in those who attended for screening, but this was not significant (p ϭ 0.50). Figure 2 shows cumulative mortality from HCC by time since entry into the study for the attenders and non-attenders. There were 68 HCC deaths in the 4,385 attenders and 9 deaths in the 458 This suggests that the mortality was lower by 24% (95%CI: Ϫ52 to 62%) in the attenders compared to the non-attenders. Stratified by Group I, II and III, Group I has the highest cumulative mortality, followed by Group II and Group III (Fig. 3). This was consistent with the above cumulative incidence findings. The incidence of interval cancers as the percentage of expected incidence (I/E ratio) by times since last negative screen, at 6 months and 1 year, were 20% and 29% respectively in subjects with liver cirrhosis. These are shown in Table II with the corresponding crude sensitivity estimates. Specificity of the ultrasound examination was 98% (82 false positives out of 4,333 examinations in subjects whose final diagnosis was negative). This translates to a positive predictive value of a positive ultrasound of 38% (51 out of 133 positive ultrasound were eventually diagnosed as HCC). Table III shows the MST and sensitivity mutually adjusted for subjects with liver cirrhosis and non-cirrhotic patients. The MST estimates were 1.57 (95% CI: 0.94 -4.68) years and 2.66 (95% CI: 1.68 -6.37) years. This suggests that subjects with liver cirrhosis have a shorter sojourn times, ie more rapid progression of disease, than those without liver cirrhosis. This accounts for why the above I/E ratio for Group I was higher than that for Group II and III. Table IV shows the baseline status for attenders and nonattenders. Note that in terms of the 6 criteria for entry, the proportions in the non-attenders were very similar to those in the attenders.  15, p ϭ 0.0489). When adjusted for these 5 significant factors, the relative risk associated with attending screening was 0.59 (95% CI: 0.29 -1.20, p ϭ 0.14). This suggests that the mortality was lower by 41% (95% CI: Ϫ20 -71%) in the attenders compared to the non-attenders.

RESULTS
The rate of death from HCC was much faster in the nonattenders than in the attenders (Fig. 2). Lowest 3-year case fatality rates were observed for screen-detected cases (31% survival), followed by the interval cancer cases (23% survival). The highest case fatality rates were observed in the non-attenders (18% survival). The difference between 3-year survival of screen-detected cases and clinically-detected (interval cancers plus non-attenders) reach statistical significant (log-rank 2 (1) ϭ12.91, p ϭ 0.0016).    The major finding in our study was that the mortality was lower by 41% in the attenders compared to the non-attenders. A significant difference of case fatality between screen-detected and clinically-detected also suggests a benefit of US screening arising from early detection of small HCC. This suggests that US screening for this high risk group might be worthwhile. Because the present study was not a population-based randomized trial evaluation of the efficacy of US, however, screening might be subject to selection bias. The multivariate adjustment, however, tended to increase the effect rather than attenuate it, suggesting that if such a bias is occurring here, it is not reflected in the high risk criteria. In addition, distributions of HbsAg, family history of HCC, AFP and anti-HCV in the refuser group were similar to those in the screened group. The proportion with abnormal AFP, the strongest risk factor for HCC incidence, liver cirrhosis and HCC death, was lower in the refuser than in the screened group (see Table IV). It is therefore unlikely that selective factors are entirely responsible for the large in death rates. There is likely to be some selection bias by other factors, as those who refuse interventions are often observed to be substantially different from those who accept, but there is no evidence of such from the factors investigated here. The use of non-attenders, for screening as a comparison group is not without precedent. 29,30 It should be acknowledged, however, that the difference in mortality in the screened group compared to the non-attenders, although substantial, was not statistically significant and the possibility of bias remains. The high early mortality in the refuser group may reflect presence of HCC at baseline. This indicates that further follow-up of this cohort and new research on other groups, ideally with a randomized comparison group is necessary. Another strategy would be to use death registry linkage to compare the deaths in the cohort actually tested for risk status with the mortality rate in the region as a whole, to assess the public heath impact of the strategy of this 2-stage screening. The assessment was performed as follows. There are 16,652 attendants in 2-stage screening from 7 townships. Application of the overall mortality of HCC, ranging from 34.3-94.7/10,000, yields a total of 103 expected deaths from HCC. There are 83 (77 from second stage and 6 from first stage) observed deaths. Calculation of standardized mortality ratio (SMR) gives 0.81 (95% CI: 0.60 -1.08, p ϭ 0.15), suggesting a 19% reduction of mortality due to a 2-stage screening. Note that this includes the refusers. To assess the efficacy of US screening for high-risk group, we re-calculate SMR by excluding subject refusing to receive US. This yields 0.74 (95% CI: 0.55-1.00, p ϭ 0.05), indicating a 26% reduction of mortality due to US screening in a high-risk group. Although this may be affected by selection bias, this result is a "per protocol" estimate that is consistent with the "intention to treat" estimate of 0.81.
It might be argued that the adjustment of the association of screening with HCC death for the risk criteria might constitute overadjustment. Because these are entry criteria, every subject is positive for at least 1 criterion. One may reasonably speculate, however, that the true effect lies between the unadjusted 24% reduction and the adjusted 41% reduction.
It is interesting to note the effects on risk of the baseline factors, in particular, the strongest predictor of HCC incidence, HCC death and cirrhosis was AFP, followed by HbsAg. The increased risk of HCC death in males is of borderline statistical significance and may be a chance finding from multiple testing.

260
This was the first time, to our knowledge, that the mean sojourn time (MST) of HCC has been estimated from empirical data. The estimated results were in agreement with the previous findings that liver cirrhosis patients are at higher risk of HCC and that progression of liver cirrhosis patients is faster than that for non-cirrhotic patients. The sensitivity for liver cirrhosis at 6 months since last negative screen are 88%, equal to that for non-cirrhosis subjects. The short MST accounts for why a shorter inter-screening interval would be required for liver cirrhosis patients. The results suggest that in future evaluation of HCC screening an inter-screening interval of 3-6 months for subjects with liver diseases and 1 year for patients with at least 1 of the 6 positive criteria as used in our study might be justified.
The results here, including the findings from proportional hazards regression that the mortality is lower by 41% in those attenders compared to the non-attenders. (albeit a non-significant reduction) suggest that there is some benefit associated with US screening in this high-risk population. The results do not establish such a benefit unequivocally. The implications of our study are that a randomized trial of screening is indicated urgently. Within a trial, from the estimation of MST, a 3-to 6-month inter-screening interval might be appropriate for subjects with liver cirrhosis. For subjects with at least 1 of the 6 criteria positive, annual screening seems to be justified. It should be noted that a randomized trial of this screening strategy would of necessity be a trial of the 2-stage procedure at the population level, with randomization to either a control group receiving usual care or a study group receiving the serological and risk assessment, with regular ultrasound for those fulfilling 1 of the high risk criteria. One could not expect informed consent to randomization after high risk status has been established. On the basis of the "intention to treat" estimate of a 19% mortality reduction and a similar follow-up period, this would mean randomizing 124,000 subjects in a 1:1 ratio for 80% power. Of these, 62,000 would be randomized to the study group and an anticipated 18,000 would meet the criteria for regular ultrasound. Use of the other estimates of benefit obtained above would imply smaller study populations ranging from 32,000 to 78,000. Such a trial would provide important information not only for high risk areas such as eastern Asia, but also for high risk populations in countries that have a low incidence of liver cancer. For example, cohorts of hepatitis infected subjects are being identified in UK prisons. 31