Potential conflict of interest: Nothing to report.
Early detection of HCC increases the potential for curative treatment and improves survival. To facilitate early detection of HCC, this study sought to identify novel diagnostic markers of HCC using surface-enhanced laser desorption ionization time-of-flight mass spectrometry (SELDI-TOF/MS) ProteinChip technology. Serum samples were obtained from 153 patients with or without HCC, all of whom had been diagnosed with HCV-associated chronic liver disease. To identify proteins associated with HCC, serum samples were analyzed using SELDI-TOF/MS. We constructed an initial decision tree for the correct diagnosis of HCC using serum samples from patients with (n = 35) and without (n = 44) HCC. Six protein peaks were selected to construct a decision tree using this first group. The efficacy of the decision tree was then assessed using a second group of patients with (n = 29) and without (n = 33) HCC. The sensitivity and specificity of this decision tree for the diagnosis of HCC were 83% and 76%, respectively. For a third group, we analyzed sera from seven patients with HCC obtained before the diagnosis of HCC by ultrasonography (US) and from five patients free of HCC for the past 3 years. Use of these diagnostic markers predicted the diagnosis of HCC in six of these seven patients before HCC was clinically apparent without any false positives. Conclusion: Serum profiling using the SELDI ProteinChip system is useful for the early detection and prediction of HCC in patients with chronic HCV infection. (HEPATOLOGY 2007;45:948–956.)
Approximately 170 million people worldwide are infected with HCV, which when persistent can progress to HCC. The incidence of HCC is rising; in the United States over the past 2 decades, age-specific incidence has shifted toward younger people.1 IFN or combined IFN and ribavirin, which are currently the only effective treatments for chronic hepatitis C, reduce the occurrence of HCC.2, 3 Some patients, however, do not receive IFN treatment or fail to clear HCV even with IFN treatment. In addition, a subset of individuals remain unaware that they are infected with HCV; in these patients, HCC may present only in the advanced stage. The prognosis of patients presenting with symptoms related to HCC is extremely poor. In contrast, early detection of HCC before the onset of clinical symptoms can lead to curative treatment, significantly improving prognosis.
Several methods developed for the diagnosis of HCC, including evaluation of serum markers, ultrasonography (US), computed tomography (CT), and magnetic resonance imaging, have been tested clinically. Alpha-fetoprotein (AFP) and des-gamma carboxy prothrombin (DCP), serum proteins that are elevated in HCC, have been the most widely used markers. Although routine screening offers the best chance for early tumor detection and improved survival, the reported sensitivities and specificities of elevated serum AFP and DCP levels vary significantly.4–9 In addition, AFP levels are elevated in only 30% to 40% of patients with HCC, particularly early in the disease process.6 Elevated AFP levels are also seen in patients with noncancerous conditions, such as cirrhosis or exacerbations of chronic hepatitis, which confounds the screening results. Marrero et al.9 reported that DCP levels were more sensitive and specific than AFP testing for differentiating HCC from nonmalignant chronic liver disease. The usefulness of DCP for the detection of early HCC is limited, however. Wang et al.8 reported that the number of patients with small HCC (less than 2 cm) demonstrating elevations in DCP was low (56.5%). AFP-L3, the lectin lens culinaris agglutinin–bound fraction and one of the three AFP glycoforms, is the major glycoform of AFP elevated in the serum of HCC patients. At a cutoff level of 15% of total AFP, the reported sensitivities of AFP-L3 as a method of detecting HCC range from 75% to 96.9% with specificities of 90% to 92.0%.10, 11 Because the high percentage of AFP-L3 observed in HCC is closely related to poor differentiation and biologically malignant characteristics, such as portal vein invasion, of neoplastic cells,11, 12 how useful this test is for the early detection of HCC is unclear. In addition, the diagnosis of small mass lesions using US or CT is relatively inaccurate. Thus, additional biochemical markers are necessary for specific detection of early HCC.
The development of proteomic array technology for serum profiling, in which a ProteinChip Array is coupled with surface-enhanced laser desorption ionization time-of-flight mass spectrometry (SELDI-TOF/MS; Ciphergen Biosystems Inc., Fremont, CA), has created a powerful tool for the discovery of new biomarkers. This technology has been successfully applied using samples from patients with prostate, ovarian, and gastric cancers. The great advantages of this method are speed, high-throughput capability, and the requirement of only a small amount of sample. Although serum AFP levels and US are the most common examination methods used for HCC surveillance, the classification tree algorithm detailed in this study provided a more accurate classification than these examination methods alone.13, 14
This study sought to assess and compare protein expression profiles of sera from patients with or without HCC on a background of chronic liver disease attributable to HCV infection. We assessed the ability of SELDI-TOF/MS ProteinChip technology to identify serum markers that could enable early HCC diagnosis.
AFP, alpha-fetoprotein; AUC, area under the curve; CT, computed tomography; DCP, des-gamma carboxy prothrombin; m/z, mass-to-charge ratio; ROC, receiver operating characteristics; SELDI-TOF/MS, surface-enhanced laser desorption ionization time-of-flight mass spectrometry; US, ultrasonography.
Patients and Methods
The 153 male patients with chronic liver disease attributable to HCV infection were selected; serum samples were collected by the Faculty of Medicine of the University of Miyazaki (Miyazaki, Japan). All patients were negative for hepatitis B surface antigen. Seventy-seven of the patients were negative for HCC, which was confirmed by US or CT of the abdomen. Samples from 64 patients with HCC were obtained before treatment. Patients were randomly divided into two groups; the first analysis group was composed of 35 and 44 patients with and without HCC, respectively, whereas 29 and 33 patients with and without HCC, respectively, made up the second analysis group. The clinical characteristics of the first and second analysis groups were not significantly different except for the average age (Table 1). In conjunction with an ongoing cohort study, we also obtained prediagnostic sera from seven patients determined to have HCC within 1 year of US screening and five patients who have remained free of HCC for the past 3 years.15 These subjects constitute the third analysis group (Table 2). Twenty-six healthy volunteers without either liver neoplasia or HCV infection served as negative controls. After freezing and thawing once, all samples were separated into 20- to 30-μl aliquots and refrozen at −80°C until analysis.
Table 1. Patient Characteristics in First and Second Analysis Groups
1st Analysis Group
NOTE. Data are shown as the means±SD, Gender: male, statistical differences were determined by the Mann-Whitney U test. Values of p < 0.05 were considered to be statistical significant. NS indicates not significant. §§§Although age differed between the 1st and 2nd analysis group, none of the other factors described were not different.
For analysis, we used ProteinChip Arrays (CM10) with anionic surface chemistry. CM10 ProteinChip Arrays incorporate a carboxylate group that acts as a weak cation exchanger. Chips were rinsed with ultra-pure water and put into a bioprocessor (Ciphergen Biosystems, Inc.), a device that holds 12 chips and allows the application of larger volumes of serum to each chip array. Within the bioprocessor, the chips were washed twice with shaking on a platform shaker at a speed of 300 rpm for 5 minutes in 150 μl binding/washing buffer (50 mM sodium acetate, pH 4.5) per well. Five-microliter serum samples were denatured in 45 μl urea buffer (7 M urea, 2 M thiourea, 4% CHAPS, 1% dithiothreitol, and 2% ampholites), then diluted 1:9 in binding/washing buffer. After washing the chips extensively in binding/washing buffer, 100 μl of the denatured, diluted serum was applied to each chip spot. The bioprocessor was then sealed and shaken on a platform shaker for 40 minutes. Chips were then removed from the bioprocessor. After washing 3 times in binding/washing buffer, we rinsed the chips once in water. Each spot was then treated twice with 0.5 μl saturated sinapinic acid (SPA) (Nacalai Tesque Inc, Kyoto, Japan) and allowed to air-dry.
Arrays were analyzed using a ProteinChip Reader (ProteinChip Biology System II, Ciphergen Biosystems Inc.). Time-of-flight spectra were generated by laser shots collected in positive mode. Laser intensity ranged from 225 to 240, with a detector sensitivity of 6. An average of 65 laser shots per spectrum were performed. For mass accuracy calibration according to the manufacturer's instructions, 500 nl of a mixture of mass standard calibration proteins (All-in-one Peptide Standard; Ciphergen Biosystems) were applied to single spot of the normal phase (NP20) chip array, followed by two applications of 1.0 μl saturated SPA. The mass-to-charge ratio (m/z) of each the proteins captured on the array surface was determined according to externally calibrated standards.
Peak Detection, Data Analysis, and Decision Tree Classification.
Peak detection was performed using Ciphergen ProteinChip Software, version 3.0.2 (Ciphergen Biosystems). Spectra between 1300 and 150,000 m/z were selected for analysis. Smaller masses were not analyzed, because these were determined to be artifacts of energy absorbing molecules. Spectra were normalized to total ion current intensity. In the preliminary examination, we observed significant noise in spectra with ranges less than 3000 m/z. In addition, no differences were apparent in the peaks of spectra at values greater than 10,500 m/z between 4 serum samples from patients with HCC and 4 samples from patients without HCC. Therefore, after baseline subtraction, we performed automatic peak detection in the optimized range of 3000 to 10,500 m/z, using peak auto-detection set to cluster, a first-pass signal/noise ratio of 5, a minimal peak threshold of 20% for all spectra, and a cluster mass window of 0.3% mass.
Based on the peak intensities of the 55 signal clusters obtained, a decision tree was constructed from the first analysis group. For each sample, the intensity values for each peak within the 3000-10,500 m/z range were input into Biomarker Patterns Software (Ciphergen Biosystems) and classified according to the tree analysis described.13, 16 Decision trees classify spectrum patterns through sequential questioning, in which the next question asked depends on the answer.17 With a decision tree, classification of patterns begins at the roof node, following the appropriate links based on the answers obtained to the questions posed at each node.
Reproducibility is critical for reliable disease diagnosis and early detection. We examined the reproducibility of our assay system using pooled normal sera from 2 individuals.13 Four protein peaks randomly selected over the course of the study were used to calculate the coefficient of variance (CV) as described.18 We then determined the reproducibility of the SELDI spectra, both within and between arrays (intra-assay and interassay, respectively). The intra-assay (spot-to-spot) CV was 10.2% for peak intensity and 0.25% for mass accuracy. The interassay (chip to chip) CV was 15.9% for peak intensity and 0.67% for mass accuracy. We also observed minimal variation of day-to-day instrumentation (data not shown).
Values shown are the means ± SD. Statistical differences, including laboratory data and individual peaks in SELDI-TOF/MS, were determined by the Mann-Whitney U test. Values of P < 0.05 were considered statistically significant. The discriminatory power for each putative marker was described via receiver operating characteristics (ROC) area under the curve (AUC). These statistical analyses were performed using STATVIEW 4.5 software (Abacus Concepts, Berkeley, CA), SPSS software (SPSS Inc., Chicago, IL), or Ciphergen ProteinChip Software, version 3.0.2.
Sample numbers for the first group used to develop the decision tree were small. A cross-validation approach using multiple decision trees would be more suitable for the construction of a final decision tree model.19 In this study, we validated the models using a 10-fold cross-validation approach to construct the final decision tree model as described previously.16, 18 The result of the biomarker patterns software using this approach differed from the classification and regression tree analysis by univariate analysis (Mann-Whitney U test).20
Detection of HCC (Data Analysis).
We aimed to identify a single peak protein or pattern of peaks that could distinguish HCC patients from individuals without HCC. Initially, we analyzed serum samples from the first analysis group, a random 35 and 44 patients with and without HCC, respectively, using the SELDI ProteinChip system. Peaks were detected automatically after baseline subtraction using Ciphergen ProteinChip Software, version 126.96.36.199 This analysis identified 55 signal peak protein clusters, seen in the spectrum representations of the two groups (HCC and non-HCC) within the 3000 to 10,500 m/z range (Fig. 1). Eight protein peaks were overexpressed, whereas 10 protein peaks were down-regulated significantly in sera from HCC patients in comparison with those from patients without HCC. The mean amplitudes of the peaks for the 2 patient groups are shown in Table 3.
Table 3. Discriminatory Peaks and Mean Values Between Groups (HCC* and Non-HCC Group)
HCC (n = 35)
Non-HCC (n = 44)
NOTE. Data are shown as the means±SD, statistical differences were determined using the Mann-Whitney U test, †Peaks selected in final classification model by decision tree analysis.
Abbreviation: *hepatocellular carcinoma.
3.94 ± 4.56
1.92 ± 1.79
8.36 ± 4.28
6.49 ± 3.99
13.61 ± 10.10
8.94 ± 8.42
26.87 ± 18.11
18.20 ± 15.09
8.40 ± 5.94
5.26 ± 4.42
12.76 ± 14.78
5.86 ± 5.37
4.39 ± 3.08
3.20 ± 2.45
16.10 ± 10.69
10.36 ± 7.26
1.27 ± 0.74
2.10 ± 1.21
0.90 ± 0.77
2.43 ± 2.50
2.02 ± 1.18
2.45 ± 1.50
1.98 ± 1.17
3.45 ± 2.84
1.65 ± 4.95
2.51 ± 3.53
3.12 ± 1.35
3.31 ± 1.41
3.45 ± 2.24
5.08 ± 3.86
5.49 ± 9.46
12.32 ± 14.63
1.23 ± 1.73
2.31 ± 2.63
1.14 ± 0.80
1.94 ± 1.71
2.42 ± 1.33
4.04 ± 3.27
0.82 ± 0.52
1.19 ± 0.67
Structure of the Decision Tree.
Decision trees are flowchart-like tree structures that repeatedly split data sets into subsets in accordance with the given cancer versus noncancer classification task. Each classifier, a simple rule applied to each patient, queries only one mass. Serum samples isolated from 35 HCC patients and 44 chronic liver disease patients without HCC served as the training set. Using the normalized peak intensities of these 55 signal clusters, we constructed and evaluated decision trees using the training set. Peaks with a high discriminatory power were used to create 6 mass classifiers (m/z = 3444, 3890, 4067, 4435, 4470, and 7770) of differing complexities. Although 2 of these classifiers did not differ significantly between patients with and without HCC (m/z = 3444 and 3890), the decision tree generated using the combination of these 6 protein peaks correctly classified 97% of HCC samples (Fig. 2, Table 3).
Testing the Decision Tree.
To determine the accuracy and validity of the algorithm, we reevaluated the decision tree (Fig. 2) that had been constructed using the training set, using the first test set (second analysis group). To evaluate the classification performance, we determined the sensitivity and specificity of the algorithm for the differentiation between patients with and without HCC. The decision tree algorithm correctly diagnosed 83% (24 of 29) patients with HCC and 76% (25 of 33) patients without HCC. Although the ROC AUC of each of the 6 mass classifiers were 0.70, 0.61, 0.71, 0.64, 0.66, and 0.70, which individually were more discriminatory than existing serum marker methods, the decision tree algorithm had highest discriminatory power (Tables 3, 4). Twenty-six healthy volunteers were all correctly identified as free of HCC. The accuracy of the algorithm for HCC diagnosis was higher than that of other known tumor markers (Table 4).
Table 4. Comparisons of Hepatocellular Carcinoma Diagnostic Rates for the Multiple Marker and Three Additional Tumor Marker Analyses in the Second Analysis Group
NOTE. †excluding subject whose data could not be obtained.
Abbreviation: *alpha fetoprotein, **Lens culinaris agglutinin-reactive fraction of alpha-fetoprotein, *** des-γ-carboxy prothrombin, ****receiver operating characteristic area under the curve.
AFP* (>20 ng/mL)
DCP†,*** (>40 mAU/mL)
Decision Tree Predicts HCC Occurrence.
The most fundamental requirement for serum-based marker detection is identification of carcinoma at an early stage when treatment has the greatest impact on prognosis. We investigated the specificity of our classification system using a second test set (3rd analysis group) of samples taken from 7 patients 1 year before the development of HCC and 5 patients with chronic liver disease remaining free of HCC for at least 3 years. Six of the 7 (86%) patients who later developed HCC were classified to the HCC group using the classifiers described previously (Fig. 2, Table 3), even though the HCC was undetectable by US at the time of serum testing. All 5 patients without HCC were classified to the non-HCC group. These results indicate that this decision tree analysis is useful for the early diagnosis of HCC.
Proteomic analyses of sera and liver tissues from patients with HCC associated with HBV or HCV infection has been used to identify new biomarkers predicting HCC development, leading to improved prognosis.21–27 Because many analyses use 2-dimensional electrophoresis, the proteins used in such investigations must typically be greater than 10,000 daltons in molecular weight.21, 25–29 Analyzing serum or another body fluid that is easy to obtain from patients to predict disease or evaluate treatment efficacy would be ideal. In this study, we used the SELDI ProteinChip system to analyze serum samples from patients with HCC. This affinity-based mass spectrometric method, which combines chromatography and MS, is suitable for the analysis of both proteins and low-molecular-weight peptides.14 Although we did not identify a single effective biomarker, we developed a new decision tree, using a cross-validation approach, that uses a multimarker algorithm of 6 proteins capable of diagnosing and predicting HCC at least 1 year before the appearance of clinically detectable disease in patients infected with HCV.
Ninety percent of the protein content of serum is composed of 10 proteins, including albumin and IgG; an additional 12 proteins make up 90% of the remaining 10%. Thus, only 1% of the protein content of serum is of interest as potential biomarkers in proteomic studies.30 Several proteomic methods combine high-resolution separation of complex protein mixtures with additional protein identification methods, such as MS. To identify the low abundance proteins of interest, one must remove the most abundant proteins from the serum by techniques such as immunodepletion. These methods are only reliable if the assumption that biomarkers are not bound to major circulating proteins is correct. If bound to these proteins, low-abundance biomarkers would be lost by immunodepletion techniques, leading to the loss of valuable diagnostic information.31 Therefore, we did not remove major serum proteins (albumin and IgG) from this study; analysis using the SELDI ProteinChip system can be performed without immunodepletion.
The characteristics of patients such as sex and age, sample collection method, processing and storage of samples, and data analysis methods may induce bias into proteomics-based biomarker discovery attempts. Because HCC occurs more frequently in males than females, we developed our classification model using male patients only. As a result, our study was not designed to address the benefit of our classification model for females with HCC. Villanueva et al.,32 however, reported that gender did not appear to affect the peptide profile. We also evaluated five female patients with HCC; the peak intensity at 8136 m/z was elevated to a similar degree as that seen in male patients with HCC. Currently, a prospective study of female patients with or without HCC is underway to validate the utility of this classification model as a marker for the detection of HCC, particularly at early stages.
We demonstrated that 18 of the selected 55 protein peaks within a m/z range of 3000 to 10,500 range differed between patients with and without HCC by univariate analysis. Based on the peak intensities of the 55 peak proteins, 6 peaks were selected to construct the decision tree for the first analysis group using Biomarker Patterns Software and a 10-fold cross-validation approach. Two (3444 and 3890 m/z peaks) of those 6 peaks, however, were not significantly different between the HCC and non-HCC groups by univariate analysis (P values of 0.2, Table 3). The selection process to construct the decision tree was not based on univariate analysis; the presented decision tree was developed using multivariate binary logistic regression to determine the peaks best able to differentiate patients with and without HCC.19, 33 In fact, the ROC AUC of each of these 6 peaks were between 0.61 and 0.71, which tended to be more discriminatory than other serum markers. The decision tree proved to be best able to predict the presence of HCC in comparison with other serum markers. For these reasons, analysis of all 6 peaks, including the 2 peaks that were not significantly different between patients with and without HCC (peaks at m/z = 3444 and 3890), had the highest discriminatory power.
The algorithm used in this study is well established as a diagnostic tool for malignant neoplasms.13, 16, 34, 35 In comparison with the use of a single biomarker for the diagnosis of disease, multiple-biomarker analysis has both higher sensitivity and specificity. Indeed, our multimarker analysis was more accurate than existing tumor marker analysis methods (Table 4). Multimarker analysis is useful to predict HCC in patients with liver cirrhosis, which has high malignant potential and heterogeneous characteristics. Complex serum proteomic patterns may reflect the underlying pathological state of an organ, including HCC. Recently, Schwegler et al.16 reported an algorithm using the seven peaks that scored highest by SELDI TOF/MS. The determined classification tree, however, could not distinguish HCC from chronic liver disease; using 38 SELDI peaks, the sensitivity and specificity (61% and 76%) for distinguishing chronic HCV from HCV-HCC were lower than those determined for the decision tree constructed in this study. Schwegler et al. demonstrated that their sensitivity and specificity values increased to 75% and 92%, respectively, when AFP/DCP/GP73 was added to their classification model. In our model, although the sensitivity increased to 92%, specificity did not increase (52%) after the addition of AFP/AFP-L3/DCP to our classification. Serum GP73 levels, which were not available for examination in our study, or other as-yet-unknown characterizations of these patients may affect the predictive capability of this method. Although the sensitivity and specificity (92% and 90%) of another proteomics study using SELDI to distinguish chronic liver disease from HCC were higher than those determined in our study, greater than 63% of the study population examined exhibited advanced HCC (stage III and IV).16, 36 Only 14% of the HCC patients included in our study population had stage III or IV disease (Table 1), which likely accounts for the differences in the peaks used in the 2 studies. The characteristics of the patients with HCC will likely affect both the sensitivity and specificity significantly. Thus, our decision tree is more suitable for the diagnosis of early HCC than any previously reported methods.16, 36
Although serum AFP level greater than 400 ng/ml serves as a useful method for the diagnosis of HCC,37 this detection method is insufficiently sensitive to detect small HCCs.38 Although the utility of several other markers has been shown to be superior to AFP in detecting early HCC,22, 39, 40 these markers were determined in patients with clinically apparent HCC. Thus, the sensitivity/specificity also may not be sufficient to detect early HCC. Our classification tree was able to predict cancer occurrence before HCC was clinically apparent by US. In the third analysis group, we correctly predicted the progression of 86% of the patients to HCC from their prediagnostic serum samples. To screen high-risk patients with chronic liver disease, such as that associated with HCV infection, our multi-marker analysis could help distinguish those patients for which the combined examination of US, CT, and arterial portography would be recommended.
In their investigation of differential protein expression in HBV-associated and HCV-associated HCC, Kim et al.26 identified 60 proteins displaying significant changes in expression levels between nontumorous and tumorous tissues. Forty-six of these proteins demonstrated an association with viral infection. We analyzed the sera of patients with HBV-associated HCC; the expression of a number of protein markers differed between HCV and HBV infections (data not shown). The biological and pathogenic activities of these 2 viruses are different; the molecular mechanisms underlying the development of hepatitis and hepatocarcinogenesis also may differ between HBV and HCV infections.26, 41 Our analysis of the proteome using the SELDI technique demonstrates that this methods also may be useful for investigation of the molecular mechanisms of hepatocarcinogenesis on the background of different viral infections.
A number of the peaks may represent doubly charged peaks; for example, the peak at 4067 m/z may be the doubly charged form of the 8138-m/z peak. One of the peaks in Table 3 included in the classification model also may be a doubly charged peak (3890/7770 m/z), which could affect the independent variables. To clarify this possibility, one must identify the individual proteins. The major limitation of the SELDI technique is that identification of individual proteins is often complicated. Lee et al.,42 however, recently isolated complement C3a as a candidate biomarker in human chronic hepatitis C and HCV-related HCC using the SELDI-TOF MS system after serum fractionation, 2-dimensional gel electrophoresis, in-gel digestion, and MS. We are now identifying the single protein represented by the 8138-m/z peak; 3 candidate proteins are known. Although we have to confirm these results by western blotting, the peak at 4067 m/z does not appear to be the doubly charged peak of the 8138-m/z peak by SELDI immunoassay. Although the serum levels of no single protein are sufficient to detect early HCC from the results of ROC AUC, identification of proteins altered in the disease may help analyze the molecular mechanisms underlying HCC development and may help identify new therapeutic targets or modalities for the treatment or prevention of HCC.
In patients with HCV infection, serum profiling using the SELDI ProteinChip system is useful both for the early detection of HCC and to distinguish HCC from chronic liver disease in the absence of HCC. Our ability to identify proteomic alterations in serum samples from HCC patients suggests that the SELDI ProteinChip system may be useful to identify proteins associated with HCC in the hopes of developing new therapeutic targets.
We thank Hiroyuki Nakao for suggestions concerning statistical analyses. The authors thank Yuko Nakamura and Yuka Takahama for their technical assistance.