Potential conflict of interest: Dr. Di Bisceglie is a consultant for and received grants from Roche. Drs. Ray and Mao received grants from Roche.
Differential response patterns to optimal antiviral therapy, peginterferon alpha plus ribavirin, are well documented in patients with chronic hepatitis C virus (HCV) infection. Among many factors that may affect therapeutic efficiency, HCV quasispecies (QS) characteristics have been a major focus of previous studies, yielding conflicting results. To obtain a comprehensive understanding of the role of HCV QS in antiviral therapy, we performed the largest-ever HCV QS analysis in 153 patients infected with HCV genotype 1 strains. A total of 4,314 viral clones spanning hypervarible region 1 were produced from these patients during the first 12 weeks of therapy, followed by detailed genetic analyses. Our data show an exponential distribution pattern of intrapatient QS diversity in this study population in which most patients (63%) had small QS diversity with genetic distance (d) less than 0.2. The group of patients with genetic distance located in the decay region (d>0.53) had a significantly higher early virologic response (EVR) rate (89.5%), which contributed substantially to the overall association between EVR and increased baseline QS diversity. In addition, EVR was linked to a clustered evolutionary pattern in terms of QS dynamic changes. Conclusion: EVR is associated with elevated HCV QS diversity and complexity, especially in patients with significantly higher HCV genetic heterogeneity.(HEPATOLOGY 2009;50:1765–1772.)
Hepatitis C virus (HCV) infection is a major public health concern worldwide. Over 2.7 million Americans are chronically infected with HCV, which results in an estimated 10,000 deaths each year and is a leading indication for liver transplantation.1 Currently, optimal antiviral therapy of chronic hepatitis C with peginterferon alpha plus ribavirin cures up to 80% of patients infected with HCV genotypes 2 and 3. However, the same treatment regimen is effective in only about 50% of patients infected with HCV genotype 1.2 It is thus important to be able to identify the factor(s), either host or viral, which affect the results of therapy, as such information may be valuable in improving the current antiviral strategy.
In this setting, HCV quasispecies (QS) characteristics have been a major focus of study in patients undergoing antiviral therapy. However, previous studies have generated conflicting data with regard to the role of HCV QS in the determination of therapeutic efficiency (see recent review3). Such results are to some extent not surprising because the responses to antiviral therapy represent a complex phenotype that is affected by multiple factors from both the virus and host sides. The involvement of these factors certainly interferes with the data interpretation from HCV QS studies, especially when the study population is small. In addition, techniques used to assess HCV QS diversity may be another source for data discrepancy. The effect of mutations on gel mobility of a given DNA molecule is sometimes unpredictable.4 Thus, data from gel-based assays is not always consistent with the results from cloning/sequencing, which is thought to be the gold standard technique to assess viral diversity. In the current study we performed a detailed QS analysis in 153 patients undergoing combination antiviral therapy (peginterferon alfa-2a plus ribavirin). Compared to many previous studies, the current project has several unique features, such as being the largest study population, with an exclusive focus on HCV genotype 1 and the application of large-scale cloning and sequencing techniques. These characteristics allow a thorough dissection of the potential effect of HCV QS during antiviral therapy.
EVR, early virologic response; HCV, hepatitis C virus; HVR1, hypervariable region 1; QS, quasispecies; SOC, self-organized criticality; SVR, sustained virologic response.
Patients and Methods
This was an ancillary study of a large clinical trial that aimed to compare the therapeutic efficiency of peginterferon alpha-2a and alpha-2b in treatment-naïve patients with chronic HCV infection.5 Of 380 patients enrolled in the trial, 189 patients were treated with peginterferon alpha-2a and are the subjects of the present study. Patient recruitment was restricted to HCV genotype 1.5 Serum samples were collected at multiple timepoints during the early phase of antiviral therapy, including baseline (w00), w04, w08, and w12. Deidentified specimens were shipped to St. Louis University (SLU) and Johns Hopkins University (JHU) and stored at −80°C until use. For each patient, molecular cloning was planned for two serum samples, one at the baseline and the other at the latest timepoint during the early phase of antiviral therapy before week 12 with a minimum HCV viral load of more than 1,000 copies/mL, approximately equal to 1,111 IU/mL when the HCV RNA level is quantified with Roche Amplicor HCV Monitor, v. 2.0 (lower limit of quantification, 600 IU/mL).
Molecular Cloning of HCV Hypervariable Region 1 (HVR1).
A 442-basepair (bp) fragment covering HCV HVR1 was amplified by reverse transcription-polymerase chain reaction (RT-PCR), followed by gel purification and TA cloning. About 15 independent clones for each sample were sequenced. Detailed experimental procedures are provided in the Supporting Material.
Raw sequences were edited with the programs ClustalW6 and BioEdit7 in which HCV H77 strain (AF009606) served as the reference sequence. After the removal of primer sequences the target domain for genetic analyses is 399 bp in length. Nucleotide positions containing insertions or deletions within this domain were removed for the present analysis and will be analyzed separately for their potential influence on antiviral therapy. The HCV QS nature was characterized by measuring both genetic complexity and genetic diversity. The definitions and measurement of these genetic parameters are outlined in the Supporting Material.
Phylogenetic analysis was used to verify HCV genotypes and/or subtypes and potential sequence clusterings corresponding to response patterns. We constructed two phylogenetic trees, one with all 4,314 clones (big tree) and the other with only 153 clones respectively representing the dominant HCV QS variant at the baseline from each patient (small tree). Both trees were computed using the program MEGA 4 with Neighbor-Joining approach, using the maximum composite likelihood model. We also assumed a rate variation among sites with an experienced value of gamma parameter (α = 0.5). Forty-five reference HCV sequences, representing different HCV genotypes and subtypes, were included.
Values of genetic parameters from comparative analyses were tested for statistical significance using two-tailed Student's t test. Categorical data from cross analyses were tested for statistical significance using the χ2 test with Yate's correction or Fisher's exact test. With regard to HCV QS diversity at baseline, we explored its potential distribution pattern in this study population (n = 153). In doing so, a one-sample Kolmogorov-Smirnov test was first used to test common distribution patterns. Next, a newly developed procedure was applied to see if the data fit a power-law or power-law-like distribution, such as exponential distribution, in which the given quantities are tightly clustered around their average values with much reduced probability far from the mean in a one-way direction.8 In this setting the term “decay region” is used to denote the low-boundary phase of the distributions and the value at the proposed low-bound point (χmin) on the curve was calculated.8 All statistical analyses were done with SPSS (v. 13.0) except for the HCV QS distribution analysis, run in MatLab (http://www.mathworks.com).
Nucleotide Sequence Accession Numbers.
A total of 3,909 distinct HCV QS sequences generated in this study were deposited in GenBank under accession numbers FJ688411 through FJ692319. The entire dataset including 4,313 sequences is available on request.
Experimental Performance and Sample Compilation.
We had a 100% success rate for the amplification of a 442-bp target by using the protocol described above. This high success rate mainly resulted from our efforts to optimize PCR primers, including their positions and composition. In some samples with low viral loads near 1,000 copies/mL (n = 15, 5.7%), we found that it was necessary to increase the input amount of serum RNA for reverse transcription (RT). This was achieved by using 280 μL of serum, instead of the regular 140 μL, for RNA extraction, followed by the elution into the same volume of Tris buffer (60 μL). Most experiments resulted in a single visible band with the expected size on agarose gel. In the molecular cloning experiments the positive rate of recombinant clones was ≈95%.
For each patient, HCV QS profiles were generated at two timepoints, the baseline and the latest timepoint during the early phase of antiviral therapy (≤week 12) with a minimum HCV viral load more than 1,000 copies/mL. Based on this standard we finally identified 110 patients with two timepoints and 43 patients with only one timepoint (baseline), which resulted in 263 serum samples to be studied. A total of 4,314 clones were generated and sequenced from these samples, an average of 16.4 clones per sample. Among 153 patients, 104 (68%) achieved early virological response (EVR), defined as more than a 2 log decrease of HCV RNA level at week 12 compared to the baseline.9 Potential differences between these two groups were examined in terms of viral factors, including baseline viral load, HCV subtypes, QS diversity, complexity, and dynamics.
Lack of Association Among Viral Load, Subtypes, and Early Response Patterns.
The phylogenetic tree constructed with all 4,314 clones (big tree) displayed single patient-based clusterings, indicating the lack of HCV coinfection with different subtypes or strains. This observation further excluded any contamination during the experimental performance. We verified the HCV genotype/subtype for all patients through the phylogenetic analysis of 153 viral sequences representing dominant QS variant from each patient. Thus, 115 patients were infected with HCV genotype 1a and 38 with HCV genotype 1b (Fig. 1). HCV genotype 1a isolates further formed three clusters, named subgroups 1, 2, and 3 (Fig. 1). Among 109 patients with HCV subtypes based on Inno-LiPA HCV II Line Probe assay, 17 appeared to have been mistyped based on our phylogenetic analysis (Fig. 1). Thus, the Inno-LiPA HCV II Line Probe assay seems to have an error rate of HCV subtyping at 15.6%, consistent with a previous report.10 More interesting, the Inno-LiPA HCV II Line Probe assay mistyped 16 of 17 HCV 1a patients as HCV genotype 1b, suggesting the existence of an intrinsic bias to HCV genotype 1b.
There was no significant difference between HCV subtypes with regard to early response patterns, EVR versus non-EVR, 1a, 76.9% versus 23.1%, and 1b, 71.4% versus 28.6% (Fig. 1). In the phylogenetic tree, HCV genotype 1a strains were further clustered into three subgroups, supported by the bootstrap test (Fig. 1). Again, no statistical significance was detected with respect to the relationship between HCV 1a subgroups and early response patterns in terms of current treatment regimen (Fig. 1).
We also investigated the potential relationship and interactions between pretreatment HCV RNA levels and early response patterns, HCV genotypes, and subgroups. As shown in Table 1, pretreatment HCV viral load was not associated with early virological response patterns (P = 0.137), HCV genotypes (P = 0.489), or HCV 1a subgroups (P = 0.171).
Table 1. Lack of Correlation Between HCV Viral Load at Baseline and Response Patterns, HCV Genotypes, and HCV 1a Subgroups
HCV 1a Subgroup
HCV viral load was determined by using Roche Amplicor HCV Monitor Test, version 2.0 (Roche Diagnostics) and is expressed as average log10 values and standard derivation.
Two-tailed t test under the assumption of two-sample equal variance. NP, not performed.
We separated our amplified region into two domains, HVR1 (81 bp) and non-HVR1 (318 bp), to avoid possible masking of statistical significance due to apparently unequal nucleotide substitution rates between these two domains. Next, we performed the analyses at two levels, pretreatment genetic diversity and its early dynamic changes during antiviral therapy.
With regard to pretreatment genetic diversity, subjects with EVR had overall higher values than non-EVR group in most parameters measured. Thus, for HVR1, d 0.253 versus 0.1723; dS 0.0849 versus 0.0810; dN 0.1451 versus 0.1033, and for Non-HVR1, d 0.0333 versus 0.0265; dS 0.0528 versus 0.0528; and dN 0.0064 versus 0.0051. However, only HVR1 dN reached statistical significance (P = 0.039) (Fig. 2).
The criteria for sample selection identified 110 patients with serum samples to be cloned and sequenced at two timepoints, one at baseline and the other at the latest timepoint during the early phase of antiviral therapy (≤week 12) with a minimum HCV viral load more than 1,000 copies/mL. Again, these patients were separated into two groups, EVR (n = 66) and non-EVR (n = 44). We conducted two kinds of group-based analyses, the average change and net change of genetic diversity between two timepoints. Average change simply compares population-based genetic diversity without the consideration of sequence differences between two populations, which is reflected by the net change. In other words, the net change estimates the extent of “clustered evolution” in terms of its phylogenetic representation.11
The average change in genetic diversity was calculated based on either HVR1 (81 bp) or non-HVR1 region (318 bp). The net change was determined with HVR1 only. Both EVR and non-EVR groups displayed a trend toward decreasing genetic diversity over time, shown as all positive values of genetic parameters. However, the average change of non-HVR1 dN increased in the EVR group (Fig. 3). Because the absolute values of dN change of non-HVR1 are actually very minimal in both EVR and non-EVR groups (0.0013 versus 0.0018), such an increase may not be biologically significant. The EVR group had a more prominent decrease of genetic diversity over time than the non-EVR group (Fig. 3). However, the difference between the two groups was not statistically significant (Fig. 3).
The net change of genetic diversity had similar patterns as the average change. The EVR group was associated with more apparent decrease of net genetic diversity. Again, the difference between the EVR and Non-EVR group was not supported statistically (Fig. 3).
Genetic complexity was estimated by measuring average Shannon entropy in the HVR1 domain. Like genetic diversity, the EVR group had a higher pretreatment genetic complexity than the non-EVR group at either the nucleotide or amino acid level (Fig. 4). The latter reached statistical significance (P = 0.0499). The average change of genetic complexity was also higher in the EVR group than that in the non-EVR group, although this difference was not statistically supported (Fig. 4).
Distribution of QS Diversity in Patients Infected with HCV Genotype 1 Isolates.
The current project, the largest-ever QS study yet to focus on HCV genotype 1, investigated possible distribution patterns of QS diversity. Using HVR1 genetic distance (d), we first plotted its histogram in this study population. The distribution patterns were subsequently estimated by a one-sample Kolmogorov-Smirnov test. The data supported an exponential distribution (P = 0.132) with the exclusion of normal (P < 0.001), uniform (P < 0.001), and Poisson (P < 0.001) distributions (Fig. 5). We further calculated the lower bound (χmin) using described formulas under the hypothesis of either continuous power-law or exponential distribution.8 The power-law distribution was not favored due to the short tail, only 15 patients located in the power-law region (χmin = 0.58), which is too few to be meaningful. The χmin for exponential distribution was equal to 0.53, putting genetic distance from 19 patients in the decay region (Fig. 5).
We evaluated HCV QS heterogeneity at baseline and its dynamic changes during the early therapeutic period. In this type of study, sampling bias is often a concern due to the lack of normalization of the entry HCV RNA amount used for RT-PCR.12 Thus, viral QS heterogeneity may be to some extent dependent on viral titers. However, under our experimental procedure and study protocol we consider the potential for sampling bias to be minimal, and its role on our observations and conclusions unimportant. First, we previously failed to detect a statistical relationship between QS diversity and viral titers.13 Second, for a given viral region used to measure QS diversity, such as HVR1 in this study, QS diversity is maintained by sequencing an adequate number of clones, usually >10, and the use of fixed PCR primers.14, 15 Finally, we focused on comparative HCV QS analyses between EVR and non-EVR groups. There is no statistical difference with regard to their average HCV viral loads in either group, which may further reduce the sampling bias, assuming it exists. An additional limitation, by the nature of the PEAK study that only followed patients through the first 12 weeks of therapy, is the inability to correlate HCV QS heterogeneity with sustained virologic response (SVR). Rather, we chose to correlate it with EVR. The number of patients with rapid virologic response (RVR), defined as undetectable viral RNA at week 4, was only 11 patients (7.2%), making it difficult to analyze. Nonetheless, we actually did not find any viral factors analyzed in this study that were specifically associated with RVR (data not shown). Thus, we focused on the EVR rather than RVR. EVR is also a response pattern that appears to reflect intrinsic sensitivity of HCV to combination therapy. It has been reported that EVR is a very good predictor of SVR.16 Data interpretation from our study may have general applicability in terms of HCV antiviral therapy.
HCV genotype is a well-documented independent factor affecting the efficiency of antiviral therapy.2 The large size of this study population allowed us to examine whether or not such an observation can be extended to HCV subtype level. Statistically, we demonstrated that pretreatment HCV viral loads, HCV subtypes, and HCV subgroups within HCV genotype 1a are not determinants of early response patterns. Previous studies suggest that a low baseline HCV viral load is an independent predictor to SVR (reviewed17). In our study, the EVR group even had a higher average viral load, although the differences between groups were not statistically supported (Table 1). The discordance may be attributed to multiple factors, such as different therapeutic regimens, patient selection, and various stages of disease progress.18, 19 Alternatively, it may simply suggest that pretreatment HCV viral load is not an independent factor to predict therapeutic efficiency.
An important finding came from our QS analysis. Due to known differences in mutation frequency between HVR1 and non-HVR1 domains, our analysis focused on HVR1. EVR was associated with increased baseline QS diversity, shown as higher values of genetic diversity and complexity. Especially, dN reached statistical significance (P = 0.039). The dN reflects the strength of evolutionary selection that is frequently interpreted as immune pressure.20 Because HVR1 contains putative B and T cell epitopes,21, 22 our observation suggests a role of pretreatment immune status in the determination of early virological response patterns. Compared with previous studies, it should be emphasized that our conclusion is more solid in terms of statistical power given the large number of patients studied. Additionally, two aspects of HCV QS dynamics have been assessed, including genetic diversity/complexity (average change) and sequence diversification (net change). In the former aspect, both EVR and non-EVR groups display a general trend of reduced genetic diversity and complexity after the start of antiviral therapy. Such a trend seems more apparent in the EVR group. Given the fact of higher pretreatment genetic diversity or complexity in the EVR group, this trend may be only the reflection of the elimination of drug-sensitive HCV QS variants. However, this explanation cannot be applied to the sequence diversification (net change) in which a similar trend has been featured. Thus, consistent with previous reports, HCV QS diversification is more likely associated with early response patterns.11
By taking the advantage of this largest-ever HCV QS study, we explored potential distribution patterns of intrapatient HCV QS diversity at the population level. In contrast to viral load that displayed a typical normal (Gaussian) distribution among 153 patients we studied (data not shown), the QS diversity showed a best fit to an exponential distribution (Fig. 5).23 This finding has several important implications. First, the exponential distribution of QS diversity may explain long-existing controversies with regard to the role of QS diversity in HCV antiviral therapy.11, 24–40 Although the overall EVR rate in this study population is about 68%, the group of patients with HVR1 genetic distance beyond the low bound (χmin = 0.53) has a significantly higher EVR rate (89.5%, P = 0.032) (Fig. 5). In fact, when taking out the patient group with large QS diversity (d >0.53), the statistical significance of dN values was lost in terms of its association with the EVR (EVR, 0.106 versus non-EVR, 0.092, P = 0.194). Thus, assuming a potential role of QS diversity in the antiviral therapy, we can conclude that such a role may become dominant only in patients with large QS diversity (d >0.53). This conclusion was also true in the similar analysis using dN values (data not shown). Because the QS diversity in most patients, as measured by genetic distance (d), is less than 0.53 (Fig. 5), intrinsic uncertainty is actually accompanied with any HCV genetic studies to explore the role of QS diversity in the antiviral therapy, especially when the study population size is small. For example, in our previous study there was only one patient confirmed with genetic distance larger than 0.53 among 29 patients studied (36).
Second, since the introduction of QS theory into virology, most viral genetic studies simply use this term to describe viral genome heterogeneity. Its original definition is largely ignored.41, 42 According to this theory, all viral variants in infected individuals form a network and act as a unit in response to internal or external interference, such as antiviral therapy. In this study we found that the EVR is associated with high QS diversity. Such an observation is not easily understood with classical population biology because high QS diversity indicates an increased possibility to contain drug-resistant viral variants. In this setting, QS theory may provide a plausible explanation by treating the entire viral population as an acting unit presumably in a status of the self-organized criticality (SOC), which is a prevailing hypothesis to explain many complex phenomena in nature, including exponential distribution.43 Consequently, high QS diversity may imply a critical status in which HCV reaches its maximum capability to maintain the QS network. Such a status, as assumed by the SOC hypothesis, would be extremely sensitive to any external stimulus, such as antiviral therapy, resulting in the collapse of the entire QS network (virus extinction). Thus, our data provide indirect evidence for the support of QS theory in virology and the SOC hypothesis should be an important addition to this theory.
Third, the formation of QS diversity results from the complicated interaction between virus and host. The exponential distribution of QS diversity suggests the lack of an ideal genetic distance that may favor HCV infection. In other words, the QS nature is not necessarily the only prerequisite for HCV to establish or maintain its persistent infection. Indeed, most RNA viruses, such as dengue and West Nile virus, share the QS nature.44, 45 However, only a few of them result in a persistent infection in humans.
Finally, our data showed the large variability of HCV QS diversity in infected patients. With the fact of a considerable relatedness between the EVR and high HCV QS diversity, the modulation of HCV QS diversity may represent a novel strategy for antiviral therapy. The underlying mechanism responsible for the formation of high QS diversity in HCV patients therefore warrants further investigation.
We thank Aaron Clauset (Santa Fe Institute) for assistance with statistical analysis.