Genetic and catalytic efficiency structure of an HCV protease quasispecies


  • Potential conflict of interest: Nothing to report.


The HCV nonstructural protein (NS)3/4A serine protease is not only involved in viral polyprotein processing but also efficiently blocks the retinoic-acid–inducible gen I and Toll-like receptor 3 signaling pathways and contributes to virus persistence by enabling HCV to escape the interferon antiviral response. Therefore, the NS3/4A protease has emerged as an ideal target for the control of the disease and the development of new anti-HCV agents. Here, we analyzed, at a high resolution (approximately 100 individual clones), the HCV NS3 protease gene quasispecies from three infected individuals. Nucleotide heterogeneity of 49%, 84%, and 91% were identified, respectively, which created a dense net that linked different parts of the viral population. Minority variants having mutations involved in the acquisition of resistance to current NS3/4A protease inhibitors (PIs) were also found. A vast diversity of different catalytic efficiencies could be distinguished. Importantly, 67% of the analyzed enzymes displayed a detectable protease activity. Moreover, 35% of the minority individual variants showed similar or better catalytic efficiency than the master (most abundant) enzyme. Nevertheless, and in contrast to minority variants, master enzymes always displayed a high catalytic efficiency when different viral polyprotein cleavage sites were tested. Finally, genetic and catalytic efficiency differences were observed when the 3 quasispecies were compared, suggesting that different selective forces were acting in different infected individuals. Conclusion: The rugged HCV protease quasispecies landscape should be able to react to environmental changes that may threaten its survival. (HEPATOLOGY 2007;45:899–910.)

Hepatitis C virus is a positive-stranded RNA virus that encodes a polyprotein of approximately 3,000 amino acid residues. This polyprotein is processed into structural and nonstructural proteins (NS) by host signal peptidases as well as by two viral proteases, NS2/3 and NS3.1 The role of the NS2/3 protease appears to be limited to the autoproteolytic cleavage in cis of the NS2-NS3 junction.2, 3 The amino-terminal 181 amino acid residues of the NS3 protein encodes a serine protease that cleaves at the NS3/4A junction in cis followed by the cleavage at the NS4A/4B, NS4B/5A, and NS5A/5B sites in trans.4, 5 The NS3 protease requires an accessory viral protein, NS4A, for optimal cleavage activity. Recently, the success of HCV in persisting has been shown to be linked to its ability to disrupt host antiviral defenses. The HCV NS3/4 protease functions as an antagonist of virus-induced interferon (IFN) regulatory factor-3 activation and IFN-β expression through its ability to block retinoic-acid–inducible gen I and toll-like receptor-3 signaling by cleaving Cardif [CARD (caspase and recruitment domain), also termed VISA (virus-induced signaling adaptor), MAVS (mitochondrial antiviral signaling), or IPS-1 (IFN-β promoter-stimulator 1)] and TRIF (Toll/interleukin-1 receptor domain-containing adaptor inducing IFN) proteins, respectively.6, 7 Several NS3/4 protease protease inhibitors (PIs) have been developed to block this step of the viral life cycle and have been proved, in some cases, to significantly decrease the plasma viral load of infected individuals.8–11

Since the first description of HCV, numerous studies have described the HCV variability found in infected individuals.12, 13 The observed HCV heterogeneity, like other RNA viruses, originates from the high rate of incorrect nucleotide substitutions during viral RNA replication (10−4-10−5 mutations per nucleotide and per replication cycle)14 and rapid viral turnover (producing on average 1012 virions per day in infected individuals).15 The high mutation rate associated with HCV replication results in the generation of swarms of mutants known as viral quasispecies.16, 17 Because the behavior of any particular variant may be influenced by the entire viral population, it has been suggested that the quasispecies, and not individual viral genomes, are the target of selection and random drift.18, 19 Early evolution of the viral quasispecies was shown to predict the clinical outcome of acute hepatitis C.20, 21 Likewise, the long-term viral evolution was shown to correlate with the severity of liver disease.22, 23

The HCV quasispecies has been well characterized for different viral genomic regions. The highest degrees of amino acid diversity within an infected individual (10%)20 has been observed in variable regions of viral envelope genes (E1 and E2). A smaller level of intraindividual amino acid diversity (less than 1%) has been reported for the NS3/4 protease coding region.24 In the current work, we analyzed, at a high resolution (1%), the genotype and enzymatic activity of 3 HCV NS3/4 protease quasispecies. We determined the catalytic efficiency of each variant present in the quasispecies to establish the relationships between genotype, phenotype, and fitness. Finally, a phylogenetic-fitness landscape map was constructed for each quasispecies.


IFN, interferon; NS, nonstructural protein; PI, protease inhibitor.

Patients and Methods


Three HCV-infected individuals (A, B, and C) were chosen for this study. The HCV genotype for the 3 individuals was 1b. Individuals A and B were co-infected with HIV type 1 (HIV-1), whereas individual C was HCV monoinfected (Table 1). Individuals A and B were selected from a previous study.25 Likewise, individual C was selected from another previous study.13 The HCV genotype was determined by the INNO-LIPA HCV IIR assay (Innogenetics NU, Ghent, Belgium). HCV RNA viral load was quantified by the Amplicor Monitor v2.0 (Roche Diagnostics Systems Inc., Branchburg, NJ). HIV-1 RNA viral load was measured by NASBA (Nuclisens HIV-QT, Biomerieux, Madrid, Spain). After taking the samples used in this study, individuals A and B were treated for 48 weeks with PEG-Interferon α−2b (Peg-Intron; Schering-Plough, Kenilworth, NJ) in combination with ribavirin (Rebetol; Schering-Plough). Individual A showed a sustained response to therapy, whereas individual B did not respond to therapy.25 Individual C's sample was taken during acute infection; afterward this individual resolved their HCV infection.

Table 1. Clinical and Virological Characteristics of the 3 Studied HCV-Infected Individuals
  1. NOTE. Individual C was HCV monoinfected.

  2. Abbreviations: Hn, nucleotide heterogeneity; Sn, nucleotide normalized Shannon entropy; Ha, amino acid heterogeneity; Sa, amino acid normalized Shannon entropy; Dg, genetic distance; Da, amino acid distance; ds, synonymous substitutions; dn, non-synonymous substitutions; ND, not determined.

HCV viral load (IU/ml)2,364109,1421,300,000
HIV-1 viral load (RNA copies/ml)<80<80
ALT (U/l)1337862
AST (U/l)2027ND
CD4+ (cells/μl)571992ND
Number of studied clones9697103
Hn (%)48.984.591.3
Ha (%)34.348.431.1
Dg (% ± SD)0.3 ± 0.21.1 ± 0.51.2 ± 0.6
Da (% ± SD)0.6 ± 0.61 ± 0.70.5 ± 0.5
Ds (% ± SD)0.45 ± 0.42.6 ± 1.13.5 ± 1.2
Dn (% ± SD)0.28 ± 0.20.54 ± 0.40.22 ± 0.2
Ratio (ds/dn)1.64.815.9

Recovery and Analysis of HCV Sequences.

HCV RNA extraction and amplification was performed as previously described.26, 27 After viral RNA was isolated, 10 μl resuspended RNA was reverse transcribed at 42°C by using the avian myeloblastosis virus reverse transcriptase (Promega) and the oligonucleotide HCVproR1 (antisense) (5′-GGATGAGTTGTCTGTGAAGAC-3′, residues 3954 to 3974 of the HCV-J strain). Five microliters of the reverse transcriptase reaction were then used in the first PCR amplification with 0.04 U/μl of the proof-reading Platinum Taq DNA polymerase (Invitrogen). The oligonucleotides used for this amplification were HCVproL1 (sense) (5′-GCAAGGGTGGCGACTCCTTGC-3′, residues 3389 to 3409 of the HCV-J strain) and HCVproR1. Nested PCR was then performed with a 5′ oligonucleotide (HCVproL2) encoding an EcoRI site, residues 21 to 34 of NS4, and a dipeptide linker, Gly-Gly, along with residues 2 to 8 of NS3 (residues 3411 to 3431 of the HCV-J strain) (5′-GGGTTGAATTCTATGGCTCCTATTGGATCTGTTGTTATTGTTGG AAGAATTATTTTGTCTGGAAGAGGAGGACCTAT-CACGGCCTACTCCCAA-3′) and a 3′ oligonucleotide (HCV proR) that was complementary to residues 3930 to 3950 of the HCV-J strain (amino acid residues 175 to 181 of NS3) and encoded an in-frame stop codon flanked by an XhoI site (5′-GGGAGGGGCTCGAGTCAAGACCGCATAGTAGTTTCCAT-3′). The PCR products were digested with EcoRI and XhoI and ligated to pBSK- (Stratagene) to generate a βgal-HCV NS32-181/421-34 protease fusion protein (pHCVNS32-181/421-34 protease). To ensure that multiple HCV NS3/4 protease templates were present in each analyzed quasispecies, for each sample, 40 different PCR amplifications were performed and pooled before cloning. For each sample, a minimum of 100 individual plasmid clones were obtained and analyzed. The different proteases were sequenced with the flanking oligonucleotides T3 (5′-AATTAACCCTCACTAAAGGG-3′) and T7 (5′-TAATACGACTCACTATAGGG) using the Big Dye v3.1 kit in the 3100 DNA sequencing system (Applied Biosystems). Sequence alignment and editing was performed with the Sequencer version 4.1 (GeneCodes) software program. For the phylogenetic analysis PAUP* 4.028 software package was used with a GTR + G model of evolution. The final graphical output was created with the TREEVIEW software program.29 Nucleotide distances were obtained from the distance matrix generated with the PAUP*4.0 software package used in the phylogenetic analysis. The amino acid distances, with the Poisson correction, were calculated with the MEGA 2 software package.30 The normalized Shannon entropy (Sn) was calculated as Sn = -Σi (piln pi)/ln N, where N is the total number of sequences analyzed and pi is the frequency of each sequence in the viral quasispecies. S varies from 0 (no complexity) to 1 (maximum complexity).31 To determine possible selective pressures, the proportion of synonymous substitutions per potential synonymous sites and the proportion of nonsynonymous substitutions per potential nonsynonymous sites was calculated with the WET software program (Windows Easy Tree software package,∼dopazo/software/wet.html). To estimate codon-specific selection pressures, we used a maximum likelihood method implemented in the CODEML software program from the PAML version 3.14 software package.32 To asses evidence for positive selection, different models of codon evolution were compared using a likelihood ratio test M0vsM3, M1vsM2, and M7vsM8. Thus, a single codon subjected to positive selection can be identified by a Bayesian method implemented in the same software (Table 2).

Table 2. Positive Selection in the HCV NS3/4 Protease Coding Region
Subject−2(lnλ)dfP valuePositions
  1. NOTE. Only the positive selected codons (P > 0.95) under model M8 are listed. Abbreviation: df, degrees of freedom for each model comparison.

 M0 vs. M3 (k = 3)0.214P < 0.25 
  M1 vs. M20.0000022P < 0.25 
  M7 vs. M80.182P < 0.25 
 M0 vs. M3 (k = 3)5.61444P < 0.2 
  M1 vs. M22.34262P < 0.25R26 and V172
  M7 vs. M87.40322P < 0.025 
 M0 vs. M3 (k = 3)0.20184P < 0.25 
  M1 vs. M22.96542P < 0.2 
  M7 vs. M855.82P < 0.0005 

Genetic Screen for Determining the Catalytic Efficiency of HCV NS3/4 Proteases.

The catalytic efficiency of the different HCV NS3/4 proteases was determined using a bacteriophage lambda (λ)–based genetic screen as previously described.26, 27Escherichia coli JM109 cells containing plasmid pcI.HCVcro27 that included the NS4B/NS5A (NEDCSTPCSGSWLRDVW) or the NS5A/NS5B (ASEDVVCCSMSYTWTGA) cleavage site were then transformed with plasmid pHCVNS32-181/421-34 protease. The resulting cells were grown overnight at 30°C in the presence of 0.2% maltose, harvested by centrifugation, and resuspended to an optical density at 600 nm of 2.0 per ml in 10 mM MgSO4. To induce the expression of HCV NS32-181/421-34 protease, cells (200 μl) were incubated in 1 ml of Luria-Bertani medium containing 12.5 μg tetracycline, 20 μg ampicillin, 0.2% maltose, 10 mM MgSO4, and 1 mM isopropyl-β-D-thiogalactopyranoside (IPTG) for 1 hour. Thereafter, cell cultures were infected with 107 PFU of λ phage. After 3 hours at 37°C, the titer of the resulting phage was determined by co-plating the cultures with 200 μl of Escherichia coli XL-1 Blue cells (adjusted to an optical density at 600 nm of 2.0 per ml in 10 mM MgSO4) on Luria-Bertani plates using 3 ml top agar containing 12.5 μg tetracycline per milliliter, 0.2% maltose, and 0.1 mM IPTG. After incubation at 37°C for 6 hours, the resulting phage plaques were counted to score growth. In the experiments in which E. coli cells express the wild-type HCV I389/NS3-3 NS3/4 protease obtained from the subgenomic replicon,33, 34 λ phage replicated up to 6000-fold more efficiently than in cells that did not express the HCV NS32-181/421-34 construct.

Phylogeny-Fitness Landscape Map.

For each quasispecies, the amino acid phylogenetic relationships and the enzymatic activity of the different HCV NS3/4 protease variants were plotted together with the median-joining method implemented in the NETWORK version software program.35

Molecular Modeling.

For each protease variant, a 3-dimensional (3D) protease image was generated. Visualization and structure manipulation were done using the molecular modeling package PyMol.36 The BK strain of the HCV NS3/4 protease37 was used as 3D backbone.

Nucleotide Sequence Accession Number.

HCV NS3 sequences obtained and characterized in this study have been submitted to the GenBank database under accession numbers EF013788 to EF014086.


Genetic Structure of HCV NS3/4 Protease Quasispecies.

Individual plasmid clones carrying the HCV NS3 protease coding region were generated from a single time-point plasma sample from 3 HCV-infected individuals (Table 1). We cloned and sequenced 96, 97, and 103 clones from each individual sample. Neighbor-joining phylogenetic reconstruction of all NS3 protease nucleotide sequences was performed to determine the evolutionary relationships of the different variants. Sequences from each individual produced a monophyletic group, which were supported by bootstrap analysis (Fig. 1). Visual inspection of the former tree and intrasample nucleotide distance (Dg) calculations (Table 1) showed that sample A was more homogeneous, at the nucleotide level, than samples B and C. Intrasample genetic distances were 0.3 ± 0.2, 1.1 ± 0.5 and 1.2 ± 0.6 for A, B, and C samples, respectively. Likewise, Shannon entropy (Sn) (0.59, 0.92, and 0.96, respectively) and nucleotide heterogeneity (Hn) (48.9, 84.5, and 91.3, respectively) calculations confirmed that quasispecies A was more homogeneous than quasispecies B and C (Table 1). The homogeneity of quasispecies A correlated with the low HCV load of this sample. A different picture was observed when calculations were performed at the amino acid level. The lowest values for Da, Ha, and Sa could be found within quasispecies C, which was the most heterogeneous quasispecies at the nucleotide level (Table 1 and Fig. 1). This result reflected the different pressure constraints between viral quasispecies. Quasispecies A and C, although very different at the nucleotide level, both displayed similar amino acid diversity. The proportion of synonymous substitutions per potential synonymous sites and the proportion of nonsynonymous substitutions per potential nonsynonymous sites were calculated to search for positive selection within the 3 analyzed HCV NS3 protease quasispecies (Table 1). The synonymous substitutions per potential synonymous sites/nonsynonymous substitutions per potential nonsynonymous sites ratios were, in the three quasispecies, greater than 1, suggesting a preponderance of genetic drift over selection within the studied coding region. Nevertheless, different values were found for each quasispecies, 1.6, 4.8, and 15.9, respectively, indicating, once again, that different selective constraints could be acting on different quasispecies. Positive selective pressures were also searched by determining the codon-specific selection with the PAML 3.14 software package. We identified 2 positively selected codons, R26 and V172, both in quasispecies B (Table 2). The substitution R26K is a very common polymorphism found in genotype 1 isolates.38 The two identified codons, R26 and V172, have not been described as belonging to a CTL, T helper, or antibody epitope.38

Figure 1.

Neighbor-joining phylogram of HCV NS3/4 protease sequences from quasispecies A, B, and C. Phylogenetic reconstruction was generated using a GTR +G model implemented in the PAUP* 4.0 beta 8 software package. Bootstrap analysis (1000 repetitions) was performed to determine the reliability of the sample grouping (numbers at branch nodes). The HCV subgenomic replicon I389/NS3-3 sequence33, 34 was used as the outgroup.

Amino acid alignments showed a similar population structure for the 3 studied quasispecies (Fig. 2). One major form with frequencies of 58%, 45%, and 66%, respectively, was identified in the 3 quasispecies. A second group of sequences representing between 2% and 6% of sequences also could be found in every quasispecies. A common characteristic that shared the 3 studied quasispecies was the high proportion of unique variants. This diversity could allow the quick adaptation of the former viral quasispecies to environmental changes. For instance, mutations conferring resistance to NS3/4 PIs were found in 2 quasispecies (Fig. 2). The mutations V36A and R109K, which were found in quasispecies B (clones B19 and B54), have been involved in the acquisition of resistance to the HCV NS3/4A protease inhibitors VX-950 and SCH 6, respectively.39 Likewise, substitution V170A, found in quasispecies C (clone C30), has been involved in the acquisition of resistance to the protease inhibitor SCH 503034.40

Figure 2.

Amino acid sequence alignment of the 3 HCV NS3/4 quasispecies. Amino acid changes are indicated relative to the master sequence. The amino acid sequences are annotated by a capital letter for each sample. The number (%) of occurrences within each sample of identical amino acid sequences is given on the right at the end of each sequence. Dots indicate amino acid sequence identity. Asterisks indicate amino acid stop codons.

The number of mutated residues found in the 3 quasispecies was 39, 59, and 35, respectively. In the 3 quasispecies, the mutated residues were scattered through the protein without any region accumulating a higher number of substitutions (Fig. 2). A 3D image representing the crystallographic HCV NS3/4 protease structure showed that most of the variability appeared at the protease surface, leaving almost without mutations critical structures as the active site or the substrate-binding regions (data not shown). In conclusion, a number of common and differential traits could be found when the genetic composition of the 3 studied quasispecies was compared. In all cases, a huge number of different protein variants were observed, including mutants having substitution involved in the acquisition of resistance to current NS3/4A PIs. Despite the fact that a high proportion of protein variants could be found in the 3 quasispecies, a different nucleotide heterogeneity and diversification was detected between different quasispecies, indicating that different selective forces could be acting on different infected individuals.

Catalytic Efficiency of HCV NS3/4 Protease Quasispecies Variants.

The enzymatic activity of the different identified single mutant proteases (33, 47, and 32, for quasispecies A, B, and C, respectively) was determined by using a bateriophage λ based genetic screen.26, 27, 41–45 First, we evaluated the enzymatic activities of single-variant proteases by engineering in the cI λ repressor the HCV polyprotein NS5A/NS5B cleavage site.27 The enzymatic activities were related to the activity of the HCV subgenomic replicon I389/NS3-3 NS3/4 protease (100%).33, 34 The three master protease enzymatic activities were 53.2% ± 10 % (A24), 73.9% ± 5% (B9), and 165% ± 10% (C1) (Fig. 3). To verify that the observed differences between the 3 master sequences were attributable to differences found in the NS3 protease region and not to possible differences in the NS4A cofactor region or in the target NS5A/NS5B cleavage site, the consensus nucleotide sequence of the NS4A cofactor and the target NS5A/NS5B coding regions was performed for the 3 viral samples. The former sequences confirmed that the 3 samples had an identical NS5A/NS5B cleavage site. Within the NS4A cofactor, a substitution at position R34K was found in samples A and B; nevertheless, site-directed mutagenesis of this residue demonstrated that this substitution did not affect the catalytic efficiency of our protease construct (data not shown). Within the 3 viral populations there were proteases with a reduced or undetectable enzymatic activity (less than 1% of the activity of the I389/NS3-3 protease). The percentage of defective proteases (less than 1% of the activity) was 39%, 29% and 31% for the A, B,and C quasispecies, respectively. Consequently, 67% of all analyzed proteases displayed a detectable enzymatic activity (Fig. 3). Nonfunctional proteases with mutations located at the catalytic triad (H57, D81, and S139) (clones A71, B28, B108, and C7) or at the zinc-binding site (C97, C99, C145, and H149) (clones A22, A42, B6, B20, B42, and C138) (Figs. 2, 3) were found, demonstrating the specificity of the genetic screen employed here to measure the catalytic efficiency of the different protease variants. Remarkably, 42% and 37% of the protease variants from quasispecies B and C, respectively, showed a similar or higher catalytic efficiency than the master protease. Of note, several of the high-fitness minority variants, mainly in the C quasispecies, had only one substitution when compared with the master sequence. Finally, the 3 proteases bearing substitutions involved in the acquisition of resistance to current PIs (B19, B54, and C30) showed a catalytic efficiency similar to that displayed by the master protease (Fig. 3).

Figure 3.

Comparative growth of phages containing different HCV NS3/4 protease single variants targeting the NS5A/NS5B cleavage site. The growth of phages encoding a single protease variant (white bars) was compared with the growth of wild-type I389/NS3-3 protease (100%) (dashed bar). The gray bar indicates the master protease. At least 2 replicates were performed for each sample.

To investigate why high-efficiency minority variants were so abundant in quasispecies C, we decided to analyze the catalytic efficiency of the entire C quasispecies with a different target cleavage site. We chose the NS4B/NS5A cleavage site, which is also targeted in trans by the NS3/4 protease. First, we analyzed the catalytic efficiency of the 3 quasispecies master sequences with the NS4B/NS5A cleavage site (Fig. 4). These results were also correlated with activity of the subgenomic replicon I389/NS3-3 NS3/4 protease (100%). Overall, the replicon and the 3 master proteases processed more efficiently the NS4B/NS5A cleavage site than the NS5A/NS5B site (97.4% ± 28%; 155.5% ± 42% and 232% ± 57%, for A24, B9, and C1, respectively) (Fig. 4). Importantly, and similar to the result obtained with the former NS5A/NS5B cleavage site, the catalytic efficiency of the A24 master protease was, to a similar extent, lower than the B9. Likewise, the catalytic efficiency of the B9 master protease was lower than the C1 protease. In contrast to the results obtained with the C1 master protease and the other 2 master proteases, 85% of quasispecies C minority variants had a lower catalytic efficiency with the NS4B/NS5A cleavage site than with the NS5A/NS5B cleavage site (Fig. 5). Overall, the variants that showed a higher catalytic efficiency than the master protease in front of the NS5A/NS5B cleavage site had a low efficiency when tested with the NS4B/NS5A cleavage site (Fig. 5). Remarkably, the C19 variant, which displayed more than 200% of the activity of the C1 master protease with the NS5A/NS5B cleavage site, showed no activity at all with the NS4B/NS5A cleavage site. Only variant C50 displayed, with both cleavage sites, a better catalytic efficiency than the master protease, suggesting that the viral complexity in this genomic region may not exclusively depend on the enzyme catalytic efficiency. Likewise, variants such as the C42, C124, and C141 that showed no activity with the NS5A/NS5B cleavage site had, although modest, a certain enzymatic activity with the NS4B/NS5A cleavage site (Fig. 5). Overall, the results obtained after analyzing this second cleavage site demonstrated that the master proteases could be a quasispecies optimally capable of processing with high efficiency their different target cleavage sites.

Figure 4.

Comparative growth of phages containing the three HCV NS3/4 master proteases targeting the NS5A/NS5B or NS4B/NS5A cleavage site. The growth of phages encoding the master proteases A24, B9, and C1 (gray and black bars) was compared with the growth of wild-type I389/NS3-3 protease (100%) targeting the NS5A/NS5B cleavage site (dashed bar). The gray bars represent phages targeting the NS5A/NS5B cleavage site. The black bars represent phages targeting the NS4B/NS5A cleavage site. The dotted black bar indicates wild-type I389/NS3-3 protease targeting the NS4B/NS5A cleavage site. At least 2 replicates were performed for each sample.

Figure 5.

Comparative growth of phages containing quasispecies C HCV NS3/4 protease single variants targeting the NS5A/NS5B or NS4B/NS5A cleavage site. The growth of phages encoding a single protease variant (white and black bars) was compared with the growth of wild-type I389/NS3-3 protease (100%) targeting the NS5A/NS5B cleavage site (dashed bar). Wild-type I389/NS3-3 black bar represents cleavage at the NS4B/5A site. The white bars represent phages targeting the NS5A/NS5B cleavage site. The black bars represent phages targeting the NS4B/NS5A cleavage site. The gray and dotted bars indicate the quasispecies C master protease (C1) targeting the NS5A/NS5B and NS4B/NS5A cleavage sites, respectively. At least 2 replicates were performed for each sample.

For each of the 3 HCV NS3/4 protease quasispecies, a phylogeny–fitness landscape map was constructed by correlating the amino acid phylogenetic relationship and the catalytic efficiencies of the different variants. This correlation was performed by using the median-joining network method.35 The 3 quasispecies formed a landscape in which the ancestor (network center) sequence coincided with the master sequence and with one of the adaptive peaks (i.e., proteases with high catalytic efficiency) surrounded by closely related minority variants (Fig. 6). Within populations B and C, other adaptive peaks, formed by minority variants, were also detected around the central peak originated by the master sequence (Fig. 6). The most complex landscape was found within the B quasispecies, which had the higher amino acid diversity (Fig. 6 and Table 1). Quasispecies C formed the simplest landscape, independently of the target cleavage site used to calculate the catalytic efficiencies. In short, several adaptive peaks were identified in the three populations, indicating that the 3 HCV protease quasispecies landscapes illustrated here were rugged.

Figure 6.

Phylogeny–fitness landscape map. Median-joining network showing the amino acid phylogenetic relationship and the catalytic efficiencies of the different protease variants. This phylogenetic reconstruction was performed by using the median-joining network method.35 Circles represent the different protease single clones. Circle size is proportional to the observed clone frequency in the quasispecies. Circle color represents the protease catalytic efficiency (fitness) relative to the wild-type HCV I389/NS3-3 protease. For quasispecies C, 2 maps are shown; these represent the results obtained with the NS5A/NS5B and NS4B/NS5A cleavage sites, respectively.


The genetic analysis performed in this study showed low genetic diversity as compared with other genomic regions subjected to strong selective pressure such as variable regions of the envelope proteins. Nevertheless, the high resolution of the quasispecies shown here also reflected the ability of HCV to explore a huge range of NS3/4 protease genetic configurations. Furthermore, the phenotypic resolution of these quasispecies has also allowed us to demonstrate that most of the former genetic configurations had a detectable enzymatic activity. Overall, most variants found in the 3 quasispecies analyzed displayed a detectable enzymatic activity, and several variants had similar or better fitness than the master when assayed with the target NS5A/NS5B cleavage site. HCV replication generates variant NS3/4 proteases with quantitatively different functional properties. As expected, within these genetic configurations, mutations involved in the acquisition of resistance to current NS3/4A PIs were also found. Importantly, all the variants bearing resistance substitutions displayed a high catalytic efficiency. This is especially important because it is expected that the application of PIs would result in the rapid outgrowth of this resistant minority population. Functional differences between HCV variants may play an important role in various aspects of viral life cycle and pathogenicity.46 Together with functional quasispecies variants, several nonfunctional variants with extremely reduced enzymatic activity were also observed. Some of these variants had substitutions within the catalytic site or the zinc-binding site, demonstrating that nonfunctional variants can be encapsidated through cooperation with infectious genomes. Low-fit variants may serve as a molecular reservoir capable of reacting swiftly to a selective pressure. Viral populations can preserve less fit variants to ensure the success of the whole quasispecies.18

The genetic variability found here is also remarkable because the effective cleavage of Toll/interleukin-1 receptor domain-containing adaptor inducing IFN and Cardif is required for optimal viral replication.47 These 2 host substrates will be immutable because they will not be expected to undergo sequence evolution, unlike the viral substrate sequences, which can be expected to mutate in response to transient selective pressure.48, 49 The finding that minority variants had a very different catalytic efficiency depending on the target cleavage site tested suggests that some minority variants might be selected on the basis of their better efficiency in cleaving TRIF or Cardif. The 3 master proteases analyzed in this study seemed to be optimally capable of efficient cleavage of the 2 target sites tested. However, they had significantly different efficiencies when they were compared. Whether differences in the NS3/4 protease catalytic efficiency can be related to virus fitness, virulence, or pathogenicity remains to be determined.50

The simplicity of the genetic screen model used here to characterize the catalytic efficiency of the HCV NS3/4 protease can be seen as a complement to the classical biochemical approach for monitoring NS3/4 proteolytic activity. An interesting result of the current study has been the finding that our system easily allows the characterization of enzymes with different proteolytic activities. The catalytic efficiencies of the different variant proteases gives not only a precise scheme of the selection forces that act in the evolution of this protein but also highlights essential positions of the enzyme. Coupling mutant sequence libraries with this positive genetic selection system will allow the study of a huge number of functional mutants.

We recently performed a similar study with the HIV-1 protease.51 HIV-1 and HCV share some characteristics, being both RNA viruses and producing persistent infections in humans. Consequently, we have found some common traits between these proteases. First, every quasispecies displayed a large number of fitness optima or peaks, suggesting that protease quasispecies complexity does not exclusively depend on the enzyme catalytic efficiency. Second, phylogenetic–fitness maps were rugged. The adaptive peak of the master protease was surrounded by other adaptive peaks formed by related minority variants, indicating that the master sequence may walk through the quasispecies fitness landscape by single mutational steps without being trapped at suboptimal alleles.52 Third, although the analyzed quasispecies shared some traits such as the presence of several fitness optima, differing nucleotide heterogeneity and diversification were detected between different quasispecies, reflecting that different selective forces were acting on different infected individuals.

To understand the HCV interaction with its host, protease landscape maps in which genetic information is associated with enzyme activities should provide clues to explain adaptive mechanisms during the complex processes of genetic drift and selective pressures exerted by the host immune system or antiviral therapy.