Human leukocyte antigen–associated sequence polymorphisms in hepatitis C virus reveal reproducible immune responses and constraints on viral evolution


  • Potential conflict of interest: Nothing to report


CD8+ T cell responses play a key role in governing the outcome of hepatitis C virus (HCV) infection, and viral evolution enabling escape from these responses may contribute to the inability to resolve infection. To more comprehensively examine the extent of CD8 escape and adaptation of HCV to human leukocyte antigen (HLA) class I restricted immune pressures on a population level, we sequenced all non-structural proteins in a cohort of 70 chronic HCV genotype 1a-infected subjects (28 subjects with HCV monoinfection and 42 with HCV/human immunodeficiency virus [HIV] coinfection). Linking of sequence polymorphisms with HLA allele expression revealed numerous HLA-associated polymorphisms across the HCV proteome. Multiple associations resided within relatively conserved regions, highlighting attractive targets for vaccination. Additional mutations provided evidence of HLA-driven fixation of sequence polymorphisms, suggesting potential loss of some CD8 targets from the population. In a subgroup analysis of mono- and co-infected subjects some associations lost significance partly due to reduced power of the utilized statistics. A phylogenetic analysis of the data revealed the substantial influence of founder effects upon viral evolution and HLA associations, cautioning against simple statistical approaches to examine the influence of host genetics upon sequence evolution of highly variable pathogens. Conclusion: These data provide insight into the frequency and reproducibility of viral escape from CD8+ T cell responses in human HCV infection, and clarify the combined influence of multiple forces shaping the sequence diversity of HCV and other highly variable pathogens. (HEPATOLOGY 2007.)

Recent studies suggest that immune control of hepatitis C virus (HCV) is possible1–3 and the role of CD8 T cells is supported by studies linking particular human leukocyte antigen (HLA) class I alleles with control of HCV.4 How viral infection persists in the face of an activated host immune response is poorly understood. Several mechanisms have been suggested that may contribute to the failure to contain HCV: these include impairment of cellular effector functions,3, 5 suppression of antigen-specific cells by regulatory T cells,6–8 dendritic cell dysfunction,9 T cell exhaustion,10, 11 or deletion of antigen-specific T cells in the liver.12 However, persistence of HCV may also be facilitated by viral evolution that enables evasion of host immune responses occurring over the course of an individual infection. In the chimpanzee model, a strong association has been demonstrated between viral persistence and the development of CD8 escape mutations.13, 14 Moreover, recent studies have begun to clarify the propensity for viral escape from CD8+ T cell responses in human HCV infections15–21 and accumulation of escape mutations in a region of non-structural protein 3 (NS3) has been reported.15, 22 In HIV-1 the majority of mutations arising outside of the envelope gene are in fact driven by host CD8+ T cell responses.23 Therefore, viral escape in targeted epitopes can be widespread, and such studies are beginning to solidify our understanding of the role of host immune pressures in shaping the diversity of these pathogens. Unfortunately, the limited size of previous CD8 escape studies in human HCV infection,15–21 combined with the difficulty of detecting ex vivo CD8+ T cell responses in the peripheral blood,24–26 have precluded a broad assessment of the extent and frequency of CD8 escape mutations in HCV infection.

We sought, therefore, to determine the extent to which CD8 escape mutations were arising in a cohort of chronic HCV-infected subjects by identifying HLA class I-associated sequence polymorphisms across the HCV proteome. The relationship between HLA allele expression and viral sequence diversity in HCV will elucidate the extent to which CD8 escape occurs in human HCV infection, and provide insight into possible mechanisms by which various HLA alleles are associated with resolution of HCV infection.4 We sequenced all non-structural proteins from 70 chronic genotype 1a-infected subjects, including a portion of E2, and related viral sequence polymorphisms to the HLA class I alleles expressed by each subject. Utilizing both a previously published statistical approach,27 and a novel phylogenetic approach, multiple HLA-associated sequence polymorphisms were identified both within and outside of previously described CD8 epitopes, revealing the commonality by which viral escape from CD8+ T cell responses is occurring in human HCV infection.


aa, amino acid; CTL, cytotoxic T lymphocyte; HCV, hepatitis C virus; HIV, human immunodeficiency virus; HLA, human leukocyte antigen; nt, nucleotide; SIV, Simian immunodeficiency virus.

Patients and Methods


Seventy subjects with chronic HCV infection were recruited from the Hepatology outpatient clinic of the Massachusetts General Hospital in Boston, the Lemuel-Shattuck Hospital and the Fenway Community Health Center. Subjects were included if they presented with a positive HCV-RNA test in serum and no clinical evidence for acute infection. Only subjects infected with genotype 1a were chosen. Twenty-four of the 70 subjects were non-Caucasian, reflecting individuals of mostly African-American and Hispanic descent, and 42 of the 70 subjects were co-infected with HIV-1. The study was approved by the local Institutional Review Board, and all subjects gave written informed consent.

Detection of HLA Class I-Associated Sequence Polymorphisms

Two methods for the detection of HLA-class I-associated sequence polymorphisms were used, a pure statistical analysis algorithm based on a previously published method by Moore et al.27 and a novel phylogenetic analysis algorithm that controls for a potential sample bias due to a shared phylogeny of the analyzed sequences. A detailed description of both methods is provided as Supplementary Material (Available at the Hepatology website: together with detailed information about the amplification and sequencing procedures.


Identification of Positive HLA-Associated Sequence Polymorphisms in HCV

Seventy subjects chronically infected with HCV genotype 1a were identified from our Hepatology outpatient clinics in Boston. We population sequenced a 7208 nucleotide region of the HCV genome (nt1944-9152 of H77) spanning amino acid 535 of E2 to amino acid 2937 of NS5B, and representing 80% of the expressed open reading frame. HLA typing revealed that the frequency of the most common HLA-A, -B, and -C alleles in our cohort were largely reflective of North American Caucasian populations [] (Fig. 1). To identify HLA-associated sequence polymorphisms we modified an algorithm previously utilized for the detection of HLA-allele associated polymorphisms in HIV-1 reverse transcriptase by Moore et al.,27 and included an analysis for associations within a 9 amino acid sliding window (see methods in Supplementary Material).

Figure 1.

HLA class I allele distribution of cohort. The frequency of the most common HLA-A, -B, and -C alleles in the study cohort were reflective of the allele frequencies commonly observed in North American Caucasian populations.

A total of 15 HLA-associated sequence polymorphisms (adjusted P < 0.05) were identified across the HCV proteome (Table 1A), some at single residues and others through a 9 amino acid sliding window (w). Associations were found for both HLA-A, B, and C alleles and within all proteins except for NS4A/B. Sequence alignments for three examples (#2, #5, and #12) that illustrate an increased frequency of mutations in subjects expressing the corresponding HLA allele are shown in Fig. 2. Three of the identified associations were within, or overlapped with, described CD8+ T cell epitopes for which HCV escape has been documented during acute or chronic HCV infection. Notably, mutations had occurred within the recently described B27-ARMILMTH2841-2849 epitope in 100% of the B27 subjects (Fig. 2A), similar to the high frequency of escape previously observed by Neumann-Haefelin et al.28 Because of such associations within described epitopes, we included a subsequent analysis of associations within a panel of 123 mapped and HLA-defined human HCV CD8+ T cell epitopes previously published [] or identified in our laboratory (data not shown). Unadjusted P values were determined for this more focused query to identify significant associations below P < 0.05. Here a total of 11 additional HLA associations were identified (Table 1B), including the three denoted above in Table 1A. Some of these additional associations were borderline significant when adjusted for multiple comparisons, suggesting that potentially larger datasets would detect these in an unbiased screen as in Table 1A.

Table 1. Positive HLA Class I-Associated Sequence Polymorphisms
A. Positive HLA Associations Adjusted for Multiple Comparisons
#HLAProteinResidueTest ContextPredicted Epitope2×2 P-Value (Unadjusted)2×2 P-Value (Adjusted)2×2 q-valuePhylogenetic P-ValuePhylogenetic q-Value
  • (W)

    association observed in the sliding window analysis

  • a

    included in table Ia and Ib

  • b

    CD8 escape was documented during acute HCV infection

  • c

    q-values <0.2 underlined

  • d

    phylogenetic P value <0.05 bold

  • e

    P > 0.05 if IUPAC codes were translated as an X (see methods in Supplementary Data)

  • fdetailed information about previously describe epitopes are included in the HCV immunology database ( with the exception of the recently described HLA-B27 restricted epitope (ARMILMTH) (see Neumann-Haefelin et al28).

1aA01NS31436 (W)ATDALMTGYATDALMTGYa0.0000030.0000.0130220.0001250.139666
2aB27NS5B2841 (W)ARMILMTHFARMILMTHa0.0008780.0000.4922470.0013440.477314
12aB35NS31356 (W)TVPHPNIEEHPNIEEVALb0.0005110.0420.3788910.0005110.374489
14B08E2644 (W)CNWTRGERCmultiple0.0003610.0450.2583360.0363640.888502
B. Additional Positive HLA Associations in Defined Epitopes (unadjusted)
#HLAProteinResidueTest ContextDescribed Epitope2 × 2P-Value (Unadjusted)2×2P-Value (Adjusted)2×2 q-ValuecPhylogeneticP-ValuedPhylogenetic q-Valuec
1aA01NS31436 (W)ATDALMTGYATDALMTGYc0.0000030.0000.0130220.0001250.139666
   2841 (W)ARMILMTHFARMILMTHb0.0008780.0000.4922470.0013440.477314
16B08NS31395 (W)HSKKKCDELHSKKKCDELb0.000763e0.0930.4968230.0007630.451654
   1398 (W)KKCDELAAKHSKKKCDELb0.0004100.0510.2866500.0004100.367591
17B08NS31402 (W)ELAAKLVALELAAKLVGL0.0019210.2170.6598540.0019210.541280
12aB35NS31359 (W)HPNIEEVALHPNIEEVALb0.0030320.2260.6352880.0006070.390826
18B40NS5A2152 (W)HEYPVGSQLHEYPVGSQL0.0054350.3570.6677550.1600000.969594
   2163 (W)EPEPDVAVLEPEPDVAVL0.012405e0.6520.7445630.0033680.534692
22A24NS4B1745 (W)VIAPAVQTNVIAPAVQTNW0.0358300.9790.8669960.1866780.967253
   1746 (W)IAPAVQTNWVIAPAVQTNW0.0358300.9790.8669960.1866780.967253
Figure 2.

Amino acid alignments of positive HLA-associations. Sequence alignments for HLA-associated polymorphisms #2, #5, and #12 from Table 1A are shown. Sequences derived from subjects expressing the corresponding HLA allele are shown above the line, while those from subjects not expressing the allele are shown below the line. The residue/window with the strongest association is highlighted in grey, regions for which CD8 epitopes that match the associated HLA-allele are described or predicted are boxed. (A) Alignment of an association in the previously described B27-ARMILMTH epitope. (B) Alignment of an association overlapping with the previously described B35-HPNIEEVAL epitope. C) Alignment of an association inside a predicted HLA-B35 epitope.

Therefore, a total of 23 positive HLA-associated sequence polymorphisms were detected in this cohort of genotype 1a infected subjects, suggesting the potential for numerous CD8+ T cell responses to be exerting immune pressure and driving sequence variation in human HCV infection. 15 out of 23 associations (65.2%) were restricted by HLA-B alleles, a striking result given the observed dominant role for HLA-B alleles in mediating the evolution of HIV-1.29

Co-infection with HIV may lead to decreased cellular immunity against HCV.30 To determine whether HIV co-infection status has an impact on the observed HLA-associations we reevaluated five of the strongest positive associations based on HIV status and CD4 count (Supplementary Fig. 1). In 2/5 cases the association holds up with a significant P value in both subgroups (HIV+ and HIV−), in two cases the association was lost in the HIV negative group and in one other case the association was lost in both groups. No significant associations were detected when subjects were stratified by CD4 counts (<300/μl; data not shown). However, due to the decreased sample size the statistical power was substantially decreased in these subgroup analyses and a much larger sample size is needed to address this question adequately. We also tested if sequence polymorphisms are located in the flanking region of described epitopes potentially blocking proteasomal processing. No HLA-class I associated sequence polymorphisms were detected in the flanking five residues of previously described CD8 epitopes that passed the threshold of significance after adjustment for multiple comparisons.

Negative HLA Associations or “Negatopes”

A few studies have begun to identify negative HLA associations, or ‘negatopes’ in HIV-1.27, 31, 32 Negatopes represent associations between expression of an HLA allele and amino acid substitutions that already represent highly prevalent or consensus residues (>50%) in the population.31 An analysis similar to those of Table 1 identified a total of seven negative HLA associations across the HCV proteome, including two located in previously defined CD8 epitopes (Table 2A and B). Sequence alignments for two examples (#1 and #6) that illustrate preferential selection of consensus residues in subjects expressing the corresponding HLA allele are shown in Fig. 3. Notably, three of these negatopes were restricted by HLA-A02, the most frequent HLA allele in our cohort (Fig. 1). These data raised the possibility that escape mutations within HCV CD8 epitopes restricted by common HLA class I alleles may be more prone to accumulate in the population due to continuous exposure to these selection pressures.32, 33

Table 2. Negative HLA Class-I Associated Sequence Polymorphisms
A. Negative HLA Associations Adjusted for Multiple Comparisons
#HLAProteinResidueTest ContextPredicted Epitope2 × 2 P-Value (Unadjusted)2×2 P-Value (Adjusted)2×2 q-ValuePhylogenetic P-ValuePhylogenetic q-Value
B. Additional Negative HLA Associations in Defined Epitopes (unadjusted)
#HLAProteinResidueTest ContextDescribed Epitope2 × 2P-Value (Unadjusted)2×2P-Value (Adjusted)2×2 q-valuePhylogeneticP-ValuePhylogenetic q-Value
6A02E2723 (W)FLLLADARVFLLLADARV0.0024830.2110.6241810.0070910.659425
Figure 3.

Amino acid alignments of negative HLA-associations. Sequence alignments for two examples of negative HLA-associated polymorphisms are shown. Sequences derived from subjects expressing the corresponding HLA allele are shown above the line, while those from subjects not expressing the allele are shown below the line. The residue/window with the strongest association is highlighted in grey, regions for which CD8 epitopes that match the associated HLA-allele are described or predicted are boxed. A) Alignment of an association inside a predicted HLA-A2 epitope. B) Alignment of an association in the previously described A2-FLLLADARV epitope.

Phylogenetic Analysis of HLA Associated Polymorphisms

The above statistical approach to identify HLA-associated sequence polymorphisms examines at face value the relationship between sequence polymorphisms in HCV and HLA expression. This approach, however, incorrectly treats the phylogenetically related samples as independent events, and interprets each sequence without regard to the sequence of viruses at the time of transmission, prior to immune selection pressure. Phylogenetic analysis allows statistical estimates of sequence history, which can provide additional power to the assessment of HLA-associated polymorphisms.

Table 1A and B provides the unadjusted phylogenetic P values for each of the originally defined positive HLA associations, and Table 2A and B the values for the originally defined negative HLA associations (see column Phylogenetic P Value). Notably, 9 of the 23 positive HLA associations were significantly weakened under this analysis, with many of those failing having exhibited weaker P values in the original screen. Of the initial seven negative associations identified, two (#4 and #7) did not satisfy an unadjusted P value of 0.05 upon reanalysis using the phylogenetic approach (Table 2A and B; see Phylogenetic P Value). We will discuss three examples at length.

One negative association, an HLA-A02 association within a defined CD8 epitope (#7; A02-ALSTGLIHL684-692), was decidedly refuted by the phylogenetic analysis. Fig. 4A illustrates the HLA sorted alignment of sequences within this epitope, while Fig. 5A illustrates the phylogenetic relationship of these sequences and the pattern of amino acid substitutions within this epitope. Although an association was found between the expression of HLA-A02 and expression of the consensus serine (S) residue (P < 0.039; Fig. 4A), no significance was found (P = 1.000 when assessing whether sequences likely to have originally contained a threonine residue [T] preferentially mutated to consensus serine in the presence of HLA-A02 (measurement of an A02 negatope; Fig. 5A; “escape” table). Rather, the tree reveals that all sequences containing the non-consensus threonine (T) residue are in fact phylogenetically related to one another, regardless of the expression of HLA-A02. That is, the sequences with a non-consensus residue were all related by common descent; the non-consensus residue was not associated with expression of HLA-A02. Together these data reveal a profound influence of founder effects within this epitope.

Figure 4.

Examples of HLA-associations differentially impacted by the phylogenetic analysis. (A) The A02-ALSTGLIHL epitope for which a negative association was detected. (B) The A01-ATDALMTGY epitope for which a positive association was detected. (C) The B35-EPEPDVAVL epitope for which a positive association was detected.

Figure 5.

Phylogenetic tree illustrating Tree-based Fisher's exact calculations. A phylogenetic tree based on sequences derived from all 70 subjects was constructed. At the end of each branch of the tree is indicated the inferred amino acid present in the ancestral sequence ‘ancestral’ and the amino acid at the residue under investigation in the particular strain ‘current’. Also indicated at each branch and internal node of the tree is an additional indicator, a number from 0–9 in tenths or X for 1.0, provided to indicate the probability of each sequence to exhibit the consensus form. Sequences derived from subjects expressing the particular HLA allele of interest are distinguished with a purple line, while sequences derived from other subjects in the study exhibit a bold grey line. Additional untyped genotype 1a control sequences shown in light grey are added to strengthen the tree. At the top of each figure is indicated the specific consensus residue under consideration shown in red, with flanking residues in black. Presented at the bottom of each figure three tables are provided. The first of these, a standard 2x2 Fisher's exact test, considers all HLA typed sequences in the tree indicating whether there is significant selection for residues in subjects expressing the given HLA allele. Two additional tree-based Fisher's exact tests are then shown. The first, “Reversion”, tests whether at the test residue sequences are reverting towards the consensus residue versus remaining stable. The second, “Escape” tests whether at the test residue sequences are mutating away from the consensus residue versus remaining stable. (A) The A02-ALSTGLIHL epitope for which a negative association was detected. (B) The A01-ATDALMTGY epitope for which a positive association was detected. (C) The B35-EPEPDVAVL epitope for which a positive association was detected.

In contrast, the phylogenetic method (tree-based Fisher's exact tests) supported HLA-associated immune pressure for association #1, epitope A01-ATDALMTGY1436–1444 (Figs. 4B and 5B). Here, 10/11 sequences were likely to have mutated from the consensus Y to the variant F in the presence of the selecting HLA-A01 allele, versus only 11/43 in the absence of the HLA allele (Fig. 5B; “escape” table, (P = 0.0001). The phylogenetic approach also enables identifying reversions, cases where mutations are selectively being driven back toward consensus residues in the absence of a given HLA allele. Here the “reversion” table indicates whether sequences have mutated towards the consensus Y in the absence of the HLA-A01 allele. There was no significant P value for “reversion” (Fig. 5B). However, there was a trend indicating that potentially a larger dataset would have revealed reversion at this site in the absence of HLA-A01 mediated selection pressure, potentially indicating a fitness cost for this substitution in genotype 1a.

Finally, the HLA-B35 restricted epitope in NS5A (B35-EPEPDVAVL2163–2171), association #19, is illustrated in which the tree strengthens an HLA association for the single residue in position 2171 (Figs. 4C and 5C). In this case, there is no indication of an elevated rate of reversion in the absence of the presenting HLA.

Application of q-Values

Due to the potential of any screening approach to identify false positives we also included an analysis of q-values, which are based on the concept of the false discovery rate and designed specifically for the analysis of genome-wide data sets.34 Here we utilized a q-value criteria of 0.2, which corresponds roughly to a P value of 0.003, and indicates that only 20% of the associations significant at this threshold are likely to represent false positives. With this criteria we identified three particularly strong positive HLA associations within our dataset (Table 1A; see column Phylogenetic q-Value for Associations #1, #2, and #19), all in previously described epitopes. Note, also included for comparison purposes is the q-value for the original corresponding 2×2 P value (Table 1A; see column 2×2 q-Value). In examining the 2×2 q-value and the phylogenetic q-value analyses two positive associations were shared (#1, #2), while the phylogenetic analysis strengthened one (#19). Notably, however, none of the negative HLA associations remained significant with q-values <0.2 by either approach (Table 2, see column Phylogenetic q-Value and 27times;2 q-Value).

Thus, from this analysis of a modest-sized data set of HCV sequences a total of 23 positive and 7 negative HLA-associations were identified using a previously published statistical approach. Upon implementation of a novel phylogenetic approach, and the use of q-values to more appropriately deal with false discovery rates, three particularly strong positive associations exhibiting q-values <0.2 were identified. It is notable that three of the associations in particular that failed the stricter q < 0.2 criteria, HLA-B08 HSKKKCDEL1395–1403, HLA-B35 HPNIEEVAL1359–1367 and HLA-B35 HPTLVFDITK881–890, reside within CD8 epitopes in which viral escape during acute HCV infection has previously been well described.15, 17 Therefore, these data suggest that larger datasets are needed to sufficiently power the identification of HLA associations in highly variable pathogens such as HCV.

Correlation of HLA-Associated Polymorphisms with Detection of CD8 Epitopes

To evaluate the data from the sequencing approach we compared the HLA-associated sequence polymorphisms with Elispot data from 10 patients utilizing a comprehensive method with overlapping peptides.30, 35 In these 10 patients a total of six CD8 responses were detected (Table 3). Notably the autologous viral sequence in five of these targeted epitopes showed sequence polymorphisms consistent with escape (bolded). We further analyzed all regions with HLA-associated polymorphisms in described epitopes restricted by each subject's HLA-alleles. Despite lack of detection of CD8 T cell responses with standard techniques ex vivo, 11 of 21 (52.4%) previously described epitopes also showed the corresponding polymorphism. Interestingly, after bulk-stimulation of PBMC the A1-1436 epitope (ATDALMTGY) was detectable in an additional two of seven HLA-A1 positive subjects without an ex vivo response suggesting that utilizing standard techniques may underestimate the CD8 immune response (data not shown).

Table 3. Sequence Polymorphisms in Targeted and Untargeted Previously Described CD8 Epitopes
PatientAABBCwCwTargeted CD8 Epitopesauntargeted HLA-matched CD8 epitopes
  • Epitopes that are consistent with escape are in boldface

  • ND epitope restriction not determined

  • a

    These CD8 responses have been previously published.25, 30, 35

  • b

    Consistent with escape - not included in manuscript because of minimum HLA criteria.

2010208550307A1-1426, B55-2898b, B55-2568bA2-2146, B8-1395, B8-1402
4022635380412B35-881B35-1359, B35-2171, A2-2146
6010208440101ND-2197A1-1436, B8-1395, B8-1402, A2-2146
8020240440305negativeA2-2146, B40-2152
9022408410717negativeB8-1395, B8-1402, A24-1745, A2-2146
10017408180207A1-1436B8-1395, B8-1402


Recent reports have clarified the role of viral escape in leading to the loss of CD8+ T cell responses in human HCV infection,15–21 contributing to our understanding of the multitude of mechanisms impacting viral persistence in the face of an active immune response. However, little is known regarding the rate at which HCV actually escapes from CD8+ T cell responses at the population level, nor the extent to which HLA class I-associated immune pressures are driving the sequence diversity of HCV. A recent study reports adaptation of HCV to HLA-class I-associated selection pressure in a region of NS3.22 Here we demonstrate that numerous positive HLA class I associations can be detected throughout the HCV proteome at the population level, some occurring with surprising reproducibility, and supporting viral escape from numerous CD8+ T cell responses in human HCV infection. Furthermore, only a few potential negative associations (negatopes) were identified. This suggests that, although CD8 immune pressures are likely to affect the frequencies of viral epitopes, their proposed role in driving extinction of particular CD8 epitopes at the population level must be interpreted cautiously. Together these data provide clearer insight into the extent to which viral escape from CD8+ T cell responses is occurring in human HCV infection.

The detection of multiple HLA-associated sequence polymorphisms across the HCV proteome reveals that some CD8+ T cell responses are predictably mounted against specific CD8 epitopes, and that CD8 escape can reproducibly occur. For example, in the HLA-A01 epitope ATDALMTGY1436–1444 only 16/52 (31%) of subjects lacking HLA-A01 exhibited a tyrosine to phenylalanine (Y→F) substitution, but in subjects expressing HLA-A01 the frequency of this substitution was highly elevated (17/18; 94%), suggesting that HLA-A01 subjects must routinely target this epitope. In line with this hypothesis, 11/28 (39%) HLA-A01 positive subjects in our Boston cohort have a detectable ex vivo response against this epitope (unpublished results). Another example is the B27-ARMILMTHF2841–2849 epitope, where 6/6 (100%) HLA-B27 subjects exhibited sequence polymorphisms as compared to only 9/64 (14%) non-B27-presenting subjects. Neumann-Haefelin et al. have previously observed that 5 of 6 HLA-B27 subjects that spontaneously resolved infection mounted detectable CD8 responses against this epitope, but only 3 of 8 subjects with chronic HCV infection did so.28 This HLA-B27 association was one of the strongest associations and is notable because it identified a potential novel B27 epitope in this region prior to its recent publication. Ray et al. recently observed many amino acid substitutions associated with the expression of particular HLA alleles within another cohort of women accidentally infected with HCV in a common-source outbreak.16 While this study was limited to only 22 subjects, and only within defined CD8 epitopes, knowledge of the infecting strain permitted a similar analysis. Notably, many of the same HLA-associated polymorphisms identified by Ray et al. were evident in our dataset, clearly illustrating the reproducibility of viral escape within HCV, even across different cohorts. Thus, the commonality of some HLA-associated polymorphisms, which reflect CD8+ T cell immune pressures, indicates that indeed some CD8 responses are highly reproducible and consistently driving viral escape in HCV. In previous studies no clear immunodominance of specific CD8 epitopes could be detected during chronic infection.25, 35 However, the current sequencing approach suggests that some escaped epitopes were previously targeted and immune responses are no longer detectable with standard techniques. Indeed, this loss of the associated CD8 T cell response following CTL escape is also highly typical in both HIV and SIV.36 Examining viral sequence evolution may therefore provide a powerful surrogate marker for the detection of CD8+ T cell responses against HCV and a better understanding of the breadth and specificities of these responses.

Studies in both HIV-1 and SIV are now revealing that there are limitations to the ability of these highly variable pathogens to support sequence polymorphisms. Viral escape in HIV-1 is often limited to a single residue within the CD8 epitope, and often further limited to substitution by only a single alternative residue.23 In addition, reversion of escape mutations has now been commonly observed upon transmission of viruses to a subsequent host.15, 16, 37, 38 Here we described various HLA-associated sequence polymorphisms in HCV CD8 epitopes that exhibit varying degrees of conservation. Again the A01-ATDALMTGY1436–1444 and B27-ARMILMTH2841–2849 epitopes provide illustrative examples (Figs. 2A, 4B, 5B). In the case of the A01 epitope, these data indicate that there may be fewer functional constraints at work to maintain one particular residue at this position. In a similar analysis in a different cohort a negative association has been described for this epitope indicating potential deletion from this population.22 However, phylogenetic evaluation of the same association in our cohort reveals a trend towards reversion in the absence of selection pressure. Alternatively, the B27 epitope region appears much more conserved and therefore may exact higher fitness costs to the virus upon escape. Recent data in both HIV-1 and SIV now illustrate the specific impact that particular CD8 escape mutations can have on viral replication capacity,39, 40 and suggest some finite space within which these pathogens can functionally exist. Indeed, it is notable that some of the HLA associations we report were driven by a single polymorphic site within the epitope, suggesting strict constraints on sequence variation at many residues across the HCV proteome.

Recent studies suggest that continuous exposure of highly polymorphic viruses to focused immune pressures may result in the accumulation of CD8 escape mutations, and thus the eventual loss of some CD8 epitopes within a population.31, 32 A follow-up evaluation of these data using the novel phylogenetic approach described herein revealed that most of the those negative HLA associations were due to founder effects rather than immune pressure.42 Applying this phylogenetic approach to the current HCV dataset revealed that some of the negative HLA associations in our study were similarly influenced by founder effects. Overall, comparison of the Fisher's exact test and the phylogenetic approach in this modestly sized data set also revealed that while some associations were weakened by the phylogenetic approach, still others were strengthened. Larger datasets are needed, therefore, to determine to what degree both positive and negative HLA associations are present in both HIV-1 and HCV at the population level, and to what degree these two approaches to identifying HLA associations are complementary.

Taken together, these data reveal the combined influence of multiple forces shaping the sequence diversity of HCV in the human population. The vast sequence diversity of viruses such as HCV, HIV-1, and SIV, and the highly polymorphic nature of the MHC class I loci, represent critical evolutionary characteristics governing the survival of both pathogen and host. Examination of these complex interactions in larger cohorts reveals patterns of host/pathogen co-evolution and a clearer picture of factors governing immune control.27, 29, 31, 41 Our data provide an important step towards elucidating the role of CD8 escape mutations in contributing to viral persistence and control of HCV.


We thank David Heckerman for pointing us to the q-test.