Pathogens adapt to antibody surveillance through amino acid replacements in targeted protein regions, or epitopes, that interfere with antibody binding. However, such escape mutations may exact a fitness cost due to impaired protein function. Here, it is hypothesized that the recurring generation of specific neutralizing antibodies to an epitope region as it evolves in response to antibody selection will cause amino acid reversions by releasing early escape mutations from immune selection. The plausibility of this hypothesis was tested with stochastic simulation of adaptation at the molecular sequence level in finite populations. Under the conditions of strong selection and weak mutation, the rates of allele fixation and amino acid reversion increased with population size and selection coefficients. These rates decreased with population size, however, if mutation became strong, because clonal interference reduced the rate of adaptation. The model successfully predicts the rate of reversion per allele fixation for an important human immunodeficiency virus type 1 (HIV-1) antibody epitope region. Therefore, antibody selection may generate complex adaptive dynamics.
Vertebrate adaptive immunity involves the specific recognition of pathogens and foreign molecules, or antigens, by T cells and B cells (Kindt et al. 2006). Membrane-bound receptors that recognize specific regions of antigens (epitopes) are generated during the maturation of both types of cells in a process unique to adaptive immunity. During the maturation of each cell, segments of receptor genes present in the germ line are randomly rearranged to form receptor-coding genes, giving each cell a single, randomly determined antigen specificity. By this process, T cells are estimated to exhibit approximately 109 receptor specificities and B cells approximately 1010 specificities, although the majority of each of these is eliminated because they recognize host antigens (Kindt et al. 2006). On subsequent exposure to antigen, cells bearing receptors with high affinity to an epitope on the antigen are clonally expanded. Therefore, the adaptive immune response is expected to impose strong selection on pathogens, discriminating among individual variants (Hamilton 1992; Frank 2002). Adaptive immunity also exhibits memory in the form of long-lived memory T and B cells that are expanded on initial exposure to an antigen and proliferate rapidly upon reexposure to the same antigen. As such, recognition of a particular pathogen variant by adaptive immunity may last the lifetime of the host.
The nature of this selection and the evolutionary responses by pathogens has been well studied for humans naturally infected with human immunodeficiency virus type 1 (HIV-1) and for macaques experimentally infected with a simian immunodeficiency virus (SIV). In the cell-mediated branch of adaptive immunity, cytotoxic T lymphocytes (CTLs) recognize processed epitopes in the form of short peptides (8–10 amino acids) from proteins of pathogens that infect cells. These epitopes are presented on the surface of all nucleated cells by class I major histocompatibility complex (MHC) proteins (Kindt et al. 2006). Unequivocal evidence of selection by CTLs in vivo was first reported for rhesus macaques experimentally infected with an SIV (Evans et al. 1999; Allen et al. 2000). In these studies, macaques with different class I MHC genotypes, presenting different SIV CTL epitopes, were selected for amino acid variants of epitope and nearby regions that interfered with epitope processing and presentation by the host's MHC or with epitope recognition by the host's CTLs. Such escape mutations, however, may exact a fitness cost to the virus in terms of reduced protein function, as demonstrated directly for HIV-1 (Schneidewind et al. 2007). Moreover, the particular escape mutation selected may depend on the trade-off between the increase in fitness due to immune evasion and the decrease in fitness due to impaired protein function (Schneidewind et al. 2008). Given fitness costs of escaping immune surveillance, it is predicted that in the absence of immune selection an escape mutation will revert to wild type. This has been observed with the transmission of virus between hosts with disparate class I MHC genotypes, such that escape mutations in the donor are no longer under selection in the recipient (Friedrich et al. 2004; Leslie et al. 2004; Fernandez et al. 2005; Navis et al. 2008). Reversion of escape mutations has also been observed for HIV-1 transmitted between hosts with the same class I MHC genotype (Allen et al. 2004). In this case, the reversion is attributed to a delay in the immune response to the targeted epitope in the recipient because the targeted epitope is initially absent. However, the reversion of escape mutations in a new host may be delayed if a second mutation compensating for the fitness loss caused by the escape mutation has also been selected in the donor (Crawford et al. 2007).
In the humoral branch of adaptive immunity, B cells recognize soluble antigens and antigens on the surface of pathogens (Kindt et al. 2006). On exposure to antigen, B cells undergo affinity maturation, a process unique to these cells. Somatic hypermutation introduces mutations into the variable regions of the genes coding for the antigen-binding regions of membrane-bound immunoglobulin receptors at a rate of approximately 10−3 per nucleotide per cell generation. Thus, as each cell divides it produces descendants varying in the antigen-binding region of their receptor. Those descendents that bind to antigen with higher affinity have higher survival and undergo further hypermutation and proliferation. Repetition of this cycle produces B cells with immunoglobulin receptors with very high affinity to the circulating antigen, and these receptors are eventually secreted as antibodies. Neutralizing antibodies target HIV-1 envelope glycoproteins on the surface of the virus (Zolla-Pazner 2004; Pantophlet and Burton 2006), selecting for neutralization resistant virus (Albert et al. 1990; Richman et al. 2003; Wei et al. 2003; Frost et al. 2005; Mahalanabis et al. 2009; Moore et al. 2009; Rong et al. 2009). The resistant variants then stimulate the production of new antibodies and the cycle is repeated, causing the emergence of resistant virus at intervals of 3–10 months (Richman et al. 2003; Wei et al. 2003). Escape from antibody neutralization may involve single amino acid replacements in an epitope (McKeating et al. 1993; Yoshiyama et al. 1994; Mo et al. 1997; Zwick et al. 2005; Shibata et al. 2007), which presumably interfere directly with antibody binding, or replacements outside of the epitope, which may affect antibody binding indirectly through conformational changes (Watkins et al. 1996; Rong et al. 2009). These escape mutations may affect protein function and reduce viral viability (McKnight et al. 1995; Mo et al. 1997; Manrique et al. 2007). And, consistent with escape mutations exacting a fitness cost, reversions have been observed with the removal of antibody selection on the bacterium Mycoplasma bovis in culture (Le Grand et al. 1996).
Here, it is hypothesized that the generation of high-affinity specific antibodies causes amino acid reversions in epitopes within a host through the repeated selection of escape mutants. This is envisioned to occur because selection of consecutive escape mutants in the same epitope by different neutralizing antibodies may release earlier escape mutations at individual sites from immune selection, allowing them to revert to wild type. This form of fluctuating selection is in contrast to that observed for CTL surveillance, in which immune selection on an epitope is relaxed when a virus infects a new host individual that does not target the epitope. A population genetic model incorporating antibody selection and genetic drift is developed and investigated through stochastic simulation. Antibody selection is shown to cause amino acid reversions, and the rate of reversion is shown to increase with the strength of selection. However, clonal interference may reduce the rate of reversion in moderately large populations. Predictions of moderate rates of reversion in HIV-1 are consistent with estimates from empirical data for an important antibody epitope region. The conditions of strong selection and weak mutation observed for HIV-1 and diverse other pathogens suggest that amino acid reversion due to antibody selection may be common. Antibody selection may generate complex adaptive dynamics.
A set of amino acid sites that interact with antibodies (binding sites) is represented as a binary string of length L, with 0s representing wild-type amino acids and 1s representing mutant amino acids. The amino acid sites, which are not necessarily sequential in a protein, represent sites at which mutations may interfere with antibody recognition. A unique binary string is referred to as an allele, and an allele that is targeted (recognized) by an antibody is an epitope. Although similar epitopes are known to be bound by the same antibody, sites at which mutations produce such antibody cross-reactivity were not considered. The sites modeled are only those at which mutation potentially abrogates recognition by an antibody, converting an epitope into an allele that “escapes” recognition. In the absence of an antibody response, the wild-type allele, with no mutations, has the highest fitness (highest protein function) and each mutation causes an equal and independent decrease in fitness. Therefore, the relative fitness of allele i is
where sm is the selection coefficient for a mutant residue relative to a wild-type residue, and therefore takes a value from 0 to 1, mi is the number of mutant sites in allele i, se is the selection coefficient for an epitope relative to an untargeted allele (the reduction in relative fitness due to antibody recognition), ranging from 0 to 1, and ei is an indicator variable that is 0 if allele i is not currently targeted by an antibody and has never been targeted, and 1 if the allele is currently targeted or has been targeted previously.
When an epitope is driven to extinction, the most common allele is then targeted by an antibody, becoming an epitope (if this allele was previously targeted, it remains an epitope). This is consistent with the observation for HIV-1 that the virus population within a patient is resistant to neutralization by contemporaneous blood serum and sensitive to neutralization by serum from later time points, which is explained by a lag in generating neutralizing antibodies to the existing viral variants (Richman et al. 2003; Wei et al. 2003). Any new mutation in the epitope's L binding sites, including a mutation to wild type that produces a new allele will serve as an escape mutation from the targeting antibody and any other antibody that has previously been elicited to other epitopes. Therefore, escape mutations increase the component of fitness due to neutralization resistance, but will carry a fitness cost of decreased protein function if the mutated residue is not wild type. Note that because of immunological memory, if a mutation produces an allele that has previously been targeted, the allele continues to be treated as an epitope. Under this model, the following chain of events, involving only single mutations, would lead to an amino acid reversion. Assuming L= 3 antibody-binding sites, the wild-type allele, 000, is initially fixed in the population. This allele is immediately targeted by an antibody and is therefore an epitope with fitness (1 –se). Any mutation will provide escape from antibody recognition. If the escape mutant 001, with fitness (1 –sm), replaces the wild-type allele, it becomes an epitope with fitness (1 –sm)(1 –se). If the subsequent escape mutants spreading to fixation are 101 followed by 100, then the amino acid at the third site, providing the initial escape mutation, has reverted to wild type and is counted as a reversion. Note that for any escape mutation to occur, sm < se because it must be true that (1 –sm) > (1 –se).
The evolution of a population was simulated stochastically in discrete generations with a Wright–Fisher model of reproduction. A binary sequence L sites long has c= 2L possible alleles. Recurrence equations were used to track changes in allele frequencies due to selection and mutation. If allele i has frequency xi before selection, then its frequency after selection is
The frequency of allele i after mutation is the sum of the products of the frequency of each allele j before mutation and the probability of obtaining i after mutation of j:
where μf is the forward per-amino acid mutation rate, μb is the backward mutation rate, is the number of amino acid mutants (1) in allele i but not in j, is the number of wild-type amino acids (0) in i but not in j, is the number of amino acid mutants in both i and j, and is the number of wild-type amino acids in both i and j. Recombination was not considered because, although HIV-1 has a high rate of crossovers between the two RNA copies of its genome during reverse transcription, recombination between nonidentical genomes depends on the frequency of coinfection of a host cell, which is rare (Josefsson et al. 2011). Genetic drift was incorporated by generating a random count for each allele in the next generation from a multinomial distribution with probabilities of possible mutually exclusive outcomes on any trial equal to the allele frequencies in the current generation (vi,…,vc), and a number of independent trials equal to the population size, N. This approach produces the same results as exact stochastic simulation, but is computationally much faster (Gillespie 1993).
With strong selection (Ns > 1) and weak mutation (Nμ« 1), a new beneficial mutation that survives stochastic loss in a finite population is expected to spread to fixation before the emergence of the next beneficial mutation to survive stochastic loss (Gillespie 1991; Orr 2002). Under these conditions, there is no clonal interference between beneficial mutations (Gerrish and Lenski 1998; Rozen et al. 2002) and the probability that a particular beneficial mutation is the next to spread to fixation is proportional to its selection coefficient (Orr 2002). With clonal interference, rates of adaptation are lower (Gerrish and Lenski 1998; Rozen et al. 2002; Kim and Orr 2005; Barrett et al. 2006). These adaptive dynamics were explored in simulations by modifying population size, selection coefficients, and the available number of beneficial mutations.
Ranges of parameter values are centered around those estimated for HIV-1. The effective population size of HIV-1 is approximately 103 infected cells (Leigh Brown 1997; Nijhuis et al. 1998; Rodrigo et al. 1999; Drummond et al. 2002; Seo et al. 2002; Achaz et al. 2004; Shriner et al. 2004), which is much lower than the census population size of approximately 107–108 (Chun et al. 1997). The model population sizes are effective population sizes and ranged from N= 102 to 105. The forward mutation rate was kept constant at μ= 10−5 per site per generation, the high rate typical of HIV-1 and other RNA viruses (Mansky and Temin 1995; Sanjuan et al. 2010). Backward mutation involves mutation to a specific (wild-type) amino acid. Because by single nucleotide mutational steps an amino acid may mutate to a maximum of six different amino acids (assuming all nucleotide changes at the first and second codon positions are nonsynonymous), the backward mutation rate was one-sixth the forward rate. Therefore, scaled (forward) mutation, Nμ, ranged from 10−3 to 1, spanning a broad range of mutation strengths. Selection on antibody epitope regions of HIV-1 may be strong (Williamson 2003) and is consistent with a selection coefficient of s≈ 0.1–0.9 (da Silva 2010). Selection coefficients used were s= 0.01 and 0.1 for the cost of escape and 0.5 and 0.9 for antibody selection, giving a broad range of scaled selection: Ns= 1 to 105. HIV-1 neutralizing antibody epitopes typically range from four to 15 amino acids (Yusim et al. 2009), but with only a few amino acids important in antibody binding (e.g., Stanfield et al. 2004; Zwick et al. 2005; Bell et al. 2008; Bryson et al. 2009). Analysis of amino acid sites for evidence of selected mutations in an important HIV-1 antibody epitope region in the present study showed that, averaged across patients, approximately four sites are involved (see Discussion). Therefore, the numbers of epitope-binding sites were L= 3 and 5. Simulations were run for 2000 generations. With an HIV-1 generation length of approximately two days (Markowitz et al. 2003; Murray et al. 2011), this is equivalent to about 11 years, the typical life span of an untreated patient. Results are means of 100 replicate simulations.
The computer code and sequence data analyzed have been deposited in the Dryad repository (datadryad.org) (doi:10.5061/dryad.69mk7).
ALLELE FIXATIONS MAY CAUSE REVERSIONS
Under some conditions, several alleles spread to fixation over the 2000 simulated generations. An example is shown from a single replicate simulation with L= 3 epitope-binding sites, population size N= 103, mutant selection coefficient sm= 0.1, and epitope selection coefficient se= 0.5 (Fig. 1). As in all simulations, the allele with the wild-type amino acid (0) at each site, in this case 000, was initially fixed. This allele was immediately targeted by an antibody and eventually replaced by an allele carrying an escape mutation. When the new, beneficial allele fixed, it was also targeted by an antibody and then replaced by a new allele carrying an escape mutation. Note that each fixation of an allele involved a single mutational step, as expected under the conditions of strong selection (Ns > 1) and weak mutation (Nμ« 1): 000 → 010 → 011 → 111 → 101 → 100. This sequential fixation of alleles caused the amino acids at sites 2 and 3 to revert to wild type.
FIXATIONS AND REVERSIONS UNDER STRONG SELECTION AND WEAK MUTATION
With population sizes N= 102–104, the conditions for strong selection (Ns > 1) and weak mutation (Nμ« 1) were generally met because μ= 10–5 and selection coefficients ranged from 0.01 to 0.9. Under these conditions, only one beneficial allele, which differs from the previously fixed epitope sequence by a single mutation, is expected to spread toward fixation at a time, as shown above. For this range of population sizes, the number of allele fixations and the number of reversions to the wild-type amino acid per site occurring over 2000 generations generally increased with N and the selection coefficients (Table 1). For example, with L= 5, sm= 0.1, and se= 0.9, the mean number of fixations increased from 18.12 to 24.76, and the mean number of reversions per site increased from 1.57 to 2.98, when N increased from 103 to 104. The increase in numbers of fixations and reversions with N and the selection coefficients is explained by the increase in the strength of selection, Ns, while mutation remains weak (Nμ« 1). That is, in the absence, or near absence, of clonal interference, fixation occurs at a higher rate because of stronger scaled selection.
Table 1. The effects of the number of epitope-binding sites (L), population size (N), selection by antibodies (se), and the fitness cost of escape mutations (sm) on the mean numbers (100 replicate simulations) of allele fixations (F) and reversions to wild type per site (R) over 2000 generations.
The highest values occurred with N= 104, sm= 0.1, and se= 0.9. For L= 3, these are 7.38 fixations and 1.39 reversions per site, and for L= 5 these are 24.76 fixations and 2.98 reversions per site (Table 1). These numbers of fixations are close to the maxima for each number of antibody-binding sites: seven possible escape alleles (out of eight possible alleles) for L= 3, and 31 for L= 5.
FIXATIONS AND REVERSIONS UNDER STRONG SELECTION AND STRONG MUTATION
With N= 105, Nμ= 1, and therefore mutation is no longer weak (Nμ« 1). There are two potentially important consequences of strong mutation. First, a spreading beneficial allele may give rise, by a single mutation, to a fitter allele that may then also spread, resulting in the fixation of an allele with two mutations. Second, more than one beneficial allele may arise directly from an existing allele by single mutations. In both cases, clonal interference is generated (Gerrish and Lenski 1998). Increasing N from 104 to 105 reduced both the number of allele fixations and the number of reversions per site (Table 1). For example, with L= 5, sm= 0.1, and se= 0.9, the number of fixations decreased from 24.76 to 7.38, and the number of reversions per site decreased from 2.98 to 1.44. The proportional decrease was greater for L= 5 than L= 3 because of the greater opportunity for clonal interference with a greater number of possible escape mutations. Figure 2 shows frequencies of alleles from single replicate simulations under these conditions. With N= 104, and thus weak mutation (Nμ= 0.1), alleles tend to spread sequentially and fixations are common, whereas with N= 105, and thus stronger mutation (Nμ= 1), fixations are fewer because of the many competing escape alleles.
The recurrent targeting of the same epitope region by different specific antibodies was shown by stochastic simulation to cause multiple fixations of antibody escape alleles, which in turn caused reversions at individual antibody-binding sites. The reversions were due to mutant residues being released from functioning in antibody escape as new escape mutants were selected, and the constant selection for high protein function. With a moderately strong cost of escape (sm= 0.1), strong selection by antibodies (se= 0.9), five epitope-binding sites, and a population size of 104, there were a mean of 24.76 allele fixations and a mean of 2.98 reversions per site over 2000 generations.
Recent studies have emphasized that clonal interference will be strongest in moderately large populations (Gerrish and Lenski 1998; Otto and Barton 2001; Barton and Otto 2005; Kim and Orr 2005). In very small populations, beneficial mutations that survive stochastic loss spread to fixation 1 at a time, consistent with the expectations from strong selection and weak mutation (Gillespie 1991; Orr 2002). In very large populations, multiple beneficial mutations are likely to occur in the same individual and thus fix together. It is in moderately large populations that multiple beneficial mutations, having survived stochastic loss, and occurring separately in different genomes, will compete and thus reduce the rate of adaptation. This situation is equivalent to negative linkage disequilibrium among beneficial mutations, which reduces the variance in fitness and, by Fisher's fundamental theorem of natural selection, decreases the rate of adaptation (Fisher 1930; Otto and Barton 2001; Barton and Otto 2005). These effects have been confirmed in the present model, explaining the reduced rates of fixation and reversion when the population's size is increased to 105.
Pathogens as diverse as the protist Plasmodium falciparum and the bacterium Escherichia coli exhibit weak mutation (Nμ« 1) and are likely to exhibit strong selection (Ns > 1) (da Silva 2010), suggesting that amino acid reversions due to antibody selection may be common. However, there appear to be no published reports of such reversions in vivo. The reason may be that there are few long-term studies of antibody epitope sequence changes in any pathogen. A search of the entire PubMed database (http://www.ncbi.nlm.nih.gov/pubmed) with the terms “antibody” and “escape” and “reversion” anywhere in an article returned 15 articles. However, none of the articles report reversions in the sense used here, where a reversion occurs to an earlier residue within the host; several report naturally occurring or selected reversions to wild type, strain consensus residues, or a virulence phenotype not initially present in the host.
Therefore, the rate of reversions in an HIV-1 antibody epitope region was calculated from abundant publicly available sequence data and compared to predictions from the model. HIV-1 exhibits strong selection and weak mutation (da Silva 2010). For HIV-1, with N≈ 103, μ≈ 10−5, sm≈ 0.1, the model predicts that over 2000 generations (∼11 years), there will be on average 0.87 (mean for se= 0.5 and 0.9) and 1.40 reversions per site for L= 3 and 5 binding sites, respectively (Table 1). For the purpose of comparisons with estimates from empirical data, which are based on varying numbers of viral sequence samples taken from patients over varying numbers of years, these values were normalized by the number of allele fixations observed for the same conditions, giving 0.137 and 0.085 reversions per site per fixation for L= 3 and 5, respectively. Amino acid reversion frequency was calculated for the HIV-1 exterior envelope glycoprotein (gp120) third variable region (V3). The 35 amino acid V3 region is an important target of neutralizing antibodies (Zolla-Pazner 2004; Pantophlet and Burton 2006) and interacts with host-cell chemokine receptors in a crucial step in cell entry (Huang et al. 2005, 2007; Xiang et al. 2010), generating much research interest and sequence data. This region has also been shown to be under positive selection (e.g., Nielsen and Yang 1998; Gerrish 2001; Williamson 2003; Templeton et al. 2004). V3 sequences from the most frequently sequenced HIV-1 subtype (B) were downloaded from the HIV Sequence Database (http://www.hiv.lanl.gov) for patients from whom at least 20 viral sequences were sampled in each of at least three years (the minimum number of samples required to detect a reversion). Sequences that were not 35 amino acids long, contained undefined residues or did not have terminal cysteines, as observed for functional sequences, were discarded. The resulting sequences did not need to be aligned. This dataset contained 20 patients, each with an average of four samples taken over 5.75 years and an average total of 135.6 sequences (a total of 2711 sequences for all patients). A change in the 50% consensus amino acid at one or more sites between consecutive samples from a patient was used as an indicator of recent or impending fixation of a V3 allele. Similarly, a change in the 50% consensus amino acid at a site to a residue present in an earlier sample was used to indicate a reversion. Averaged across patients, there were 3.95 amino acid sites involved in 2.10 fixations and 0.18 reversions per site. This gives 0.086 reversions per site per fixation, which is very close to the predicted value of 0.085 for five epitope-binding sites. In contrast, with N= 104, the model predicts 0.191 and 0.122 reversions per site per fixation for three and five epitope-binding sites, respectively. Therefore, the model appears to predict the observed amino acid reversion frequency in an important HIV-1 antibody epitope region.
The model does not take into account possible interactions among mutations in their effects on fitness, that is, fitness epistasis. Mainly compensatory interactions have been reported for the HIV-1 V3 region in its evolution from using the primary chemokine coreceptor for cell entry to using an alternative coreceptor (da Silva et al. 2010). Such interactions may increase the rate of fixation if one or more compensatory mutations fix subsequently to the fixation of an escape mutation. More importantly, compensatory mutations would reduce the rate of reversion as they decrease the fitness cost of an escape mutation. Nevertheless, the model without the complication of epistasis successfully predicted the number of reversions per site per fixation observed for V3.
Frequent reversions at epitope-binding sites resulting from fluctuating antibody selection may help explain the strong linear relationship between the among-population mean site-specific frequency of an amino acid and its effect on fitness in HIV-1 V3 (da Silva 2006). This relationship has been explained as the result of strong fluctuating selection and weak mutation (da Silva 2010). Under the conditions of strong directional selection and weak mutation, the probability that a particular beneficial mutation, among several, is the next to spread to fixation is proportional to its selection coefficient (Gillespie 1991; Orr 2002). With fluctuating selection, such steps in an adaptive walk are constantly being repeated, and when averaged across populations, the frequency of a mutant may be proportional to its effect on fitness (da Silva 2010).
These results show that the recurring generation of specific, neutralizing antibodies to an epitope region may cause amino acid reversions. Reversions are expected to be especially common when selection is strong and mutation is weak, because under these conditions clonal interference does not reduce the rate of adaptation. HIV-1 evolves under strong selection and weak mutation, and its predicted rate of reversions normalized by the number of allele fixations matches that observed for an important antibody epitope region. Antibody selection may generate complex adaptive dynamics.
Associate Editor: L. Meyers
This research was supported by The School of Molecular and Biomedical Science, and its Discipline of Genetics, at The University of Adelaide. L. S. Ling helped compile sequence data for HIV-1. An earlier version of the manuscript was greatly improved by the suggestions of two anonymous reviewers.